feat(seo): canonical, JSON-LD, sitemap directive, richer meta descriptions#533
feat(seo): canonical, JSON-LD, sitemap directive, richer meta descriptions#533
Conversation
The default Hugo-generated robots.txt only contained `User-agent: *` with no Sitemap declaration. Search engines could find the sitemap by direct probe, but Bing and Yandex rely on the directive to discover it reliably. Adding a Sitemap line makes the canonical sitemap location explicit to all crawlers. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: tym83 <6355522@gmail.com>
…rsions The site previously emitted no canonical link tag and no structured data. Pages across legacy documentation versions (v0, v1.0, v1.1) duplicate the latest version's content but had no dedup signals, so search engines split ranking authority across all five copies. Changes: - Emit `<link rel="canonical">` pointing to the page's own permalink for every page outside legacy doc versions. - Emit `<meta name="robots" content="noindex, follow">` on legacy doc-version pages (any version present in `params.versions[]` whose `id` is neither `params.latest_version_id` nor `next`). Pages remain reachable for users following links; they no longer compete with the current version for ranking. - Inline JSON-LD `Organization` schema on every page so search engines can build a consistent knowledge entity for Cozystack (CNCF Landscape, GitHub, Slack, Telegram in `sameAs`). - Inline JSON-LD `WebSite` on the homepage to expose the site's name, URL, and description to AI search and rich result generators. - Inline JSON-LD `BlogPosting` on single blog posts with title, description, dates, author, image, and publisher — required for Google Discover eligibility and AI Overview citation. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: tym83 <6355522@gmail.com>
…erage
Section index pages and the site default `params.description` carried
generic blurbs ("Free Cloud Platform based on Kubernetes",
"Operational guides on the storage subsystem") with little keyword
overlap with the actual content. Search snippets from these pages
gave little context to scanning users and missed terms that real
queries use ("KubeVirt", "LINSTOR", "VictoriaMetrics", "managed
PostgreSQL", "Cilium eBPF").
Updates:
- Site default description now covers the platform's main capability
set (VMs, managed databases, S3, GPU) and its CNCF Sandbox status.
- v1.2 docs root, applications, virtualization, storage, networking,
and operations section indexes each get descriptions naming the
underlying components and concrete services they document.
Each description stays under ~155 characters to fit a typical SERP
snippet without truncation.
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: tym83 <6355522@gmail.com>
✅ Deploy Preview for cozystack ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request implements SEO enhancements by updating meta descriptions across documentation sections and the site configuration, adding a robots.txt file, and introducing JSON-LD structured data for Organization, WebSite, and BlogPosting types. It also includes logic for managing canonical URLs and noindex tags for legacy documentation versions. Feedback identifies a potential version mismatch in the SEO logic that could lead to the current documentation being incorrectly excluded from search results and suggests providing a fallback for blog post descriptions in the structured data.
| {{- $latestVersion := .Site.Params.latest_version_id | default "v1.2" -}} | ||
| {{- $isOldDocsVersion := false -}} | ||
| {{- range .Site.Params.versions -}} | ||
| {{- if and .id (ne .id $latestVersion) (ne .id "next") -}} | ||
| {{- if in $.RelPermalink (printf "/docs/%s/" .id) -}} | ||
| {{- $isOldDocsVersion = true -}} | ||
| {{- end -}} | ||
| {{- end -}} | ||
| {{- end -}} | ||
| {{- if $isOldDocsVersion }} | ||
| <meta name="robots" content="noindex, follow" /> |
There was a problem hiding this comment.
There is a discrepancy between the SEO logic and the site configuration. The hugo.yaml file (line 133) defines latest_version_id as v1.3, but this PR applies SEO improvements to the v1.2 documentation. Under the current logic in lines 21-27, all v1.2 pages will be marked with noindex (line 29), which negates the value of the new meta descriptions and JSON-LD for those pages. If v1.2 is intended to be the primary version for search engines, latest_version_id in hugo.yaml should be updated to v1.2.
| "@context": "https://schema.org", | ||
| "@type": "BlogPosting", | ||
| "headline": {{ .Title | jsonify }}, | ||
| "description": {{ .Description | jsonify }}, |
There was a problem hiding this comment.
If a blog post is missing the description field in its frontmatter, the BlogPosting JSON-LD will have an empty description field. Using .Summary as a fallback ensures a valid description is always provided for search engines.
| "description": {{ .Description | jsonify }}, | |
| "description": {{ .Description | default .Summary | jsonify }}, |
Cozystack is increasingly cited by AI search engines (Google AI Overview, ChatGPT search, Perplexity, Claude). Two improvements: 1. /llms.txt — emerging standard for AI crawlers (Anthropic, OpenAI, Perplexity read it). Vendor-neutral content describing the project, docs structure, releases, community, and authoritative facts. 2. SoftwareApplication JSON-LD on homepage — helps AI engines classify Cozystack correctly when answering "what is Cozystack" type queries. Adds Apache 2.0 license, free offer, and dependency hints. Both changes are vendor-neutral. No marketing content; no funnel-links. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: tym83 <6355522@gmail.com>
Summary
Adds the missing technical-SEO scaffolding to cozystack.io plus AI-search optimization (GEO). The site is currently indexable but emits no canonical link, no structured data, no AI-crawler discovery file, and lets five duplicate doc-version copies (v0, v1.0, v1.1, v1.2, next) compete for the same ranking signals.
This PR addresses the highest-leverage fixes without changing site architecture or adding any vendor-specific outbound links.
What changed
Technical SEO
layouts/robots.txt— adds an explicitSitemap:directive so Bing, Yandex, and other crawlers discover the sitemap reliably.layouts/partials/hooks/head-end.html:<link rel="canonical">pointing to each page's permalink.<meta name="robots" content="noindex, follow">on legacy doc versions. Pages stay reachable via direct link; they no longer compete with the current version for ranking authority.Organization(every page) — name, URL, logo, description, foundingDate,sameAs(GitHub, CNCF Landscape, Slack, Telegram).WebSiteon the homepage — eligibility for sitelinks searchbox.BlogPostingon single blog posts — eligibility for Google Discover and AI Overview citation.AI search optimization (GEO)
static/llms.txt— vendor-neutral guide for AI crawlers (Anthropic, OpenAI, Perplexity). Describes the project, docs structure, releases, community, authoritative facts. No marketing content; no funnel-links.layouts/partials/hooks/head-end.html— adds JSON-LDSoftwareApplicationon homepage. Helps AI engines classify Cozystack correctly when answering "what is Cozystack" type queries (Apache 2.0 license, free offer, dependency hints).Content
hugo.yaml—params.descriptionrewritten to cover the platform's core capability surface (VMs, managed databases, S3, GPU) and the CNCF Sandbox status.content/en/docs/v1.2/{,applications,virtualization,storage,networking,operations}/_index.md—descriptionfrontmatter rewritten on each section index to name the underlying components (KubeVirt, LINSTOR, Cilium eBPF, VictoriaMetrics, Velero, etc.) and concrete services. Each under ~155 characters so SERP snippets are not truncated.Why
Organizationschema, the project is not modeled as a recognized entity in Google's Knowledge Graph and tends not to be cited in AI Overview results.llms.txtis an emerging standard (2025-2026). Anthropic, OpenAI, and Perplexity have begun reading it. Adding a vendor-neutral file describing the project gives AI search engines an authoritative, structured reference when answering Cozystack-related queries.SoftwareApplicationJSON-LD on homepage is the schema-org pattern AI engines look for when classifying software products. Combined with the existingOrganizationschema, it gives the entity recognition signals needed for AI search citation.Vendor neutrality
All changes preserve cozystack.io's vendor-neutral CNCF positioning:
llms.txtdescribes the project, not AenixTest plan
make servebuilds without errors/llms.txtreturns the static file as expected/robots.txtincludes the Sitemap directive