Skip to content

feat(seo): canonical, JSON-LD, sitemap directive, richer meta descriptions#533

Draft
tym83 wants to merge 4 commits intomainfrom
feat/seo-canonical-jsonld
Draft

feat(seo): canonical, JSON-LD, sitemap directive, richer meta descriptions#533
tym83 wants to merge 4 commits intomainfrom
feat/seo-canonical-jsonld

Conversation

@tym83
Copy link
Copy Markdown
Contributor

@tym83 tym83 commented May 8, 2026

Summary

Adds the missing technical-SEO scaffolding to cozystack.io plus AI-search optimization (GEO). The site is currently indexable but emits no canonical link, no structured data, no AI-crawler discovery file, and lets five duplicate doc-version copies (v0, v1.0, v1.1, v1.2, next) compete for the same ranking signals.

This PR addresses the highest-leverage fixes without changing site architecture or adding any vendor-specific outbound links.

What changed

Technical SEO

  • layouts/robots.txt — adds an explicit Sitemap: directive so Bing, Yandex, and other crawlers discover the sitemap reliably.
  • layouts/partials/hooks/head-end.html:
    • Emits <link rel="canonical"> pointing to each page's permalink.
    • Emits <meta name="robots" content="noindex, follow"> on legacy doc versions. Pages stay reachable via direct link; they no longer compete with the current version for ranking authority.
    • Inlines JSON-LD Organization (every page) — name, URL, logo, description, foundingDate, sameAs (GitHub, CNCF Landscape, Slack, Telegram).
    • Inlines JSON-LD WebSite on the homepage — eligibility for sitelinks searchbox.
    • Inlines JSON-LD BlogPosting on single blog posts — eligibility for Google Discover and AI Overview citation.

AI search optimization (GEO)

  • static/llms.txt — vendor-neutral guide for AI crawlers (Anthropic, OpenAI, Perplexity). Describes the project, docs structure, releases, community, authoritative facts. No marketing content; no funnel-links.
  • layouts/partials/hooks/head-end.html — adds JSON-LD SoftwareApplication on homepage. Helps AI engines classify Cozystack correctly when answering "what is Cozystack" type queries (Apache 2.0 license, free offer, dependency hints).

Content

  • hugo.yamlparams.description rewritten to cover the platform's core capability surface (VMs, managed databases, S3, GPU) and the CNCF Sandbox status.
  • content/en/docs/v1.2/{,applications,virtualization,storage,networking,operations}/_index.mddescription frontmatter rewritten on each section index to name the underlying components (KubeVirt, LINSTOR, Cilium eBPF, VictoriaMetrics, Velero, etc.) and concrete services. Each under ~155 characters so SERP snippets are not truncated.

Why

  • The site's strongest backlinks (kubernetes.io 44 dofollow, ripe.net 102 links, opennet.ru 107) are not converting to ranking authority because search engines split signals across five duplicate copies of every doc page. Marking legacy versions noindex consolidates authority on the current version.
  • Without Organization schema, the project is not modeled as a recognized entity in Google's Knowledge Graph and tends not to be cited in AI Overview results.
  • llms.txt is an emerging standard (2025-2026). Anthropic, OpenAI, and Perplexity have begun reading it. Adding a vendor-neutral file describing the project gives AI search engines an authoritative, structured reference when answering Cozystack-related queries.
  • SoftwareApplication JSON-LD on homepage is the schema-org pattern AI engines look for when classifying software products. Combined with the existing Organization schema, it gives the entity recognition signals needed for AI search citation.

Vendor neutrality

All changes preserve cozystack.io's vendor-neutral CNCF positioning:

  • llms.txt describes the project, not Aenix
  • No funnel-links to aenix.io
  • No commercial CTAs
  • No marketing copy

Test plan

  • make serve builds without errors
  • HTML output validates: canonical link, robots noindex on legacy versions, JSON-LD Organization/WebSite/BlogPosting present
  • /llms.txt returns the static file as expected
  • /robots.txt includes the Sitemap directive
  • Production deploy pending merge

tym83 and others added 3 commits May 8, 2026 23:12
The default Hugo-generated robots.txt only contained `User-agent: *`
with no Sitemap declaration. Search engines could find the sitemap by
direct probe, but Bing and Yandex rely on the directive to discover it
reliably. Adding a Sitemap line makes the canonical sitemap location
explicit to all crawlers.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: tym83 <6355522@gmail.com>
…rsions

The site previously emitted no canonical link tag and no structured
data. Pages across legacy documentation versions (v0, v1.0, v1.1)
duplicate the latest version's content but had no dedup signals, so
search engines split ranking authority across all five copies.

Changes:

- Emit `<link rel="canonical">` pointing to the page's own permalink for
  every page outside legacy doc versions.
- Emit `<meta name="robots" content="noindex, follow">` on legacy
  doc-version pages (any version present in `params.versions[]` whose
  `id` is neither `params.latest_version_id` nor `next`). Pages remain
  reachable for users following links; they no longer compete with the
  current version for ranking.
- Inline JSON-LD `Organization` schema on every page so search engines
  can build a consistent knowledge entity for Cozystack (CNCF Landscape,
  GitHub, Slack, Telegram in `sameAs`).
- Inline JSON-LD `WebSite` on the homepage to expose the site's name,
  URL, and description to AI search and rich result generators.
- Inline JSON-LD `BlogPosting` on single blog posts with title,
  description, dates, author, image, and publisher — required for
  Google Discover eligibility and AI Overview citation.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: tym83 <6355522@gmail.com>
…erage

Section index pages and the site default `params.description` carried
generic blurbs ("Free Cloud Platform based on Kubernetes",
"Operational guides on the storage subsystem") with little keyword
overlap with the actual content. Search snippets from these pages
gave little context to scanning users and missed terms that real
queries use ("KubeVirt", "LINSTOR", "VictoriaMetrics", "managed
PostgreSQL", "Cilium eBPF").

Updates:

- Site default description now covers the platform's main capability
  set (VMs, managed databases, S3, GPU) and its CNCF Sandbox status.
- v1.2 docs root, applications, virtualization, storage, networking,
  and operations section indexes each get descriptions naming the
  underlying components and concrete services they document.

Each description stays under ~155 characters to fit a typical SERP
snippet without truncation.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: tym83 <6355522@gmail.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented May 8, 2026

Deploy Preview for cozystack ready!

Name Link
🔨 Latest commit 3199165
🔍 Latest deploy log https://app.netlify.com/projects/cozystack/deploys/69ffe52eb7e17c00091d964f
😎 Deploy Preview https://deploy-preview-533--cozystack.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 21a8737b-0f21-4c6a-8d78-c49c2b943c7e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/seo-canonical-jsonld

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements SEO enhancements by updating meta descriptions across documentation sections and the site configuration, adding a robots.txt file, and introducing JSON-LD structured data for Organization, WebSite, and BlogPosting types. It also includes logic for managing canonical URLs and noindex tags for legacy documentation versions. Feedback identifies a potential version mismatch in the SEO logic that could lead to the current documentation being incorrectly excluded from search results and suggests providing a fallback for blog post descriptions in the structured data.

Comment on lines +19 to +29
{{- $latestVersion := .Site.Params.latest_version_id | default "v1.2" -}}
{{- $isOldDocsVersion := false -}}
{{- range .Site.Params.versions -}}
{{- if and .id (ne .id $latestVersion) (ne .id "next") -}}
{{- if in $.RelPermalink (printf "/docs/%s/" .id) -}}
{{- $isOldDocsVersion = true -}}
{{- end -}}
{{- end -}}
{{- end -}}
{{- if $isOldDocsVersion }}
<meta name="robots" content="noindex, follow" />
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is a discrepancy between the SEO logic and the site configuration. The hugo.yaml file (line 133) defines latest_version_id as v1.3, but this PR applies SEO improvements to the v1.2 documentation. Under the current logic in lines 21-27, all v1.2 pages will be marked with noindex (line 29), which negates the value of the new meta descriptions and JSON-LD for those pages. If v1.2 is intended to be the primary version for search engines, latest_version_id in hugo.yaml should be updated to v1.2.

"@context": "https://schema.org",
"@type": "BlogPosting",
"headline": {{ .Title | jsonify }},
"description": {{ .Description | jsonify }},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If a blog post is missing the description field in its frontmatter, the BlogPosting JSON-LD will have an empty description field. Using .Summary as a fallback ensures a valid description is always provided for search engines.

Suggested change
"description": {{ .Description | jsonify }},
"description": {{ .Description | default .Summary | jsonify }},

Cozystack is increasingly cited by AI search engines (Google AI Overview,
ChatGPT search, Perplexity, Claude). Two improvements:

1. /llms.txt — emerging standard for AI crawlers (Anthropic, OpenAI,
   Perplexity read it). Vendor-neutral content describing the project,
   docs structure, releases, community, and authoritative facts.

2. SoftwareApplication JSON-LD on homepage — helps AI engines classify
   Cozystack correctly when answering "what is Cozystack" type queries.
   Adds Apache 2.0 license, free offer, and dependency hints.

Both changes are vendor-neutral. No marketing content; no funnel-links.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: tym83 <6355522@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant