XML Sitemaps and RSS Feeds for Search Indexation | OpsBlu Docs

XML Sitemaps and RSS Feeds for Search Indexation

Configure XML sitemaps and RSS/Atom feeds to accelerate search engine indexation. Covers sitemap generation, RSS autodiscovery, and Google News.

XML sitemaps and RSS feeds serve complementary roles in search engine indexation. Sitemaps provide a complete inventory of your indexable pages. RSS feeds signal freshly published or updated content. Together, they give crawlers both the full picture and real-time updates.

XML Sitemaps

Sitemap Structure

A well-formed XML sitemap contains only indexable URLs with accurate metadata:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2024-11-15T08:30:00+00:00</lastmod>
  </url>
  <url>
    <loc>https://example.com/products/widget</loc>
    <lastmod>2024-11-10T14:00:00+00:00</lastmod>
  </url>
</urlset>

Image and Video Sitemaps

Extend your sitemap to include media for image and video search:

<url>
  <loc>https://example.com/products/widget</loc>
  <lastmod>2024-11-10</lastmod>
  <image:image>
    <image:loc>https://example.com/images/widget-hero.jpg</image:loc>
    <image:title>Blue Widget Pro - Front View</image:title>
  </image:image>
</url>

For video content:

<url>
  <loc>https://example.com/tutorials/setup-guide</loc>
  <video:video>
    <video:thumbnail_loc>https://example.com/thumbs/setup.jpg</video:thumbnail_loc>
    <video:title>Widget Setup Guide</video:title>
    <video:description>Step-by-step setup instructions</video:description>
    <video:content_loc>https://example.com/videos/setup.mp4</video:content_loc>
    <video:duration>180</video:duration>
  </video:video>
</url>

Dynamic Sitemap Generation

For sites with frequently changing content, generate sitemaps dynamically:

// Express.js sitemap endpoint
const { SitemapStream, streamToPromise } = require('sitemap');

app.get('/sitemap.xml', async (req, res) => {
  const stream = new SitemapStream({ hostname: 'https://example.com' });

  // Fetch all published pages from database
  const pages = await db.query(
    'SELECT url, updated_at FROM pages WHERE status = $1 ORDER BY updated_at DESC',
    ['published']
  );

  pages.forEach(page => {
    stream.write({
      url: page.url,
      lastmod: page.updated_at.toISOString(),
    });
  });

  stream.end();
  const sitemap = await streamToPromise(stream);

  res.header('Content-Type', 'application/xml');
  res.header('Cache-Control', 'public, max-age=3600');
  res.send(sitemap.toString());
});

RSS and Atom Feeds

Why RSS Matters for SEO

RSS feeds provide several SEO advantages:

  • Faster indexation: Google monitors RSS feeds for new content and indexes it faster than waiting for the next sitemap crawl
  • Google News eligibility: News publishers must provide an RSS or Atom feed for Google News inclusion
  • Content syndication: RSS feeds power content aggregators that can drive referral traffic and backlinks
  • Freshness signals: Regular RSS updates signal to search engines that your site publishes consistently

RSS 2.0 Feed Structure

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Example Blog</title>
    <link>https://example.com/blog</link>
    <description>Expert insights on web technology</description>
    <language>en-us</language>
    <lastBuildDate>Fri, 15 Nov 2024 08:00:00 GMT</lastBuildDate>
    <atom:link href="https://example.com/feed.xml" rel="self" type="application/rss+xml"/>

    <item>
      <title>How to Optimize Core Web Vitals</title>
      <link>https://example.com/blog/core-web-vitals</link>
      <guid isPermaLink="true">https://example.com/blog/core-web-vitals</guid>
      <pubDate>Fri, 15 Nov 2024 08:00:00 GMT</pubDate>
      <description>A practical guide to improving LCP, INP, and CLS scores.</description>
    </item>
  </channel>
</rss>

Atom Feed Structure

Atom is an alternative to RSS with stricter formatting:

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Example Blog</title>
  <link href="https://example.com/blog"/>
  <link rel="self" href="https://example.com/atom.xml"/>
  <updated>2024-11-15T08:00:00Z</updated>
  <id>https://example.com/blog</id>

  <entry>
    <title>How to Optimize Core Web Vitals</title>
    <link href="https://example.com/blog/core-web-vitals"/>
    <id>https://example.com/blog/core-web-vitals</id>
    <updated>2024-11-15T08:00:00Z</updated>
    <summary>A practical guide to improving LCP, INP, and CLS scores.</summary>
  </entry>
</feed>

RSS Autodiscovery

Add autodiscovery links so browsers and crawlers find your feeds automatically:

<head>
  <link rel="alternate" type="application/rss+xml"
        title="Example Blog RSS Feed"
        href="https://example.com/feed.xml">
  <link rel="alternate" type="application/atom+xml"
        title="Example Blog Atom Feed"
        href="https://example.com/atom.xml">
</head>

Feed Best Practices

  • Limit feed size: Include the most recent 20-50 items. Larger feeds slow down parsers.
  • Include full GUIDs: Use permanent, unique URLs as guid values to prevent duplicate indexation
  • Update timestamps accurately: Only update lastBuildDate or <updated> when content genuinely changes
  • Validate feeds: Use the W3C Feed Validation Service to check syntax
  • Reference in robots.txt: Add Sitemap: directives for sitemaps but not for RSS feeds (feeds are discovered via autodiscovery tags)

Monitoring

  • Check Search Console Sitemaps report weekly for submission status and error counts
  • Verify RSS feed renders correctly by loading it in a browser
  • Monitor indexation rate: new content published via RSS should appear in Google within 24-48 hours for established sites
  • Track the ratio of submitted sitemap URLs vs. indexed URLs as a site health metric