Javascript

Generating a Sitemap with Node.JS and Javascript Automatically

Site maps are a very important aspect of SEO optimization. Google and other search engines can use a sitemap to figure out where all your pages are and how they link together. In this tutorial we will be creating an automated site map with Node.JS and Express.

I will be using MongoDB as the database tool, but if you use MySQL or something else, you can easily swap these components out.

Concept

Most sites are split into two vague categories:

  • Pages which have URLs created (i.e. articles, blog posts, etc).
  • Pages which have URLs which rarely change (i.e. home pages, etc).

To automate a sitemap, we can list out URLs we know are unlikely to change, and then use Javascript to take database entries of new pages and add these to our sitemap. Google and others will crawl this sitemap frequently, meaning new URLs are automatically added to Google.

1. Install Components

First of all, we have to install all the components we need to accomplish this. Below is the skeleton of our index.js file:

javascript Copy
import express from 'express' import { SitemapStream, streamToPromise } from 'sitemap' import mongoose from 'mongoose' const port = 3000 const app = express() let sitemap; app.get('/sitemap.xml', async function(req, res) { try { // Code to try } catch (e) { // Catch errors console.log(e); } }); app.listen(port, function() { console.log(`Listening on port ${port}`) });

The two main components we need for this demo are express, mongoose and sitemap. As mentioned, I am using MongoDB, so mongoose is relevant here. For others, you might have to use different packages. All of these can be install with the commands below:

shell Copy
npm i express npm i sitemap npm i mongoose

2. Get your database contents

Before we begin, lets get our database contents. With MongoDB and mongoose, I can do the following to get my database contents:

javascript Copy
const allArticles = await Article.Article.find({ published: true }).select('name') const allCategories = await Category.Category.find().select('name')

Now we have an array of all our data which refers to different articles and categories, all of which have a URL. Next, lets map all that data to a simple array of URLs:

javascript Copy
const articles = queryArticles.map( ({ name }) => `/article/${name}`) const categories = queryCategories.map( ({ name }) => `/category/${name}`)

3. Combine with sitemap

Now lets write all of those URLs to our sitemap. We write each article and category URL to a stream, which is then published to whoever is viewing the sitemap. The final code for our sitemap.xml route looks like this:

javascript Copy
let sitemap; app.get('/sitemap.xml', async function(req, res) { res.header('Content-Type', 'application/xml'); res.header('Content-Encoding', 'gzip'); if (sitemap) { res.send(sitemap) return } try { const allArticles = await Article.Article.find({ published: true }).select('name') const allCategories = await Category.Category.find().select('name') const articles = queryArticles.map( ({ name }) => `/article/${name}`) const categories = queryCategories.map( ({ name }) => `/category/${name}`) // Change yourWebsite.com to your website's URL const smStream = new SitemapStream({ hostname: 'https://yourWebsite.com/' }) const pipeline = smStream.pipe(createGzip()) // Add each article URL to the stream articles.forEach(function(item) { // Update as required smStream.write({ url: item, changefreq: 'weekly', priority: 0.8}) }); // Add each category URL to the stream categories.forEach(function(item) { // Update as required smStream.write({ url: item, changefreq: 'monthly', priority: 0.6}) }); // cache the response streamToPromise(pipeline).then(sm => sitemap = sm) smStream.end() // Show errors and response pipeline.pipe(res).on('error', (e) => {throw e}) } catch (e) { console.error(e) res.status(500).end() } });

Notice that in our forEach loop for we mention the change frequency and priority:

  • A higher priority means that these pages are more important relevant to your site. You should give your most important pages a higher priority.
  • Change frequency does not necessarily mean search engines will only crawl pages at that frequency, but it is an indication.

4. Add any other pages

If you want to add other pages not found in your database, just add another smStream.write line. For example:

javascript Copy
smStream.write({ url: '/about', changefreq: 'monthly', priority: 0.4})

5. Index it on Google

Great! Now you have a sitemap, and it's live. I would recommend you use Google's Search Console to ensure the sitemap is registered for your domain, under the 'Site Map' section.

Don't forget to update your robots.txt file with a link to your sitemap as well!

shell Copy
User-agent: * Allow: / Sitemap: https://www.fjolt.com/sitemap.xml

With these simple steps, you've just improved your SEO game ever so slightly.

Last Updated Monday, 11 January 2021

Subscribe

Subscribe to stay up to date with our latest posts via email. You can opt out at any time.

Not a valid email