Generating a Sitemap with Node.JS and Javascript Automatically
📣 Sponsor
Site maps are a very important aspect of SEO optimization. Google and other search engines can use a sitemap to figure out where all your pages are and how they link together. In this tutorial we will be creating an automated site map with Node.JS and Express.
I will be using MongoDB as the database tool, but if you use MySQL or something else, you can easily swap these components out.
Concept
Most sites are split into two vague categories:
- Pages which have URLs created (i.e. articles, blog posts, etc).
- Pages which have URLs which rarely change (i.e. home pages, etc).
To automate a sitemap, we can list out URLs we know are unlikely to change, and then use Javascript to take database entries of new pages and add these to our sitemap. Google and others will crawl this sitemap frequently, meaning new URLs are automatically added to Google.
1. Install Components
First of all, we have to install all the components we need to accomplish this. Below is the skeleton of our index.js file:
import express from 'express'
import { SitemapStream, streamToPromise } from 'sitemap'
import mongoose from 'mongoose'
const port = 3000
const app = express()
let sitemap;
app.get('/sitemap.xml', async function(req, res) {
try {
// Code to try
} catch (e) {
// Catch errors
console.log(e);
}
});
app.listen(port, function() {
console.log(`Listening on port ${port}`)
});
The two main components we need for this demo are express, mongoose and sitemap. As mentioned, I am using MongoDB, so mongoose is relevant here. For others, you might have to use different packages. All of these can be install with the commands below:
npm i express
npm i sitemap
npm i mongoose
2. Get your database contents
Before we begin, lets get our database contents. With MongoDB and mongoose, I can do the following to get my database contents:
const allArticles = await Article.Article.find({ published: true }).select('name')
const allCategories = await Category.Category.find().select('name')
Now we have an array of all our data which refers to different articles and categories, all of which have a URL. Next, lets map all that data to a simple array of URLs:
const articles = queryArticles.map( ({ name }) => `/article/${name}`)
const categories = queryCategories.map( ({ name }) => `/category/${name}`)
3. Combine with sitemap
Now lets write all of those URLs to our sitemap. We write each article and category URL to a stream, which is then published to whoever is viewing the sitemap. The final code for our sitemap.xml route looks like this:
let sitemap;
app.get('/sitemap.xml', async function(req, res) {
res.header('Content-Type', 'application/xml');
res.header('Content-Encoding', 'gzip');
if (sitemap) {
res.send(sitemap)
return
}
try {
const allArticles = await Article.Article.find({ published: true }).select('name')
const allCategories = await Category.Category.find().select('name')
const articles = queryArticles.map( ({ name }) => `/article/${name}`)
const categories = queryCategories.map( ({ name }) => `/category/${name}`)
// Change yourWebsite.com to your website's URL
const smStream = new SitemapStream({ hostname: 'https://yourWebsite.com/' })
const pipeline = smStream.pipe(createGzip())
// Add each article URL to the stream
articles.forEach(function(item) {
// Update as required
smStream.write({ url: item, changefreq: 'weekly', priority: 0.8})
});
// Add each category URL to the stream
categories.forEach(function(item) {
// Update as required
smStream.write({ url: item, changefreq: 'monthly', priority: 0.6})
});
// cache the response
streamToPromise(pipeline).then(sm => sitemap = sm)
smStream.end()
// Show errors and response
pipeline.pipe(res).on('error', (e) => {throw e})
} catch (e) {
console.error(e)
res.status(500).end()
}
});
Notice that in our forEach
loop for we mention the change frequency and priority:
- A higher priority means that these pages are more important relevant to your site. You should give your most important pages a higher priority.
- Change frequency does not necessarily mean search engines will only crawl pages at that frequency, but it is an indication.
4. Add any other pages
If you want to add other pages not found in your database, just add another smStream.write
line. For example:
smStream.write({ url: '/about', changefreq: 'monthly', priority: 0.4})
5. Index it on Google
Great! Now you have a sitemap, and it's live. I would recommend you use Google's Search Console to ensure the sitemap is registered for your domain, under the 'Site Map' section.
Don't forget to update your robots.txt file with a link to your sitemap as well!
User-agent: *
Allow: /
Sitemap: https://www.fjolt.com/sitemap.xml
With these simple steps, you've just improved your SEO game ever so slightly.
More Tips and Tricks for Javascript
- How does the Javascript logical AND (&&) operator work?
- Javascript Array Slice Method
- A Guide to Heaps, Stacks, References and Values in Javascript
- Art Generator with Javascript and WebGL
- Javascript Variables
- How to get the current URL with Javascript
- Javascript Operators and Expressions
- Sharing Screens with the New Javascript Screen Capture API
- Javascript Types
- Web Workers Tutorial: Learn how Javascript Web Workers Work