[SSS] Creating a sitemap

:  ~ 2 min read

Sitemaps are used by search engines to know what pages to crawl; they're basically a list of all the URLs available on a website. Last time we saw how to manually create an RSS feed by manually creating an XML document, and a sitemap is also just an XML document.

First, let's add the route to our droplet:

func addRoutes() -> Droplet {
	get("/sitemap.xml", handler: SitemapController.create)
	// [...]
}

As we saw in the previous post, we need a controller that has a method with the same signature as the handler:

struct SitemapController {

	static func create(with request: Request) throws -> ResponseRepresentable {

		// 1
		request.setValue("application/xml", forHTTPHeaderField: "Content-Type")

		// 2
		let posts = try Post.query().sorted().run()
	}

}

We start by setting the content type to XML (1) again, and by fetching our posts (2). We won't be checking if there are no posts, because in that case, we will return all the static pages.

static func create(with request: Request) throws -> ResponseRepresentable {

	request.setValue("application/xml", forHTTPHeaderField: "Content-Type")

	// 1
	let noPriority = [
		"privacy-policy",
		"feed"
	]

	// 2
	let lowPriority = [
		"projects/bouncyb",
		"projects/sosmorse",
		"projects/iwordjuggle",
		"projects/carminder"
	]

	// 3
	let highPriority = [
		"about",
		"projects",
		"projects/expenses-planner"
	]

	let posts = try Post.query().sorted().run()
	let urls = noPriority + lowPriority + highPriority
	let root = "https://rolandleth.com/"
	var xml = ""

	xml += "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
	xml += "<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">"

	// 4
	func priority(for url: String) -> Float {
		if highPriority.contains(url) {
			return 0.9
		}

		if lowPriority.contains(url) {
			return 0.3
		}

		return 0.1
	}

	// 5
	urls.forEach {
		xml += "<url>"
		xml += "<loc>\(root)\($0)</loc>"
		xml += "<changefreq>yearly</changefreq>"
		xml += "<priority>\(priority(for: $0))</priority>" // 6
		xml += "</url>"
	}

	// 7
	posts.forEach {
		xml += "<url>"
		xml += "<loc>\(root)\($0.link)</loc>"
		xml += "<changefreq>monthly</changefreq>"
		xml += "<priority>0.5</priority>" // 8
		xml += "<lastmod>\($0.modified)</lastmod>"
		xml += "</url>"
	}

	xml += "</urlset>" // 9

	return xml // 10
}

A sitemap entry has an optional property called priority, which tells the crawler how important a page is, in comparison with the others. The range for this property is 0-1, with a default value of 0.5.

In my particular case, I wanted some pages to be totally unimportant (1), some to have low priority, because they are old projects (2), and some to have high priority (3).

We iterate through all URLs (5), and with the help of a function (4), we set its desired priority (6).

We then continue by iterating through all posts (7), and this time we set the priority to 0.5 (8), just to be explicit about what's happening: we want all posts to have neutral priority; they are all equally important.

Finally, we close the urlset tag (9), and return the xml string (10), wrapping up the sitemap creation.