A File Format For Static Websites

Everybody’s doing it wrong. Use this file format.

Ever hear of a “static website?” They’re the Next Big Thing in website hosting — and a return to yesteryear. Unlike today’s normal hosting companies, static-site hosting companies don’t run any of your code. Economies of scale make your website crazy-fast and crazy-cheap.

Static websites can‘t write to a database on the server. No sign-in forms; no online shopping; no photo upload. With a bit of creative programming, a static site can handle just about anything else.

GitHub Pages, Jekyll, Middleman and Roots all generate static websites. They all do it by “compiling” your website into a directory full of files.

And that’s wrong.

What Is A Static Page?

For instance, here’s the dialogue between a browser and a server for the world’s first web page:

GET /hypertext/WWW/TheProject.html HTTP/1.1
Host: info.cern.ch
HTTP/1.1 200 OK
Date: Tue, 04 Apr 2017 22:25:34 GMT
Server: Apache
Last-Modified: Thu, 03 Dec 1992 08:37:20 GMT
ETag: "40521e06-8a9-291e721905000"
Accept-Ranges: bytes
Content-Length: 2217
Connection: close
Content-Type: text/html
<HEADER>
<TITLE>The World Wide Web project</TITLE>
... (obsolete HTML continues...)

That’s a static page: an HTTP response. It has three components:

  1. path: where the file is stored on the server.
  2. headers: instructions the web browser needs to decode the content.
  3. body: the data.

Here’s what today’s static website generators would produce:

<HEADER>
<TITLE>The World Wide Web project</TITLE>
... (more really-old HTML stuff)

That’s the static page body. The site generator will store it at a certain path — but often the wrong path. And there’s no sign of headers.

Today’s site generators do this:

  1. Produce static pages
  2. Write the static pages to files
  3. Upload the files to a website

They should skip step 2, because it’s problematic.

The Price Of Files

Files can’t store ETag and Cache-Control. These headers let you configure some responses to be fast and cheap (like images) and other responses to be easy to overwrite at a moment’s notice (like breaking news articles).

Files aren’t aware of Content-Encoding. Some hosting companies (S3 springs to mind) won’t compress static pages for you, making them load more slowly and at a higher cost than you’d like. You can solve that problem by compressing the body and setting Content-Encoding … but your filesystem doesn’t understand this.

And with files, you can’t name a file after a directory. For instance, https://github.com/huffpostdata is a web page, and https://github.com/huffpostdata/in-memory-website is a web page. But if your static site generator generates a file called output/huffpostdata, it can’t generate another file called output/huffpostdata/in-memory-website because that would only work if output/huffpostdata were a directory … but it’s a file.

Maybe your site generator works around this by storing output/huffpostdata/in-memory-website and output/huffpostdata/index, where “index” is some special keyword. But then you can’t create a file called “index”. In general, files don’t map to web-server endpoints.

Plus, different filesystems allow different filenames. Linux filesystems can store a file named foo:bar, but 2017’s Windows and Mac filesystems can’t.

Finally, reading and writing files is slow.

In sum: files don’t describe a static website. So static website generators shouldn’t write files.

The Solution: In-Memory Website

I’ve coded this in in-memory-website: some NodeJS tools and a language-agnostic specification.

The idea is:

  • All static website generators can output static websites. At HuffPostData I coded a couple of site generators: hpd-asset-pipeline and hpd-page-generator.
  • You can store a static website in a file and load it later.
  • You can upload a static website to a hosting service like S3.
  • You can modify a static website — for instance, add ETags or gzip-compress.
  • You can develop a static website using a development server that re-runs your framework every time a file changes. (My NodeJS development server even makes the browser refresh, with LiveReload.) The development server can mimic S3 almost exactly.
  • You can stream a static website, if it’s too large to fit in memory. And if you must, you can always stream it into filesystem files.
  • You can mix and match programming languages: pipe a StaticWebsite from your Ruby site generator to your Node S3 uploader.

I look forward to the day static website generators live in a happy community. Our first step: ditch the filesystem.

--

--

Journalist, ex software engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store