I want to illustrate one aspect of progress with this basic discussion of the various types of references in texts. While paper print-outs certainly have a benefit of explicitly labeling notable document fragments, the digital formats take this feature to a new level, as you can now actively navigate your document by interacting with the labels. And importantly, this is not only true for web documents, but also with any current PDF document. While PDF were originally created for platform-independent but print-oriented documents, they have partially evolved to also support various structural annotations and active features, akin to HTML documents.

Why is this important? Because it demonstrates the direction of evolution of our digital documents: originally print-oriented formats are starting to allow the embedding of more abstract structural information. In a nutshell:

Structural document formats, such as HTML, generalize over and may eventually supersede print-oriented formats, such as PDF.

But if you want to see them on paper today, you would have to do some legwork.

The Typesetting Iceberg

If you want to author a book as a web document and print it out, you could be in for a ride. On the surface, browsers can do a good job of interpreting a CSS stylesheet and displaying it beautifully on your screen, but they are far from being great at printing them out. Printers and screens are very different animals.

Printers require different information from your browser’s rendering engine. And that information is very low-level. To create a printable document, we first need to know how to:

  • generate a table of contents,

  • determine the page- and line-breaks,

  • position page numbers,

  • position footnotes and other bits conveniently hidden in hover and click events,

  • position floating objects, such as figures and tables

  • … the list goes on.

The invisible considerations that a typesetting engine performs behind the scenes are numerous and one should venture into that sea with caution.

There is a meaningful symmetry here when comparing to the LaTeX to HTML direction. For a vanilla set of well-behaved HTML structures, you can get a beautiful PDF that satisfies most use cases. For more advanced effects, usually involving CSS and/or JavaScript, there are again two ways out: either you forbid the use of certain techniques or dedicate developer time to develop special stylesheets.

What’s good for the goose is good for its webpage

In 2015, if you are creating a manuscript with the intent of publishing it as (part of) a book or some other formal proceedings, you, or more likely your publisher, are likely to target three distinct forms of your work: a printed book, an e-Book and, more rarely, a web document. Luckily for us, the world is simple in at least one aspect, as e-Books are just web documents with a bit of extra metadata.

Why should we care about both printable and web documents? One view is that we are in the process of transitioning from a print-only to a digital-only age. Another is that neither is going away, and there are different uses for different media.11Fans of audio books may find this view sensible, as no one thinks we are moving to an audio-only publishing world.

But the status quo sends a clear message:

At least for the moment, authors and their tools need to concern themselves with both print and web workflows.

Translating between the islands of the “paradigm shift” is far from trivial, and getting all the little bits right is difficult, but it isn’t impossible. In fact, it’s necessary.