Enhancements to HTML Documentation



The upcoming release of R (version 4.2.0) features several enhancements to the HTML help system.

The most noticeable features are that LaTeX-like mathematical equations in help pages are now typeset using either KaTeX or MathJax, and usage and example code are highlighted using Prism. Additionally, the output of examples and demos can now be shown within the browser if the knitr package is installed. This is especially useful if the examples produce graphical output. Apart from these, several less noticeable changes have been made to improve useablity.

The goal of this post is to introduce these enhancements, and request testing of these features during the run-up to the release of R 4.2.0 so that as many bugs can be taken care of before release as possible.

Rendering of mathematics

The R documentation format supports LaTeX-like mathematics through the \eqn{} and \deqn{} commands. Historically, these have been rendered properly when converted to PDF (via LaTeX), but not when converted to other formats. However, support for rendering mathematics in HTML has matured over time, and in particular the MathJax and KaTeX Javascript libraries are stable and popular solutions. In fact, several R packages provide indirect support for mathematics in R documentation, notably mathjaxr and katex. Naturally, both of these require special markup.

From R 4.2.0, \eqn{} and \deqn{} commands can be rendered as mathematics in HTML output, using either KaTeX or MathJax. The details can be controlled using the help.htmlmath option. For dynamic help, the default is equivalent to setting

options(help.htmlmath = "katex")

which uses a local copy of KaTeX that ships with R. The help.htmlmath option can also be set to "mathjax"; in this case, if the mathjaxr package is installed then its local copy of MathJax is used, otherwise an online CDN is used. If the option is set to any other non-NULL value, the displayed pages will fall back to the basic substitutions that were used previously.

Static HTML output, if enabled, will always use KaTeX via a CDN (as relative paths to a local copy cannot be reliably computed).

Code highlighting

R now ships with Javascript and CSS files downloaded from https://prismjs.com/ to provide syntax highlighting of R code in the Usage and Examples sections in R help pages. Such code are now marked with a <code class='language-R'> tag, which are processed and rendered using the Prism Javascript and CSS.

Syntax highlighting is currently disabled for static HTML output, again due to the difficulty in computing relative paths, and the lack of a CDN.

Examples and demos

R provides specific facilities to run two types of example code included in package documentation, namely examples and demos, via the example() and demo() functions respectively. The example code is shown as part of the corresponding help page, and demos can be accessed via dynamic help through the package’s index page. Versions of R prior to 4.2.0 provided no option to run the examples via the dynamic help system. Demos could be run by clicking a link, but the result was to run the demo in the console.

Displaying the result of running examples or demos within the help system has certain obvious benefits, especially if the output contains graphics. While this is a fairly non-trivial task, the powerful knitr package makes it quite simple.

If the knitr package is installed, R 4.2.0 will include links in help pages that allow its examples to be run, which when clicked will run the examples as a .Rhtml document and display the output as an HTML page. Such pages can be also be created directly by accessing links of the form

http://127.0.0.1:<PORT>/library/<PKG>/Example/<TOPIC>

Previously available demo links of the form

http://127.0.0.1:<PORT>/library/<PKG>/Demo/<TOPIC>

will similarly display output in the browser instead of running the demo in the console.

Both the example() and demo() functions have a new type argument, which can be set to "html" to show output in a browser instead of the console using links of this form.

One caveat that may be considered either a feature or a bug depending on your perspective: The code to create the HTML output using knitr sets some but not all knitr options. This means that the output may be affected by settings previously modified by the user. For example, if one sets

knitr::opts_chunk$set(dev = "svg")

or

library(svglite)
knitr::opts_chunk$set(dev = "svglite")

then the embedded images will be SVG instead of PNG (which in most cases will be an improvement). Similarly, loading packages that define additional methods for knit_print() may change the output from how it would have appeared in the console.

HTML5

Help pages are now HTML5, the current HTML standard, which in particular helps facilitating some of the enhancements described previously (and in the future may be used for additional enhancements).

Considerable effort was put into ensuring creating valid HTML5. The old validation toolchain could not handle HTML5, so a new one was created based on HTML Tidy and integrated into the tools package.

Validation not only identified HTML generation issues in R, but also Rd markup problems in add-on packages, often from outputting raw HTML. So a new check for the validity of the package HTML help pages was added which is turned on for the CRAN submission checks, and can generally be activated by setting env var _R_CHECK_RD_VALIDATE_RD2HTML_ to something true. (This needs HTML tidy available on the system path for executables. The checks are currently not performed for Rd pages generated by roxygen2, which still needs updating for the HTML changes.)

These checks report validation problems for the generated HTML, and relating these to the Rd sources is not always immediate. In such cases, what seems to work best is calling help(<TOPIC>, help_type = "html") on a TOPIC from the offending Rd file, and then use the browser to view the HTML source. This allows to identify the context, and in fact, often already highlights invalid content.

R help pages now also add the viewport meta tag inside <head>, which improves rendering and zooming on mobile devices.

Stylesheet

The R.css stylesheet that is used to style R help pages features some simple improvements. More usefully, dynamic help now serves the copy of R.css located in $R_HOME/doc/html/R.css rather than the copy created for every package during installation (the latter is still necessary for static HTML). This means that any local changes made to $R_HOME/doc/html/R.css after R is installed will affect all help pages subsequently displayed through the dynamic help system. For example, one might include

@import url("https://cdn.jsdelivr.net/npm/bootstrap@4.1.3/dist/css/bootstrap.min.css");

at the top of R.css to include the Bootstrap 4 CSS. Note that this does not affect RStudio’s built-in browser, which uses its own styling.

Disabling

Many of these enhancements can be disabled by setting the environment variable _R_HELP_ENABLE_ENHANCED_HTML_ to a false value.