Study Guide: Converting HTML to PDF on the command line using Google Chrome

Mac OS Command

<PATH_TO_CHROME> --headless --disable-gpu --print-to-pdf=<OUTPUT_FILE.pdf> <INPUT>

  1. Input can be a local file or a website URL.
Website Example

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --headless --disable-gpu --print-to-pdf=file1.pdf http://localhost:1313/index.html

Local File Example

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --headless --disable-gpu --print-to-pdf=file2.pdf ./index.html

Essential Flags (Shell Switches)        
--headless Don’t open the GUI.
--disable-gpu Was used to allow command line to work when GUI was already launched. May no longer be needed, but seems to do no harm.
--print-to-pdf Takes two variables: <OUTPUTFILE.PDF> <INPUTFILE>
More Flags (Shell Switches)        
default-background-color The background color to be used if the page doesn’t specify one. Provided as RGBA integer value in hex, e.g. ‘ff0000ff’ for red or ‘00000000’ for transparent.
disable-cookie-encryption Whether cookies stored as part of user profile are encrypted.
enable-begin-frame-control Whether or not begin frames should be issued over DevToolsProtocol (experimental).
enable-crash-reporter Enable crash reporter for headless.
disable-crash-reporter Disable crash reporter for headless. It is enabled by default in official builds.
crash-dumps-dir The directory breakpad should store minidumps in.
deterministic-mode A meta flag. This sets a number of flags which put the browser into deterministic mode where begin frames should be issued over DevToolsProtocol (experimental).
disk-cache-dir Use a specific disk cache location, rather than one derived from the UserDatadir.
dump-dom Instructs headless_shell to print document.body.innerHTML to stdout.
hide-scrollbars Hide scrollbars from screenshots.
password-store Specifies which encryption storage backend to use. Possible values are kwallet, kwallet5, gnome, gnome-keyring, gnome-libsecret, basic. Any other value will lead to Chrome detecting the best backend automatically. TODO(crbug.com/571003): Once PasswordStore no longer uses the Keyring or KWallet for storing passwords, rename this flag to stop referencing passwords. Do not rename it sooner, though; developers and testers might rely on it keeping large amounts of testing passwords out of their Keyrings or KWallets.
print-to-pdf-no-header Do not display header and footer in the pdf file. DOES NOT WORK!
proxy-bypass-list
proxy-server
no-system-proxy-config-service Do not use system proxy configuration service.
remote-debugging-address Use the given address instead of the default loopback for accepting remote debugging connections. Should be used together with –remote-debugging-port. Note that the remote debugging protocol does not perform any authentication, so exposing it too widely can be a security risk.
repl Runs a read-eval-print loop that allows the user to evaluate Javascript expressions.
screenshot Save a screenshot of the loaded page.
ssl-key-log-file Causes SSL key material to be logged to the specified file for debugging purposes. See https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS/Key_Log_Format for the format.
timeout Issues a stop after the specified number of milliseconds. This cancels all navigation and causes the DOMContentLoaded event to fire.
use-gl Sets the GL implementation to use. Use a blank string to disable GL rendering.
use-angle Sets the ANGLE implementation to use. Only relevant if “use-gl” is set to “angle”
user-agent A string used to override the default user agent with a custom one.
user-data-dir Directory where the browser stores the user profile. Note that if this switch is added, the session will no longer be Incognito.
virtual-time-budget If set the system waits the specified number of virtual milliseconds before deeming the page to be ready. For determinism virtual time does not advance while there are pending network fetches (i.e no timers will fire). Once all network fetches have completed, timers fire and if the system runs out of virtual time is fastforwarded so the next timer fires immediatley, until the specified virtual time budget is exhausted.
window-size Sets the initial window size. Provided as string in the format “800,600”.
auth-server-allowlist Allowlist for Negotiate Auth servers.
font-render-hinting Sets font render hinting when running headless, affects Skia rendering and whether glyph subpixel positioning is enabled. Possible values: none
block-new-web-contents If true, then all pop-ups and calls to window.open will fail.
explicitly-allowed-ports Allows overriding the list of restricted ports by passing a comma-separated list of port numbers.

Gotchas

  1. print-to-pdf-no-header does not produce output.

More Information

  1. https://github.com/chromium/chromium/blob/master/headless/app/headless_shell_switches.cc
  2. https://pandoc.org/MANUAL.html#creating-a-pdf
  3. https://discourse.gohugo.io/t/generate-hugo-website-as-a-pdf/22855/6
  4. https://github.com/jgazeau/website2pdf
  5. https://stackoverflow.com/questions/11338049/how-to-convert-html-with-mathjax-into-latex-using-pandoc
  6. https://superuser.com/questions/157484/start-google-chrome-on-mac-with-command-line-switches
  7. https://stackoverflow.com/questions/11338049/how-to-convert-html-with-mathjax-into-latex-using-pandoc


Related Content

Source: https://class.ronliskey.com/study/unix/unix-html-to-pdf/