When you finding some very pretty web pages and want to save it into formatted PDF files. You definitely want to get some tools to help us save the beautiful pages. In this post, we will introduce a very useful tool to help you to transform the web pages to PDF files.
Percollate is a command-line tool to turn web pages into beautifully formatted PDFs. The following is the work process of Percollate .
- Fetch the page(s) using
- If an AMP version of the page exists, use that instead (disable with
- Enhance the DOM using
- Pass the DOM through
mozilla/readabilityto strip unnecessary elements
- Apply the HTML template and the print stylesheet to the resulting HTML
puppeteerto generate a PDF from the page
Here is an example spread from the generated PDF of a chapter in Dimensions of Colour; rendered here in black & white for a smaller image file size.
The image on the web page
The PDF file transformed from the web page
How to install Percollate ?
You can install
# using npm npm install -g percollate # using yarn yarn global add percollate
To keep the package up-to-date, you can run:
# using npm, upgrading is the same command as installing npm install -g percollate # yarn has a separate command yarn global upgrade --latest percollate
How to use Percollate ?
percollate --helpfor a list of available commands. For a particular command,
percollate <command> --helplists all available options.
|Command||What it does|
||Bundles one or more web pages into a PDF|
||Not implemented yet|
||Not implemented yet|
html commands have these options:
|Option||What it does|
||The path of the resulting bundle; when ommited, we derive the output file name from the title of the web page.|
||Export each web page as an individual file.|
||Path to a custom HTML template|
||Path to a custom CSS|
||Additional CSS styles you can pass from the command-line to override the default/custom stylesheet styles|
||Don’t prefer the AMP version of the web page|
||Print more detailed information|
||Include a Table of Contents page|
Generate Basic PDF
To transform a single web page to PDF:
percollate pdf --output some.pdf https://w3cgeek.com
To bundle several web pages into a single PDF, specify them as separate arguments to the command:
percollate pdf --output some.pdf https://w3cgeek.com/page1 https://w3cgeek.com/page2
You can use common Unix commands and keep the list of URLs in a newline-delimited text file:
cat urls.txt | xargs percollate pdf --output some.pdf
To transform several web pages into individual PDF files at once, use the
percollate pdf --individual https://w3cgeek.com/page1 https://w3cgeek.com/page2