This is just a little aside from the audio focus of this blog. Recently I completed my thesis for my PhD, which involves a lot of costly printing. Many printers charge one rate for black & white pages, and a separate, much higher rate for colour pages. As a result I wanted to anticipate how much one printing of the thesis would cost and so needed to know:
- The number of pages in the thesis
- How many of these are colour, and
- How many of these are black & white
Performing this task on a pdf file was not immediately clear to me. Thankfully this is a well trodden path so the answer was readily available after a bit of searching and compiling, but I am writing it here so that anyone going down the same path doesn’t have to do the same.
The solution presented here uses a shell script, written for bash. This was run in the terminal on macOS High Sierra, but should be portable for bash terminals. The target document is a pdf as generated by LaTeX.
For those who just want the code, here it is in a github gist:
I wanted to design a shell script which I could call with a single argument, the file name, and receive the total number of pages and the number of colour/black & white pages. To do this I made a file called
count_colour_pages.sh in which I wrote the shell script. This file needs to be on your
$PATH to ensure you can call it anywhere in your file system.
The output is then three lines in the terminal reporting the desired properties of the pdf document.
iMac: thesis-latex$ count_colour_pages.sh main.pdf
Number of pages: 213
Number of b&w: 141
Number of colour: 72
main.pdf is my compiled LaTeX document. Despite the ridiculous length of theses, only around 1/3rd of this one has any colour on the page, which saved me a bunch when printing.