How to remove background from a PDF file?

Buffu · Dec 23, 2016

I have a PDF file which has a weird dark yellowish background that renders it unreadable when printed in grayscale. I am not interested in the yellow background I just want black text from the PDF.

I want to do this for a quite a large number of PDFs so I want a process that is either automated or requires little effort ?

I tried converting PDF to images then extracting text from them but this requires large amount of effort as I have to do that page by page. Is there a better way to do this ?

Nidum · Dec 23, 2016

Try using photo processing software . Use the controls to fade out the background colours and enhance the text colours .

Buffu · Dec 23, 2016

Nidum said:

Try using photo processing software .

Softwares like GIMP, Photoshop ?

JorisL · Dec 23, 2016

Buffu said:

Softwares like GIMP, Photoshop ?

Possibly, InDesign is another viable option. Maybe the latter is your best bet if you have access to it.

Nidum said:

Try using photo processing software . Use the controls to fade out the background colours and enhance the text colours .

If I'm not mistaken PDFs can be saved using layers so that would be the perfect scenario.

robphy · Dec 23, 2016

Is the background a decorative feature under the text (like in a powerpoint)?
Or is it the background of a scanned image?

Does your printer have an option to print in "Black and white"?
Does printing "as an image" help?

Buffu · Dec 24, 2016

robphy said:

Is the background a decorative feature under the text (like in a powerpoint)?
Or is it the background of a scanned image?

Does your printer have an option to print in "Black and white"?
Does printing "as an image" help?

The colour of the page itself is dark yellow. It is not a decoration.
Yes my printer has a black and white option but it is not much helpful. I tried printing "as an image" but got same results.

robphy · Dec 24, 2016

Are you using Acrobat Reader on Windows?
Or something else?

As mentioned above, does your document have "layers"?
I wonder if there is an option to somehow "print without using yellow ink" (possibly in an advanced option when printing).

Can you provide a small example pdf?

john101 · Dec 24, 2016

You could try Krita and open the pdf as a layered image
Then select a layer and do some colour correction, brightness-contrast shift, or something from the filter menu
Then select the next layer and in filters click apply filter again
Or use the macro record option to record the steps you take and apply the macro to each layer in turn.

Possibly there is some terminal scripting you can do to batch process layers. I know nothing about that.

edit add: re scripting: likewise you can open the pdf in something like gimp as multiple images and do a scripted batch processing of them and then shift the images in turn into one image as layers and export/save that as a pdf.

(I seem to remember that PhotoImpact (16?) allows for batch processing of layers(don't know if you can save as pdf in it though))

In Karta similarly save the layered image as a pdf.

Because a pdf is a stack of images, be careful when saving as depending on how you save whether it is lossless or not. Best idea is to always keep the originals and don't overwrite them.

Buffu · Dec 29, 2016

robphy said:

Are you using Acrobat Reader on Windows?
Or something else?

As mentioned above, does your document have "layers"?

I am using Acrobat DC professional edition.
I don't know how to check if my document have layers or not.

robphy said:

I wonder if there is an option to somehow "print without using yellow ink" (possibly in an advanced option when printing).

Can you provide a small example pdf?

I don't see an option for printing without an specific ink.

@john101 I used Krita and GIMP but I don't know what color correction I need to do in them, learning curve is high in these software.

Anyhow I used Inkscape to solve my problem.
This is what I did, maybe of some use to someone.

Steps :-
1) :- Convert the PDF to JPG or PNG(http://pdftoimage.com/). Simply importing a PDF in Inkscape is doable but this method does not work with it. (Please let me know if you know)
2):- Drag the Image in the on a new inkscape document. Select the option that says "Import image with default dimensions"(something like this).
3) :- Then go the path > trace bitmap.

4) :- Then just select the 'Ok' option and wait for the magic to happen. Default settings work for me but you might have to adjust them.
This step weirdly does not work if the PDF is directly imported in Inkscape.

5) :- Then convert it back to PDF. http://jpg2pdf.com/

I tried it with some different types of images and it does not work properly with all kinds of images.

How to remove background from a PDF file?

Is A.I. more than the sum of its parts?

AI vs. Humans as Processors in an Environment

France to ditch Windows for Linux

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

How to remove background from a PDF file?

Similar threads