How to remove background from a PDF file?

  • Thread starter Thread starter Buffu
  • Start date Start date
  • Tags Tags
    File Pdf
Click For Summary

Discussion Overview

The discussion revolves around methods to remove a dark yellowish background from PDF files to enhance readability, particularly when printed in grayscale. Participants explore various software options and techniques for automating or simplifying the process, as the original poster seeks a solution for multiple PDFs.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Experimental/applied

Main Points Raised

  • The original poster seeks an automated method to remove a yellow background from PDFs, as manual conversion to images is too labor-intensive.
  • Some participants suggest using photo processing software like GIMP, Photoshop, or InDesign to adjust background colors and enhance text visibility.
  • There is a discussion about whether the background is a decorative feature or part of a scanned image, which may influence the approach taken.
  • Participants inquire about printer settings, such as printing in "Black and white" or "as an image," to see if these options yield better results.
  • One participant mentions using Krita for color correction and suggests batch processing layers, although they acknowledge a lack of knowledge about scripting for automation.
  • The original poster shares a step-by-step method using Inkscape to convert the PDF to an image, apply bitmap tracing, and convert it back to PDF, noting that results may vary with different images.

Areas of Agreement / Disagreement

Participants express various methods and software options, but there is no consensus on a single best approach. The discussion includes multiple competing views and techniques, reflecting uncertainty about the most effective solution.

Contextual Notes

Some participants mention the potential for PDFs to contain layers, but there is uncertainty about how to check for this feature. The effectiveness of different software and methods appears to depend on specific document characteristics and user familiarity with the tools.

Buffu
Messages
851
Reaction score
147
I have a PDF file which has a weird dark yellowish background that renders it unreadable when printed in grayscale. I am not interested in the yellow background I just want black text from the PDF.

I want to do this for a quite a large number of PDFs so I want a process that is either automated or requires little effort ?

I tried converting PDF to images then extracting text from them but this requires large amount of effort as I have to do that page by page. Is there a better way to do this ?
 
Computer science news on Phys.org
Try using photo processing software . Use the controls to fade out the background colours and enhance the text colours .
 
Nidum said:
Try using photo processing software .
Softwares like GIMP, Photoshop ?
 
Buffu said:
Softwares like GIMP, Photoshop ?

Possibly, InDesign is another viable option. Maybe the latter is your best bet if you have access to it.

Nidum said:
Try using photo processing software . Use the controls to fade out the background colours and enhance the text colours .

If I'm not mistaken PDFs can be saved using layers so that would be the perfect scenario.
 
Is the background a decorative feature under the text (like in a powerpoint)?
Or is it the background of a scanned image?

Does your printer have an option to print in "Black and white"?
Does printing "as an image" help?
 
robphy said:
Is the background a decorative feature under the text (like in a powerpoint)?
Or is it the background of a scanned image?

Does your printer have an option to print in "Black and white"?
Does printing "as an image" help?
The colour of the page itself is dark yellow. It is not a decoration.
Yes my printer has a black and white option but it is not much helpful. I tried printing "as an image" but got same results.
 
Are you using Acrobat Reader on Windows?
Or something else?

As mentioned above, does your document have "layers"?
I wonder if there is an option to somehow "print without using yellow ink" (possibly in an advanced option when printing).

Can you provide a small example pdf?
 
You could try Krita and open the pdf as a layered image
Then select a layer and do some colour correction, brightness-contrast shift, or something from the filter menu
Then select the next layer and in filters click apply filter again
Or use the macro record option to record the steps you take and apply the macro to each layer in turn.

Possibly there is some terminal scripting you can do to batch process layers. I know nothing about that.

edit add: re scripting: likewise you can open the pdf in something like gimp as multiple images and do a scripted batch processing of them and then shift the images in turn into one image as layers and export/save that as a pdf.

(I seem to remember that PhotoImpact (16?) allows for batch processing of layers(don't know if you can save as pdf in it though))

In Karta similarly save the layered image as a pdf.

Because a pdf is a stack of images, be careful when saving as depending on how you save whether it is lossless or not. Best idea is to always keep the originals and don't overwrite them.
 
Last edited:
robphy said:
Are you using Acrobat Reader on Windows?
Or something else?

As mentioned above, does your document have "layers"?

I am using Acrobat DC professional edition.
I don't know how to check if my document have layers or not.

robphy said:
I wonder if there is an option to somehow "print without using yellow ink" (possibly in an advanced option when printing).

Can you provide a small example pdf?

I don't see an option for printing without an specific ink.

@john101 I used Krita and GIMP but I don't know what color correction I need to do in them, learning curve is high in these software.

Anyhow I used Inkscape to solve my problem.
This is what I did, maybe of some use to someone.

Steps :-
1) :- Convert the PDF to JPG or PNG(http://pdftoimage.com/). Simply importing a PDF in Inkscape is doable but this method does not work with it. (Please let me know if you know)
2):- Drag the Image in the on a new inkscape document. Select the option that says "Import image with default dimensions"(something like this).
3) :- Then go the path > trace bitmap.
sacac.png


4) :- Then just select the 'Ok' option and wait for the magic to happen. Default settings work for me but you might have to adjust them.
This step weirdly does not work if the PDF is directly imported in Inkscape.
asdas.PNG


5) :- Then convert it back to PDF. http://jpg2pdf.com/

I tried it with some different types of images and it does not work properly with all kinds of images.
 

Similar threads

  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
Replies
4
Views
2K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 14 ·
Replies
14
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
17
Views
7K