Register to reply

Automatically OCR PDF files

by NeoDevin
Tags: ocr, pdf
Share this thread:
NeoDevin
#1
May6-13, 11:42 AM
P: 687
Can anyone recommend a method to have all pdf files in a given folder automatically OCR?

My scanner saves files as pdf, but I would like them to be searchable.

Thanks in advance.
Phys.Org News Partner Science news on Phys.org
FIXD tells car drivers via smartphone what is wrong
Team pioneers strategy for creating new materials
Team defines new biodiversity metric
ChrisJA
#2
May9-13, 01:00 AM
P: 42
It would help to know what operating system you are using. Mac OS X or Linux?

Adobe Acrobat will do what you want.
NeoDevin
#3
May11-13, 02:09 AM
P: 687
I have computers running windows and linux, a method for either would be fine, preferably a free option.

Routaran
#4
May16-13, 11:38 AM
P: 292
Automatically OCR PDF files

There's several OCR options available to you to use. I did a Google Search for 'linux ocr pdf' and this was the first hit on the list
http://ubuntuforums.org/showthread.php?t=1456756

you can write a small script with a for loop that will go through the contents of a directory and ocr all the pdf files if the program doesn't have flags that allow you to do multiple pdfs at the same time.
ChrisJA
#5
May16-13, 11:54 AM
P: 42
Sorry, I +thought+ I had relied to this days ago. It seems the way to go is "tesseract" http://code.google.com/p/tesseract-ocr/
It has it's own GUI but there are other 3rd party GUIs or you can run it from the command line or script


Register to reply

Related Discussions
Is a slit automatically also a detector? Quantum Physics 8
Notification - how do I turn that off automatically? Forum Feedback & Announcements 8
Automatically Charging and Discharging a Capacitor Electrical Engineering 4
Car Wheels Automatically Recenter General Physics 3
Automatically redirected to chat? Forum Feedback & Announcements 3