Automatically OCR PDF files

  1. Can anyone recommend a method to have all pdf files in a given folder automatically OCR?

    My scanner saves files as pdf, but I would like them to be searchable.

    Thanks in advance.
     
  2. jcsd
  3. It would help to know what operating system you are using. Mac OS X or Linux?

    Adobe Acrobat will do what you want.
     
  4. I have computers running windows and linux, a method for either would be fine, preferably a free option.
     
  5. There's several OCR options available to you to use. I did a Google Search for 'linux ocr pdf' and this was the first hit on the list
    http://ubuntuforums.org/showthread.php?t=1456756

    you can write a small script with a for loop that will go through the contents of a directory and ocr all the pdf files if the program doesn't have flags that allow you to do multiple pdfs at the same time.
     
  6. Sorry, I +thought+ I had relied to this days ago. It seems the way to go is "tesseract" http://code.google.com/p/tesseract-ocr/
    It has it's own GUI but there are other 3rd party GUIs or you can run it from the command line or script
     
Know someone interested in this topic? Share a link to this question via email, Google+, Twitter, or Facebook

Have something to add?

0
Draft saved Draft deleted