Reading in an image?

  • Thread starter Maxwell
  • Start date
  • #1
513
0
Hey guys,

I was wondering if anyone knows of an open source solution for reading in a black and white image and converting it to text (preferably in a text file)?

I've tried Google, but I could only find online generators.

Thanks.
 

Answers and Replies

  • #2
79
0
Will Gimp do what you want?

Otherwise, I have a vague recollection (+10 yrs ago) of using some command line utils in unix like png2pbm. No idea if that kind of stuff is still around.
 
  • #3
robphy
Science Advisor
Homework Helper
Insights Author
Gold Member
5,863
1,170
Some things that come up on a google search of: freeware ocr

http://www.simpleocr.com/
http://www.download.com/SimpleOCR/3640-2070_4-10152129.html

http://www.inftyproject.org/en/software.html#InftyReader
http://www.sciaccess.net/en/InftyReader/index.html

http://www.gnu.org/software/ocrad/ocrad.html

http://documents.cfar.umd.edu/ [Broken] (repository)
http://www.adams1.com/pub/russadam/ocr.html [Broken]
http://www.heatonresearch.com/articles/42/page1.html (project)
http://www.codeproject.com/dotnet/simple_ocr.asp [Broken] (project)
http://code.google.com/p/ocropus/
 
Last edited by a moderator:
  • #4
Zen
10
0
I've tried some of the opensource alternatives, and I must say was disapointed. :( I't might be some settings that needed to be done though. I gave up after a while
 
  • #5
513
0
Thank you for those links, but they are kind different from the type of manipulation I need.

Aren't there any algorithms that just straight take in a black & white image and turn it into binary?
 
  • #6
robphy
Science Advisor
Homework Helper
Insights Author
Gold Member
5,863
1,170
What is the format of the input image?
Once the image is read in, it shouldn't be too hard to compete a short computer program in (say) perl, python, c, java...
 
  • #7
513
0
The images can either be PNG or TIFF files. If there is code that only works for one of the previously mentioned file types, that is fine.
 
  • #8
robphy
Science Advisor
Homework Helper
Insights Author
Gold Member
5,863
1,170
If I had to quickly write something to do this, I would personally choose one of these apparoaches:

write a program in Python using the Python Imaging Library http://www.pythonware.com/products/pil/

write a program in Java using ImageJ and some selection from its plugins http://rsb.info.nih.gov/ij/ http://rsb.info.nih.gov/ij/plugins/index.html

The above, however, would not be very portable... except to other computers with these installed.

As I hinted above, the hardest part would be to read in the image file... probably best handled by someone else's routines that the rest of your program would call.
 
Last edited by a moderator:
  • #9
1,997
5
What is the format of the input image?
Once the image is read in, it shouldn't be too hard to compete a short computer program in (say) perl, python, c, java...
Converting an image to text can obviously not done with a short computer program. It is actually quite complex.

- Oh, never mind, you were commenting on converting the image to a binary.

What I am waiting for is for some student to write an OCR program that converts formulas into latex. Would be a nice project and would benefit many.
 
Last edited:
  • #10
robphy
Science Advisor
Homework Helper
Insights Author
Gold Member
5,863
1,170
Converting an image to text can obviously not done with a short computer program. It is actually quite complex.

The OP is not looking for an OCR program... but a program which converts a 2-color image into some kind of text file with a binary or hexadecimal representation of the image.

What I am waiting for is for some student to write an OCR program that converts formulas into latex. Would be a nice project and would benefit many.

Did you see the InftyReader project in the links I posted above?
Here are some samples: http://www.inftyproject.org/en/demo.html#0002
 
  • #11
1,997
5
The OP is not looking for an OCR program... but a program which converts a 2-color image into some kind of text file with a binary or hexadecimal representation of the image.
I see sorry for the confusion.


Did you see the InftyReader project in the links I posted above?
Here are some samples: http://www.inftyproject.org/en/demo.html#0002
Heh, interesting! I am going to check it out!
 
  • #12
513
0
The OP is not looking for an OCR program... but a program which converts a 2-color image into some kind of text file with a binary or hexadecimal representation of the image.

Exactly. This conversion from a 2 color image to text is only a small part of an overall project, so if I don't need to mess around with writing the program myself, and could perhaps use an open source solution, that would be fantastic.
 
  • #13
robphy
Science Advisor
Homework Helper
Insights Author
Gold Member
5,863
1,170
Exactly. This conversion from a 2 color image to text is only a small part of an overall project, so if I don't need to mess around with writing the program myself, and could perhaps use an open source solution, that would be fantastic.

The Python PIL and ImageJ solutions are open source platforms... But I doubt you will find an already written program that does what you want. (However, see below.) If this is part of a larger project, these solutions above might be worth looking into.

Here is one question though... are you looking to create a text file comprised of only "0" and "1" corresponding to the 2 colors of an image? [Rather than (say) a text file with a hexadecimal representation, each line corresponding to eight rows of the image.] In other words, are you looking for something like [but not precisely] this: http://www.text-image.com/ or http://ascii.dyne.org/ ?
 
Last edited by a moderator:
  • #14
513
0
The Python PIL and ImageJ solutions are open source platforms... But I doubt you will find an already written program that does what you want. (However, see below.) If this is part of a larger project, these solutions above might be worth looking into.

Here is one question though... are you looking to create a text file comprised of only "0" and "1" corresponding to the 2 colors of an image? [Rather than (say) a text file with a hexadecimal representation, each line corresponding to eight rows of the image.] In other words, are you looking for something like [but not precisely] this: http://www.text-image.com/ or http://ascii.dyne.org/ ?

I'm not sure either really capture what I'm looking for, but if I had to choose one, I'd say the second link depicts what I'm trying to do better.

For the first link, the images are converted to 1's and 0's, but it doesn't seem like those values represent anything.

What I'm looking for is to take a black and white image in, and have the black areas represented by 1's and the white areas represented by 0's.
 
Last edited by a moderator:

Related Threads on Reading in an image?

  • Last Post
Replies
1
Views
1K
  • Last Post
Replies
4
Views
14K
  • Last Post
Replies
5
Views
6K
Replies
11
Views
31K
  • Last Post
Replies
4
Views
791
Replies
1
Views
1K
  • Last Post
Replies
8
Views
753
  • Last Post
Replies
18
Views
16K
Replies
2
Views
11K
Top