Text color detection algorithm giving pixels of text

SlurrerOfSpeech · Oct 5, 2019

I'm creating a type of image processing software and I have a need to get the color that best represents text once I have identified the pixels in a region of text. I've tried using simple strategies like "averaging" the pixels or taking the most common pixel, but these produces bad results

An example I can show is a screenshot of this very message board post.

Suppose I want to get the color of "PF thrives ...". A good representation is the hex value #4C4C35 which I found out using color picker in MSPaint and clicking on a pixel that appeared to be the same color as the text appears to the human eye.

However, that color would be chosen by "most common color" just barely, and as you can see, there are many other colors (turqoise, purpose, tan, etc.) that might've won if the text was a little thinner. I need a more foolproof algorithm because mine fails in many cases.

.

Filip Larsen · Oct 5, 2019

If the regions you talk about are known to be text only, then it seems logical that the most common color (or hue) would be that of the background and the second most common one would be the "real" color of the text with smaller peaks due to foreground color mixing with background color due to the anti-aliazing rendering of fonts. Perhaps, you can even somehow "filter out" those minor peaks on a second pass (when you know first and second peak) if their hue the lies "between" the background and foreground hue.

I am not sure if you also want to pick of if only some of the text is rendered in a different color, say if a few words are red?

SlurrerOfSpeech · Oct 5, 2019

Filip Larsen said:

If the regions you talk about are known to be text only, then it seems logical that the most common color (or hue) would be that of the background

No, I'm saying that I have the coordinates of the pixels, e.g. a 2-d array of booleans where true is a text pixel and false is a background of the text, and choosing the 1st most common color among the true coordinates within a region containing text.

Below is a better example of the pixels of small, thin text on this page when I take a screenshot. The most common colors are actually a blue or gold instead of black as intended. So most common color is not a reliable formula.

Ibix · Oct 5, 2019

I'd do some reading about how text is composited onto a background. Presumably you can identify the background colour - given that, can you invert the compositing process?

Ibix · Oct 5, 2019

Alternatively, measure the spatial distance of each text pixel from the nearest "pure background" pixel, order by distance, keep the top 10%, and take the modal or median colour of those.

Muck around with the distance measure (Euclidean, anisotropic Euclidean, taxicab, etc), percentage to keep, and averaging methodology to see if you can find decent performance.

Filip Larsen · Oct 5, 2019

Apparently subpixel rendering has been invented since the last time I did any work on font rendering (back when only non-colored anti-aliasing was used), so I will venture a guess that the colors you see is an artifact of that particular rendering technique. It is not clear to me how colored text is rendered with this technique, but perhaps its still possible to apply some kind of averaging filter that extracts the original color of each letter.

Or perhaps you can look into if any OCR libraries or similar have put efforts into adjusting for this effect when "reading" off a screen. A quick search gave a link a github project with an accompanying blog post.

Tom.G · Oct 6, 2019

A somewhat obvious simplification would be to first convert to grey-scale, then perhaps thresholding to convert to Black and White.

Be aware that the 'characters' will not be perfectly formed no matter the method of obtaining them. There are always missing and extraneous pixels to contend with. Your recognition routine will have to operate on the 'nearest match' approach.

Cheers,
Tom

Text color detection algorithm giving pixels of text

Is A.I. more than the sum of its parts?

AI vs. Humans as Processors in an Environment

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect