Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Resolution in visual system in physical terms

  1. Sep 6, 2017 #1
    I've often wondered at the "resolution" of the human visual system but it's not at all clear from what I've read whether this question even makes sense. As a sort of general position, many of the articles I've read suggest that the human eye, over the full field of vision, delivers around 400-600 megapixels of detail. The central foveal field is considerably less, perhaps no more than 10 megapixels. And only around 1 million fibres extend from retina to brain.

    However, I don't think this is really what I'm thinking about. The visual system does a lot of processing along the way and at some point in the system an image becomes conscious. And of course, vision is more like a video stream than a static image.

    What I am curious about is whether we know how detailed is a conscious image. While we might only discern 10 megapixels in terms of visual field at the retina's foveal region, and only 1 million fibres extend to the brain, I assume that the constant stream being generated produces a far more detailed conscious image. That is, the "resolution" of a conscious image must be greater than is physically derived at a singular moment at the retina.

    If that is so (or even if it isn't I suppose) is it known how many neurons contribute to each pixel in a conscious image? I mean by this that as conscious images are formed over time, does a single neuron contribute to one pixel or many pixels? Do we know the ratio of neurons to pixels at the stages of image processing that most likely give rise to conscious experience of a scene?

    Or is this not really a valid question because I don't understand enough about visual processing?
  2. jcsd
  3. Sep 6, 2017 #2


    User Avatar
    Science Advisor
    Gold Member
    2017 Award

    I think that is a good question, but the answer is very complicated. The mind fills in missing parts of what the eye sees -- right or wrong. Mental recognition of the pattern that the eye sends it is very complicated. If you are asking about that, then I don't believe we know enough to answer your question or even how to rigorously express an answer.
  4. Sep 6, 2017 #3

    jim mcnamara

    User Avatar

    Staff: Mentor

    How about a measure of resolution, or visual acuity? What the fovea picks up - a small subsection of the visual panorama. Is that what you mean?

    @FactChecker is close to the heart of the perception issues. Your brain does a lot of postprocessing of visual input to fill in images it interprets as belonging to whole whole picture, and can be fooled pretty easily. Optical illusions like trompe l'oeil, and illusions stage magicians perform -- come to mind.

    This helps:
    https://en.wikipedia.org/wiki/Visual_acuity. It discusses what you asked and all of the attendant factors. IMO.
  5. Sep 6, 2017 #4


    User Avatar
    Science Advisor
    2017 Award

    Measuring the resolution of an image by number of pixels is a relatively new phenomenon that has largely come about because of how digital cameras store information. For images not stored digitally (e.g. on film), it's unclear how you could describe the resolution in terms of megapixels. Traditionally, the spatial resolution of an imaging system would be measured by how closely apart two objects could be for them to be distinguishable in the image. Features such as the numerical aperture/f-number of the imaging system help determine the spatial resolution of the imaging system. Ultimately, the spatial resolution of an optical imaging system is limited by the diffraction of light.

    Measuring the resolution of the human eye in megapixels presuposes the image information gets stored/interpreted in a pixelated, digital format. Is this true?
  6. Sep 6, 2017 #5


    User Avatar
    Science Advisor
    Gold Member
    2017 Award

    Some issues with the mental image of what a person sees depend a lot on how long he gets to study the image. It is well established that eye witness testimony of a brief criminal event is very unreliable. The mind fills in many details that are not really there. On the other hand, given time to study a picture and take mental notes of the details, the results are much more reliable.
  7. Sep 6, 2017 #6

    Andy Resnick

    User Avatar
    Science Advisor
    Education Advisor

    "perfect vision" corresponds to about 1 arcmin of resolution. However, as you correctly note, that doesn't begin to address visual acuity: it is known that the angular resolution of vision depends on the contrast between object and background, if the object is moving, if the object is 'blinking' (and the temporal characteristics of the blinks- duration, repetition rate, etc.), if the image is located at the fovea or periphery.... And everything depends on wavelength and if you mean scotopic (dim light, so the rods) or photopic (bright light, the cones).

    Your retina has about 7 layers of processing: local averaging, edge detection, movement tracking, temporal averaging, and more.

    The best book I have read on this topic is "Basic Vision" https://www.amazon.com/Basic-Vision-Introduction-Visual-Perception/dp/019957202X
  8. Sep 6, 2017 #7


    User Avatar
    Science Advisor

    I have heard pixel based digital picture resolutions compared to the number of grains/area in film and negatives, mostly from photographers.

    I human visual system functions differently from film and digital images.

    There is processing of visual image information in the retinal itself.
    Changes vs. time will affect how this processing proceeds, so the movie like aspect can be important.
    A flash, or a sudden increasingly large dark area, in the peripheral visual field will more strongly attract attention than areas of constant illumination.

    Different areas of the retina serve different functions in the internal reconstruction of the visual area (like a stage), the objects in that area, and movements between the various parts. Information concerning the depth (distance; derived from eye convergence, eye lens focusing, other visual cues) of objects in a scene could also be included.
    Retinal areas surrounding the fovea have rods (B&W), which are better for low light conditions . Spatially, they may provide a more low-res. regional awareness of objects in the area.

    The fovea is constantly moving, going from focusing on a spot for a short period of time and them moving to another part of the scene (eye movement information would also be combined with the photo-receptor derived information). This provides more information, at a higher density, from particular areas under observation, unconsciously selected for you by the visual system. Many areas just get filled in based on their surroundings. There are optical illusions based on this.

    Therefore, your observed scene could vary quite a bit in what you might be able to resolve.
    While pictures or movies (digital of non-digital) story an even density of picture components (pixels or film grains) across the image, a visual image (as normally used to explore an environment), the internal human visual image is more like a built up model, based mostly on visual inputs, but sometimes also on other senses (such as sensing eye/lens movements or orientation vs. gravity).

    When you observe a visual field, you are probably looking at an internally constructed model of the objects and areas you are observing.
    The inputs to this system are not evenly distributed across the retina. Most color receptors are found at the highest densities at the fovea. The fovea is aimed at areas of interest in a visual field to get detailed information of a particular part of the visual field.

    With continued observation from different observation points, points of increasingly high resolution could be built up, exceeding what might otherwise seem like limits. Of course technology (such as microscopes/telescopes) have allowed us to further extend our "vision" to even higher resolutions (which is still going through your visual system).
    In that sense, there are no limits to the resolution of the human visual system.

    If you are just interested in the limits of direct observation under controlled conditions, then a simple psychophysical approach to determine the limits of how close together two points can be resolved.

    However, normal use of your visual system involves much more than just this.
  9. Sep 6, 2017 #8
    Thank you for the interesting responses. I'll get back to my question in a moment. Thanks too for the book recommendation, Andy Resnik. I think it's very interesting to learn that there are several layers of "processing" in the retina - I'd love to know what that actually means given that this physical area should be relatively open to study. How do retinal cells process information in that way?

    In terms of resolution or visual acuity, to which I admit to only a very sketchy understanding, this page seems to be referenced quite a bit.

    Here, the author proposes that typical visual acuity of 1.7 when measured via a line pair corresponds to 0.59 arc minute per line pair, and hence the "pixel" spacing works out to 0.3 arc-minute. He makes some interesting calculations in terms of the pixel detail needed for an image to reach the limits of human visual acuity.

    Returning to my question, I think what I am getting at is best addressed in the comments above that talk about how our brains/minds construct scenes and fill in details etc. Now, without straying into philosophy, I am going to assume that when I see a scene, regardless of how much in-filling or construction is going on, what I experience must be a physical artefact.

    Assuming the reference above is fairly accurate, it seems that there could be assumed to be a finite limit to the number of points or pixels in an experienced scene. At some fine limit of detail, two points resolve into one as far as vision goes. I am asking whether we know how many neurons are required to represent any of these points at the limit of acuity.

    Put another way, when I look at a wall of uniform colour and brightness, I experience something different from a scene of detail, such as a wall of fine coloured dots. As we decrease the size and spacing of the dots, at some point the wall ceases to be experienced as separate dots and becomes visible as a solid colour (I assume!). Is there a difference in how many neurons represent the separate dots versus the solid colour? I don't see that there can be as the initial sensory perception seems to be largely the same (scattered cones responding to photons of x wavelength). Those signals are then passed through the visual cortex and eventually make it to consciousness, but is it a one to one relationship (I doubt that), or is the final "image" composed of more points (ie neurons) than the original sensory response passed from the retina (I imagine that it must).

    Does that make sense? I suppose we run into the problem of not knowing what makes something conscious, but in terms of the visual cortex processing, is there a stage at which processing finishes, so to speak, where my question might apply?
  10. Sep 6, 2017 #9


    User Avatar

    Staff: Mentor

    That's what I was thinking: [to OP] try calculating what angle the letters on an eye chart subtend for 20/20 vision....
  11. Sep 7, 2017 #10


    User Avatar
    Staff Emeritus
    Science Advisor

    No, retinal cells are not linked to the brain on a one-for-one basis. There are far more receptors in the retina than there are "paths" in the optic nerve. Receptors are linked together in different ways and the signals from multiple receptors are usually added together and processed in some fashion before ever reaching the optic nerve. The receptors can also respond differently, with some being turned "on" by light and some being turned "off" instead. Interestingly, the different layers of this processing chain appear to perform things like edge detection and shape detection, among others.

    Once the signals from the receptors make it through the beginning of the optic nerve, they end up being processed again at various points up the chain and the signals eventually spread out to a great many neurons in the form of a conscious image. Each stage does different things. For example, the lateral geniculate nucleus (LGN) performs ranging and velocity detection of major objects in the visual field before up-channeling the visual signals to the other parts of the visual system. Then these perform more processing to start to piece together all the different types of information into a coherent "global view", which is then passed on.

    See the following image: https://upload.wikimedia.org/wikipedia/commons/c/cd/Lisa_analysis.png

    Note that the output of the retina is very, very low quality compared to the input image. Instead of a single raw image you have many different lower quality "images" which give the rest of the visual system basic information to work with. I put the word "image" in quotes for a reason. It's easy to think that the output of the retina is an actual image, when it is very likely that it is more like a streams of data which are all processed in parallel and added together to form the image in your head that you "see". Given all of the processing and compression that takes place, it seems almost a miracle that we can see anything at all!

    If you were to think of this process in terms of digital imaging, then imagine that you're trying to film a scene at 60 FPS with a 100 megapixel camera, but the cable and equipment transmitting the image to your computer can only support a fraction of that amount of data. So the camera has to do a lot of pre-processing to compress the data without losing valuable information. Now, this transmitted image isn't a raw image, it may not even be an "image" at all. The computer has to do its own processing of the data to recover the image from the incoming data, correct for artifacts introduced by the compression process, store the image in memory, and keep track of changes in the image and all of the other things that need to happen.

    See the following links for more info:
  12. Sep 7, 2017 #11
    There are two elements of image processing that are important in answering your question. The first is the use of a priori information is fair game in image processing - and the human brain certainly employs this. The second is a form of integration - either spatially or over time.

    Taken together, they allow the resolution of some image details well below the pixel level. For example, if you move a sword across the view of a camera, using certain assumptions, I can process that image to determine the position of the sword to high precision - much greater than my pixel resolution. Those assumptions, my a priori information, will include: the shape of the sword is not significantly changing; there are limits to the amount of acceleration in the velocity of that sword; the background image is also not changing or changing relatively slowly. I can combine images across time to deduce a precise shape of the sword - then, with each frame of imagery, I can spatially integrate the entire sword image to precisely determine its location.

    That the mammalian visual cortex processes edge and motion at a very early stage has been observed since the 1960's. Cats have been a favorite subject. For example: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1359523/pdf/jphysiol01247-0121.pdf

    From there, this edge and motion information is passed on to other parts of the brain. Further processing produces both obvious and not-so-obvious results. It allow us to consciously recognize and appreciate the form and 3D location of the objects we see. Less obvious is how visual information can be used without direct conscious recognition. https://en.wikipedia.org/wiki/Blindsight

    But the key is that we don't consciously see pixels. What we consciously see are objects in our field of vision - along with their relative 3D locations and an experience-based recognition of what those objects are.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted