Knowing how close you are yo an object requires depth perception. Without it, you can’t reach the apple you want to eat, walk down a path, or see danger coming toward you. No playing basketball or football.
Monocular Cues
The most common explanation of depth perception is cue theory. Essentially, we find environmental hints which we use to infer depth. These unconscious calculation can require two eyes but most only need one eye to find them. These cues are the same ones artists use in paintings or drawings.
There is no agreed upon list of cues but here are nine top contenders.
1. Size
Interestingly, ancient paintings don’t show size cues. Size isn’t incorporated until the Renaissance in the 1500s. You can clearly see size cues in the work of Leon Battista Alberti (1406-1472). His method of painting (Alberti’s window) is widely used today. Think of the scene you want to paint as being a window with panes of glass. In each pane, replicate what you see. The result will be a painting with the proper relative perspective.
But why didn’t painters before the Renaissance use size cues? One suggestion is that depth perception didn’t emerge until then. An interesting argument but highly unlikely. It’s not like earlier people couldn’t walk without falling down. A sudden acquisition of depth perception isn’t reasonable.
David Hockney and Charles Falco Have a different explanation. Several years ago,Hockney wrote to encourage me to better explain his theory; I’ll try but probably still want get it right. Sorry, David.
The idea is not so much that pane-by-pane replication is the secret but that the realistic paintings of the Renaissance was the result of optics. The belief is that artists used concave mirrors to project images on canvas and traced them.
Another theory is cultural. Before the Renaissance, position was more important than size. In a totem pole, for example, the figure on the bottom was the most important. The figures are stacked as a rank order. In other artwork, depicting figures the same size could be more about asthmatics than vision.
There are three types of size cues: absolute, relative and familiar.
Relative Size
Relative size shows two objects at the same time. The smaller one seems father away, particularly is they aren’t presented side by side. This cue relies of a difference in visual angle, which is how much the object fills the view. Things which fill the frame are close. Things far away take up less space in the visual field.
Absolute Size
Absolute size uses two or more items being individually placed in the same setting. Pictures which include a rule are good examples of this approach. But separate relative size judgments against a consistent standard will work too.
Familiar Size
Things we are familiar with can act as cues. Most people know how big a car, average person or car is. These are things we come in contact with on a regular basis. We use this knowledge as a basis for making decisions about depth.
2. Linear Perspective
We are greatly impacted by the convergence of parallel lines. Lines that toward each other as they approach the horizon are powerful clues about depth and distance. When a vanishing point is implied the effect is strong.
3. Interposition (occlusion)
We know from experience that we encounter most objects serially. The walk across the room, reaching one item, then another. Objects occur in order. When we see an image we objects layered on top of each other, we remember our real world experience, and infer depth. Closer object cut off part of the view of other objects, so objects in an image that cut off another must be closer too.
4. Aerial Perspective
Haze and blueness are part of our experience with the sky. Objects near us are clear, and objects in the distance are hazy.
5. Relative Brightness
Assuming all objects can be clearly seen, brighter objects are closer. We are used to the sun reflecting more off of closer objects. Used in conjunction with size, brightness is a helpful tool. Notice in this picture taken from space that closer (bottom) appears brighter. There is also a haze and a horizon effect present.
6. Shade & Shadow
shadow s give us Information about the directionality of light. Something with a shadow below it is interpreted as being convex. If makes the top part look like it is sticking up. Shadows on top indicate a concave or depressed area. It looks a button has been pressed.
7. Texture Density Gradients
Density is a subtle cue. You know something is giving you a sense of depth but nothing leaps to mind. You can see it when there is a series of mountains or sand dunes going off in the distance. They are all the same distance from each other but the density increases as they progress. Similarly, the paintings of Gustave Caillebotte show a good use of density. The farther away thing are, the denser they look.
8. Depth of Field
The bokeh effect in photography is pleasing, and it illustrates a broader concept of blur. We are used to everything being in focus because our visual system adjusts quickly. We look at something up close, mid-distance and far away without having to change lens or processing equipment. It is seamless.
in our experience, the depth of field only occurs in photographs. As ee gain experience in making photographs or watching movies, we learn to interpret blur as a shift in importance. Important things are clear, unimportant things are unfocused.
Secondarily, focus indicates distance. Our eyes don’t do this refocusing trick. But with experience we associate blur with distance. Think of it as a photographic figure-ground effect.
9. Elevation
We use the horizon to help locate objects. The angle changes as we change elevation, so thins that are closest to the horizon are thought to be farther away.
10. Motion
Motion parallax
When you move around your environment, the shift in angles gives you clues about the objects and their locations. Shifts in shading give you hints about depth. You can gain some help by bobbing your head, tilting it forward or back, or turning it side to side. But the biggest help in walking around the environment.
As you move. the near points move in the opposite direction, and the far points move in the same direction as you. Consequently, near objects are displaced faster than far. Things in the distance don’t seem to move but things close to you change.
You’ll notice when the moon is full and close to the horizon, it both looks bigger and seems to follow you.
Moving Objects
in this case, we are standing still and an object is moving. We use all the previous cues but depth isn’t our biggest concern. We are really interested in velocity. That is, our primary concern with moving objects is self preservation. We are particularly interested in calculating the time of contact (TOC), a sort of estimated time of arrival (ETA) for objects. If a meteor is headed our way, we are less concerned about the change in depth perception than in the likelihood of impart.
Combinations
Most situations have a combination of monocular cues present at the same time. We often get horizons, bright near portions, and haze in the distance. Depth perception isn’t all or none; it is all the available cues all the time.
Kinetic Cues
In addition to changing angles, walking around an object gives you kinetic cues. You can feel the slope of the green as you walk toward your golf ball. You can get a “feel” for the auditorium when you walk the platform. You judge the depth as you walk the top of a cliff. These changes in muscles report proprioceptive stimuli to the brain. We use body information to tell us about our environment.
Binocular Cues
Binocular cues require two eyes. There are not many of them but they are very powerful. When information is presented to two eyes at once, each eye see something different. The distance between the determines how dispersant the images are.
when eyes are not in the identical place, each retina responds to different angles of light. This retinal disparity fools the visual system into seeing 3-D images.
Having two eyes looking at a scene doesn’t lower the number of monocular cues present. Each eye sees relative size, vanishing points and occlusion. They just see them from a different angle. Using two eyes is a bit like walking around an object without having to move. Better.
Monocular and kinetic cues provide only relative distance information. Binocular cues provide absolute distance. Retinal disparity is our key tool. Since each eye gets a slightly different view of the world, we must match up the images in the two eyes. Crossing at the optic chiasm results in the lefts sides of each eye being matched up and the right sides of each eye being matched up. The occipital lobes share information, giving you detailed stereo images. Adding in kinetic and monocular cues, we have a very complete of our environment.
For distance images, the eyes have one setting. The pupils are relaxed, there is squinting, and there is only one focus. At 20, 200, or 200- feet, it is all the same. The eyes is a set-focus camera. The cornea provides the focus but is not adjustable. We are a large pinhole camera on legs.
The problem arises at about 20 feet and under. To see close up we have two tools: convergence and accommodation. As the name suggests, convergence is pointing the eyes toward each other, converging in the middle. Convergence is our built in triangulation system. We become increasingly cross-eyed as objects get closer to us.
The other 20-feet process, accommodation, is a change in the shape of the lens. For looking at the horizon, we relax our eyes, causing the lens to be flat. When things are up close, we adjust the focus by bending the lens. The cornea provides the majority of the ficus but by bending the lens we get another 1.3 change in focus. Think of it as fine tuning the image.
Mach: Economy principle
Ernst Mach (1838-1916) proposes a minimum principle for perception. The idea is that we organize an image to keep differences to a minimum (more stable world). Although Mach’s proposal is general, his research was among the first in what we would now call edge detection.
We have a good optical system but we have a great processing system. We augment the optics with prepossessing and with computing power. We use horizontal cells in the prepossessing phase. When a photoreceptor fires, the horizontal cells attached to it tells the neighboring photoreceptor not to fire. This enhances the edges. We get a sharper edge than we would get from optics alone. Later on, the brain processes the image even further and compares the result to images it has on file, aiding in object recognition.
The process of telling neighboring cells not to file is called lateral inhibition. The inhibition of firing through horizontal cells moves laterally across the retina. This edge enhancement makes contours stand out.
There is a certain amount of information processing before the signals from the nerves enter the optic nerve and the brain. The system works well. It highlights boundaries and increases contrast. Notice how the gray dots appear darker when they touch darker dots.
But the system breaks down when sensors are fatigued from repeated firing. Illusionary spots between between shapes that close together but not touching. The illusory spots are due to lateral inhibition gone astray.
If you state at the bottom image, particularly when your eyes are tired, the point where the boxes meet will appear to be a small square of a different shade. It just white but as your eyes fatigue, it is harder and harder to maintain edge detection.
Direct Perception
James Gibson (1904-1979) proposed that much of perception doesn’t need cognitive interpretation. His general complaint was that there is too much emphasis of cognitive interpretation of visual data. It is unnecessary. We can use direct perception.
There is no need to infer cognitive processes. We can directly perceive the qualities of a distal stimulus. It has properties we can detect with relying on inference or memory.
Similarly, we can use sensory information we gather without needing to think about it. Pattern across the entire retinal image gives direct awareness of depth. Cues might be present but we can get along without them.
His specific complaint was that too much emphasis was placed on illusions. We should be studying real perception. Illusions are artificially constrained; they lack “ecological validity.”
Illusions
Illusions are like magic tricks. They rarely occur in real life but they are lots of fun. The moon illusion of the moon looking bigger the closer it is to the horizon occurs in real life but like artificial illusions the magic happens in your head.
Unlike mirages, where the sensory system is working fine but the input is extremely unusual, illusions take advantage of gaps and errors in our processing. They show where our senses don’t function well.
Illusions occur when stimuli are ambiguous and information is incompletely specified. Both of these are key ingredients in a good magic show. Stage magicians don’t let you see their props from every angle, and you’re not allowed to walk around. They want you to see only the things they want you to see. Close up magicians fool you by sleight of hand but also by lying to you. Often the trick is done, from the magician’s point of view before you are asked to pick a card, any card.
There are no good tools for investigating illusions. Investigating them with some information constrained limits the scientific method. Removing the constraints ruins the illusion. In the real world, our investigatory approach is to reach out and tough things. Not very scientific but very practical.
In some sense, there is nothing more Gestaltic than illusions. We mostly just appreciate them.
Muller-Lyer illusion
Equal length lines don’t look equal.
Vertical-Horizontal Illusion
Shows how we more accurately estimate horizontal lines than vertical ones. Again, lines of the same length.
Ames’ Window
Adelbert Ames (1880-1955) used a trapezoidal cardboard shape which he hung on a string. Watching the trapezoid spin looks like a rectangle spinning in a smaller arc. We have a lot of experience with rectangles, and can predict what a spinning rectangle would look like. We have very little experience in real with trapezoids, so we assume what we are seeing is something different. We normalize the situation using our best experience.
Cutaneous Rabbit
This is a tactile illusion I can never get to work. See how well you do.
Discovered by Geldard & Sherrick, the effect 9s the result of sequential taps. Give a couple of rapid taps near your wrist, followed by a couple of rapid taps closer to the elbow. You should feel a hop.
Did it work?
Checkerboard illusion
Edward Adelson (1952-) demonstrated an illusion of color constancy. Two squares of identical brightness look different because of their context.
Light
Want to jump ahead?
- What Is Perception?
- Perceptual Efficiency
- Vision
- Principles
- Depth
- Light & Eyes
- Eye
- Retina
- Color Vision
- LGN
- Occipital Lobe
- Pathways
- Taste
- Simple
- Tongue
- Throat
- Smell
- Basic
- Nose
- Olfactory bulb
- Flavor
- Touch
- Receptor
- Pressure
- Haptic Perception
- Temperature
- Pain
- Itch
- Hearing
- Ear
- Cochlea
- Pathway
- Temporal Lobe
- Vestibular
- Visceral
- Proprioception
- Time
Photo by Robert Bye on Unsplash