A Personal View of
James Gibson’s Approach to Perception
Accepted 20 August 2008
James Gibson, theory, heuristic
This commentary argues that James Gibson’s contribution to the field of perception can be best understood as a set of heuristics that directs researchers to describe the stimulus information that makes it possible for animals to function effectively in the environment in which they evolved. He argued that the description of stimulation provided by traditional physics could not account for veridical perception and needed to be replaced by a new, biologically relevant ecological description of the input for perceptual functioning. (Cog Crit 1: 31-41, 2008)
At the outset I want to say that James Gibson was one of the most important thinkers, if not the most creative worker, in the field of perception in the last half of the 20th century. His views about the nature of stimulus information were far in advance of the field of sensation/perception in his own time, and many of his positions as to what should be the central topics of study in our evolving field have been accepted as obvious by many in 2008.
This was not the case in the 1960s when sensory psychophysics was the dominant approach to perception. It was widely assumed that we could build from the bottom up, starting with understanding how photons were caught by the receptors and changed the chemical structure of the retinal cells, leading eventually to their depolarization. Once we understood how the lowest level of the sensory machinery worked, so it was thought, we could work our way from describing the psychophysics of sensitivity to energy to understand the more complex processes that led to perception. Sensations of light and dark and color were the foundation on which perceptions were inferred by higher cognitive functions.
Every idea in the history of science has precursors that provide notions that are transformed or rejected by new formulations. Gibson’s position was strongly influenced by the tradition of the Gestalt psychologists who rejected the reductionism of their era and argued, for example, that relationships, ratios between luminance values rather than isolated sensations of luminance, made lightness constancy possible. Kurt Koffka, one of the leaders of the Gestalt movement in Europe had come to Smith College in 1925, four years before Gibson joined the faculty as an assistant professor. That Koffka’s ideas had a substantial impact on Gibson’s thinking became very clear to me in the fall of 1966 when the students in his graduate seminar read and discussed Koffka’s book, Principles of Gestalt Psychology (1935). Koffka argued that understanding how the constancies (size, shape, lightness) are accomplished is the central goal of perceptual research. How is it, he asked, that we are able to perceive “invariant” properties of the world, given the constant change in sensory input? While his answer was that it is in the way the brain functioned, Gibson claimed that if we looked, we would find invariant properties in the stimulation itself.
Much has been written about James Gibson’s ideas, and I do not wish to present myself as an expert Gibson scholar. That said, I will offer my understanding of the positions and arguments that he put forward in the late 1960s. In the years that followed, his point of view did not change in any substantial way although in his spirit it is only appropriate for this commentary to serve as the beginning of a dialogue.
My comments are based on my personal discussions with Gibson, my mentor, as they occurred in his seminar that convened each semester, in the departmental office at the Ithaca airport laboratory and, occasionally, at gatherings in the early hours of the morning at his home. The period of these interactions began in the fall of 1964 when I enrolled in the graduate school at Cornell and continued after I graduated in the summer of 1968.
The spirit of James Gibson deserves some attention. He was a charming man who loved animated debate and took great joy in using his intellect to find arguments that would confound those who would engage with him. As a teacher, he was a difficult man to agree with. Unlike most academics, he showed little interest in the comments of a student who tried to present evidence or new work that supported his position. In contrast, when a student introduced data that conflicted with his point of view, he stopped opening his mail (as he was prone to do until a conversation caught his interest) and in short order played out a series of moves refuting the proposition of his opponent. For example, when a graduate student suggested that the Ames trapezoidal window (which creates the illusion of a rectangular object slanted in depth) was evidence that perception was based on assumptions, Gibson acted like a chess master who enjoyed showing how the other player’s position would lead to that player’s check mate in only two moves. There was a playfulness about these interactions, as well as a readiness to treat anyone who wanted to play with him with respect. His interest lay always in the ideas and never in denigrating those who tried to challenge his claims.
Gibson’s approach to perception was more a heuristic than a theory of perception. My perspective is that many of Gibson’s positions were heuristic. He argued about what we should be studying and how we, as scientists of perception, should spend our time. He felt that too few of us were engaged in the activity that would make for progress, i.e. discovering the relational, higher-order properties of stimulation that make possible the adaptive functioning of animals. And he had strong opinions about popular approaches that he felt would not provide explanatory profit. His claims were not theoretical in the sense that we use the term today. That is, Gibson did not provide the kind of story that we normally take to be an explanation of a phenomenon.
It will be clear to anyone who has read Gibson’s books that I am making a very strange and perhaps bizarre argument. Over and over again, he put forward what he claimed to be a radical theory of perception. Previous theory, based on the processing of sensations detected by receptors and reprocessed by successively more cognitive mechanisms, eventually generating perceptions of the external world, was, according to him, fundamentally flawed. What I want to claim is that Gibson was not putting forward a theory of visual perception at all. What he was telling us is that we could make more progress as scientists, at this point in history, by using our time and talents to describe the information in the stimulation that specifies the properties of the environment than by attempting to understand the neurological, sensory, perceptual and cognitive processes that take place inside the head. He also suggested that in describing the stimulus information we should devote ourselves to discovering not elementary sensory properties but the higher-order amodal properties of stimulation that are available over space and time.
Gibson called for the creation of a new kind of physics, ecological optics, to describe stimulus information. This new optics involves the patterns of energy that are appropriate for understanding how evolving organism’s function. It explains how animals locomote, find food, avoid getting eaten, and engage in the activities to produce the next generation. Coming up with a correct account of the problems that the perceptual system tries to solve is one of the most difficult tasks for scientists in this field. Getting this functional level of theory right is all-important if we are going to produce a science that successfully connects biology to other levels of explanation.
Gibson’s claim was that when one had properly described the stimulus information that perceptual systems had evolved to pick up, understanding how the brain “resonated” to that information would prove simple. He maintained that although such information might well be mathematically difficult to describe, the brain would respond to the information in a biologically simple fashion. Gibson believed that when scientists did come to understand the internal processes that make perception possible, these mechanisms would be very different than those previously proposed. When we were able to characterize the internal mechanisms that pick up information, he said, they would be direct processes that resonated to the information in the stimulation, like a tuning fork that resonates when the particular frequency to which it is tuned is present in the air. It was his perspective that no intermediate steps of reasoning would be required and the process would not involve anything like a sequence of propositions following one another in a logical fashion.
Gibson was pessimistic, as well, about the utility of the neurophysiological accounts of vision that were offered in the 1960s. He thought that the hierarchical account of form perception, based on the work of Hubel and Weisel, was on the wrong track and unlikely to lead anywhere. Although he did not direct criticism to the work that led them to win the Nobel Prize, he was simply uninterested in it. At that time many, including this author, believed that the elements of vision had been discovered by studies that placed microelectrodes in the visual cortex of the cat. We also seemed to understand how the elements were added together in the cortex to give us the molecules of form. Retinal ganglion cells detecting contrast provided the input for simple cortical cells that detected the orientation of contours. The simple cells were combined to create complex cells and then hyper-complex cells. By combining distinctive features, we could perceive the shapes of letters and recognize other objects. According to Gibson, our widely accepted approach was fundamentally flawed, “muddled,” as he would say.
Theory or Heuristic
What do we mean by “theory” at the present period in the history of the cognitive sciences? The label for part of our field that has come into vogue, cognitive-neuroscience, suggests an answer. The word implies an explanation that relates the phenomenon to be explained to a set of facts on another more fundamental level. Some set of facts about sensation, perception, or cognition is tied to and explained by a set of facts about the functioning of the neurons or, more recently, of neural networks. One does not need to claim that the functioning at the higher level is “nothing but” the processes that go on at the more fundamental level because it is often the case that new properties emerge at the higher level. An example of such a theory that was popular when Gibson was a graduate student was the Gestalt account of perception. Perceptual phenomena, such as the perception of form, were the result of field-forces in the brain that produced the simplest and most compact percept. Thus, brain processes were given as explanation of perceptual facts.
In contrast to a theory that provides an explanation, a heuristic is a technique or method that directs ones attention toward discovery. What Gibson was doing was providing, to those who would listen, heuristics or techniques that direct attention toward discovery. His advice was that we should direct our efforts to describing stimulus information that had not yet been discovered. He had shown us that this was possible by making many discoveries of new stimulus information throughout his own career. The texture gradient discussed in his 1950 book, Perception of the Visual World, described a new property of the visual field that made size constancy possible. Regardless of their distance, objects of the same size would cover approximately the same number of texture elements when the objects rested on the ground. Similarly, the description of the optical information for impending collision or looming, was a perfect example of the scientific riches that could be gained if we would only make an effort to describe the properties of the stimulation that guide perception and action.
The effects of Gibson’s heuristics on the field
In many ways the impact of Gibson’s arguments have been enormous. The International Society for Ecological Psychology and the journal it sponsors, Ecological Psychology are both in their 20th year. The presentations of the members of the organization and the articles in the journal present studies of dynamic event perception, perception of affordances, relationships between properties of animals and properties of the environment, and a variety of topics that would probably not have been explored with the same intensity had Gibson not been such a forceful advocate.
On the other hand, Gibson’s point of view remains a minority position in the field of perception. The modern version of Helmholtz’s unconscious inference theory, Bayesian models of how probabilistic prior information is weighted by incoming cues, is growing in popularity. Researchers have not run out of ways to make up models of processes that take place in the head. Their models also provide heuristic guidance. For example, researchers such as Marc Ernst and Marty Banks (2002) have investigated how variability in stimulation changes the weight given to that information. There continues to be a vast effort to understand functions of the many visual projection areas that have been discovered and how the connections between these areas can be tied to perceptual phenomena. Clearly, Gibson’s pessimism about the utility of this work is not shared outside the ecological community.
If Gibson’s heuristics are correct, one might well inquire, why hasn’t a larger group of scientists devoted themselves to discovering invariant information? And why is there no long list of newly discovered properties of stimulation that allow us to perceive the objects, layouts and events that surround us? It is not the case that no one has tried. Over the last 30 years computer-vision researchers have been strongly motivated and highly paid to build general-purpose computer systems that would perceive the external world and direct flexible robots to do our work without pay or complaint. They tried but found the general-purpose problem very difficult and they have moved on to building single purpose visual guidance systems to control robots in highly constrained factory situations with some success. One of the problems computer vision engineers tried to investigate was the perception of shape from shading (Barrow and Tenenbaum 1978). Gibson had also explored the problem in a very similar manner. He described how reflectance, illumination, and surface orientation determines the luminance level projected to an eye from a scene in his discussion of the structuring of ambient light (Gibson 1966, pp 208-216). The process is totally determined by optics, and numerous computer programs can generate realistic images of scenes when the three variables (lighting, orientation of surfaces, and the reflectance of the surfaces) are specified as inputs. But one cannot run optics in reverse. Gibson notes that any combination of these three factors can generate a particular luminance level or a whole scene, so how is it possible to distinguish slant from lightness or illumination? The problem with three unknowns on one side of the equation and only one value on the other cannot be solved. Barrow and Tenenbaum (1978) argue if you assume that the surface does not vary in reflectance, i.e., reflects light in all directions equally, and the illumination that is reaching the surface does not vary in brightness, one can calculate the surface orientation. Their approach is to provide constraints or assumptions that make the problem solvable.
Gibson’s approach to the problem is different. He does not want us to try to understand the assumptions that underlie a process that transforms the percept of a bump into a dent when a shape from shading photograph is rotated. Instead he directs us to a different problem. He states that there “has to be special information in the array to specify a slant as against a color, or either against a shadow” (Gibson 1966, p. 215). He tells us to describe the differences in the way light is structured by variations in surface orientation that create shading and the way that shadows cast on a surface are generated. He promises us that by working hard on the problem we will come up with an answer.
Perceiving the nature of the material that makes up a surface could make it possible to rule out the possibility that some of the variations in luminance are caused by illumination or surface orientation, but how are we able to know from an image whether what we are seeing is gray paper, plywood or granite? This is a fine problem for scientists to devote themselves to, and Gibson’s argument that there has to be special information available points researchers in a certain direction but does not give these researchers much help getting started on the problem of describing the information. The reason there has not been more work describing stimulus information is that it is very difficult to discover something that is genuinely new.
Understanding the problems that are solved by the perceptual system
Although many researchers cling to the view that traditional physical analysis provides the most useful description of stimulation, Gibson’s heuristic approach has begun to influence others. His ecological approach tells us to first define the problems that the perceptual system is trying to solve. Having a clear understanding of the problem one can go on to discover the stimulus information that makes solving the problem possible. The ecological approach argues that the problems have to do with how animals function effectively in the environment.
For example, animals perceive that they are approaching an object, or more importantly that an object is approaching them. This information indicates that they must act to protect themselves from collision. The optical information for collision is symmetrical expansion that begins slowly when the object is distant and increases in velocity as the object approaches. When the object projects a visual angle of about 30 degrees the rate of expansion increases greatly and ends in an explosive fashion due to the hyperbolic nature of the geometry. The problem is how does the visual system process this pattern of change, a flow field with vectors that vary over time from the center of expansion in velocity and direction. From a developmental point of view one can ask how does sensitivity to this kind of information change with maturation and experience. The reason for interest in sensitivity to this kind of display is the belief that it presents functionally important information.
The traditional approach grounds itself not in biology and the problems that animals must solve but in the notion that traditional physics provides us the simplest and must useful understanding of the input for vision. From this viewpoint, the fundamental problem is the perception of the kinds of optical motion defined by the mathematician, not the biologist. All rigid motions in the environment project retinal motions that can be made up of rotation, translation, dilation (expansion and contraction), and sheer. The constant velocity version of these motions is assumed to be the place to begin work, and researchers are busy studying infants' sensitivity to constant velocity expansion patterns even though these displays specify an object that slows in velocity as it approaches, thereby specifying a non-dangerous event. Human infants do not blink to this display in first month of life as they do when an event involves an increasing rate of optical expansion and specifies impending collision.
Gibson was primarily concerned with describing the stimulus information that makes it possible for humans to succeed in accomplishing tasks that we share with other animals. These tasks involve moving. What is the information that makes possible safe locomotion in an environment with cliffs one could fall from, and obstacles one could walk into? There are a large number of similar tasks to be considered and in the years ahead they will be explored. But there are other tasks humans solve with great skill that cannot be understood by describing the stimulus information because they are fundamentally representational in nature. When we recognize an object, person or place as familiar we can access relevant prior experience. That prior experience is not in the light to the eye. We can recall the name of an object, its properties and consider alternative plans for what to do with it. Some of the cognitive processes that humans use so effectively and make it possible for Gibson to argue for his ecological perspective are outside the range of Gibson’s heuristics.
Gibson took great joy in confounding his audience by saying things that that were novel and stating them in ways that were difficult to grasp. The message was often that others’ thinking about perception was based on unexamined and wrong assumptions about how the process must work. He argued that sensations were irrelevant to perception, leaving many new listeners with little understanding of what he was trying to say--except that they were engaged in an activity that would be part of the discipline of physiology but not psychology, if he had his way. I think Gibson was using shock and awe to get people to pay attention to what he was saying. His claim was, in part, that one did not have to experience sensations to arrive at perceptions, and except for a rare sense datum philosopher, few would disagree with him. His heuristic claim was that if one wanted to understand, for example, how we were able to perceive the color of the material that makes up the environment, across the variations in lighting that occur, we will not make progress by understanding the functioning of the cones.
The overriding value of his perspective remains clear and strong. We need to understand the tasks that perceptual processes fulfill and the problems these processes solve. To understand how this is accomplished we must describe the input, the stimulus information that makes perception possible. We need to describe the consistent properties of the world that make solutions possible.
In the opening chapters of his book, Vision, David Marr (1982) considered the many levels of analysis that were ignored by many of his colleagues in computer vision who concerned themselves with how processes could be implemented in computer hardware. He gave Gibson credit for directing our attention to understanding the problems that must be solved by the visual system and the information available in stimulation that make the solution possible. But Marr also pointed out two other levels of explanation that eventually must be included to provide to a full understanding of vision. Eventually we will need an effective computational account of how stimulus information is processed and an understanding of how the computations are implemented in neurophysiology or computer hardware. Fundamentally, I agree with Gibson and Marr that unless we understand the problems that vision solves and how information makes solutions possible it will be difficult making progress on either the computational or implementation level. The task of properly describing stimulus information is to a large degree ahead of us, Gibson pointed us in the right direction and gave us a good start.
Barrow HG, Tenenbaum JM (1978) Recovering intrinsic scene characteristics from images. In: Hanson AR, Riseman EM (eds) Computer vision systems. Academic Press, New York
Ernst MO, Banks MS (2002) Humans integrate visual and haptic information in a statistically optimal fashion. Nature 415(6870):429-433
Gibson JJ (1950) Perception of the visual world. Houghton Mifflin Company, Boston
Gibson JJ (1966) The senses considered as perceptual systems. Houghton Mifflin Company, Boston
Koffka K (1935) Principles of gestalt psychology. Harcourt, Brace & World, Inc, New York
Marr D (1982) Vision. WH Freeman and Company, San Francisco