Tuesday 14 April 2020

The eyes have it





Computer vision: The Summer Vision Project


Why have humans found it so hard to make it possible for computers and robots to see?  Most versions of the story of computer vision begins at MIT in July 1966. There, Seymour Papert set a summer project to ten of his his students in the MIT Artificial Intelligence group. Completely underestimating the difficulties involved, Papert described it as:–
“… an attempt to use our summer workers effectively in the construction of a significant part of a visual system.” 
He expected them to have come up with a simple form of computer vision by the end of August, by which time the computer would be able to identify objects in image, such as:
“… complex surfaces and backgrounds, e.g. cigarette packs with writing and bands of different  colour, or a cylindrical battery, then to extend the class of objects to objects like tools, cups, etc” .[1]
As we now know, this was not accomplished by August 1966. In fact, work on computer vision continues to this day, as we discover just how complicated the process of vision really is.  The early AI scientists simply did not understand that, however much they knew about the mechanisms of vision, they were completely naïve about the process.  *

I continue to find it extremely helpful to explore the differences between how we teach computers to see, and how we as humans learn to see. Very often these differences illustrate profound differences between machine and human intelligences.

So how do we see?

Recently I typed “How do we see?” into my browser. Top of the list of recommended websites was the American website of The National Eye Institute.

This is what it said,
“All the different parts of your eyes work together to help you see. First, light passes through the cornea (the clear front layer of the eye). The cornea is shaped like a dome and bends light to help the eye focus. Some of this light enters the eye through an opening called the pupil (PYOO-pul). The iris (the colored part of the eye) controls how much light the pupil lets in. Next, light passes through the lens (a clear inner part of the eye). The lens works together with the cornea to focus light correctly on the retina. When light hits the retina (a light-sensitive layer of tissue at the back of the eye), special cells called photoreceptors turn the light into electrical signals.
“These electrical signals travel from the retina through the optic nerve to the brain. Then the brain turns the signals into the images you see.” **   [2] 
In addition to this text, there was a diagram. It was just like the ones I had seen and learnt to copy in my school days. Any computer today could learn how to copy this diagram. Even I as a silly secondary school pupil could do it.
The human eye
But what did the diagram teach me?  It taught me that if I learnt  to reproduce it accurately, my teachers and examiners would apparently believe that I understood how we see *** . But of course that was not the case at all.   The diagram glossed over the very bit I was interested in then, and that I am still interested in today. When the description ends with: “Then the brain turns the signals into the images you see”, my response has always been, “Yes, but how does my brain turn these signals into images?”

As a child I tried to imagine whether deep within my brain there could be another, smaller, eye that looked at the data sent back there by my real eye. But how did the smaller eye see? Did it send the data to another, still smaller eye behind it? and so on ad infinitum?


Today I know that in reality the description and the diagram show only how the eye collects a certain kind of data from outside. It says nothing, absolutely nothing, about how we see, how we make sense of the data.

No, that infinite recursion wasn't going to work as an explanation. And since school and College failed to came up with a better explanations, I have spent far too many hours thinking about this question: how do we see?

The evolution of the eye

Light and Dark

I began by studying how the eye evolved in nature, studying how it became more and better refined as a sense organ, in the hope that I might find clues about the evolution of the process of seeing.

So let us start at the start:






Euglena are primitive single cel organisms that lives in water. They are so primitive that biologists place them somewhere somewhere between animal and vegetable. Like plants, they makes their own food by photosynthesis using light. You will see from the picture above that the organism has a red spot. This is a light-sensitive protein that biologists call an eyespot, which detects if light is falling on it or not. The Euglena use this basic information to change their position to face the light.




This, I would argue, has nothing to do with seeing. It is simply collecting the most basic data from the world in which the Euglena live: formless light or dark.




The movement of shadows



A step up from Euglena are Planaria. These freshwater flatworms don’t have eyespots: instead they have two cup-shaped indentations on their bodies, lined with pigmented light-sensitive cells. Where the red eyespot on Euglena can only detect whether there is light or not, the Planaria's cup-shaped indentations give the creature rather more information. Planaria can not only tell if it is light or dark. If light is coming in at an angle, it will fall on the cells on one side of each indentation, rather than the cells on the other side. This means that the Planaria can tell the direction of the light source. If a moving shadow interrupts the light, the opposite will happen. Light will be blocked from falling on cells on one side of each indentation before being blocked from cells on the other side. So the planaria is not only sensitive to the presence of light, not only sensitive to the direction the light is coming from, but most significantly, they  can also tell the direction of travel of a shadow – a possible predator – that interrupts the light.

Planaria do not see images. But does the fact that Planaria are aware of movement in the world around them mean they can see?   


Sight as a sensation






Suppose a creature had the ability to form an image in the same way that a pinhole camera does, with a much deeper depression for capturing light, and a pinhole to provide some focus. With this kind of lens-less proto-eye, the organism would be able to distinguish some features of its surroundings. The sea snail called Nautilus has just this kind of eye. But in fact the image that is formed is very faint and lacks all detail.

True, these dim unfocussed patterns on the retina give the Nautilus a little more information than the movements of shadows across the cup-shaped indentations of Planaria. But in themselves, I would argue that they barely constitute ‘seeing’.

I would argue that the significant difference between how Planaria and Nautilus respond to visual information coming from the world around them, is that as well having two light-sensitive pinholes, Nautilus comes equipped with a very sophisticated sense of smell, sense of touch and additionally both a short-term and simple long-term memory. Planaria have only a crude sense of touch and some sensitivity to certain chemicals. It is possible that they have a very basic memory of locations. ****.

As the Nautilus forages, the patterns of light and shade that move across the simple retina will inevitably coincide with smells and the sensations from its antennae. (Some of the antennae are used for smell, some for touch only)

Nautilus will experience patterns of coincidence occur over time. (The Nautilus can live for 20 years). Let us imagine a young Nautilus who finds that whenever this dim image forms, it smells this odour and feels this texture at the end of its tentacles. And when that smell is weaker, and that part of the image is smaller, the Nautilus's tentacles can no longer touch that thing out there. The young Nautilus might guess that the thing in the image is further away. The young Nautilus begins to be able to make guesses about objects in the world around it.  

When the mature Nautilus ‘sees’ something which is out of reach, the data going down the optic nerve are not perceptions. The guesses are not seeing.

The seeing is based on our Nautilus's experience of what is likely to be out there, given the regularity with which the sensations across its senses match its past experience.

I begin to understand that it is our experience of coinciding and consistency  across senses that is the beginning of perception as sight.

I will try to tease this out in the next entry.






* It reminds me of the old man in the story who said, “I can give you the explanation, but I cannot give you the understanding.’

** The National Eye Institute article ends rather poetically by saying, “Your eyes also need tears to work correctly”, adding a mysteriously romantic world view to its very limited explanation.

*** When I learnt to reproduce the diagram without any understanding, somehow it was I who felt a fraud, when of course – as so often in secondary education –  it was the teachers who were fraudulently passing off a physical description as an explanation of something far more complex.

**** A recent paper describes a rather goulish experiment which claims to show that Planaria can “exhibit environmental familiarization, and that this memory persists for at least 14 days - long enough for the brain to regenerate. We further show that trained, decapitated planarians exhibit evidence of memory retrieval in a savings paradigm after regenerating a new head.” [3]




[1]   Papert, S. The Summer Vision Project  Artificial Intelligence Group , Vision memo No. 100 July 7 1966  at http://people.csail.mit.edu/brooks/idocs/AIM-100.pdf  accessed 14 April 2020.
[2]   National Eye Institute. How the Eyes Work.16 July 2019. https://www.nei.nih.gov/learn-about-eye-health/healthy-vision/how-eyes-work (accessed April 11, 2020)
[3]   Shomrat T, Levin M (October 2013). "An automated training paradigm reveals long-term memory in planarians and its persistence through head regeneration". The Journal of Experimental Biology216 (Pt 20): 3799–810.