Since the early years of artificial intelligence, scientists have dreamed of creating computers that can see the world.
But like many other goals in AI, computer vision has proven to be easier said than done.
But it took much more than a summer break to achieve those goals.

In the past decades, advances inmachine learningand neuroscience have helped make great strides in computer vision.
Kreimans book helps understand the differences between biological and computer vision.
Kreiman also discusses what separates contemporary computer vision systems from their biological counterpart.

40% off TNW Conference!
Biological vision is the product of millions of years of evolution.
There is no reason to reinvent the wheel when developing computational models.

We can learn from how biology solves vision problems and use the solutions as inspiration to build better algorithms.
Biological vision runs on an interconnected web link of cortical cells and organic neurons.
Computer vision, on the other hand, runs on electronic chips composed of transistors.

Kreiman calls this the Goldilocks resolution, a level of abstraction that is neither too detailed nor too simplified.
Those approaches have proven to be very brittle and inefficient.
On the other hand, studying and simulating brains at the molecular level would prove to be computationally inefficient.
I am not a big fan of what I call copying biology, Kreiman toldTechTalks.
There are many aspects of biology that can and should be abstracted away.
We probably do not need units with 20,000 proteins and a cytoplasm and complex dendritic geometries.
That would be too much biological detail.
On the other hand, we cannot merely study behaviorthat is not enough detail.
InBiological and Computer Vision,Kreiman defines the Goldilocks scale of neocortical circuits as neuronal activities per millisecond.
to complex objects (faces, chairs, cars, etc.).
The word layers is, unfortunately, a bit ambiguous, Kreiman said.
In biology, each brain region contains six cortical layers (and subdivisions).
It remains unclear what aspects of this circuitry should we include in neural networks.
Some may argue that aspects of the six-layer motif are already incorporated (e.g.
But there is probably enormous richness missing.
Also, as Kreiman highlights inBiological and Computer Vision, information in the brain moves in several directions.
But each layer also provides feedback to its predecessors.
And within each layer, neurons interact and pass information between each other.
In contrast, in artificial neural networks, data usually moves in a single direction.
Theres a feedback mechanism called backpropagation, which helps correct mistakes and tune the parameters of neural networks.
But backpropagation is computationally expensive and only used during the training of neural networks.
And its not clear if backpropagation directly corresponds to the feedback mechanisms of cortical layers.
In the visual cortex (right), information moves in several directions.
In neural networks (left), information moves in one direction.
He also said out that neurons have complex temporal integrative properties that are missing in current networks.
Goal differences
Evolution has managed to develop a neural architecture that can accomplish many tasks.
Several studies have shown that our visual systemcan dynamically tune its sensitivities to the goalswe want to accomplish.
Creating computer vision systems that have this kind of flexibility remains a major challenge, however.
Current computer vision systems are designed to accomplish a single task.
But each neural online grid can accomplish a single task alone.
In AI systems, on the other hand, each of these things exists separately.
Do we need this kind of integration to make better computer vision systems?
As scientists, we often like to divide problems to conquer them, Kreiman said.
I personally think that this is a reasonable way to start.
We can see very well without smell or hearing.
Consider a Chaplin movie (and remove all the minimal music and text).
you’ve got the option to understand alot.
If a person is born deaf, they can still see very well.
However, a more complicated matter is the integration of vision with more complex areas of the brain.
Some (most?)
He pointed to following picture of former U.S. president Barack Obama as an example.
No current architecture can do this.
Areas such as language and common sense are themselves great challenges for the AI community.
you might read the original articlehere.