Look at the short video below.
Can you answer the following questions: Which object caused the ball to change direction?
Where will the ball go next?

What would happen if you removed the bat from the scene?
You might consider these questions very dumb.
But interestingly, todays most advancedartificial intelligence systemswould struggle to answer them.

It’s free, every week, in your inbox.
But for current artificial intelligence technology, theyre two fundamentally different disciplines.
Its amazing what pattern recognition alone can achieve.

Neural networks have also made some inroads in generating descriptions about videos and images.
But there are also very clear limits to how far you could push pattern recognition.
While an important part of human vision, pattern recognition is only one of its many components.

Visual reasoning is an active area of research in artificial intelligence.
Researchers have developed several datasets that evaluate AI systems ability to reason over video segments.
Whether deep learning alone can solve the problem is an open question.

But so far, progress in fields that require commonsense and reasoning has been little and incremental.
It is inspired byCLEVR, a visual question-answering dataset developed at Stanford University in 2017.
CLEVR is a set of problems that present still images of solid objects.

CLEVRER is a first visual reasoning dataset that is designed for casual reasoning in videos.
A controlled environment
CLEVRER is a fully-controlled synthetic environment, as per the authors of the paper.
The model might work on other limited environments, however.

The basic deep learning performed modestly on descriptive challenges and poorly on the rest.
Some of the advanced models performed decently on descriptive challenges.
But on the rest of the challenges, the accuracy dropped considerably.
Pure neural networkbased AI models lack understanding of causal and temporal relations between objects and their behavior.
Another significant benefit of NS-DR is that it requires much less data in the training phase.
The benefits of NS-DR do come with some caveats.
NS-DR is our preliminary attempt to approach this complex problem.
Gan acknowledges that NS-DR has several limitations to extend to rich visual environments.
CLEVRER is one of several efforts that aim to push research towardartificial general intelligence.
NS-DR is a stepping stone towards future practical applications, Gan says.
it’s possible for you to read the original articlehere.