Two paths for AI
40% off TNW Conference!
Some scientists believe that assembling multiple narrow AI modules will produce higher intelligent systems.
This is basically how nature works.

Billions of years of natural selection and random variation have filtered lifeforms for their fitness to survive and reproduce.
The rest were eliminated.
Thus, success, as measured by maximising reward, demands a variety of abilities associated with intelligence.

In such environments, any behaviour that maximises reward must necessarily exhibit those abilities.
For example, consider a squirrel that seeks the reward of minimizing hunger.
But a squirrel that can only find food is bound to die of hunger when food becomes scarce.

This is why it also has planning skills and memory to cache the nuts and restore them in winter.
And the squirrel has social skills and knowledge to ensure other animals dont steal its nuts.
For example, sensory skills serve the need to survive in complicated environments.

Meanwhile, hearing helps detect threats where the animal cant see or find prey when theyre camouflaged.
Rewards and environments also shape innate and learned knowledge in animals.
By performing actions, the agent changes its own state and that of the environment.

Based on how much those actions affect the goal the agent must achieve, it is rewarded or penalized.
However, the researchers stress that some fundamental challenges remain unsolved.
Reinforcement learning is notoriously renowned for requiring huge amounts of data.
For instance, a reinforcement learning agent might need centuries worth of gameplay to master a computer game.
Therefore, slight changes to the environment often require the full retraining of the model.
However, Churchland pointed it out to possible flaws in the papers discussion about social decision-making.
The DeepMind researchers focus on personal gains in social interactions.
I may be wrong, but I tend to see this as a milestone.
Reinforcement learning assumes that the agent has a finite set of potential actions.
A reward signal and value function have been specified.
you’re able to read the original articlehere.