When faced with a new learning task, our brains replay events in reverse, much like a video on rewind, a new study suggests.
This type of reverse-replay is also used in artificial intelligence research to help computers make decisions. The finding could explain why we learn tasks more easily if we take frequent study breaks: the pauses between sessions give our brains time to review information.
The finding was detailed in a Feb. 12 online issue of the journal Nature.
The researchers measured brain activity in rats as the animals ran back and forth on a linear track. Specifically, they monitored a brain region called the hippocampus, which is known to be important for memory and navigation in both rats and in humans.
When the rats completed a lap, they were given a food reward. After eating, the animals would pause briefly before starting another lap. Outwardly, the rats didn't seem to be doing much during these rest periods. They would fidget, groom or remain still. The brain recordings told a different story, however. During times of rest, a rat's hippocampus was a hotbed of activity.
As the rodents ran up and down the track, hippocampal cells fired in certain patterns. This sequence of firing repeated when the animals rested, but in reverse order. The reverse-replays were repeated several times; each replay took only a few hundred milliseconds.
"In that compressed time, the rat is replaying the entire track from where it currently is all the way back to the very beginning," said study team-member David Foster from the Massachusetts Institute of Technology. "This result suggests that the immediate experience is actually recapitulated several times. The processing going on outside of the original experience may be important for learning."
The finding could help explain how rats solve something called the "temporal credit assignment problem." And because the hippocampus in rats and humans perform many of the same functions, the current study suggests that our brains may work in the same way.
The problem, a classic dilemma in decision-making theory, is this: If an animal has to perform a sequence of actions before it can get a reward, how does it know which actions were ultimately important and which weren't? Actions performed right before the reward was obtained are easy to identify as important, but what about actions performed at the beginning of the sequence? Which of those were important?
Richard Sutton, a computer scientist at the University of Alberta, Canada who was not involved in the study, likens the problem to playing backgammon for the first time.
"How do you evaluate the opening move if you don't know how to play yet?" he said.
In the fields of computer science and artificial intelligence, the temporal credit assignment problem is solved by having the machines work backward, replaying events in reverse and assigning more credit to actions near the end of a sequence than to those at the beginning.
"You know that the final move was the right thing to do, so you can send that information back through the set of actions that were taken leading up to the final state," Foster said in a phone interview.
If reverse replay also takes place in humans, it could explain why cramming hours before a test doesn't typically work. The new finding suggests that our brains learn best when there are frequent pauses between study sessions; during these breaks, our brains unconsciously reviews the new information several times, making it easier to commit to memory when the time comes.
How reverse replay leads to learning
Scientists have long known that the release of the chemical molecule dopamine is an important part of the brain's reward system. The release of this neurotransmitter floods us with feelings of joy and motivates us to perform certain activities.
When this knowledge is paired with the new suggestion that our brains may replay new experiences in reverse, a possible mechanism for learning emerges, Foster said.
The researchers hypothesize the existence of a special "value area" of the brain where dopamine signals and reverse-replay signals are fed become paired together. If the dopamine signal is one that decays over time, meaning that it is stronger at the beginning of transmission than at the end, then the following might happen:
As a reverse replay signal plays out in the brain's value area, it is associated with the beginning of a strong dopamine signal; as the replay continues, the dopamine signal becomes weaker. In this scenario, actions taken near the beginning of a reverse replay event will be more important to an organism than actions taken later.
Hints in psychology
Sutton said he would not be surprised if reverse replay occurred in animals as well as machines. If anything, he said, this mechanism had long been suspected from early psychological experiments, such as Ivan Pavlov's classical conditioning experiments with dogs.
"Pavlov rang the bell and gave the dog the steak and after a while, just ringing the bell was rewarding," Sutton told LiveScience. "So somehow it worked backward from the steak to the bell."
Foster agrees, but added that the current study suggests we make trains of associations going much further back than previously thought.
"It's taking the animals several seconds to run around, so this replay could be sending that information back through several stages and rewarding a long sequence of actions," Foster said. "It's that long sequence that is new."
The current study looked specifically at spatial learning; however, in rats, and probably in humans too, the hippocampus is involved in other types of learning as well.
"So [reverse replay] could very well be a mechanism to deal with a broad variety of information, not just spatial," Foster said.