Roughly 7,000 languages are used around the world, and many thousands more have cycled in and out of existence throughout human history. Where did these languages come from, and how did our ancestors create the very first ones? One basic unanswered question is whether the first languages began as gestures, like modern-day signed languages of the deaf, or as vocalizations, like most extant human languages, which are spoken.
Unfortunately for scientists interested in these questions, languages don’t leave fossils. So instead, experimental psychologists like me try to understand how language evolved by conducting communication studies with modern human beings.
Recently, my colleagues and I ran a series of experiments to examine how effectively people are able to communicate vocally without the use of speech. Can they use vocalizations to express their thoughts, without using words – and what can their efforts tell us about how the very first languages may have arisen?
‘Iconic’ clues from signed languages' recent roots
Estimates of when the first spoken languages arose are highly uncertain, spanning tens of thousands to hundreds of thousands of years ago or more. They are far too ancient for us to detect any evidence of an original “proto” language in what people speak today.
However, signed languages may offer a clue. These gestural languages created by the deaf typically have much more recent roots, being on the order of just tens or hundreds of years old.
In a handful of cases – for instance, when deaf children without a native signed language have come together in schools for the deaf, or in isolated rural communities with a high incidence of genetic deafness – scientists have actually had the opportunity to observe how signed languages are created anew.
What they find is that people in these circumstances first invent “iconic” gestures – that is, gestures that somehow depict or enact their meaning. For instance, think of scribbling your signature in the air to ask the server for the bill at a restaurant, or pointing and tracing a route to give someone directions. These gestures show what you are trying to express.
Iconic gestures, which can be understood even when communicators lack a common language, can then be molded into a system of signs and grammatical rules that are shared between members of a community. Over time and generations, they can develop into a fully complex and expressive language.
Can voices make the same leap?
But can this same process work with the vocalizations of speech? Can people similarly use their voice to depict their meaning and bootstrap the creation of a spoken language without gestures?
On the face of it, many scholars have argued “no.” They reason that it is much easier to show a concept with a visible gesture than to represent it with some kind of noise. This intuition is illustrated by an example from psychologist Michael Tomasello – trying to request Parmesan in an Italian restaurant by twiddling your fingers over your pasta as if sprinkling grated cheese. But what kind of vocalization would you produce to express this?
About this challenge, the renowned linguist Charles Hockett once wrote that:
When a representation of some four-dimensional hunk of life has to be compressed into the single dimension of speech, most iconicity is necessarily squeezed out. In one-dimensional projection, an elephant is indistinguishable from a woodshed.
Was Hockett right about the limited potential for people to create iconic vocalizations? To what extent can people create vocalizations with acoustic properties that somehow resemble their meaning in the same way they are able to create iconic gestures that do?
Creating new ‘words’ in the lab
Of course, our research participants come to the lab already knowing a spoken language – this is unavoidable. Yet, we have found that just by asking people to vocalize without speaking, we are able to learn a lot about their ability to communicate with iconic vocalizations, and also about their ability to use these vocalizations to create simple systems of vocal “words.”
For example, in our most recent study, published in the journal Royal Society Open Science, we asked university students to communicate with each other in a 10-round game of vocal charades. Their task was to communicate a set of various meanings – such as smooth, slow, big, up or down – to their partner with vocalizations, without using words.
We found that participants shared similar ideas of how certain properties of their voice – such as pitch, loudness, timbre and duration – translated to particular meanings. With few exceptions, each meaning was expressed with characteristic properties that distinguished it from each other meaning.
For example, vocalizations meant to convey “rough” were aperiodic and noisy.