A Bit About Languages

Languages spread from centers to far-flung regions. When modern humans came out of Africa, they brought languages with them, and some linguists claim to find commonalities in all world languages outside of southern Africa. The evidence is elusive, and so far unconvincing to most, but the human radiation was real, so linguistic relationships must have once been there.

Failing proto-world, there is considerable suggestive evidence that all or most of the northern Eurasian languages have some distant relationship (Pa-gel et al. 2013). "Nostratic phylum" has been proposed as the term for these northern Eurasian languages (and some North American ones) if they are indeed related. The resemblances could be due to borrowing across tens of thousands of years, because the steppes have been a highway since Neanderthal times. But there is no reason to reject common origin out of hand. Evidence may someday resolve the issue.

Today, the worlds languages are grouped into a large number of families and combined in a somewhat smaller but still impressive number of phyla. Typical families are Germanic, Romance, and Sinitic (Chinese). Typical phyla are Indo-European, which includes most European languages and many Asian ones, and Tibeto-Burman, or Sino-Tibetan, which includes Chinese, Tibetan, Burman, and hundreds of related but extremely disparate languages in eastern and southern Asia. Many languages, including Basque, have no known relatives; Basque is its own tiny family and phylum.

A recent theory, developed from East Asian data, relates the spread of agriculture to the spread of language phyla. This theory was developed by Peter Bellwood to account for the dramatic spread of the Austronesian phylum (see Bellwood 2009, with critiques by other scholars appended). Beginning around 6,000 years ago, Austronesian speakers began to move outward from southeast China. They colonized Taiwan, evolving there into the so-called Taiwan Aborigines." This was only the beginning. Taking to the sea, the

Austronesians exploded over the vast realms of the Pacific and Indian Oceans. Today, somewhat closely related Austronesian languages are spoken from Madagascar to Hawai'i and from New Zealand to Micronesia. Wherever these people went, they took agriculture. Cognate words for chickens, coconuts, root crops, and dozens of other agricultural items and techniques are found all over their vast realm, indicating that the early Austronesians had all these things.

It occurred to Bellwood, and to other scholars, that other linguistic spreads might also be associated with agriculture. This has been the subject of much research, culminating in a volume edited by Bellwood and Colin Renfrew (2002). Independent genetic evidence confirms that many migrations occurred and that agriculture released spectacular demographic expansions. Agricultural peoples moved rapidly from the Near East through Europe and into Africa, expanding in numbers as they did so. Evidence from Europe, Southeast Asia, and western Africa confirms striking demographic expansions directly after the introduction of agriculture in these regions (Gignoux et al. 2011). Farmers multiplied fast and moved out to new lands. Local people were not wiped out but rather merged with the expanding farmers, leaving varying degrees of genetic admixture.

Of course, not all linguistic spreads were accompanied by farming. The Inuit, Athapaskans, and several other hunting peoples spread over thousands of miles without benefit of agriculture. Moreover, having agriculture does not guarantee spread; the Georgian-language speakers have probably had agriculture almost since it began, 11,000 years ago, but have remained confined to a tiny area in the Caucasus. Within eastern Asia, the Yao-Mian phylum has recently spread from southeast China into Southeast Asia, but before that it seems to have been narrowly confined to a small part of south China. The Miao-Hmong phylum started in northwest China, according to some Miao myths. It survives in central and south China, with recent radiation into northern Southeast Asia. It has certainly spread with agriculture but has never gained much territory.

But some groups do spread. Bellwood and others have made a very convincing case for the association of the Tibeto-Burman (Sino-Tibetan) language phylum with the spread of millet agriculture. The dates and geography make this seem reasonable. The Tibeto-Burman languages, including the ancestor of the Chinese languages of today, are about as different as you would expect if they branched off from each other 7,000 or 8,000 years ago. I find the association convincing, but it is controversial. G. Van Driem thinks the stock originated in Sichuan (van Driem 1999, 2002). Others (myself included) think it originated further north but then differentiated in Sichuan. Either way, the stock originated very close to the point of origin of millet agriculture.

The spread of the Thai-Kadai phylum is clearly associated with the spread of rice agriculture (Bellwood and Renfrew 2002). We know that Thai-Kadai languages diversified in, and probably spread from, the Yangzi Valley area, where rice was domesticated. Their routes of spread and the probable timing of the spread fit well with the spread of rice agriculture south and southwest. The Austronesian phylum was associated with rice agriculture early and has some very Thai-sounding words; it maybe related to Thai-Kadai (R Benedict 1975), or, more likely, it simply may have become connected with rice agriculture and a few loanwords in very early times. The Thai-Kadai languages branched from each other perhaps 6,000 years ago. Their speakers were, however, probably not the only rice-growers, and Hmong/Miao and Yao/Mien languages were in the right area then, too, and have been associated with the spread of rice by some scholars.

A significant fact is the spread of the Thai root for "chicken," kai. This word was borrowed into Chinese early, becoming ji in Mandarin but remaining kai in Cantonese. (The Cantonese language is likely the result of Thai speakers switching to Chinese in the Tang Dynasty and since. The Cantonese word for "chicken" is far from the only Thai-sounding word in that language.) Not stopping there, kai went on—increasingly distorted—into Korean, Japanese, the Central Asian languages, and thence into the Western world, eventually as far afield as Morocco (Blench 2007). It is awfully hard to escape the conclusion that the Thai peoples domesticated the chicken, which is native to south China and Southeast Asia. Borrowed words surely indicate borrowed chickens. Other local peoples in Southeast Asia, such as the Austronesians, have their own words for the bird, implying that they were aware of wild chickens before domestication.

Bellwoods correlation of advanced agriculture with the spread of the Austronesian languages in the islands east of Asia is no longer controversial. Millet reached Taiwan by 3,000-2500 BCE; a recent find revealed large amounts of foxtail millet and rice at Nan-kuanli. This and related sites probably represent the ancestors of todays Austronesian-speaking "aborigines" of Taiwan, recently arrived from south China with seeds in hand (Tsang 2005). There is clear archaeological evidence for an explosive radiation of advanced farming and pottery-making people from south China to Taiwan and thence to the Philippines and the islands south and east—the lands inhabited by Austronesian peoples today (Bellwood 1997,2002, 2005; Donohue and Denham 2010 dispute this, but Bellwood has a very effective answer in the commentary section of their article). However, subsequent profound changes in both language and agriculture took place when Austronesians mingled with Papuans in Melanesia (Paz 2002), with the result that Oceanic Austronesian agriculture looks much more Papuan than Chinese.

When we move to western Eurasia, however, we are in a very different situation. Bellwood and Renfrew hypothesized that the Indo-European (IE) phylum was present at the birth of agriculture in the Near East and spread along with it. This is certainly false. Agriculture in the Near East began at least 11,000 years ago, and the IE phylum is a very close-knit one. Suffice it to say that the Hittite for water is wadar. This and hundreds of other close pairs prove that IE cannot possibly have split up more than about 6,000 years ago. Languages change very fast, especially in the days before books, radio, and television. Languages diverge and differentiate faster than we once thought (Brown 2010), and this process probably happened even faster in preliterate times.

Moreover, we know that agriculture began in the dry Levantine back country. But the IE phylum has shared cognates for a whole host of cool-temperate plants and animals, including laks for salmon. (No, that word isn't of Jewish origin.) These biota firmly fix the IE origin somewhere between northeast Europe and the Caucasus—most likely in and around what is now the Ukraine. Conversely, IE significantly lacks words for dry Levantine commodities.

Also, there is plenty of evidence for pre-IE farmers in Europe. The surviving Basque language is the most obvious piece, but there are also the host of agricultural and rural words in Germanic that have no IE roots: "wheat," "sheep," "eel," "delve," "roe" (deer), "boar," and even "land," among others (Witzel 2006). Greek also borrowed from non-IE languages a whole host of agricultural and settlement words. Speakers of IE languages would hardly have borrowed such words from hunter-gatherers. Spreading in the other direction, IE speakers of the language ancestral to today's Iranic and north Indian (Sanskritic) languages borrowed a similar large range of words, including terms for camel, donkey, and brick as well as a whole host of religious terms (including names of gods, like Indra) and literary usages (Witzel 2006).

A recent hypothesis, based on virus epidemiology, has the IE languages originating in Anatolia and spreading with agriculture (Bouckaert et al. 2012). But again the timing is wrong; agriculture had already spread widely by the later, and more believable, timing they reconstruct, and one wonders what happened to the earlier propagators. Viruses do not make a very good model for humans.

What, then, accounts for the spread of the IE peoples? The traditional explanation was that they developed riding, horse traction, and horse-based warfare (chariots and, probably later, riding). This explanation has received a powerful boost lately from increasingly clear evidence that the horse was domesticated in Kazakhstan around 5,500 years ago (Anthony 2007; Harris 2010)—just east of the place and time reconstructed for the IE homeland. The horse was probably a food animal first. Only after domestication could it be milked and ridden. Anyone familiar with wild equines will know that they would not stand still for either process! Horses, unlike ruminant livestock, are neither stolid nor intellectually limited. They are high-strung, sensitive, extremely intelligent animals, and to this day it takes a tremendous amount of empathy and skill to work with them. Instead of dull servants like cows, they can become super-smart companions. In Mongolia, my wife saw small boys riding bareback, standing up, controlling the horses by pressure of feet. The horses went through the most amazing maneuvers, sensitive to every touch and knowing exactly what to do.

Breeding to maintain this level of intelligence while getting rid of the natural ferocity and cunning of wild equines was truly a piece of work. Domestication must have been a long process with a lot of mutual learning. The advantages of skilled horsemen include the military edge made famous by both ancient Greek and early Chinese authors but do not stop there; imagine the edge ancient horsemen had in herding, communications, trading, plowing, and just about every other mobile activity.

This theory has recently been supplemented by the idea that the IE peoples had the gene that allowed them to digest lactose as adults. In most humans, the enzyme that breaks down lactose—milk sugar—is not produced after age about six. After that, they suffer major intestinal discomfort if they consume much fresh dairy food over time. Among the worlds milk-dependent peoples, however, mutant genes convey the ability to digest lactose throughout life. These genes are close to universal in Europe. They evolved quite separately among the herding peoples of East Africa (who actually have more genes for this trait than Europeans do). This is a recent development, known from many lines of evidence to be a product of dairy-dependent economics, and arose long after the beginning of agriculture.

The European gene is fairly common in the Near East but fades out rapidly in the rest of Asia. There—in Central and South Asia, in particular—people rely on lactose metabolism by Lactobacillus bacteria to make milk palatable. Lactobacillus breaks down lactose into lactic acid, not only making the food digestible but also preserving it (lactic acid is a strong preservative). Lactobacillus fermentation has thus proved very useful: it gives us yogurt, sourdough bread, pickles, sauerkraut, soy sauce, salami, and many other preserved foods. Asians, using this technology, did not need lactase. However, to carry out and maintain Lactobacillus fermentation requires a rather sophisticated lifestyle on the part of its users. It is not something that dates back to the dawn of farming.

Indo-Europeans were probably too mobile to maintain the sensitive, delicate cultures that give us yogurt and sourdough, so the mutant gene allowed the IE peoples to depend heavily on fresh milk products (Cochran and Harpending 2009). This may have given them a major competitive advantage as they took to nomadic herding. They could easily spread south and east into lands lacking the gene. I think this is quite probable, but evidently the pre-IE peoples in Europe also had the gene, since we know they were relying on cattle and sheep and doing at least some dairying.

The complex of riding and lactase allowed the IE peoples to depend on nomadic stockherding and to be superior at it. This allowed them to spread with lightning speed, which they evidently did, for their languages soon cropped up from the Atlantic to the frontiers of China.

Important to our history is one IE family in particular, the Indo-Iranian. The Iranic languages and the Sanskrit-derived languages of India are modern representatives of this early but compact branch. A number of sound shifts unite them and separate them from other IE families. Indo-Iranian speakers evidently nomadized east and south, eventually conquering and occupying vast realms from Iran to Bengal and from south Russia to westernmost China. In the process they assimilated many speakers of languages now lost. They have, in turn, lost most of their central Asian territory to Turkic languages— showing how fast languages can spread widely and then retreat. Most Turkic speakers today had ancestors who spoke IE tongues.

Bellwood and Renfrew also thought the Afroasiatic phylum—whose most visible representative is the Semitic family—might have developed along with agriculture and spread from the Near East. This, too, is impossible. The Afroasiatic center of diversity is Ethiopia, in or near which this language phylum certainly arose. The intrusion of one small branch, the Semitic family, into Asia is a relatively recent phenomenon, probably going back not much before their entrance to history, with the Babylonians. The original Afroasiatic speakers were certainly like modern Ethiopians physically and culturally.

So, who developed Near Eastern agriculture? The Sumerian language (and its possible relative Elamite) is in the right place at the right time. The Sumerians spread far and successfully into the best farmland before being conquered and linguistically assimilated by the Semites. Their art shows that they looked like modern Middle Eastern people—their genes have survived much better than their languages. My money is on the Sumerians.

We will have to deal with a few other language phyla in this book. The Uralic languages (Finnish, Hungarian, and relatives) arose near enough to IE to have exchanged ancient loanwords. The Austroasiatic phylum evidently arose in eastern India—that is its center of diversity and linguistic range. It spread east, possibly as recently as two or three thousand years ago. One branch is the Mon-Khmer family, which includes Khmer, Vietnamese, and many more obscure highland languages. Bellwood thought the Austroasiatic phylum began in China, but all evidence is against this; all evidence is consistent with an origin in east-central India. The Austroasiatic phylum probably spread with the rise of agriculture in India, about 6,000 years ago.

Finally, a fatefully important, putative phylum for Asian history is the Altaic, including the Turkic, Mongol, and Tungusic families. These three branches are very distantly related, if they are related at all. Much doubt has been cast on whether there really is an Altaic phylum. (It was once extended even farther, to include Korean and Japanese, but very few linguistic scholars accept this now, and evidence is overwhelmingly against it.) The Mongol languages show rather puzzling similarities not only to Turkic but also to Uralic and IE languages, ranging from such startling word resemblances as minii "mine," to the Mongol noun case systems similarities to Russian and Finnish (as opposed to Turkic, which structures these things differently). However, these similarities are notably lacking in pattern, indicating that they are likely borrowed. Those not borrowed very possibly have an older common origin in the "Nostratic" universe. Mongol has three roots for "I, me" (hi, min-, and na-), and all of them sound like pronouns in various languages all over the world (cf. Yucatec Maya in "I, mine"). Is this evidence for Proto-World (as some maintain) or merely a result of these being extremely easy sounds for the human mouth to make? People tend to save energy when talking— "television" becomes "TV"—and the commonest words, especially those much used by children, naturally become short and simple.

The Turks and Mongols certainly nomadized, camped, and fought side by side for thousands of years. They also lived near Uralic peoples, and had early contacts with the Tocharians and probably other Indo-Europeans. Mutual influence was inevitable. The basic vocabularies of Mongolian and Turkic languages, however, do not show any believable relationship. No one can miss the similarities of English one, two, three, Latin unum, duo, tres, and Sanskrit eka, dva, tri, and there are hundreds of other such sets of cognates, even for quite complex concepts (Celtic ri, Latin rex, Sanskrit raja,...), in Indo-European. But try to find any similarities between (Khalkha) Mongolian neg, khoyor, gurvan and Turkish bir, iki, uc, "one, two, three." The basic vocabulary words in the Mongol and Turkic languages are usually very different—unless they are so similar that they must be recent borrowings. On the other hand, there are some very deep and basic cognates, including the word for milk. The word for water is close—su in Turkic, us in modern Mongol—but Chinese is similar too (shui from earlier soi or swu). Perhaps we are looking at a very ancient common origin and a great deal of subsequent mutual influence. In any case, the idea that Turkic, Mongol, and Tungusic are related in an Altaic phylum seems extremely shaky, if not downright defunct (Vovin 2005).

Color words are as confusing as in English: just as English has half Germanic (blue, white) and half French (violet, purple), modern Mongol has basically Turkic loans for black, yellow, and deep blue, but utterly un-Turkic words for white, red, and gray, and even a thoroughly un-Turkic word for blue (now used for pale blue). The borrowing for deep blue is significant: it refers to the sacred blue of Heaven (medieval Turkic gok, Mongol kbk, now khokh; the change from k to kh pronounced like the ch in German ich is standard in modern Khalkh Mongolian). The native word tsenkher refers basically to nonsacred blue. Anyone familiar with Mongolia will know the sky-blue silk scarf's wrapped around every venerable tree, rock, cairn, shrine, and other object (including the occasional telephone pole) that is sacred, fortunate, or deserving of spiritual respect. Borrowing the Turkic word for the sacred color may indicate respect for Turkic cultural forms in the early medieval period, when the Gok Turkic Empire ruled Mongolia and much of Central Asia.

The Altaic phylum, or cluster, bears the name of the Altai Mountains, where it supposedly originated. If it did not originate there, at least the Turkic languages apparently did. All these languages come from the cold steppes and forests of high Central Asia. The Altaic peoples emerge into history fairly late but were obviously active much earlier, having quickly acquired nomadic herding and riding, presumably from Indo-Europeans.

The Altaic peoples have shown a truly astonishing ability to build huge empires. The Mongol Empire is only the most conspicuous of many. Turkic, though not the other languages, has shown a monumental ability to flourish at the expense of local languages. Millions of square miles of formerly IE and other languages' territory are now Turkic speaking. There are parts of Turkey that in historic times have switched from Hittite to Phrygian to Greek to Turkish—yet archaeology reveals no change in the people themselves. They were and are genetically the same lineages. They switched languages according to who had most recently conquered them.

Similar, if less complicated, language shifts are almost universal in Central Asian history, as elsewhere. Conquered peoples usually pick up the languages of their conquerors, but if the conquerors are few in number, the reverse takes place. In China, the spread of Chinese languages within historic times has led to linguistic absorption of many Thai, Miao, Yao-Mian, Austronesian, and others. Often, these older languages leave traces. Cantonese, in particular, seems to have begun as a form of Tang Dynasty Chinese spoken by Thai people; its tone system, much of its vocabulary (remember kai), and other traits reflect the massive linguistic acculturation of the Zhuang and other Thai-related minorities in historic times. This sort of linguistic acculturation guarantees that any language phylum is going to include languages spoken by very diverse peoples.

It is certain that the Chinese languages proper have expanded with the Chinese state. The core of geographic China became Chinese speaking by the Shang Dynasty. The state of Chu, in and around what is now Hunan, seems to have originally spoken various Thai-Kadai languages. It became Chinese speaking in the latter part of the first millennium BCE—first among the elite, later—slowly—among all. With the spread of the Chinese-speaking groups, several very different languages developed: Cantonese, Shanghainese (Wu), Hakka, two or more Fujianese languages, Gan, Xiang, and so on. These have often been miscalled dialects for political reasons: political leaders have generally promoted the dominant and by far the most widely spoken language, Mandarin or Guoyu ("National Language"). A dialect is, correctly, a subvariant of a language—not a language in its own right. Guoyu is now rapidly replacing local languages and their (actual) dialects. This is, demonstrably, a huge loss to local cultures, literature, the arts, and free expression.

As with the term "China" in its geographical sense, referring to the inhabitants of the region as "the Chinese" or "the Chinese people" before the Qin Dynasty is technically wrong. I try to avoid it but obviously do not always succeed. From Qin on, there is the problem of whether one is using "the Chinese" to mean the linguistic Chinese, or the people of the Chinese state, or the people of the geographical region called China. I usually try to stick with language, but consistency is simply impossible, if only because one must quote sources that use the term quite differently. The linguistic Chinese are now called the Han Chinese, from the Han Dynasty. However, many of the citizens of the Han empire were Tai, Yao, Miao, Vietnamese, Austronesian, Mongol, proto-Turkic, and so on and on. Some spoke languages now extinct and unclassified, like the language of the Xiongnu. So "Han" is as misleading a term as "Chinese." However, it is established, and I cannot escape it.

