tirsdag den 25. december 2018

Was the Voynich manuscript written in Nahuatl?

Excerpt of the text from the Voynich Codex
 showing the odd script.

Recently a number of papers by a group of botanists from Purdue University have proposed that the enigmatic Voynich manuscript which has so far resisted decipherment was written in Nahuatl in the 16th century.

The Voynich manuscript is a codex written on 16th century vellum paper, which clearly includes botanical illustrations, but also a number of baffling illustrations that seem to be cosmological as well as maps. The pictures are accompanied by writing in a mysterious script that has been subject to multiple analyses and decipherment attempts.

In this blogpost, I give my impression of the linguistics of the proposed decipherment of the Voynich manuscript as a kind of Nahuatl.

Excerpt from the 16th century Nahuatl language
herbal Codex Badianus showing the similarity of the illustrations
(Actually, I think the Badianus has much better illustrations.)
The scholars who have advanced the proposal that the codex is written in a form of Nahuatl are Arthur Tucker and Rexford Talbert and Jules Janick. They published their 2013 proposal titled "A Preliminary Analysis of the Botany, Zoology, and Mineralogy of the Voynich Manuscript "  in HerbalGram, the Journal of the American Botanical Council. With additional material published at their institutional deposititory.

Now, in 2018, Janick and Tucker published a book titled "Unraveling the Voynich" on Springer Press, which presents the entire argument in favor of seeing the Voynich  manuscript as a Mexican codex, written largely in Nahuatl - with some Spanish and Taino mixed in.

The Codex: 

Folio 9r of the Voynich Manuscript
showing a plant with odd shaped leaves.

The codex has 240 pages, some of which are wide fold-out pages. Analysis of the parchment has shown it to be from the early 15th century, made from calf skin. Most of the contents are illustrations of plants with small texts written in an odd script. Other pages are astrological charts, populated with little nude ladies who bathe and shower in odd tubs connected with pipes.

The first known owner was Georg Baresch a 17t century alchemist in Prague. Other owners seem to have been Emperor Rudolph II, Jesuit scholar and self-proclaimed decipherer of the egyptian hieroglyphs Athanasius Kircher. When the Jesuit society decided to sell the manuscript it was bought by Lithuanian bibliophile Wilfrid Voynich after whom it is named. Today it is housed in the Beinecke Library at Yale University where it is catalogued as "Beinecke MS 408", where it has been digitized and put online for anyone to inspect (located here: https://brbl-dl.library.yale.edu/vufind/Record/3519597)

All of the pages have writing in the odd script, and in spite of a host of the world's quirkiest minds working to decipher it, it has still not been read.

Here is a chart of the symbols (from Wikipedia) - the correspondence with the Latin alphabet is only to be able to name each glyph with letters from A to B:

As mentioned the mysterious manuscript has been scrutinized by many of the world's quirkiest minds - the same type of mind that would spend a career seeking to prove that Basque or Burushaski are Indo-European languages - and they have produced an amazing gamut of different proposals: From codes and ciphers, or a hoax, or shorthand Latin, or glossolalia, or an East Asian language, and now, Nahuatl.

But most of these odd proposals have not been published as presumably(?) peer-reviewed edited volumes by Springer, so the Nahuatl proposal does merit serious attention. Especially given the fact that no Nahuatl specialists have been involved in the decipherment.

The Argument for Nahuatl: 

There are three main arguments used for identifying the manuscript as written in Nahuatl:

  1. The herbological part of the codex has similarities to Mexican herbological codices produced in the mid 16th century, and the botanists argue that many of the plants can be identified as new world species. And that a map of a city can be identified as "angelopolis" which they identify as the city of Puebla (de los Ángeles) in the state of Puebla. 
  2.  The proposed tl-letter looks like the first letter in
     this word tlanequilis from 
    an 18th-century Nahuatl testament.          
  3. The character  which is very frequent in the manuscript, is similar to a ligature character found in some Mexican codices representing the Nahuatl consonant tl. (It also sort of looks like the way I write capital H when I write my signature, and like how many people write a double l)                        
  4. The proponents argue that some of the plants can be identified by Nahuatl names, and claim that they can read some of the text in Nahuatl, using their identification of the glyphs with Nahuatl phonemes. 
I will look primarily at the third of these arguments, both because this is the actual claim to a decipherment. Arguments one and two can be true even if the language is not Nahuatl. All claims to decipherment of course rest on the degree to which they actually allow us to read the texts written in the script that they are claiming to decipher.

The main argument of the book is that the book contains elements of Nahuatl and new world flora, that it contains inspiration from the Jewish Kabbalah (which they claim was practiced among Franciscans in the New World), and that it refers to the city of Puebla de los Angeles which was founded by the Franciscan friar Toribio Benavente "Motolinia".

Nevertheless, an odd chapter by the linguist Fernando Moreira, looks at the readings and compares them with different Mesoamerican languages, finding that it doesn't really match any of them - and then proposes an undescribed Mesoamerican language which he calls "acolhuacatlatolli" (the Nahuatl word for "language of the Acolhua"). The Acolhuas were the Nahuatl-otomí ethnic group that lived in Texcoco. We know their language very well since most of what we today call "Classical Nahuatl" is in fact the Acolhua dialect of Nahuatl.  Moreira nevertheless, oddly suggests that it could have been a form of Popoloca (which is what Nahuas called all the languages they couldn't understand including at first Spanish).

So while the general argument of the book is that the language is a form of mixed Nahuatl-Spanish, the chapter by Moreira argues that it is not, and then introduces an unknown and undescribed language as a sort of deus ex machina that allows them to maintain the main parts of their hypothesis when the evidence is shown not to support it. In the rest of this blog post, I will argue based on the original proposal that it is Nahuatl or has a Nahuatl element, and not based on the alternative hypothesis that it represents an undescribed Mesoamerican language, nor the possibility that it represents a language spoken by space aliens who built the Mexican pyramids.

The Problems: 

Ok, I am already going into the problems with the proposal. The most nefarious problem is that it is pseudo-rigorous -  that is it, it works hard to give the appearance of being rigorous scholarship while in fact it is not at all.  They cite lots of serious scholarship, and mostly they cite it correctly, but nevertheless all the citations are used only for circumstantial evidence. As soon as we look at the concrete examples and the readings they are unsupported by this evidence and rests on pure speculation - often uninformed speculation.

For me the best problem, best because it is so solid that it clearly invalidates the entire endeavor, is the fact that none of the proposed readings are valid - hardly a single one of the proposed words actually read like a bona fide Nahuatl word.

Many of them are completely alien to Nahua phonological structure. And to be honest I am surprised that the scholars haven't found it to be odd that a few of the letters are so frequent that they appear in almost all words - for example more than half of the proposed plant names (and names of the nude ladies they call "nymphs") start with the letter that they read as /a/ - that would be very odd in a natural language, unless the a was a very frequent grammatical prefix (which it isn't in Nahuatl).

The readings:
Table from Janick & Tucker 2018:141

Janick and Tucker produce a full set of proposed readings for the voynichese symbols given in two tables on page 141-142. I reproduce the first part of the table here to the right (non-underlined Latin equivalents are "tentative").

Following the tradition of comparing letter frequencies in decipherment proposals, the table also supplies the frequency of each symbol in the Voynich Manuscript and the frequency of the proposed Latin equivalent in a randomly selected Nahuatl manuscript.

It is odd that the proposed readings include both signs for single phonemes as well as sings for syllables câ (we are not told what the circumflex above the â is supposed to mean? Does it represent a saltillo?) and yâ/hâ (hâ is not actually a possible Nahuatl syllable).

It also seems that Janick and Tucker fail to realize that the letter u found after c and h in classical Nahuatl texts is not actually a vowel, but represent the sound of the consonant /w/ or the lip rounding in the phoneme /kw/. This is basic stuff, and why it makes no sense to seek to make a decipherment using a language that one does not in fact understand (Champollion knew this, and that was why he spent so much time studying Coptic and other Semitic languages).

Here are some of their readings of the names of plants.

First the one that seems to be their clou: the reading of the name of a cactus-like plant as <NĀSHTLI>. They argue that this reading resembles the Nahuatl name of the fruit of the nopal cactus which is nōchtli. And sure, it does look similar to that word. The -tli ending looks like the absolutive suffix, and the root NĀSH is superficially similar to nōch-, and the reading follows Nahua phonological rules. Nevertheless, a and o are different vowels in Nahuatl, and sh (x) and ch are different consonants - so only one out of three letters in the root of the proposed reading actually match, the others are "near matches" at best.

Other readings fare a lot worse. Look for example at these images proposals:

As is known to any serious student of Nahuatl, Nahuatl does not allow consonant clusters in the beginning or end of syllables, and also does not allow clusters of more than two consonants in the middle of words. Words like ichpchi or itlmamcho or itlmaca or itlmchi are not possible words in any dialect of Nahuatl.

It seems reasonable to expect more of a decipherment than for it to produce one near match and then a load of meaningless gibberish.

Some of the syllables or even sequences of two syllables that occur frequently in their readings do have potential readings - but this is only natural given that Nahuatl has a rather small phoneme inventory and therefore not many different potential syllables. For example they note that mā means "hand" and cui means "to take" and māca means "to give" - but given how short these monsyllabic sequences are and how frequent the elements are, it is simply a coincidence. When there aren't more letters, and the letters have been assigned Nahuatl equivalents, some sequences in the reading are bound to look like some sequences in the vast vocabulary of Nahuatl. 

The chance of random matches gets even worse when they admit the possibility of readings in Spanish and Taino and of weird mixtures of the two (unlike anything found in any colonial Document). Why for example, would a Nahua or Nahuatl speaker,  given that Nahuas were expert cultivators of agave, use the Taino word for agave "maguey" (in the mangled spelling <MAHUEOI>) and not the Nahuatl word metl?  

A Test: 

The best way to assess a proposed decipherment is of course by testing it on a piece of text and see what it produces, and if it is intelligible.

I tried such a test on a piece of text from the top of folio 28v, and below is the result. It is utterly unintelligible, it has only the vaguest resemblance to Nahuatl - and that is only because of the strong association between the /TL/ phoneme and Nahuatl. The phonology is alien to Nahuatl, allowing for example consonant clusters in the beginning and end of words, and failing to respect the Nahua phonological rules of assimilation. Nahuatl is of course a language that has few phonemes and a lot of a's and a lot of tl's and cu's and hu's and so does this proposed reading - but that is only because the decipherers on purpose have assigned those readings to the most frequent letters. Furthermore these letters are twice as frequent in this "language"  than they are in Nahuatl according to their own count - for example Nahuatl only has the frequency 4,7% for tl, whereas the Voynich has the frequency 10%. So what we get is a text that superficially looks like Nahuatl, but only to someone who doesn't actually know any Nahuatl.

Nevertheless, anyone who knows any dialect of Nahuatl will be able to see that the below is not Nahuatl, and that only certain words resemble Nahuatl because they have the sounds and endings that are frequent in nahuatl such as -tli and -câ (why do the decipherers add the ^ symbol above the a in the câ letter? They never explain what it is supposed to represent).

Following the proposed decipherment this text reads:

Could it be Nahuatl or inspired by Nahuatl?

The language of the proposed reading clearly is not Nahuatl. It has only the most superficial structural resemblance to Nahuatl,  even if we were to admit the possibility of undescribed dialects. When we decide to read the most frequent signs of the script as their most frequent Nahuatl counterparts the text naturally comes to resemble Nahuatl. But since it violates the phonological rules expected of Nahuatl, and is entirely void of any recognizable grammatical structure from Nahuatl (we can't even see differences between verbs and nouns, much less actual grammatical morphology) this can safely be discarded.

A further argument against the plausibility of the background story of the proposal is historical: In mid-16th century Mexico anyone who would be able to produce a codex would also have been able to write it in proper Nahuatl - even Spanish friars (this was a requirement for being a priest in Mexico at this time). So, OK maybe they would want to invent a new script so that nobody could read what they had written about all those little naked ladies - but one would of course assume that they would then write intelligible Nahuatl. Otherwise why bother?

The Libellus de Medicinalibus Indorum Herbis, also known as the Codex Badianus is an actual herbal manuscript which is known to have been written by Nahua scholar Martin de la Cruz in 1552 and later translated into Latin by another Nahua scholar, Juan Badiano. Nahua people in the 16th century were not only able to write intelligible Nahuatl, they were also able to translate it into Latin. And to boot the illustrations are much better, and allows easy identification of the different species - the Voynich plant drawings come across as crude by comparison. 

Finally, as I read the example it bothered me that there is a certain repetitiveness in the deciphered text, the same letters seem to occur very frequently in combinations with specific other letters. This is not usually the case for natural languages - but very frequent in something like glossolalia of the baby-speech "lalala balala malalaba"- type.

Some of the little naked ladies, these ones from folio 80r. 


  • Janick, J., & Tucker, A. O. (2018). Unraveling the Voynich Codex. Springer.

tirsdag den 24. juli 2018

Meat and Mushrooms: Food words in Nahuan and Coracholan

Food related words make for fun etymology, especially in Mesoamerican languages because Mesoamerican food is so delicious. I have previously dealt with the Nahuatl etymologies of the words for salt, avocado, chocolate and cocoa.
In this blog post, I will look at some food words in Nahuan and Coracholan noting what seems to be an intricate web of semantic changes between the languages. The words show changes of meaning that cross between general and specific terms, and between animal- and plant-based foods. 

It is a common thing in the world's languages that words for food products shift their meanings to other foods, and that words for general types of food change their meaning to become specific, or words for specific foods become general. This is of course because we have a tendency to think in terms of staple foods, so that the name of whatever kind of food we eat the most tends to become the general term for food , or conversely, we tend to use the general term "food" to refer to the specific kind of food we eat the most (for example in Danish the general word for food "mad" when used as a count noun ("en mad") refers specifically to an open-faced ryebread sandwhich) .

In the history of English and Nordic languages we see for example that the English word  "meat" is related to the Nordic word "mat" meaning "food", and that the word "meal" is related to the Nordic word "mel" meaning "flour", and that "flæsk", the Nordic cognate of the English word "flesh", means "pork". When I inquired for similar changes in the  Historical linguistics Facebook-group it was pointed out that the Semitic root lħm probably meant "basic food", since the meanings of its modern cognates are "meat" in Arabic, "cow" in EthioSemitic, "fish" in Modern South Arabic [edit: thanks to Whyght], and "bread" in Hebrew.  

Now, take a look at these sets of words in Nahuatl and reconstructed Corachol:

nakatl "meat"
nanakatl "mushroom"
xonakatl "onion"
yetl/etl "beans"
nohpalitl "nopal cactus" (Opuntia spp.)

*nakari "nopal cactus"
*muume "beans"
*wai "meat"
*yekwa "mushroom"

At first glance we notice that the root *naka looks similar in Nahuan and Corachol. In Nahuatl it refers to meat but also to two kinds of foods that both have an umami-like, meaty taste and texture - namely onions and mushrooms. In Corachol the root refers to another plant with an umami-like meaty taste and texture, namely the nopal cactus. So either, the root naka- originally referred to meat and was then extended to refer to meaty-plants, or else it originally simply meant "meaty food" (the kind that can carry a good meal all by itself) and was then in Nahuatl changed to refer specifically to animal meat. Either of these processes seem plausible.

Knowing a bit about the sound changes that have operated in Nahuan and Corachol we can see one more likely cognate: In Corachol initial w- often comes from a previous *p. And in Nahuatl e often comes from a previous *ai, and initial y- before e often corresponds to a previous *p. Knowing this, we see that Corachol wai "meat" is in fact a potential cognate of Nahuan yetl/etl "beans". No good etymology has been proposed for the Nahuatl root ye/e "beans" and Nahuan is alone among the Southern Uto-Aztecan languages in not having a cognate of the root *muni "beans". So here it seems as if Nahuatl has changed a word *pai (or *pa'i) previously meaning "meat" to meaning instead "beans", and dropping the original word for beans altogether. The semantic change from "meat" to "beans" may seem implausible at first, but I swear if you ever taste a thick, salty broth of ayocote beans the umami is so strong that you will be willing to bet there is bacon in there. 

The Corachol root for mushroom *yekwáh seems related to the Uto-Aztecan root *pakuwa "mushroom" (reconstructed by Stubbs for Numic, Tepiman, Tarahumaran and Cahita). But we don't usually get the reflex y from PUA *p in Coracholan - only Nahuan seems to have y from *p. So maybe this word was loaned into Coracholan from Nahuan (where yekwa would be the expected reflex of *pakuwa, with the intermediate stage *yakɨwa), and then subsequently the root was swapped for nanakatl in Nahuan! (this is admittedly speculative, but the pattern fits).

This would make a scheme of semantic changes something like this: 

Model 1. Red is proto-forms, blue is Nahuan, and purple is Coracholan. It looks like Corachol is conservative and Nahuan innovative. (Photos from wikicommons https://commons.wikimedia.org.)

But there is an alternative that may be preferable, because in the Northern Uto-Aztecan language group Numic naka- is the name of the bighorn sheep (which is presumably tasty). So perhaps the original meaning of naka was "bighorn sheep" which then in Southern Uto-Aztecan became "meat" which in Nahuatl and Corachol was extended to "meaty plants" and then in Corachol was fixed as "nopal". 

And guess what? It turns out that wai "meat" in Corachol  (and yetl "bean in Nahuatl") which must have come from something like *pa'i, may also originally have referred to bighorn sheep (Stubbs reconstructs *pa'a)!   

Model 2. If we accept this model, Coracholan shared the "bighorn>meat" change with Nahuan and then innovated the nopal meaning. The Nahuan change of nakatl to mean "meaty" plants would then be a subsequent unrelated, but semantically convergent, change. (Photos from wikicommons, https://commons.wikimedia.org)

But it is also possible that the original meaning of naka- was "meaty umami-tasting food", which for the Northern Uto-Aztecan hunter-gatherers came to refer proto-typically to the bighorn sheep, and came to refer to meat in Nahuan  (but kept its connotation of meatiness in the words for onion and mushroom), and that it separately came to refer to the nopal cactus among the desert-dwelling Coracholan nomads. 

Model 3. Here the original meaning of naka is assumed to have been meat and meaty food, and Numic (in green) is assumed to have changed this to bighorn sheep. 

Interestingly, I have been able to observe a semantic change like this in process in Nahuatl: A couple of years ago when I was working in the Zongolica region a Nahuatl-speaking friend of mine pointed out that he was annoyed at how some people in the region had started using the word to:chin "rabbit" in the meaning "meat". He made fun of how they would for example say "tochin de puerco" (i.e. literally "rabbit of pig" ) in the meaning "pork". 

Am I the only one who could eat a grilled bighorn sheep with mushrooms, onions, and beans right about now?

lørdag den 30. juni 2018

Salt and Whiteness: The etymology of white stuff in Nahuan

This post arises from a conversation I had yesterday with R. Joe Campbell, who is one of the world's great Nahuatl scholars as well as an amazingly knowledgeable and kind man, whom I have had the great fortune to get to know when I lived in the US. Joe is working on a major analytical database that analyzes the morphology of all of the words in Alonso de Molina's dictionary. For that reason he is extremely interested in finding out how all of the thousands of Nahuatl words in the dictionary can best be analyzed. This often leads to interesting questions. 

The question of today's debate is this: Is the Nahuatl adjective istāthat names the color white, derived from istatl the noun meaning "salt"; or is the noun 'salt' derived from the adjective 'white'?

The question is relevant because it has ramifications for how we understand some basic things about Nahuatl grammar. 

In Nahuatl there is a clear tendency for color words to be derived from nouns that describe something with a particular color. This is of course very common in the world's languages: "orange" being an obvious example of this in English. In Nahuatl, many color names like are similarly derived.  The word chichiltik "red" is transparently derived from the word chilli "chili", and the color word tlīltik "black" is derived from the word tlīlli 'ink/soot'. Indeed in modern Nahuatl, one can productively derive new color terms by using the suffix -tik which produces an adjective with the sense of "like X". So nēxtik "like ashes" can mean 'grey', cafēntik "like coffee" or chocolatik "like chocolate" can mean "brown". And sometimes color words are even borrowed from Spanish with the -tik suffix, so that azultik is used for 'blue' in several dialects that I have encountered.  

This -tik suffix is generally regarded as a kind of participial form where the -k is the preterit ending describing a completed action, and the -ti- morpheme is related to the intransitive version of the causative (sort of like an inchoative) that means 'to become' (e.g. in tlākati "to be born" composed of tlāka "human" and -ti). This means, interestingly, that apparently denominal adjectives in Nahuatl are in fact deverbal, since the noun has to be "verbed" before the adjective can be derived. Many other adjectival verbs are derived from verbs using only the preterit ending -k, forexample tomāwak 'fat', and chikāwak 'strong' respectively derived from the inchoative verbs tomāw'to become fat' and chikāwa 'to become strong'.  

But not all denominal adjectives have the -tik ending, and nor do all color words. Notably the word for 'white' istā, does not, but seems to have a simple -k suffix that is added to the stem ista- 'salt' producing the same effect as the -tik suffix. Other tik-less adjectives are xokok 'sour' (related to xokotl 'fruit'), kokok 'spicy' (related to kokoa 'hurt'), sesēk 'cold' (related to setl "ice" or to sēwa ''be cold ). This challenges us to think about how the derivational process works in these cases, where the noun does not seem to have been verbalized before derivation, but where the denominal adjective nevertheless carries the preterit marker -k. 

Joe's proposal for how to deal with this is that the noun has indeed been verbalized, but that the verbalizing morpheme has been deleted. His argument goes like this: 

There is another verbalizing suffix in Nahuatl which is -ya, and it also gives an inchoative meaning 'to become X' or 'to make x Y'. For example from the adjective itztik 'cold' (maybe related to the noun itztli 'obsidian'), one can derive a deadjectival verb itztiya 'to become cold', and then one may form a participle with the preterit suffix -k so we get itztiyak 'cold' (but in a sense of "cooled down", implying that it was hot before). There is also such a verb derived from istatl 'salt', namely istaya 'to become salty'. 

So what if, Joe proposes, there is a grammatical rule that allows the -ya- to be deleted, so that itztik really is a shortened form of itztiyak, and istāreally is a shortened form of istayak. This would explain the seemingly non-verbalized adjectives derived from nouns. 

My argument is that this assumption is unnecessary, and in fact contradicted by the etymological evidence regarding the words for 'white' and 'salt' in Nahuatl. 

Let me give a bit of theoretical context for my disagreement: 

Nahuatl is of course a Uto-Aztecan language, and to understand the history of words one should not look only at the productive derivational processes in the language, but also at other related languages to reconstruct the deep history of the language. 

Nahuatl did not emerge as a fully formed context-free grammatical system of generative processes that derive words through well-defined rules from a well-defined set of lexical items. Rather, it developed gradually and incrementally through phonological and grammatical alterations caused by speakers interacting with each other, borrowing from each other, and imitating each others ways of using the language.  It is simply unrealistic to expect to be able to explain all vocabulary through synchronic grammatical processes. Rather we should invoke the historical process to explain the anomalies and irregularities that all languages have.

Let me now describe how the Nahuatl words for salt and white relate to the same words in other Uto-Aztecan languages.

Nahuatl:  istatl
Huichol: únaa
Cora: uná
Yaqui: óna
Tarahumara: oná
Northern Tepehuán: ónai
Shoshone: oŋa

Here we see that all the Uto-Aztecan languages have the word 'salt' derived from a single root that can be reconstructed as *ona. Nahuatl is the only Uto-Aztecan language to have a word for salt from a different root. This is not odd of course, Nahuatl could for example have borrowed its word for 'salt' from another language, or have innovated it from some other root.  

Nahuatl:  istā
Huichol: tuxa
Northern Tepehuán: tóha
Yaqui: tosa'i
Tarahumara: tosakame
Shoshone: tosa

Here Nahuatl again appears to be the odd one out, but in fact Nahuatl istak is cognate to the other Uto-Aztecan words for "white". What happened in Nahuatl is that when a word of the shape CVCV had the accent on the second syllable, then the vowel in the first syllable was weakened to the point of dissappearing - after which an prothetic i- was inserted infront of the consonant cluster: so Nahuatl followed this development: tòsá > tsa > itsa. "Oh, but that gives *itsa and not the desired ista", I hear you object. And you are right, but when the vowel syncope produces a cluster of certain consonants, the two consonants then switch places through a metathesis. This happens particularly with the cluster /ts/ which regularly metathesizes to /st/ after the syncope, perhaps to avoid confusion with the affricate phoneme /ʦ/. (Another example of this syncope with subsequent metathesis is the word for 'cave' ostotl which comes from Uto-Aztecan *tɨso  through the process *tɨso  > tso > itso > isto > osto). So while the word for salt in Nahuatl is not related to the uto-Aztecan root for salt, the word for white is related and clearly derives from the ancient root *tosa. Nahuatl also has another word derived from the same root, but without syncope and metathesis, namely tīsatl 'chalk'. Here we must assume that the proto language had two versions of *tosa distinguished by the placement of the accent, namely *tòsá "white" and *tósà "chalk" - the accentuated *ó developed into i, while the unaccentuated *ò was weakened and lost, producing the consonant cluster that subsequently underwent metathesis.

On this ground alone, even though it is not a very common process in the world's languages, we can conclude that the noun meaning 'salt' in Nahuatl is derived from the adjective "white", and not the other way round. At some point speakers of Nahuatl stopped referring to salt as 'ona', and instead started calling it "white stuff". And other speakers of Nahuatl liked this new way of talking about salt so much that they all began doing it, and eventually forgot the word 'ona' had ever existed.

This, however, also means that we still have to explain the -k ending, which then cannot really be considered a participial ending, as this would require the root to be verbalized.

Here comes my attempt at an explanation:

Whenever we learn a language, whether as children or adults, the main task is to observe and understand the different patterns of the language in a way that allows us to produce utterances that other speakers will understand. When we hear what others say, they can help us understand by using constructions that we have heard before, and that we can therefore be expected to understand. And when we speak we do the same to allow others to understand us. Irregularities hinder this process, and therefore we tend to over time convert irregular patterns to regular ones. This process is called analogy. 

Speakers of Nahuatl have used a set of patterns to help themselves distinguish well between different parts of their language. The final segment of a word tends to give a clear hint to the listener about whether the word is a verb, a noun or something else. Nahuatl has two major open word classes: verbs and nouns (and then some minor closed word classes such as particles, and a small class of true adjectives). Because Nahuatl has very free wordorder, it is helpful to be able to recognize words as nouns or verbs by their phonological form.

Verb stems always end in a vowel, and this vowel is usually a, less frequently i, very rarely o, and never e. Most nouns end with the absolutive suffix that has the most frequent form -tl/-tli. Perfective forms, both verbal and participial (participals of cours ebeing sort of mid-way between verbs and nouns), end with -k or -ki. 

In Nahuatl adjectives form an odd word class, since adjectives may be 'verby' either by being derived from verbs or by being participial forms of verbs. Others are 'nouny' and take nominal morphology (for example kwalli 'good' which originated a nominal form of the verb 'eat', and originally meant 'edible'). And yet others are neither verby or nouny (the ones we could call "true adjectives"): for example wēwe 'old', wēyi 'big'. Most adjectives however are verby participials ending in -k or -ki. This ambiguity, where a single class of words is a kind of irregularity that makes it harder for listeners to cognitively process utterances, because there is no overt mark associated with adjectives. This is the kind of situation that can cause processes of analogical change to kick in, by enforcing the dominant pattern on the irregular cases. The dominant pattern is that adjectives end in -k or -ki.

What I propose is therefore that the class of true adjectives was originally unmarked in the Uto-Aztecan languages, as is also the case in most of the languages today. But speakers of Nahuatl began to derive adjectives deverbally as participials creating a huge class of adjectives ending in -k. They then started gradually extending the -k pattern also to those true adjectives that originally ended in a vowel (and therefore looked verby) making them more recognizably adjective. 

ista 'white' became istā
yankwi 'new' became yankwik 
yeti 'heavy' became yetik 
koko 'spicy' became kokok
xoko 'sour' became xokok
yawi 'blue' became yawik 

In processes of analogical change often the most frequently used words are the ones that are the last to become assimilated to the regular pattern. This seems to be exactly what we see in Nahuatl, as wēyi, kwalli and wēwe are among the most frequently used adjectives. Perhaps in the future they will become *wēyik, *kwallik and *wēwek

It is interesting to think that perhaps istatl is not the only noun derived from an adjective: xokotl 'fruit' might originally have meant "something sour", and yawitl 'blue corn' might originally have meant "something blue". There is no word *kokotl in Nahuatl witha meaning similar to "something spicy" (kokotl in fact means "pimple"), but the word for chile in Corachol and other Southern Uto-Aztecan languages is kukuri where the -ri could well be considered equivalent to the Nahua absolutive suffix -tli. Perhaps Nahuatl used this same word *kokotl or *kokol in the meaning chile, before introducing the word chilli.

The point of it all is a reminder that even though Nahuatl is a language with an insane amount of productive morphology, where derivations can be stacked upon derivations, back and forth between the categories - that does not necessarily mean that everything can be (or should be) explained through synchronic processes and grammatical rules. Even as we strive to accurately describe the different grammatical processes that operate in the Nahuatl language, we must remember that it is not in fact the grammatical rules that determine how people speak, but rather, it is, the ways in which people speak that produce the rules of grammar.

fredag den 22. december 2017

How similar is Nahuatl to Hopi?

I recently encountered a surprising claim in a book called "Our Sacred Maíz Is Our Mother: Indigeneity and Belonging in the Americas", by Roberto Cintli Rodriguez. The claim is that Nahuatl and Hopi are so closely related that people who speak one will also be able to understand the other. 

Nahuatl and Hopi are both Uto-Aztecan languages, but linguists classify them are as far from eachother in the Uto-Aztecan language family as is possible. So given that even dialects of Nahuatl can be impossible to understand to speakers of other dialects, it is a remarkable claim that a Nahuatl speaker should be able to understand Hopi

The Flag of the Hopi nation,
with cornstalks and the four corners.

Rodríguez notes that this claim is contrary to everything linguists would have to say about the relation between the two languages, but states that a Nahuatl speaker he calls Maestra Cobb has talked about an experience when she was able to understand words spoken in Hopi by Hopi elders. While no linguist can of course say that Mtra. Cobb is wrong about her own experience, we can certainly suggest that if it is true it is such an exceptionally odd occurrence that it would normally require more than anecdotal evidence for others to accept. 
From a linguistic point of view, the claim is similar to an English speaker stating that she understood spoken Greek without having ever heard the language before. The saying "it's all Greek to me", is meaningful exactly because this does not usually happen (that is ever). The distance between Nahuatl and Hopi, whether measured in miles between the two current speech communities, or in years since the last common ancestor, is about the same as the distance between English and Greek. The father of empiricism, David Hume once wrote that ""A wise man ... proportions his belief to the evidence" (repeated by Carl Sagan as "extraordinary claims require extraordinary evidence").

In the following, I will compare Nahuatl and Hopi to demonstrate just how extraordinary the claim made by Rodríguez' and Mtra. Cobb is.  Since, I don't know Hopi myself, I will take phrases and words from Milo Kalectaca and Robert Langacker's 1978 "Lessons in Hopi" and compare them to their Nahuatl equivalents. 

Lets start with 10 basic vocabulary items: 
kuuyi /paahu

Of these ten only two are close enough that a person knowing the word in either Hopi or Nahuatl might reasonably be expected to guess the meaning of the word in the other languages: "I" and "man". Of the other words, four more are in fact related ("moon", "corn", "water", "star"), but are so far from eachother in sound that it would be very surprising if someone was able to guess the meaning of the related word in the other language. The last 4, are not related at all but come from separate roots: probably for these one of the languages language borrowed their terms from another unrelated language.

Now lets compare actual sentences: The grammar of the two languages is also very different.

"What is your name":
Hopi:       Um hin maatsiwa?   (literally: "you how be.named")
Nahuatl:  kenin timotoka?  (literally "how you.name.yourself") (or in the Huasteca variants kenihki motoka)

Here, Hopi has three words, Nahuatl has two, the only words that seem related arethe words for "how" - but I am not in fact sure they are. Certainly a person speaking one language but asked in another, would only be able to guess the meaning from the context, in the same way that we might guess that someone speaking a foreign language is presenting themselves if we see them shaking hands and saying the same thing to various people. One would be understanding the context, but not the words (this is of course how we all learn our first language, without a dictionary).

"She is eating"
Hopi: Pam tuumoyta (literally.  he/she/it     is.eating)
Nahuatl: (yeh) tlakwa (literally. he/she/it something-eat)

Here we see that neither the third person singular pronoun or the verb "to eat" seem related. In Nahuatl the pronoun can be omitted, but in Hopi it cannot.

Here we see another very big difference between Hopi and Nahuatl:

maana   tiyot      tsotsoona "the girl kisses the boy"
girl        boy       kisses

tiyo      maanat   tsotsoona "the boy kisses the girl"
boy     girl          kisses

In Hopi the subject of the sentence usually comes first, the object second and the verb last. Nouns have a special object-form, (the ending in -t) that makes it possible to see if a noun is object in a sentence.

In Nahuatl the same sentence can be said in any of the following ways:

kipitsoa    in piltontli    in ichpokatl
kisses.it    the boy         the girl

kipitsoa in ichpokatl in piltontli
kisses.it     the girl        the boy

in ichpokatl kipitsoa    in piltontli
the girl        kisses.it    the boy

in piltontli kipitsoa     in ichpokatl
the boy      kisses.it    the girl

in piltontli in ichpokatl    kipitsoa
the boy      the girl           kisses.it

But regardles of the order of the elements in the sentence the sentence can mean either "the boy kisses the girl" or "the girl kisses the boy". The order of the words is irrelevant, and there is no specific object or subject form on the nouns that lets us see what role the noun has in the sentence. Only intonation and context allows us to decide whether the sentence means that the boy or the girl does the kissing. Also, the Nahuatl verb has the prefix ki- which marks that the object is third person singular, i.e. "he/she/it".  This is a very big difference in the way that the grammar of the two languages works: Hopi is a langauge with fixed word order and grammatical case marking on nouns, Nahuatl is a language with free word order and grammatical marking on verbs. Additionally of course, none of the words in this sentence are related or even look like eachother.

The last example I will give is:

"I see you"
Hopi:       na ung tuwa
Nahuatl:  nimitzitta

Here Hopi has three words and Nahuatl has one, and in fact two of the elements are related the verb for "to see" in Hopi is tuwa and in Nahuatl itta - but they are in fact related; and the word for "I" in Hopi na, is in fact related to the Nahuatl prefix for the first person subject ni-. But even though the elements are related, I have a very hard time imagining that any Nahuatl speaker or Hopi speaker will be able to understand the meaning of the word in the opposite language.

So I would say that while Rodríguez' friend Mtra. Cobb may have been able to guess the meaning of a sentence in Hopi, or perhaps have heard some words of Hopi before that allowed her to understand some parts of a sentence, it seems highly unlikely that she would - and even more unlikely that a random Nahuatl speaker would be able to understand a random Hopi speaker, much less to converse.

But in the end it is of course an empirical question that can only be answered by carrying out the experiment.

I have always wanted to go visit Hopi, and I have Nahua friends who I am sure will be happy to come with me to meet their distant cousins up there.

fredag den 28. juli 2017

Reviewing Kaufman’s evidence for Mixe-Zoque, Wastekan and Totonakan borrowings in proto-Nahuan

In a 2001 paper, distributed on the internet through the website of the Project for theDocumentation of Languages of Mesoamerica (PDLMA) the eminent linguist and expert in Mesoamerican languages Terrence Kaufman analyzed the prehistory of Nahuan languages. He focused specifically on showing how influence from the languages of the Mesoamerican Language Area participated in shaping the Southern Uto-Aztecan dialect proto-Nahuan into the Mesoamerican language Nahuatl. The data used for the paper is very impressive, his conclusions well argued, and Kaufman’s writing style is as always very authoritative, and so the paper has been cited quite a few times (30 citations in google scholar).

In this post, I will take issue with some of the conclusions in Kaufman’s paper, specifically I will show that Kaufman significantly overstates his evidence for substantial lexical influence from Mesoamerican languages on proto-Nahuan, because he does not adequately take into account alternative, potential or probable etymologies from Uto-Aztcan sources. I show that most of his proposed borrowings into proto-Nahuan are in fact equally (or more) likely to have Uto-Aztecan etymologies, either from proto-Uto-Aztecan, from proto-Corachol-Nahuan or can be plausibly analyzed as originating as combinations of Nahuan roots.

My conclusion is that there are much fewer borrowings from Mixe-Zoquean, Wastekan and Totonakan in proto-Nahuan than often thought, and that we therefore cannot use this contact as evidence that proto-Nahuatl was spoken in the area of north-eastern Mesoamerica where Kaufman locates the speech community. Rather we should locate the proto-Nahuan speech community on the north-western periphery of Mesoamerica in close contact with Corachol and with Oto-Pamean languages. 

Proposed loans from Mixe-Zoque in all Nahuan

Kaufman’s source
Potential UA etymology
PUA *kawa “shell”
PZ *kɨ’ak
PCN *kakai
PMZ *kopak “head”
*PUA *kupa “top of head/hair”
PMi *pus
Huichol *purusi “stub, cut short”<PCN *puyusi “stub”
PZo *pata’
PCN *pɨta
Old man
PMZ *na’w
PCN *nawari “thief”  
*nawa “steal”
PMZ *(hah)-¢uku
PMZ *tu’nuk
Corachol *tutuvi “large parrot”, Nahua toto “bird”
PMZ *sam “heat”
PCN *sia “sand/clay” + mi “collective plural”
PMZ calque of *tɨ’k-ɨy “house enter”

Kaufman proposes 9 borrowings and a lexical calque from proto-Mixe-Zoque, proto-Zoque or proto-Mixe into proto-Nahuan. Of these borrowings, 7 have equally probable Uto-Aztecan etymologies, and 5 have definite cognates in Corachol, suggesting that if they are borrowings and not inherited then the borrowing would have been between proto-Mixe and proto-Corachol-Nahuan. The calque seems likely, and the word for ant seems possible. Also, I actually think the word for cacao is a likely borrowing from Mixe-Zoque, since the alternative “shell” etymology proposed by Dakin and Wichmann is somewhat weak, and given the fact that it is extremely unlikely that proto-Nahua was spoken by people who lived in a cacao-producing region whereas proto-Mixe-Zoque almost certainly was.  Nevertheless, the claim of Mixe-Zoque contact with proto-Nahuan seems to lack real support once the alternative etymologies are examined.  This is particularly significant because the words proposed as borrowings are highly culturally significant suggesting that Mixe-Zoque speakers had a profound culturalizing influence on proto-Nahua speakers, teaching them to use foot-wear, live in adobe houses with cultivated liverstock such as turkeys, and to use the culturally salient luxury good cacao, and that through them the Nahuas adopted the pan-Mesoamerican belief in shapeshifting sorcerers. With these borrowings, the role of Mixe-Zoque in this regard seems much less significant. Kaufman has been a major proponent of seeing Mixe-Zoque speaking Olmecs as the drivers of the development of the Mesoamerican cultural area, and they probably were – but it does not seem to me that there was any significant contact between Mixe-Zoque speakers and the proto-Nahuan speech community. This probably means that the Nahuas entered Mesoamerica after the decline of Olmec civilization in the centuries before the beginning of the first millennium.

Proposed loans from Wastekan in all Nahuan

Kaufman’s proposal
Potential UA etymology
Sp. of Parrot
Nahua oši-ƛ “sticky dirt”
Nahua ne:-te:č “reciprocal-together”

Kaufman proposes 5 loans from Wastek Maya into proto-Nahuan. Of these only pulque, and deer-foot seem likely loans. Kochotl is not a general Nahuan word, and there is no reason to reconstruct it for proto-Nahuan – likely be an exclusive eastern or Huasteca Nahua loan. Netech is morphologically analyzable as ne-te:ch. Ohoxihtli seems a likely reduplicated form of oxitl “dirt that comes of when you was”. Nahuas in fact associated the origin of pulque with the Huastecs, so it seems likely that this is indeed a likely loan. In conclusion, there may have been contact between proto-Nahuan and Wastekan, but if there was it was quite limited – the only likely loan is the word for pulque, and in fact not all Nahuan varieties have this root, as many use the inherited word for “honey” nekwƛi instead.

Proposed loans from Totonac in all Nahuan

Kaufman’s Totonac source
Alternative source
PUA (Stubbs, 2011, #557)
-¢in diminutive
Otomi-Mazahua či-
Corachol ¢i-/-ši (š is a regular cognate of Nahuan ¢ in Corachol)

Corachol *siuri “tadpole” regularly becomes Nahuan *šoli-.
Nahua: wa(k/h)-kal-ƛi “drying house”.
Corachol ¢ɨ¢ɨ
Sp. Of fish
waapa “tilapia”
Brother in law
Older sister
-pi “sister” (not younger)
SUA *saka “grass”, Hopi tïïsaqa ”grass”, NUA *saka “willow” (Stubbs 2011 #1055)
Plate/flat bowl
Wild avocado
Avocado is yewka in Coracholan suggesting an origin as proto-Cora-Nahuan *pewaka
Proto-Corachol-Nahuan *siwi “sour/bitter”. *iw becomes Nahua o, but the question is where the -ko element then comes from.
Totonac *ƛ

Kaufman’s 14 proposed loans from Totonac fare a little better when checked for plausible alternative etymologies. The forms šolotl, wahkalli, chichi, pihtli, pawatl have viable UA etymologies. Šolotl and chichi are shared with Corachol. The diminutive -tzin could be borrowed from Totonac, but Otomi-Mazahua has a diminutive/honorific prefix či- and Coracholan has a diminutive prefix ¢i- and a honorific suffix -ši.  The Totonac form does match the Nahua form better than either of those sources. In any case there is basis for considering the -¢i diminutive morpheme to be an areal trait since it is shared between Mesoamerican languages of three different linguistic families (Totonakan, Oto-Pamean and Uto-Aztecan).
The words pochotl and xonotl, describe species with restricted distribution that likely arose as local borrowings in the Nahuatl varieties spoken where these species are found and only subsequently spread through inter-Nahua contact – I would not reconstruct these words to proto-Nahuan. Wapotl and čone are not found in all (or most?) Nahuan dialects, but are local (recent) borrowings.
That leaves the words for plate, brother in-law, tilapia and xonote, as well as the phoneme ƛ, as likely borrowings from Totonacan into proto-Nahuan. 


Out of 29 proposed borrowings, only 9 seem more likely to have been borrowed, than to have been inherited. So, having reviewed the evidence of borrowings from Mixe-Zoquean, Totonac and Wastekan, I must conclude that the extent of lexical borrowings from Mesoamerican languages into proto-Nahuan is greatly overstated by Kaufman.

Kaufman also shows a long list of borrowings from Wasteko into Huastecan Nahuatl – the Nahuatl variety that we know has been spoken in close contact with Wastekan Maya for centuries. Here, most of the proposed borrowings seem completely plausible, but a couple to me suggest the direction of borrowing to be the opposite of what is assumed by Kaufman.

For example the Wastekan word kw’itš’a “grind in mortar” which Kaufman proposes as the source of Huastecan Nahua tekwicha “pestle” seems likely to be related to the Nahuan word for grinding kwečoa “to grind” from PN kwe¢iwa, and related to Huichol rakwi¢i “nixtamal”, Cora kwei¢i “dough” – suggesting a loan from Nahuan into Wasteko.

The Wastek word molik “elbow” is suggestive, but it is not restricted to Huastecan Nahuatl as Kaufman implies, it is found also in western Nahua branch (and as molic in Molina’s dictionary). This suggests either borrowing into Wastek from Nahuan or an additional example of Wastek contact with PN. Given the otherwise unconvincing evidence for Wastek/proto-Nahuan contact, it is probably best to see the default hypothesis as a loan from Nahuan into Wastek. The proposed borrowing of Wastek či’im “maguey juice” as čiimiƛ “mothers milk” in Wastek Nahuan is unlikely, since Cora has ¢i’imé “mothers milk” suggesting again borrowing in the opposite direction.

 In the paper itself, Kaufman states that Mesoamerican languages are seemingly reluctant to borrow and that therefor any situation in which a language is permeated by borrowings shows very intense contact. I think the review of the paper suggests that proto-Nahuan was not permeated with borrowings from Wastekan, Totonakan and Mixe-Zoquean.

References Cited: 

*Dakin, K., & Wichmann, S. (2000). Cacao and chocolate. Ancient Mesoamerica11(1), 55-75.

*Kaufman, T. (2001). The history of the Nawa language group from the earliest times to the sixteenth century: Some initial results. Paper posted online at http://www. albany. edu/anthro/maldp/Nawa. pdf. University of Pittsburgh.