tirsdag den 25. december 2018

Was the Voynich manuscript written in Nahuatl?

Excerpt of the text from the Voynich Codex
 showing the odd script.

Recently a number of papers by a group of botanists from Purdue University have proposed that the enigmatic Voynich manuscript which has so far resisted decipherment was written in Nahuatl in the 16th century.

The Voynich manuscript is a codex written on 16th century vellum paper, which clearly includes botanical illustrations, but also a number of baffling illustrations that seem to be cosmological as well as maps. The pictures are accompanied by writing in a mysterious script that has been subject to multiple analyses and decipherment attempts.

In this blogpost, I give my impression of the linguistics of the proposed decipherment of the Voynich manuscript as a kind of Nahuatl.

Excerpt from the 16th century Nahuatl language
herbal Codex Badianus showing the similarity of the illustrations
(Actually, I think the Badianus has much better illustrations.)
The scholars who have advanced the proposal that the codex is written in a form of Nahuatl are Arthur Tucker and Rexford Talbert and Jules Janick. They published their 2013 proposal titled "A Preliminary Analysis of the Botany, Zoology, and Mineralogy of the Voynich Manuscript "  in HerbalGram, the Journal of the American Botanical Council. With additional material published at their institutional deposititory.

Now, in 2018, Janick and Tucker published a book titled "Unraveling the Voynich" on Springer Press, which presents the entire argument in favor of seeing the Voynich  manuscript as a Mexican codex, written largely in Nahuatl - with some Spanish and Taino mixed in.


The Codex: 

Folio 9r of the Voynich Manuscript
showing a plant with odd shaped leaves.

The codex has 240 pages, some of which are wide fold-out pages. Analysis of the parchment has shown it to be from the early 15th century, made from calf skin. Most of the contents are illustrations of plants with small texts written in an odd script. Other pages are astrological charts, populated with little nude ladies who bathe and shower in odd tubs connected with pipes.

The first known owner was Georg Baresch a 17t century alchemist in Prague. Other owners seem to have been Emperor Rudolph II, Jesuit scholar and self-proclaimed decipherer of the egyptian hieroglyphs Athanasius Kircher. When the Jesuit society decided to sell the manuscript it was bought by Lithuanian bibliophile Wilfrid Voynich after whom it is named. Today it is housed in the Beinecke Library at Yale University where it is catalogued as "Beinecke MS 408", where it has been digitized and put online for anyone to inspect (located here: https://brbl-dl.library.yale.edu/vufind/Record/3519597)

All of the pages have writing in the odd script, and in spite of a host of the world's quirkiest minds working to decipher it, it has still not been read.

Here is a chart of the symbols (from Wikipedia) - the correspondence with the Latin alphabet is only to be able to name each glyph with letters from A to B:


As mentioned the mysterious manuscript has been scrutinized by many of the world's quirkiest minds - the same type of mind that would spend a career seeking to prove that Basque or Burushaski are Indo-European languages - and they have produced an amazing gamut of different proposals: From codes and ciphers, or a hoax, or shorthand Latin, or glossolalia, or an East Asian language, and now, Nahuatl.

But most of these odd proposals have not been published as presumably(?) peer-reviewed edited volumes by Springer, so the Nahuatl proposal does merit serious attention. Especially given the fact that no Nahuatl specialists have been involved in the decipherment.

The Argument for Nahuatl: 

There are three main arguments used for identifying the manuscript as written in Nahuatl:

  1. The herbological part of the codex has similarities to Mexican herbological codices produced in the mid 16th century, and the botanists argue that many of the plants can be identified as new world species. And that a map of a city can be identified as "angelopolis" which they identify as the city of Puebla (de los Ángeles) in the state of Puebla. 
  2.  The proposed tl-letter looks like the first letter in
     this word tlanequilis from 
    an 18th-century Nahuatl testament.          
  3. The character  which is very frequent in the manuscript, is similar to a ligature character found in some Mexican codices representing the Nahuatl consonant tl. (It also sort of looks like the way I write capital H when I write my signature, and like how many people write a double l)                        
  4. The proponents argue that some of the plants can be identified by Nahuatl names, and claim that they can read some of the text in Nahuatl, using their identification of the glyphs with Nahuatl phonemes. 
I will look primarily at the third of these arguments, both because this is the actual claim to a decipherment. Arguments one and two can be true even if the language is not Nahuatl. All claims to decipherment of course rest on the degree to which they actually allow us to read the texts written in the script that they are claiming to decipher.

The main argument of the book is that the book contains elements of Nahuatl and new world flora, that it contains inspiration from the Jewish Kabbalah (which they claim was practiced among Franciscans in the New World), and that it refers to the city of Puebla de los Angeles which was founded by the Franciscan friar Toribio Benavente "Motolinia".

Nevertheless, an odd chapter by the linguist Fernando Moreira, looks at the readings and compares them with different Mesoamerican languages, finding that it doesn't really match any of them - and then proposes an undescribed Mesoamerican language which he calls "acolhuacatlatolli" (the Nahuatl word for "language of the Acolhua"). The Acolhuas were the Nahuatl-otomí ethnic group that lived in Texcoco. We know their language very well since most of what we today call "Classical Nahuatl" is in fact the Acolhua dialect of Nahuatl.  Moreira nevertheless, oddly suggests that it could have been a form of Popoloca (which is what Nahuas called all the languages they couldn't understand including at first Spanish).

So while the general argument of the book is that the language is a form of mixed Nahuatl-Spanish, the chapter by Moreira argues that it is not, and then introduces an unknown and undescribed language as a sort of deus ex machina that allows them to maintain the main parts of their hypothesis when the evidence is shown not to support it. In the rest of this blog post, I will argue based on the original proposal that it is Nahuatl or has a Nahuatl element, and not based on the alternative hypothesis that it represents an undescribed Mesoamerican language, nor the possibility that it represents a language spoken by space aliens who built the Mexican pyramids.

The Problems: 

Ok, I am already going into the problems with the proposal. The most nefarious problem is that it is pseudo-rigorous -  that is it, it works hard to give the appearance of being rigorous scholarship while in fact it is not at all.  They cite lots of serious scholarship, and mostly they cite it correctly, but nevertheless all the citations are used only for circumstantial evidence. As soon as we look at the concrete examples and the readings they are unsupported by this evidence and rests on pure speculation - often uninformed speculation.

For me the best problem, best because it is so solid that it clearly invalidates the entire endeavor, is the fact that none of the proposed readings are valid - hardly a single one of the proposed words actually read like a bona fide Nahuatl word.

Many of them are completely alien to Nahua phonological structure. And to be honest I am surprised that the scholars haven't found it to be odd that a few of the letters are so frequent that they appear in almost all words - for example more than half of the proposed plant names (and names of the nude ladies they call "nymphs") start with the letter that they read as /a/ - that would be very odd in a natural language, unless the a was a very frequent grammatical prefix (which it isn't in Nahuatl).

The readings:
Table from Janick & Tucker 2018:141

Janick and Tucker produce a full set of proposed readings for the voynichese symbols given in two tables on page 141-142. I reproduce the first part of the table here to the right (non-underlined Latin equivalents are "tentative").

Following the tradition of comparing letter frequencies in decipherment proposals, the table also supplies the frequency of each symbol in the Voynich Manuscript and the frequency of the proposed Latin equivalent in a randomly selected Nahuatl manuscript.

It is odd that the proposed readings include both signs for single phonemes as well as sings for syllables câ (we are not told what the circumflex above the â is supposed to mean? Does it represent a saltillo?) and yâ/hâ (hâ is not actually a possible Nahuatl syllable).

It also seems that Janick and Tucker fail to realize that the letter u found after c and h in classical Nahuatl texts is not actually a vowel, but represent the sound of the consonant /w/ or the lip rounding in the phoneme /kw/. This is basic stuff, and why it makes no sense to seek to make a decipherment using a language that one does not in fact understand (Champollion knew this, and that was why he spent so much time studying Coptic and other Semitic languages).

Here are some of their readings of the names of plants.

First the one that seems to be their clou: the reading of the name of a cactus-like plant as <NĀSHTLI>. They argue that this reading resembles the Nahuatl name of the fruit of the nopal cactus which is nōchtli. And sure, it does look similar to that word. The -tli ending looks like the absolutive suffix, and the root NĀSH is superficially similar to nōch-, and the reading follows Nahua phonological rules. Nevertheless, a and o are different vowels in Nahuatl, and sh (x) and ch are different consonants - so only one out of three letters in the root of the proposed reading actually match, the others are "near matches" at best.

Other readings fare a lot worse. Look for example at these images proposals:





As is known to any serious student of Nahuatl, Nahuatl does not allow consonant clusters in the beginning or end of syllables, and also does not allow clusters of more than two consonants in the middle of words. Words like ichpchi or itlmamcho or itlmaca or itlmchi are not possible words in any dialect of Nahuatl.

It seems reasonable to expect more of a decipherment than for it to produce one near match and then a load of meaningless gibberish.

Some of the syllables or even sequences of two syllables that occur frequently in their readings do have potential readings - but this is only natural given that Nahuatl has a rather small phoneme inventory and therefore not many different potential syllables. For example they note that mā means "hand" and cui means "to take" and māca means "to give" - but given how short these monsyllabic sequences are and how frequent the elements are, it is simply a coincidence. When there aren't more letters, and the letters have been assigned Nahuatl equivalents, some sequences in the reading are bound to look like some sequences in the vast vocabulary of Nahuatl. 

The chance of random matches gets even worse when they admit the possibility of readings in Spanish and Taino and of weird mixtures of the two (unlike anything found in any colonial Document). Why for example, would a Nahua or Nahuatl speaker,  given that Nahuas were expert cultivators of agave, use the Taino word for agave "maguey" (in the mangled spelling <MAHUEOI>) and not the Nahuatl word metl?  


A Test: 

The best way to assess a proposed decipherment is of course by testing it on a piece of text and see what it produces, and if it is intelligible.

I tried such a test on a piece of text from the top of folio 28v, and below is the result. It is utterly unintelligible, it has only the vaguest resemblance to Nahuatl - and that is only because of the strong association between the /TL/ phoneme and Nahuatl. The phonology is alien to Nahuatl, allowing for example consonant clusters in the beginning and end of words, and failing to respect the Nahua phonological rules of assimilation. Nahuatl is of course a language that has few phonemes and a lot of a's and a lot of tl's and cu's and hu's and so does this proposed reading - but that is only because the decipherers on purpose have assigned those readings to the most frequent letters. Furthermore these letters are twice as frequent in this "language"  than they are in Nahuatl according to their own count - for example Nahuatl only has the frequency 4,7% for tl, whereas the Voynich has the frequency 10%. So what we get is a text that superficially looks like Nahuatl, but only to someone who doesn't actually know any Nahuatl.

Nevertheless, anyone who knows any dialect of Nahuatl will be able to see that the below is not Nahuatl, and that only certain words resemble Nahuatl because they have the sounds and endings that are frequent in nahuatl such as -tli and -câ (why do the decipherers add the ^ symbol above the a in the câ letter? They never explain what it is supposed to represent).

Following the proposed decipherment this text reads:
TLMCÂ CUAALL MAE  HUMOLL  MAHUMI CUATLI CHIMAEI
ITLMACÂI CUATLO MICHI CUATLMAE MAE TLMI CUATLMAECHI MAEA
MAE MACÂ MI MALL

Could it be Nahuatl or inspired by Nahuatl?

The language of the proposed reading clearly is not Nahuatl. It has only the most superficial structural resemblance to Nahuatl,  even if we were to admit the possibility of undescribed dialects. When we decide to read the most frequent signs of the script as their most frequent Nahuatl counterparts the text naturally comes to resemble Nahuatl. But since it violates the phonological rules expected of Nahuatl, and is entirely void of any recognizable grammatical structure from Nahuatl (we can't even see differences between verbs and nouns, much less actual grammatical morphology) this can safely be discarded.

A further argument against the plausibility of the background story of the proposal is historical: In mid-16th century Mexico anyone who would be able to produce a codex would also have been able to write it in proper Nahuatl - even Spanish friars (this was a requirement for being a priest in Mexico at this time). So, OK maybe they would want to invent a new script so that nobody could read what they had written about all those little naked ladies - but one would of course assume that they would then write intelligible Nahuatl. Otherwise why bother?

The Libellus de Medicinalibus Indorum Herbis, also known as the Codex Badianus is an actual herbal manuscript which is known to have been written by Nahua scholar Martin de la Cruz in 1552 and later translated into Latin by another Nahua scholar, Juan Badiano. Nahua people in the 16th century were not only able to write intelligible Nahuatl, they were also able to translate it into Latin. And to boot the illustrations are much better, and allows easy identification of the different species - the Voynich plant drawings come across as crude by comparison. 

Finally, as I read the example it bothered me that there is a certain repetitiveness in the deciphered text, the same letters seem to occur very frequently in combinations with specific other letters. This is not usually the case for natural languages - but very frequent in something like glossolalia of the baby-speech "lalala balala malalaba"- type.

Some of the little naked ladies, these ones from folio 80r. 

Bibliography:

  • Janick, J., & Tucker, A. O. (2018). Unraveling the Voynich Codex. Springer.