without the filler:
Excavations have been taking place at Boğazköy-Hattusha for more than century under the direction of the German Archaeological Institute (DAI).
Around 30,000 clay tablets have been found at the site to date, which have shed light on various aspects of life during the Hittite period, according to the Julius-Maximilians-Universität Würzburg. The tablets contain inscriptions in cuneiform—what is generally considered to be the oldest known writing system. Developed by the ancient Sumerians of Mesopotamia more than 5,000 years ago, cuneiform is a script that was used to write several languages of the ancient Near East.
Most of the inscriptions found at Boğazköy-Hattusha record the extinct Hittite language, which is the oldest attested member of the Indo-European family. Other languages, such as Luwian and Palaic, are also represented at the site.
However, excavations conducted this year, led by professor Dr. Andreas Schachner of the DAI’s Istanbul Department, surprisingly uncovered a recitation of a previously unknown extinct language. The language was hidden on a cuneiform tablet containing a ritual text written in Hittite. The Hittite ritual text refers to the lost tongue as the language of the land of Kalašma, an area that likely corresponds to where the towns of Bolu or Gerede in northern Turkey are located today.
“The new language was written in cuneiform,” Schachner told Newsweek. “It is the same writing system the Hittites used. The text is part of a longer text starting in Hittite. As it continues it says at one point: ‘Continue in the language of the Land [of] Kalašma.’”
“The Hittites were uniquely interested in recording rituals in foreign languages,” Daniel Schwemer, head of the Chair of Ancient Near Eastern Studies at Julius-Maximilians-Universität Würzburg, said in a press release.
The recently discovered language remains largely incomprehensible. However, Professor Elisabeth Rieken with the Philipps University of Marburg, Germany, a specialist in Anatolian languages, has confirmed that the Kalasmaic tongue belongs to the Indo-European family, according to Julius-Maximilians-Universität Würzburg.
EDIT: a more readable article with some other details here - https://www.uni-wuerzburg.de/en/news-and-events/news/detail/news/new-indo-european-language-discovered/
That’s amazing. Not only it has value on itself, but input from Anatolian languages is specially useful for IE studies, as they branched out first.
From this link, with the same news:
despite its geographic proximity to the area where Palaic was spoken, the text seems to share more features with Luwian. How closely the language of Kalasma is related to the other Luwian dialects of Late Bronze Age Anatolia will be the subject of further investigation.
I’m specially interested if Kalasmaic show the same palatalisation as Luwian shows, but Hittite doesn’t. Or perhaps even a second pattern of palatalisation - because while traditionally those languages are classified as centum, they split off before the whole centum/satem division happened, so it’s possible that Kalasmaic already shows *ḱ ǵ ǵʰ palatalising but not *k g gʰ. If it does, that would provide evidence that further helps to date satemisation and “centumisation” in the rest of the family as a late PIE phenomenon.
The more I read into your comment, the less words I understand, but it was interesting nonetheless, thanks!
Sorry. I have a nasty habit to dig into posh vocab without explaining it. I’ll try to unpack it, feel free to ask questions as I know that this will be still a bit messy.
IE = Indo-European = a fucking huge family of languages. It includes Russian, Hindi, English, Italian, Greek, plus a lot more. The common ancestor of all those languages was nicknamed “Proto-Indo-European”, or PIE for short.
After people coined the term “Proto-Indo-European”, they realised that PIE was actually two languages:
- Early PIE - ancestor of Late PIE and Proto-Anatolian
- Late PIE - ancestor of the five languages I mentioned above
Proto-Anatolian is the ancestor of a bunch of dead languages from what’s today Turkey. For example: Hittite, Palaic, Luwian… and likely the language just discovered, nicknamed “Kalasmaic” (after the Kalasma principality from ~3500 years ago).
Palatalisation is a sound change. Usually it’s some sound like /t/ or /k/ being pronounced as /tʃ/ (as in church) or similar. It happens extremely often: Latin did it, Japanese did it, English did it, my dialect of Portuguese did it, etc.
*ḱ ǵ ǵʰ k g gʰ kʷ gʷ gʷʰ are nine sounds reconstructed for early and late PIE. Nobody knows exactly how they were pronounced (but everyone has a guess about it), that’s why they’re annotated with Zalgo-like diacritics. The asterisk means “we did not attest this directly”.
Centum and satem are groups of Indo-European languages, based on what happened with those nine sounds:
- satem - merged *k g gʰ with *kʷ gʷ gʷʰ; palatalised *ḱ ǵ ǵʰ
- centum - merged *k g gʰ with *ḱ ǵ ǵʰ; kept *kʷ gʷ gʷʰ alone
Note that in both cases *k g gʰ merged with something different.
Traditionally Hittite and other Anatolian languages are classified as centum. However, this division might not apply well to them, because maybe the sound changes above happened to Late PIE, after Hittite and co. branched off. Luwian happened to palatalise a lot of sounds, but that was a later sound change, unrelated to the one that the satem languages went through.
I was wondering if Kalasmaic underwent the same palatalisation as Luwian, and how much. It would be specially useful if it kept all three trios somehow distinct - because then we can say “yup, centum/satem does not apply to Anatolian languages”, and date the change to after Late PIE and Proto-Anatolian diverged.
Thanks for the explanation, it’s really a super interesting topic
I would be interested to hear about how it’s determined what language family it belonged to. Personal or place names? A smattering of cognates? Which ones?
This is just a guess, but since they were able to place it within the Anatolian branch but not solve where in it, they likely found only a handful of cognates with Hittite and/or Luwian. Probably showing either rhotacism or palatalisation, since they’re tentatively linking it with Luwian.
Personal and placenames alone are not reliable for that, as they’re often borrowed, unless you got a lot of them.
Here you go! Unfortunately Wikipedia seems to be a little sparse on details relevant to this situation but there’s plenty of info there to provide some good jumping-off points for further research.
“A fifth… element.”
This is the best summary I could come up with:
According to the Julius-Maximilians-Universität Würzburg in Germany, a public research university, the lost language belongs to the Indo-European family, which includes hundreds of related tongues that are all thought to share a single prehistoric ancestor.
The latest Indo-European language to be identified was discovered thanks to a ritual text inscribed on a tablet at the UNESCO World Heritage Site of Boğazköy-Hattusha in Turkey’s northern Çorum province.
The Hittite ritual text refers to the lost tongue as the language of the land of Kalašma, an area that likely corresponds to where the towns of Bolu or Gerede in northern Turkey are located today.
“The Hittites were uniquely interested in recording rituals in foreign languages,” Daniel Schwemer, head of the Chair of Ancient Near Eastern Studies at Julius-Maximilians-Universität Würzburg, said in a press release.
However, Professor Elisabeth Rieken with the Philipps University of Marburg, Germany, a specialist in Anatolian languages, has confirmed that the Kalasmaic tongue belongs to the Indo-European family, according to Julius-Maximilians-Universität Würzburg.
In a study published in the journal Transactions of the Philological Society, a team of scientists describe how they partially deciphered the “unknown” Kushan script, an ancient writing system that was once in use in parts of Central Asia between around 200 B.C.
The original article contains 565 words, the summary contains 206 words. Saved 64%. I’m a bot and I’m open source!
This is exactly what a large languages model (LLM) could help translate. It would be amazing to train a LLM on this and all other languages and see if it can learn to translate it.
Putting aside all the other issues… How do you expect to train a large language model on what is probably one clay tablet of text?
Dude the blockchain will literally fix it all. Sprinkle some federated protocol on there and it’ll cure cancer by tomorrow.
My bad I thought it said 30,000 tablets were found in the unknown language.
Ouch, ok.
Also as far as I know, LLMs require parallel corpora (i.e. same text in different languages) to learn to translate. Otherwise I see no way how they could establish connections across the different languages.
Interesting. I wonder if we can eventually figure out how to train against an unknown language and map relationships to a known language without parallel corpora.
Wouldn’t there be similar patterns that emerge that could be correlated to approximate the matching symbols using alignment techniques?
Ud Reaaa