Archaeologists discover previously unknown language from ancient tablet

antonim@lemmy.dbzer0.com · edit-2 1 year ago

Archaeologists discover previously unknown language from ancient tablet

Lvxferre@lemmy.ml · edit-2 1 year ago

That’s amazing. Not only it has value on itself, but input from Anatolian languages is specially useful for IE studies, as they branched out first.

From this link, with the same news:

despite its geographic proximity to the area where Palaic was spoken, the text seems to share more features with Luwian. How closely the language of Kalasma is related to the other Luwian dialects of Late Bronze Age Anatolia will be the subject of further investigation.

I’m specially interested if Kalasmaic show the same palatalisation as Luwian shows, but Hittite doesn’t. Or perhaps even a second pattern of palatalisation - because while traditionally those languages are classified as centum, they split off before the whole centum/satem division happened, so it’s possible that Kalasmaic already shows *ḱ ǵ ǵʰ palatalising but not *k g gʰ. If it does, that would provide evidence that further helps to date satemisation and “centumisation” in the rest of the family as a late PIE phenomenon.

nyoooom@lemmy.world · 1 year ago

The more I read into your comment, the less words I understand, but it was interesting nonetheless, thanks!

Lvxferre@lemmy.ml · 1 year ago

Sorry. I have a nasty habit to dig into posh vocab without explaining it. I’ll try to unpack it, feel free to ask questions as I know that this will be still a bit messy.

IE = Indo-European = a fucking huge family of languages. It includes Russian, Hindi, English, Italian, Greek, plus a lot more. The common ancestor of all those languages was nicknamed “Proto-Indo-European”, or PIE for short.

After people coined the term “Proto-Indo-European”, they realised that PIE was actually two languages:

Early PIE - ancestor of Late PIE and Proto-Anatolian
Late PIE - ancestor of the five languages I mentioned above

Proto-Anatolian is the ancestor of a bunch of dead languages from what’s today Turkey. For example: Hittite, Palaic, Luwian… and likely the language just discovered, nicknamed “Kalasmaic” (after the Kalasma principality from ~3500 years ago).

Palatalisation is a sound change. Usually it’s some sound like /t/ or /k/ being pronounced as /tʃ/ (as in church) or similar. It happens extremely often: Latin did it, Japanese did it, English did it, my dialect of Portuguese did it, etc.

*ḱ ǵ ǵʰ k g gʰ kʷ gʷ gʷʰ are nine sounds reconstructed for early and late PIE. Nobody knows exactly how they were pronounced (but everyone has a guess about it), that’s why they’re annotated with Zalgo-like diacritics. The asterisk means “we did not attest this directly”.

Centum and satem are groups of Indo-European languages, based on what happened with those nine sounds:

satem - merged *k g gʰ with *kʷ gʷ gʷʰ; palatalised *ḱ ǵ ǵʰ
centum - merged *k g gʰ with *ḱ ǵ ǵʰ; kept *kʷ gʷ gʷʰ alone

Note that in both cases *k g gʰ merged with something different.

Traditionally Hittite and other Anatolian languages are classified as centum. However, this division might not apply well to them, because maybe the sound changes above happened to Late PIE, after Hittite and co. branched off. Luwian happened to palatalise a lot of sounds, but that was a later sound change, unrelated to the one that the satem languages went through.

I was wondering if Kalasmaic underwent the same palatalisation as Luwian, and how much. It would be specially useful if it kept all three trios somehow distinct - because then we can say “yup, centum/satem does not apply to Anatolian languages”, and date the change to after Late PIE and Proto-Anatolian diverged.

nyoooom@lemmy.world · 1 year ago

Thanks for the explanation, it’s really a super interesting topic

fred@lemmy.ml · 1 year ago

I would be interested to hear about how it’s determined what language family it belonged to. Personal or place names? A smattering of cognates? Which ones?

Lvxferre@lemmy.ml · 1 year ago

This is just a guess, but since they were able to place it within the Anatolian branch but not solve where in it, they likely found only a handful of cognates with Hittite and/or Luwian. Probably showing either rhotacism or palatalisation, since they’re tentatively linking it with Luwian.

Personal and placenames alone are not reliable for that, as they’re often borrowed, unless you got a lot of them.

Obi@sopuli.xyz · 1 year ago

sleep_deprived@lemmy.world · 1 year ago

Here you go! Unfortunately Wikipedia seems to be a little sparse on details relevant to this situation but there’s plenty of info there to provide some good jumping-off points for further research.

aeronmelon@lemm.ee · 1 year ago

“A fifth… element.”

AutoTL;DR@lemmings.world · 1 year ago

This is the best summary I could come up with:

According to the Julius-Maximilians-Universität Würzburg in Germany, a public research university, the lost language belongs to the Indo-European family, which includes hundreds of related tongues that are all thought to share a single prehistoric ancestor.

The latest Indo-European language to be identified was discovered thanks to a ritual text inscribed on a tablet at the UNESCO World Heritage Site of Boğazköy-Hattusha in Turkey’s northern Çorum province.

The Hittite ritual text refers to the lost tongue as the language of the land of Kalašma, an area that likely corresponds to where the towns of Bolu or Gerede in northern Turkey are located today.

“The Hittites were uniquely interested in recording rituals in foreign languages,” Daniel Schwemer, head of the Chair of Ancient Near Eastern Studies at Julius-Maximilians-Universität Würzburg, said in a press release.

However, Professor Elisabeth Rieken with the Philipps University of Marburg, Germany, a specialist in Anatolian languages, has confirmed that the Kalasmaic tongue belongs to the Indo-European family, according to Julius-Maximilians-Universität Würzburg.

In a study published in the journal Transactions of the Philological Society, a team of scientists describe how they partially deciphered the “unknown” Kushan script, an ancient writing system that was once in use in parts of Central Asia between around 200 B.C.

The original article contains 565 words, the summary contains 206 words. Saved 64%. I’m a bot and I’m open source!

notfromhere@lemmy.one · 1 year ago

This is exactly what a large languages model (LLM) could help translate. It would be amazing to train a LLM on this and all other languages and see if it can learn to translate it.

antonim@lemmy.dbzer0.com · 1 year ago

Putting aside all the other issues… How do you expect to train a large language model on what is probably one clay tablet of text?

Tvkan@feddit.de · edit-2 1 year ago

Dude the blockchain will literally fix it all. Sprinkle some federated protocol on there and it’ll cure cancer by tomorrow.

notfromhere@lemmy.one · 1 year ago

My bad I thought it said 30,000 tablets were found in the unknown language.

antonim@lemmy.dbzer0.com · edit-2 1 year ago

Ouch, ok.

Also as far as I know, LLMs require parallel corpora (i.e. same text in different languages) to learn to translate. Otherwise I see no way how they could establish connections across the different languages.

notfromhere@lemmy.one · 1 year ago

Interesting. I wonder if we can eventually figure out how to train against an unknown language and map relationships to a known language without parallel corpora.

notfromhere@lemmy.one · 1 year ago

Wouldn’t there be similar patterns that emerge that could be correlated to approximate the matching symbols using alignment techniques?

Blue and Orange@lemm.ee · 1 year ago

Ud Reaaa