r/IndoEuropean • u/lpetrich • 19d ago
Linguistics How much can one deduce about Proto-Indo-European if one only had the present-day Indo-European languages?
Many language families are only known from members documented only over the last few centuries, so it would be interesting to speculate about how much we can learn about Proto-Indo-European if we only had its present-day members. As part of this exercise, let us suppose that we already know how to do historical-linguistics research.
Some families would be easy to recognize: Goidelic, Brythonic, Romance, Germanic, Baltic, Slavic, Albanian, Greek, Armenian, Iranian, and Indic. Some would be more difficult: Celtic, Balto-Slavic, and Indo-Iranian. Indo-European itself would be even more difficult, but I think that it could still be recognized.
One would try to avoid the complication of borrowed words by using lists of highly-conserved and seldom-borrowed words, like the Swadesh, Dolgopolsky, and Leipzig-Jakarta lists, lists with pronouns, "name", small numerals, human beings and close kinship terms, body parts, and common animals, plants, natural phenomena, qualities, and actions.
With these lists, one can find sound correspondences like Grimm's law.
Grammar would be more difficult, but one can make a little progress.
Although articles (a, an, the) are common, they have a lot of variety, and one will conclude that they are later inventions and that if PIE had any articles, they were lost.
Noun plurals have a lot of variety, as do noun cases, with no cases to seven cases in Baltic and some Slavic languages. Some languages have more cases in pronouns than nouns, and some of these ones are closely related to languages with more cases. Did they partially lose cases?
There are some correspondences in the noun cases:
- Dative plural: Icelandic -um, German -n, Baltic, most Slavic -m-
- Nominative singular -s absent from accusative singular: Greek, Baltic (Lithuanian, Latvian), nominative but not accusative singular -r: Icelandic
Turning to verbs, several of the languages have similar personal endings, subject-agreement ones, though several others have much-reduced endings or no endings. For the present tense, I come up with these simplified forms for the more distinct endings:
- Icelandic: -, -r, -r; -um, -idh, -a
- Spanish: -o, -s, -; -mos, -is, -n
- Irish: -im, -ir, -ann; -imid, -ann, -id
- Lithuanian: -u, -i, -a; -me, -te, -a
- Russian: -yu, -sh, -t; -m, -te, -t
- Greek: -o, -is, -i; -ume, -ete, -un
- Albanian: -(j), -(n), -(n); -m, -n, -n
- Persian: -am, -i, -ad; -im, -id, -and
- Bengali: -i, (-ish, -o), -e (singular, plural)
Greek also has mediopassive endings: -ome, -ese, -ete; -omaste, -este, -onde -- the only language to have such endings.
In general, however, verb tense, aspect, mood, and voice constructions are often subfamily-specific and hard to relate across the subfamilies.
There is an exception: the suppletion in the verb "to be":
- English: (inf) be, (3s) is, (past 3s) was
- Spanish: (past 3s) fue, (3s) es
- Lithuanian: (inf) bûti, (3s) yra,(2s) esi
- Serbo-Croatian: (inf) biti, (3s) jesti
- Persian: (inf, vb noun) budan, (3s) ast
- Irish: is
- Welsh: (vb noun) bod
- Albanian: (3s) është
- Greek: (3s) ine, (2s) ise
- Armenian: (3s) ê, (2s) es
The Romance f- is related to others' b- by a sound correspondence: Italian fratello ~ French frère ~ English brother ~ Welsh brawd (pl. brodyr) ~ Lithuanian brolis ~ Russian brat ~ Czech bratr ~ Persian barâdar ~ Hindi bhâi
Looking halfway back to the emergence of the Latin and Greek literary traditions (~ 200 BCE, ~ 800 BCE), back to around 900 CE, one finds that Old English, Old Saxon, Old High German, and Old Norse have grammar much like Icelandic grammar. Old Church Slavonic is much like reconstructed Proto-Slavic, noun cases and all.
One finds much less borrowing, and one finds a little more support for PIE grammatical features. In particular, Old Irish has dative plural -b, much like Germanic, Baltic, and Slavic -m, and Old French has a curious declension: nom sing -s, acc sing -, nom pl -, acc pl -s, something like Greek, Baltic, Icelandic, and Old Norse -s and -r.
3
u/lpetrich 18d ago edited 18d ago
Indo-European is unusual for language families with its time depth having attested members with great antiquity. List of languages by first written account - Wikipedia Among families with present-day speakers, only Semitic beats it in oldest attested member, with Akkadian (2600 BCE) beating Hittite (1700 BCE).
However, IE and Sino-Tibetan beat Semitic in oldest attested members with descendants that are still spoken:
- IE: Greek (1450 BCE), Sanskrit (1500 - 1000 BCE: Indic)
- Sino-Tibetan: Old Chinese (1250 BCE: Chinese)
- Semitic: Hebrew (1000 BCE), Aramaic (1000 BCE)
Bayesian phylogenetic analysis of Semitic languages identifies an Early Bronze Age origin of Semitic in the Near East - PMC has date 3750 BCE for Proto-Semitic.
- East Semitic: Akkadian: 2600 BCE X
- Central Semitic: (calc) 2450 BCE
- Arabic: 750 BCE
- Northwest Semitic: (calc) 2100 BCE
- South Semitic: (calc) 2650 BCE
- Central Semitic: (calc) 2450 BCE
Northwest Semitic: (calc) 2100 BCE
- Hebrew-Aramaic (calc) 1500 BCE
- Hebrew: 1000 BCE
- Aramaic: 1000 BCE
South Semitic:
- Southeastern Semitic (Modern South Arabian): (calc) 100 BCE
- Southwestern Semitic
- Old South Arabian: Sabaean: 850 BCE X
- Ethiosemitic: (calc) 800 BCE
- Ge'ez: 350 CE X
- Amharic: 1200 CE (?)
2
u/Eannabtum 18d ago
Among families with present-day speakers, only Semitic beats it in oldest attested member, with Akkadian (2600 BCE) beating Hittite (1700 BCE).
Actually, Semitic is part of Afro-Asiatic, so the family is attested from ca. 3200 BC (Egyptian).
2
u/lpetrich 18d ago
I was a bit reluctant to mention Afroasiatic, because there is not nearly as much that can be firmly reconstructed as there is for (say) Indo-European or Austronesian or Bantu. One can reconstruct personal pronouns and pronoun-related inflections, a bit of other grammar like feminine *-t, some basic vocabulary like *dam- "blood", but not much more.
Estimates of the age of AA vary over 20,000 to 10,000 years (dates 18,000 BCE - 8000 BCE), roughly the end of the Last Glacial Maximum to the beginning of the Holocene.
There are two major attempts to reconstruct AA, and they do not agree on very much.
- Reconstructing Proto-Afroasiatic (Proto-Afrasian) by Christopher Ehret - Paper - University of California Press by Christopher Ehret
- Hamito-Semitic Etymological Dictionary – Materials for a Reconstruction | Brill by Vladimir E. Orel and Olga V. Stolbova
- On calculating the reliability of the comparative method at long and medium distances: Afroasiatic Comparative Lexica as a test case by Richard Ratcliffe
1
u/Reasonable_Regular1 18d ago
Old South Arabian: Sabaean: 850 BCE X
The paper you linked doesn't mention Sabaic, and that's because it's a Central Semitic language. (Likewise, its claim that "no consensus exists for placing Arabic in either the Central or South Semitic group" is a good thirty years out of date.)
2
u/lpetrich 18d ago
List of languages by first written account - Wikipedia
First attestation after the Indo-European Big Three, sometimes long after, and age comparable to IE itself, yet generally accepted:
- Austronesian origin time 3500 BCE: Malayo-Polynesian o.t. 2250 BCE: Cham 350 CE, Old Malay 683 CE, Old Javanese 804 CE
- Austroasiatic o.t. 2000 BCE: Old Khmer 611 CE, Vietnamese 1440 CE
- Uralic o.t. 2000 BCE: Hungarian, Finnic 1200 CE
- Kra-Dai o.t. 2000 BCE: Thai 1292 CE
- Uto-Aztecan o.t. 2000 BCE: Classical Nahuatl 1550 CE
- Niger-Congo o.t. (?): Benue-Congo o.t. 4500 BCE: Bantu o.t. 3000 BCE: Kikongo 1557 CE, Swahili 1711 CE
- Pama-Nyungan o.t. 3000 BCE: Guugu Yimidhirr 1886 CE
Sources:
- The Austronesian Homeland and Dispersal | Annual Reviews
- Early Austronesians: Into and Out Of Taiwan - PMC
- Austroasiatic etymologies of words related to household structures
- Drastic demographic events triggered the Uralic spread | John Benjamins
- Uralic archaeolinguistics | The Oxford Handbook of Archaeology and Language | Oxford Academic
- Phylogenetic evidence reveals early Kra-Dai divergence and dispersal in the late Holocene | Nature Communications
- Lack of linguistic support for Proto-Uto-Aztecan at 8900 BP | PNAS
- Project MUSE - A recent northern origin for the Uto-Aztecan family
- Phylogeographic analysis of the Bantu language expansion supports a rainforest route | PNAS
- Automated Dating of the World’s Language Families Based on Lexical Similarity | Current Anthropology: Vol 52, No 6 -- Malayo-Polynesian, Benue-Congo, Pama-Nyungan dates from archeology
1
u/lpetrich 18d ago edited 18d ago
List of languages by first written account - Wikipedia
Most language families with a time depth comparable to Indo-European, about 4000 - 2000 BCE, that have surviving members, have those members first attested later than the traditional Big Three of Indo-European: Latin, Greek, and Sanskrit.
- Latin: earliest writing 550 BCE, beginning of surviving literary tradition 200 BCE
- Greek: earliest writing 1450 BCE, beginning of surviving literary tradition 800 BCE
- Sanskrit: earliest surviving compositions 1500 - 1000 BCE: the Vedas
I'd mentioned Semitic as having surviving members with comparable age: Hebrew, Aramaic, Arabic. Others:
- Sino-Tibetan: origin time 6000 - 4000 BCE, oldest attested members Old Chinese 1200 BCE, Old Tibetan 765 CE, Burmese 1058 CE
- Dravidian: origin time (?), oldest attested members Old Tamil 200 BCE, Old Kannada 450 CE, Telugu 575 CE, Old Malayalam 850 CE
Sources for Sino-Tibetan origin:
- Archaeological evidence for initial migration of Neolithic Proto Sino-Tibetan speakers from Yellow River valley to Tibetan Plateau | PNAS
- Dated phylogeny suggests early Neolithic origin of Sino-Tibetan languages | Scientific Reports
- Phylogenetic evidence for Sino-Tibetan origin in northern China in the Late Neolithic | Nature
0
u/Waste_Cartographer49 19d ago edited 19d ago
Nvm I misread the post
3
u/lpetrich 19d ago
Basque is one language, so it isn't comparable to Indo-European.
Furthermore, the only generally-accepted relatives of Basque are Aquitanian and possibly Iberian.
Numerous other ones have been proposed, with none of them gaining general acceptance.
1
u/Willing-One8981 19d ago
I'm not a linguist either, but I'd assume a proto language could not be reconstructed from a language isolate.
3
u/lpetrich 18d ago
For a multi-dialect language like Basque, one may reconstruct a protolanguage from its dialects. But that isn't anywhere comparable to Indo-European.
7
u/Wagagastiz 19d ago
When laryngeals were first proposed, before they were confirmed by hittite, were they relying on Sanskrit and ancient Greek or were they deduceable via modern languages?