r/IndoEuropean 19d ago

Linguistics How much can one deduce about Proto-Indo-European if one only had the present-day Indo-European languages?

Many language families are only known from members documented only over the last few centuries, so it would be interesting to speculate about how much we can learn about Proto-Indo-European if we only had its present-day members. As part of this exercise, let us suppose that we already know how to do historical-linguistics research.

Some families would be easy to recognize: Goidelic, Brythonic, Romance, Germanic, Baltic, Slavic, Albanian, Greek, Armenian, Iranian, and Indic. Some would be more difficult: Celtic, Balto-Slavic, and Indo-Iranian. Indo-European itself would be even more difficult, but I think that it could still be recognized.

One would try to avoid the complication of borrowed words by using lists of highly-conserved and seldom-borrowed words, like the Swadesh, Dolgopolsky, and Leipzig-Jakarta lists, lists with pronouns, "name", small numerals, human beings and close kinship terms, body parts, and common animals, plants, natural phenomena, qualities, and actions.

With these lists, one can find sound correspondences like Grimm's law.

Grammar would be more difficult, but one can make a little progress.

Although articles (a, an, the) are common, they have a lot of variety, and one will conclude that they are later inventions and that if PIE had any articles, they were lost.

Noun plurals have a lot of variety, as do noun cases, with no cases to seven cases in Baltic and some Slavic languages. Some languages have more cases in pronouns than nouns, and some of these ones are closely related to languages with more cases. Did they partially lose cases?

There are some correspondences in the noun cases:

  • Dative plural: Icelandic -um, German -n, Baltic, most Slavic -m-
  • Nominative singular -s absent from accusative singular: Greek, Baltic (Lithuanian, Latvian), nominative but not accusative singular -r: Icelandic

Turning to verbs, several of the languages have similar personal endings, subject-agreement ones, though several others have much-reduced endings or no endings. For the present tense, I come up with these simplified forms for the more distinct endings:

  • Icelandic: -, -r, -r; -um, -idh, -a
  • Spanish: -o, -s, -; -mos, -is, -n
  • Irish: -im, -ir, -ann; -imid, -ann, -id
  • Lithuanian: -u, -i, -a; -me, -te, -a
  • Russian: -yu, -sh, -t; -m, -te, -t
  • Greek: -o, -is, -i; -ume, -ete, -un
  • Albanian: -(j), -(n), -(n); -m, -n, -n
  • Persian: -am, -i, -ad; -im, -id, -and
  • Bengali: -i, (-ish, -o), -e (singular, plural)

Greek also has mediopassive endings: -ome, -ese, -ete; -omaste, -este, -onde -- the only language to have such endings.

In general, however, verb tense, aspect, mood, and voice constructions are often subfamily-specific and hard to relate across the subfamilies.

There is an exception: the suppletion in the verb "to be":

  • English: (inf) be, (3s) is, (past 3s) was
  • Spanish: (past 3s) fue, (3s) es
  • Lithuanian: (inf) bûti, (3s) yra,(2s) esi
  • Serbo-Croatian: (inf) biti, (3s) jesti
  • Persian: (inf, vb noun) budan, (3s) ast
  • Irish: is
  • Welsh: (vb noun) bod
  • Albanian: (3s) është
  • Greek: (3s) ine, (2s) ise
  • Armenian: (3s) ê, (2s) es

The Romance f- is related to others' b- by a sound correspondence: Italian fratello ~ French frère ~ English brother ~ Welsh brawd (pl. brodyr) ~ Lithuanian brolis ~ Russian brat ~ Czech bratr ~ Persian barâdar ~ Hindi bhâi

Looking halfway back to the emergence of the Latin and Greek literary traditions (~ 200 BCE, ~ 800 BCE), back to around 900 CE, one finds that Old English, Old Saxon, Old High German, and Old Norse have grammar much like Icelandic grammar. Old Church Slavonic is much like reconstructed Proto-Slavic, noun cases and all.

One finds much less borrowing, and one finds a little more support for PIE grammatical features. In particular, Old Irish has dative plural -b, much like Germanic, Baltic, and Slavic -m, and Old French has a curious declension: nom sing -s, acc sing -, nom pl -, acc pl -s, something like Greek, Baltic, Icelandic, and Old Norse -s and -r.

33 Upvotes

14 comments sorted by

7

u/Wagagastiz 19d ago

When laryngeals were first proposed, before they were confirmed by hittite, were they relying on Sanskrit and ancient Greek or were they deduceable via modern languages?

5

u/lpetrich 19d ago

Mainly Ancient Greek and Sanskrit, from their having long vowels that alternate with short ones: Laryngeal theory - Wikipedia

Modern Greek has no vowel-length distinction, however, and later Indic languages lost Sanskrit ablaut, as far as I know.

Indo-European ablaut alternation is usually e, o, -, and that is common in Greek, but Greek also has (long vowel) - (long vowel) - (short vowel) where that alternation ought to be, and Ferdinand de Saussure proposed "coefficients sonantiques" ("sonant coefficients") that were a sort of consonant that turns into vowels and lengthens vowels. Those CS's eventually became known as laryngeals from comparison to some consonants in some Semitic languages that do things to nearby vowels.

So I think that laryngeals would be difficult to deduce from the present-day IE languages.

1

u/Willing-One8981 19d ago

Probably not - Sanskrit was key to Saussure's early laryngeals proposal. Without Sanskrit the sound changes would have probably appeared random.

2

u/lpetrich 18d ago

Sanskrit? Ancient Greek has better distinction of inherited vowel qualities: ancestral e, a, o > Sanskrit a.

3

u/lpetrich 18d ago edited 18d ago

Indo-European is unusual for language families with its time depth having attested members with great antiquity. List of languages by first written account - Wikipedia Among families with present-day speakers, only Semitic beats it in oldest attested member, with Akkadian (2600 BCE) beating Hittite (1700 BCE).

However, IE and Sino-Tibetan beat Semitic in oldest attested members with descendants that are still spoken:

  • IE: Greek (1450 BCE), Sanskrit (1500 - 1000 BCE: Indic)
  • Sino-Tibetan: Old Chinese (1250 BCE: Chinese)
  • Semitic: Hebrew (1000 BCE), Aramaic (1000 BCE)

Bayesian phylogenetic analysis of Semitic languages identifies an Early Bronze Age origin of Semitic in the Near East - PMC has date 3750 BCE for Proto-Semitic.

  • East Semitic: Akkadian: 2600 BCE X
    • Central Semitic: (calc) 2450 BCE
      • Arabic: 750 BCE
      • Northwest Semitic: (calc) 2100 BCE
    • South Semitic: (calc) 2650 BCE

Northwest Semitic: (calc) 2100 BCE

  • Hebrew-Aramaic (calc) 1500 BCE
    • Hebrew: 1000 BCE
    • Aramaic: 1000 BCE

South Semitic:

  • Southeastern Semitic (Modern South Arabian): (calc) 100 BCE
  • Southwestern Semitic
    • Old South Arabian: Sabaean: 850 BCE X
    • Ethiosemitic: (calc) 800 BCE
      • Ge'ez: 350 CE X
      • Amharic: 1200 CE (?)

2

u/Eannabtum 18d ago

Among families with present-day speakers, only Semitic beats it in oldest attested member, with Akkadian (2600 BCE) beating Hittite (1700 BCE).

Actually, Semitic is part of Afro-Asiatic, so the family is attested from ca. 3200 BC (Egyptian).

2

u/lpetrich 18d ago

I was a bit reluctant to mention Afroasiatic, because there is not nearly as much that can be firmly reconstructed as there is for (say) Indo-European or Austronesian or Bantu. One can reconstruct personal pronouns and pronoun-related inflections, a bit of other grammar like feminine *-t, some basic vocabulary like *dam- "blood", but not much more.

Estimates of the age of AA vary over 20,000 to 10,000 years (dates 18,000 BCE - 8000 BCE), roughly the end of the Last Glacial Maximum to the beginning of the Holocene.

There are two major attempts to reconstruct AA, and they do not agree on very much.

1

u/Reasonable_Regular1 18d ago

Old South Arabian: Sabaean: 850 BCE X

The paper you linked doesn't mention Sabaic, and that's because it's a Central Semitic language. (Likewise, its claim that "no consensus exists for placing Arabic in either the Central or South Semitic group" is a good thirty years out of date.)

2

u/lpetrich 18d ago

List of languages by first written account - Wikipedia

First attestation after the Indo-European Big Three, sometimes long after, and age comparable to IE itself, yet generally accepted:

  • Austronesian origin time 3500 BCE: Malayo-Polynesian o.t. 2250 BCE: Cham 350 CE, Old Malay 683 CE, Old Javanese 804 CE
  • Austroasiatic o.t. 2000 BCE: Old Khmer 611 CE, Vietnamese 1440 CE
  • Uralic o.t. 2000 BCE: Hungarian, Finnic 1200 CE
  • Kra-Dai o.t. 2000 BCE: Thai 1292 CE
  • Uto-Aztecan o.t. 2000 BCE: Classical Nahuatl 1550 CE
  • Niger-Congo o.t. (?): Benue-Congo o.t. 4500 BCE: Bantu o.t. 3000 BCE: Kikongo 1557 CE, Swahili 1711 CE
  • Pama-Nyungan o.t. 3000 BCE: Guugu Yimidhirr 1886 CE

Sources:

1

u/lpetrich 18d ago edited 18d ago

List of languages by first written account - Wikipedia

Most language families with a time depth comparable to Indo-European, about 4000 - 2000 BCE, that have surviving members, have those members first attested later than the traditional Big Three of Indo-European: Latin, Greek, and Sanskrit.

  • Latin: earliest writing 550 BCE, beginning of surviving literary tradition 200 BCE
  • Greek: earliest writing 1450 BCE, beginning of surviving literary tradition 800 BCE
  • Sanskrit: earliest surviving compositions 1500 - 1000 BCE: the Vedas

I'd mentioned Semitic as having surviving members with comparable age: Hebrew, Aramaic, Arabic. Others:

  • Sino-Tibetan: origin time 6000 - 4000 BCE, oldest attested members Old Chinese 1200 BCE, Old Tibetan 765 CE, Burmese 1058 CE
  • Dravidian: origin time (?), oldest attested members Old Tamil 200 BCE, Old Kannada 450 CE, Telugu 575 CE, Old Malayalam 850 CE

Sources for Sino-Tibetan origin:

0

u/Waste_Cartographer49 19d ago edited 19d ago

Nvm I misread the post

3

u/lpetrich 19d ago

Basque is one language, so it isn't comparable to Indo-European.

Furthermore, the only generally-accepted relatives of Basque are Aquitanian and possibly Iberian.

Numerous other ones have been proposed, with none of them gaining general acceptance.

1

u/Willing-One8981 19d ago

I'm not a linguist either, but I'd assume a proto language could not be reconstructed from a language isolate.

3

u/lpetrich 18d ago

For a multi-dialect language like Basque, one may reconstruct a protolanguage from its dialects. But that isn't anywhere comparable to Indo-European.