Discuss results - Methodology in Greek and Latin

Methodology in Greek and Latin
Posted by: Dionysius Thrax on Sept. 9, 2013, 21:06
There are some serious problems in the distance calculations of Greek-to-Modern Greek and Latin-to-Italian. If indicative of your entire methodology, these problem cast substantial doubt on all your comparisons and measures. I hope you can correct them and adjust your overall results accordingly.
In the case of ancient Greek to modern Greek: you list the ancient language as extinct, and calculate a distance given on a small sample of words. However, ALL of the ancient words you use are perfectly familiar to a fluent speaker of modern Greek, and are in regular use in daily conversation, in public media, in books high and low, and so on. I can go through ALL of your sample words this way, but will refrain in the interest of time and space. The point is that the distance you calculate is completely erroneous -- given this particular sample, the distance is plainly ZERO. You seem to have simply gone through Google Translate and not bothered to ask native speakers. To be fair, there are ancient words not in current use, but they are not among those of your list. Even those more obscure words, however, are more a matter of vocabulary -- some people have a greater and some a lesser vocabulary, even within a single language. A person with a large modern Greek vocabulary will also have a correspondingly large ancient Greek vocabulary, and the distance will again be smaller than you calculate.
This leads to the next problem.
My comments above also hold for Italian versus Latin -- most Italians will understand most of the Latin words presented -- perhaps not the 100% that I mention for Greek, but something quite close. However, they will not be able to handle as much of Latin syntax as a modern Greek reader can handle of ancient Greek syntax.
Thus, even apart from vocabulary, modern Greek is close to ancient Greek than Italian is to Latin. Again, your methodology misses this dimension entirely, skewing the results in a way that confuses anyone not familiar with the relevant languages. Instead of shedding light on your subject, you are, inadvertently, spreading misinformation, made worse by the false appearance of objectivity.
What I am pointing to regarding Italian and Latin carries through to all the other languages derived from Latin. In every case, (Greek, Latin, or any other language), the comparisons you are making should at least track etymologically related words -- or the results are simply not meaningful. To give an example. A comparison between "automobile" and "auto" says something about linguistic evolution. A comparison of "automobile" and "car" does not, at least not in a comparison of consonant matches. The words are completely unrelated, but the use of "car" does not mean that people do not understand and use "automobile." Furthermore, the "auto" will also lead to Greek, and the "mobile" will lead to Latin. "Car" would go through French to Latin to Gaelic, along a completely different etymological line. Comparing consonants between "auto" and "car" is like comparing apples to lobsters or gyroscopes.
This kind of error is not trivial, and needs to be corrected before any reliable inferences can be drawn from your study. I hope you can make the necessary corrections, because your project is admirable and would be very useful if it could be relied upon.
Posted by: Vincent on September 9, 2013, 22:34
Thank you for your interesting comparison which leads to better explain what my system reflects.
The words I have chosen are words which are very stable semantically, having no or few synonyms and being resistant against borrowing.
I took the words from different sources - mostly dictionaries - aiming to get the words in the exact shape as they are today or - for extinct languages - as they were at a given time. What really matters from there is the codification in consonant groups which is the basis for the algorithm.
The genetic distance my system reflects is just based on the same 18 words throughout the study. So it doesn't reflect synthax at all. The idea is to consider these 18 words as a "gene" which reflects one aspect of languages: a kind of "lexical nucleus". Using the same "lexical nucleus" accross the whole study is a step toward objectivity.
Most modern languages have a particular history - some are a compromise between dialects which has become an official language, others are a main dialect which has prevailed over other dialects in a particular area. I will check again with my Greek source (a collegue of mine) and come back again. But I think your point about Greek may have to do with a particular fact: the Greek language is one of the few world languages having been a lingua franca for a very long time. As such, it is certainly more difficult to draw lines between the modern state and more ancient ones: the ancient state has always had a bigger influence on the evolution than with languages which have evolved without constant reference to their protolanguage. It is certainly an interesting topic and I am not an expert here - but I think that all this should not have any impact on the "lexical nucleus" used in this study - except of course, if I took the wrong words...
As to Italian vs. Latin - the words are very close, but still different. What matters is the degree of difference. In fact, the genetic distance between ancien Greek and modern Greek (30) is quite near to the distance between Latin and Italian (26) which reflects the respective historical distance between these languages.
Regarding your automobile / Car example - I agree completely with you and this is the reason why I have choosen only 18 words which a so basic to life that they can come only from the very core of each language.
One example showing that using these core words as a "lexical nucleus" reflects a genetic distance regardless of historical influences is the case of Arabic vs. Maltese: Maltese is Arabic in core but has untergone so massive other influences that it looks more distant to Arabic than Italian does to Latin. Still, the genetic distance based on the core vocabulary is 19 - historically, Maltese hat split from Arabic later than Italian from Latin. In this sense, the problem with Greek is that modern Greek hat never really split from ancient Greek?
Posted by: Dionysius Thrax on Sept. 10, 2013, 01:50
In the meantime, the words: as I pointed out, ALL of them are understandable to a modern Greek, and all are in everyday use, so that part of the answer doesn't really need further comment. It's just 100%, all the way down the list.
What I can help with is this, though: some words are not comparable.
Eye: ὀφθαλμός versus μάτι. If you want to use μάτι, you should compare it to ομμάτιον from όμμα (nominative), όμματος (possessive). Basically, all that has changed is the the initial omicron has been dropped, but even that not always. People in daily conversation would not skip a beat if they heard these words. Οφθαλμός is in current use. Everyone goes to the ophthalmologist, who is called an ὀφθαλμόλογος.
Incidentally: there is an iPhone app called Epetymon that traces the etymology of Greek to the most fundamental stems. It allows you to go from modern to ancient Greek, listing all sorts of intermediate forms, and showing the substitutions along the way. It would help to use it with a Greek speaker nearby.
Next word: Nose: again you are using two unrelated stems. If you use ῥίς (ρινός, possessive), you would be referring to the same root as that from which rhinoceros comes from, and everyone who has ever heard of a rhinoceros (ρινόκερος) would understand you. The proper comparison would be between ῥίς, ρινός, and ρίνα, ρίνας. If on, other hand, you stayed with μύτη, well, that's ancient already, coming from μύτις, snout.
Οτορινολαρυγκολογος is a fun word: Ear-Nose-Throat-o-logist. Οτο-ρινο-λαρυγκο-λογος, all there, all ancient, in modern Greek.
Next: τελευτάω versus θάνατος. Well, θάνατος is both in current use and is older than Homer. Likewise, τελευτάω and its derivatives. Of course, θάνατος means "death" directly, and τελευτάω is a verb and means, in this form, "I end" -- which could and still can mean " I die" or "I make someone die," just as if we said "I end (myself)" or "I end (someone)" -- this works with or without the various niceties: "I [put to an] end" for instance.
Next, same issue: ὕδωρ is a current-ancient word. A modern Greek couldn't say "hydraulics" or "plumber" (υδραυλικός) without it. Likewise, νερό has ancient roots, going back to νεόω , ( νέος ), meaning young and flowing, and connecting to the names of a million things from ships to water spirits (Diane Nyad's name, for instance, from ναιάς). Some of that thread is here:
νηρός , ά , όν , of fish, = νεαρός, fresh, PCair.Zen. 616 (iii B.C.), Xenocr. ap. Orib.2.58.63; cf. ἡμίνηρος. νηρόν, τό, or νηρός, ὁ, water, OGI 201.21 (Nubia, vi A.D. ), cf. Phryn. 29; acc. sg. written τὸν νιρόν PSI3.165.3 (vi A.D.); cf. Mod.Gr. νερό. f.l. for νειρός in Lyc. 896, glossed κάθυγρος, Suid.s.v. νηρίτης (interpol.), but ταπεινός, Hsch.
Then there are some strange entries, like ἄνεμος versus άνεμος! That's identical! How does it get a 66? Shall we blame Erasmus? His pronunciation has been questioned and is no longer to be trusted. It turns out that by Hellenistic times, Greek sounded pretty much the way is sounds today. To insist otherwise is a little like insisting that Versailles should be pronounced "Ver-sails" -- not very persuasive to a French person.
To repeat, the rest of the words are basically also 100% matches. Everyone knows νύξ and uses it interchangeably with νύχτα, usually choosing according to formality of occasion, but more often as a matter of creative expression. In ordinary speech, people play with the language as if it were music, using the form that sounds best of that generates emphasis where emphasis is desired.
Did I miss anything? Not really. Even the low scores are not accurate: for instance ὀδούς versus δόντι. The possessive case of ὀδούς is ὀδόντος, from which δόντι derives directly. To make the comparison fair, the nearest forms should be compared.
The background for all this goes back many centuries. Over to its long history, the country has become accustomed to having a low and high language as well as an old and new variation, and many dialects. Added to these causes is the fact that many people are multilingual, and, more recently, that everyone is affected by the Internet. The outcome is a rather delightful linguistic versatility, a kind of polysemantic polyphony which is very hard to translate but that can be explained bit by bit, as I am trying to do here. I can explain the parts, but I can't quite convey the fluidity of shifting registers mid-sentence and never tripping. If you like some of John Zorn's rapid-transition music, though, you can get a sense of it there.
Ah, I did miss one: τίς versus Ποιος. Different roots, both ancient, both in current use. The ancients would have been more discriminating in when to use which, but both are understandable to this day. The fair comparison would have been between ποίος and ποιός -- meaning that only the accent has changed, and then not always. Τις is still in use, as in the ancient and still used phrase "τις ει" -- "who's there?".
From this explanation, I would say that given these words, the score should be perfect or near perfect. If you would like to use other words, I'll be happy to comment on them. As I mentioned in my initial reply, there are ancient words that are not used -- but this is more related to vocabulary loss than to them becoming incomprehensible. Anyone who takes the time to develop a better vocabulary has access to the ancient language. The more difficult difference is the syntax, but that is not insuperable. The greatest difference, though, is that the ancient language is a vast superset of the modern -- meaning that there are things you can say directly with it that you can only express in more convoluted ways in the modern. This is accomplished by having access to the full grammar, with all its tenses and moods and declensions, a construct which is really quite formidable and beautiful to behold. Thus it is easier to read and understand the ancient language that to use it -- but, with practice, one can get spoiled -- the modern language begins to feel simplistic and contaminated, and the old one far superior. People who have studied it for centuries have come upon this realization. It's quite eye-opening. It helps one realize what the Renaissance was really all about.
In any event, the linguistic continuity is real and palpable and more easily accessible than perhaps any other. I've taken the time to reply in detail because I am genuinely delighted by this phenomenon -- it is really quite incredible to pick up and read the letters between Alexander the Great and his teach, Aristotle, for example, and to have the centuries vanish. Or to read Plato and laugh with Socrates as if you're reading a contemporary novel, with the whole chatter of the dialog bursting to life in a way that translations can only hint at. The effect is stunning in its clarity, and nothing like comparing English to Old English, for example, where the language is unstable and strange and gives the impression of being more primitive than the present one.
Homer's Greek is more difficult because one needs to be reminded of more of the vocabulary, but it's not terribly hard, and as one reads it, one finally realizes where most modern Greek words come from. The Grammar of Dionysios Thrax is just luminous in its clarity and brevity. Quite honestly, I wouldn't mind seeing it being taught in today's schools of Greek. By the time we get to Hellenistic Greek (Koine), reading becomes effortless. And here we are!
OK, enough for now. I hope this is helpful and interesting. Your project is fascinating and important. I'll be happy to watch it develop.
All the best!
Posted by: Vincent on September 11, 2013, 21:37
Thank you again for this. The history of the Greek language is fascinating and it is difficult for me to make the right stem choices as I have no knowledge of Greek. On the other hand, it is a very important language for most linguists and the Greek data used in this study should be perfect...
I have changed the data according to your remarks - with few exceptions for which I have questions. But the genetic distance has already gone down to 22...
My questions:
1) Regarding "Death" - it is probably the most difficult word to handle among the 18 words I have selected for the study. It is quite exposed to semantic shift - although the shift occurs almost always between "die", "kill" and "murder". In fact, I have to use the stem - whether it comes from a verb or a noun. For death, I have used the stem for "die" in many languages, as it seems to be more stable and resistant to semantic shift in a verb than in a noun. Interestingly, the "-M-T-" or "-M-*-T-" stem is common to many Indo-European and Semitic languages... A "T(H)-N-T-" stem seems to prevail in Greek?..
2) Are the "ω" and "ς" at the end of τελευτάω and θάνατος part of the stem or are they a mark for nominative? I have to compare the lexical phonemes only...
3) Re. Water / νερό / ὕδωρ, I am not sure what is the best thing to use as I should keep using words having exactly the same meaning... It is the same dilemma as for "death". Any idea. At present, the "50" I comparing two different stems are in fact just chance, not because the stems are cognates...
There is another topic I have long been looking information about, but until now without result: Tsakonian. Do you know a source where I can get these 18 words in Tsakonian? There is a Swadesh list for many languages, but not for Tsakonian. I suppose there are great resources available only in Greek, which is the reason why I didn't find anything. It would be interesting to see which genetic distance we get between Tsakonian and Greek on the basis of these 18 words - I suppose it will be somewhere between 30 and 40...
Best regards
Posted by: Dionysius Thrax on Sept. 16, 2013, 21:56
I'm happy you're finding all this useful. No one can know all languages, so I realize it's imperative for people to help you do what you've set out to do, and I'm pleased to be able to assist. Besides, all of this is an interest of mine, too, so it's fun to look into these matters.
About Tsakonian: I don't know much about it specifically (except what might be available to me through modern and ancient Greek,which might still be useful), but I can ask and see if people I know might be able to help -- I do know some people involved in the study of dialects in Greece, and the dialects are often closer to the ancient Greek than most of the modern language is. If I can locate anybody, I'll let you know.
If you have any specific questions, send them along, they might be useful in getting people interested...
1) Regarding "death": "τελευτάω" comes from "τέλος," which we know and use as telos, teleology, etc. The word means "end" but also "purpose," "perfection," and so on. Any comparison would have to contain the "τελ" part, which, of course, "θάνατος" does not. "Θάνατος" comes from "θείνω" (I strike, I slay), so the meanings are related regarding the end, but one is the natural and expected outcome (which, of course, can be brought about by force, just as we can say "he ended him") and the other being more about the way in which this end comes around, and the fact that even an expected end comes as a blow to both the person suffering it and to those who care. Both words may have some relation to your stem ("T(H)-N-T-") probably through the verb "τείνω" (to tend), but that doesn't help in terms of distance, because the distinctions are ancient (they are already present in Homer) and all three variations are still in everyday use. Whichever one you use, the distance would be zero, but it would be best to pick one and stay with it for the sake of correctness of data and integrity of results.
2) Neither the "ω" nor the "ς" are parts of the stem. The "ω" is first person singular suffix of the verb; the "ς" is part of the nominative, as you state. Both vary accordingly.
3) Regarding water: water being as fundamental as it is, the roots go as far back as possible. As I said, both νερό and ὕδωρ are perfectly comprehensible to this day, but you are more likely to ask for a glass of νερό. There are plently of situations where the preference would be reversed. If we stay with νερό and trace backward, we would find that the stem is already implied by the very first letter -- the ν~ already hints at everything liquid, flowing, new, mutable, etc. After that, the following sounds can and do vary -- it could become νε~ or να~ or νη~... or νεαρ~, νηαρ~, ναιαρ~, or νησ~, ναιαδ~, and all the permutations would (and do) retain their connection to water. These shifts happened thousands of years ago, though, and, as with all the other cases, are present in the modern language, so they do not signify present distance. Whatever distance they signify must have happened a much longer time ago, perhaps in Linear B times, or before. Some of the changes back then where not necessarily conceptual, they had more to do with dialects, which eventually where consolidated into (mostly) the Atheniad Attic and the Hellenistic/Alexandrian Koine.
I'm not sure this is helping you. My suggestion is that you pick one option and stay with it. The only thing that does not work is to try to say that νερό is distant from ὕδωρ -- that not apples and oranges, it oranges and clockwork Apple computers, or some such contrived comparison.
To make this a bit simpler and usable, you could compare "νεαρόν" το "νερό" -- this would give the linguistic continuity. A parallel comparison of the ancient "νεαρόν" to "νηρόν" to "νερόν" would be appropriate. The word chain would probably be something like this: ναω (I flow) > νεοργόν (refreshing)> νεαρόν> νηρόν> νερόν (water, formal)> νερό (water, casual). Proof the the distinctness of the words is given in that the νερό is actually taken from the phrase "νηρόν ὕδωρ"" meaning, literally, "refreshing water" (from flowing, rather then stagnant, water), which, over time, has been abbreviated to just νερό. It's literally as is we used the phrase "may I have some fresh water, please" and finally just got lazy and decide to say "may I just have a glass of fresh, please"... which is not all that different that saying that "refreshments will be served."
With all of that in mind, it's now quite clear that the "ὕδωρ" is the substance itself, now as then. The root for that is *ud-n-t(o) (sorry, I don't have the character for the "n" on my keyboard, it has a small circle beneath it). The words that come from it often begin with "υδρ~"... while "νερό" is related to "new," "novus," etc.
Well, that's it. I hope it's useful!
All the best.
