Quantitative comparative linguistics

eLinguistics.net's Mission & Statement:
  -> Making language relatedness easily perceivable with a simple quantification.
  -> Setting up a 100% automated language classification.
  -> Check the -> UPDATED LANGUAGE TREE! (July 2024)

This blog presents a completely computerized model for comparative linguistics. The quantification of language relationships is based on a probabilistic model, assessing basic vocabulary according to clear rules. It makes it possible to generate a 100% automated language classification into families and subfamilies and to generate objective distance values for language comparisons.

You can compare languages in the calculator and get values for the relatedness (genetic proximity) between 550 languages. An evolutionary tree summarizes all the results, showing how these languages relate to each other (single isolated languages are not displayed in the tree).

This comparative linguistics approach takes you to a short digital trip in the history of languages... You will see how 18 words (when carefully chosen) can deliver values which are enough to calculate a distance between two and more languages and represent it on a tree. The distances are expressed as values between 0 (the nearest distance - so the same language) to 100 (biggest possible distance). Play with these values in the calculator! You will recognize proximities you can feel by yourself if you know some of the languages used in this study...

Tower of Babel

Try out a comparison:

Or browse on world map:

Facebook LinkedIn X VK
A few examples to illustrate the idea behind this comparative linguistics project: the system's assessment for the distance from 0 to 100 between following languages is:
  • English to German: 31
  • Dutch to German: 14
  • Danish to Norwegian: 4
  • Russian to German: 50
  • Russian to Polish: 9
  • Arabic to Hebrew: 28
  • Arabic to Maltese: 24
  • Finnish to Hungarian: 50
  • Finnish to Estonian: 17

This gives you a first idea what this site is about. With the few examples above, you can conclude that the degree of proximity between Russian and German (both Indo-European languages) is quite the same as the degree of proximity between Finnish and Hungarian (both Finno-Ugric).

Once we can get such values, we can generate a matrix. like this one, summing up genetic distances between some languages (values from the few examples above have a green background in the matrix):

Language matrix
...and finally, out of this distance matrix, we generate an evolutionary tree - using the same system - and in fact the same software - like in genetics and biology (details under Resources):
Language evolitionary tree

Blog author: Vincent Beaufils - LinkedIn -