Tatsama, Tadbhava, and Beyond

language

An introduction to tatsama and tadbhava vocabulary in Indo-Aryan languages and applications of the concept in non-Indic contexts.

Author

Shaandro Sarkar

Published

February 17, 2023

1 Introduction

The Vedas comprise some of the earliest extant texts in an Old Indo-Aryan language, and the Vedic language is one of a group of related dialects collectively known as Old Indo-Aryan (OIA). The varieties of OIA evolved over the millennia to become Hindi-Urdu, Punjabi, Marathi, Gujarati, Bengali, Nepali, Sinhala, and the various other Indo-Aryan languages spoken today. OIA words gradually changed in pronunciation and meaning over time; these inherited terms form the bulk of the vocabulary of the modern Indo-Aryan languages. For example, OIA agní ‘fire’ evolved into āg in Hindi-Urdu, Marathi, and Gujarati; agg in Punjabi; agun in Bengali; ogun in Kashmiri; and āgo in Nepali through about three thousand years’ worth of sound changes.

Around the fifth century BC, grammarian and scholar Pāṇini compiled the Aṣṭādʰyāyī, a treatise that codified the grammar of Classical Sanskrit. A later form of the Vedic language, Classical Sanskrit became the predominant classical language of much of South and Southeast Asia. Due to the status of Sanskrit as a liturgical, literary, and scholarly language, the vernacular Indo-Aryan languages borrowed words directly from Sanskrit, supplementing their existing vocabulary inherited from OIA. Bengali, for instance, inherited šą̄jh ‘evening’ from OIA saṁdhyā́ and then reborrowed the Sanskrit word as šondha. Indo-Aryan languages are full of such doublets of inherited words and learned borrowings.

Indo-Aryan languages thus have two major types of Indo-Aryan vocabulary: inherited and borrowed. Native words, i.e. words inherited from OIA, are called tadbhava (Skt. tadbhava ‘born from that’). Learned borrowings from Sanskrit are called tatsama (Skt. tatsama ‘same as that’). In some cases, Indo-Aryan languages borrowed Sanskrit words but adapted to native phonology; these words are called ardhatatsama (Skt. ardʰatatsama ‘half-tatsama’). From the same OIA word raudrá, Bengali has the inherited tadbhava rod, the borrowed tatsama roudro, and the adapted ardhatatsama roddur.

After the western half of the Roman Empire collapsed starting in the sixth century, the colloquial varieties of Latin, called Vulgar Latin, quickly began to diverge from each other, with the various dialects of Vulgar Latin evolving into Spanish, Portuguese, French, Italian, Romanian, and the various other Romance languages. Vulgar Latin words gradually changed in pronunciation and meaning over time; these inherited terms form the bulk of the vocabulary of the modern Romance languages. For example, Vulgar Latin *pluviam ‘rain’ evolved into Spanish lluvia, French pluie, Portuguese chuva, Italian pioggia, and Romanian ploaie through almost two thousand years’ worth of sound changes.

Despite the vernacular use of Vulgar Latin and its descendants, Latin remained the predominant literary language during the first millennium AD, largely due to Latin’s status as the liturgical language of Roman Catholicism as well as the language of academics and scholars. Starting in the Middle Ages, Romance languages also borrowed words directly from Classical Latin, supplementing the vocabulary inherited from Vulgar Latin. Spanish, for instance, inherited leal ‘loyal’ from Latin lēgālis then reborrowed the same word as legal ‘legal’. Romance languages are full of such doublets of inherited words and learned borrowings.

Having native vocabulary inherited from the ancestor of a language that coexists alongside borrowings from the classical or literary forms of the ancestor is actually not unique to the Romance languages!