This site works better with JavaScript enabled!

Indian Scripts Index for Herbs and Spices

  

The romanizations used on the Spice Pages are very close to both the common scientific transliteration (IAST) of Sanskrit and the enlarged ISO 15919 standard. Pure ASCII schemes like Harvard-Kyoto and ITRANS are very different.


Additionally, Devanagari transliterations are given for languages that have another native alphabet. These Devanagarizations use a couple of special signs to accommodate sounds alien to most Aryan languages. In general, there is a 1:1 correspondence between Devanagari and native letters; as an exception, the anusvara has been used in Devanagari throughout even if the native alphabet expresses nasalization by means of character, and a few native signs not reprensentable in Devanagari had to be replaced by near-matches. Since the Devanagari transliterations are produced programmatically by sed, some of them might by systematically flawed.

Displaying this page correctly is currently a true challange to all except the most recent computer systems, many of which will fail. You will not only need fonts for all scripts used, but you must also make sure that your browser (or the underlying operating system) can handle the complex rules of Indic typography correctly. Malayalam is an especially tricky case, and many current systems fail the following test: മഞ്ഞള്‍ (maññaḷ) and എള്ള്‌ (ĕḷḷə). You should see only one virama (a breve-like mark) over the last letter of the second word, while on some systems up to four viramas are shown, or characters are replaced with question marks or square boxes. Fortunately, Unicode 5.1 offers a new way to code the first word as മഞ്ഞൾ, which should be easier to interpret for renderers (as soon as the fonts are updated).

The entries are sorted according to the canonical Devanagari collating sequence, which is mimicked by all the other Indic scripts. Anusvara is sorted as if it were written as a nasal. The handling of the implicit vowel is still somewhat unsystematic; it is often ignored in sorting if it is written but not pronounced.


Depending on definition, several hundred to more than thousand languages from five families are spoken in India alone. Yet, most of these have no literary tradition, and some that had it in the past have now lost it (shockingly often in the course of the 19/20th century). Some North-Western Indian languages use the Arabic Alphabet and must be excluded from this index; the same is true of many minority languages in the far North East, where the Latin alphabet is common (Khasi, Garo, Karbi etc., though some others use Bengali script, and Bodo is unique in using Devanagari). The traditional literary languages of the India have today official status in the union states where they are spoken, and all of them are contained in this index with moderate to reasonable accuracy.

The past decades witnessed a trend towards literacy in many languages that were previously mostly oral; some of these are comparatively large languages (Konkani, Dogri), but also small minority languages from the North East have undergone that process. Some of these have now acquired official status, or are approaching this aim. Usually, existing scripts were adapted to new languages (mostly Devanagari or Bengali), rather than reviving archaic, disused autochthonic scripts. In principle, many of these new literary languages could be included into this index, but information about their spice names is hard to come by in the web, and so I have to rely on fieldwork, which is resourceful and often gives poor results. Nevertheless, I can present spice names in a couple of languages of the far North East, and I harbour hope that the number might increase in future.

The following table is both a status quo and a to-do list. It contains all languages known to me which satisfy two conditions: (a) written with a Brahmi-derived script that is structurally close enough to Devanagari and supported in Unicode; an (b) has official status, or is at the very least tought in school so that a standard orthography is defined. In India and its neighbouring countries, an estimated 30 to 40 languages from three families (Indo–European, Sino–Tibetan, Dravidian) remain as possible candidates for inclusion into this index. Some cases are really problematic: Rajasthani, although enjoying somewhat official status, has no normalized orthography; Tibetan and its relatives are written in a script that has developed pretty far from the Indian original (thus, Tibetan spice names are more easily found in the Tibetan Index). There are also some cases with no clear consensus on the script used for a language: Konkani (Latin/Devanagari/Kannada, although it is offical in Goa with Devanagari script), Kokborok (Bengali/Latin), Kashmiri and Sindhi (Arabic/Devanagari).

The South East Asian Brahmi-derived scripts are perhaps impossible to add to this index, as they have evolved a long, long road from the common ancestor. There is currently a Thai and Lao Index available, and the two remaing (Khmer and Burmese) might follow in the future, whenever I’ll have enough expertise for this task (or a friendly reader will help me).

ScriptLanguageRemarks
Devanagari संस्कृत Sanskrit (sa) βclassical tongue of lore, religion and philosophy
हिंदी Hindi (hi)lingua franca in Northern India; official language of the Indian Union and many Northern union states
मराठी Marathi (mr)official state language in Maharashtra
कॉशुर Kashmiri (Koshur, ks)regional langage in Kashmir, now mostly (in Pakistan always) written in Arabic alphabet [کٲشر] (Dardic)
कोंकणी Konkani (kok) αofficial state language in Goa
नेपाली Nepali (ne)national langage in Nepal; official second language in West Bengal
नेपालभाषा Nepal Bhasa (Newari, new) regional langage in Nepal; the traditional Newari-Script (Ranjana) enjoys a partial revival in Kathmandu, but Unicode support is still lacking (Sino–Tibetan)
बोड़ो Bodo (brx)official second language in the Western part of Assam (Sino–Tibetan)
मैथिली Maithili (Bihari, bh) official (?) regional language in Bihar and strong community in Southern Nepal. In the past, it was written in a variety of Bengali script known as Mithil­akshar (there is yet no Unicode support) or in the Kaithi script [𑂍𑂶𑂘𑂲], but today Devanagari is used if the language is written at all.
डोगरी Dogri (doi)regional language in Jammu and Kashmir (co-official in the South of that state); spoken in Pakistan and there written in Arabic alphabet [ڈوگرى]. Culinary vocabulary is very close to Hindi.
सिन्धी Sindhi (sd)scattered over North Western India and Pakistan (where it is co-official); recognized by the Indian constitution, but no official status in any union state; in India partially and in Pakistan completely written in a variety of Arabic script with many additional characters [سنڌی]
राजस्थानी Rajasthani (raj)Grouph of rather diverse dialects spoken in Rajasthan
दिवॆहिबस् Dhivehi (dv)National language of the Maldives and written with a unique script [ދިވެހިބަސް]. A few hundred speakers in India use Devanagari, but I do not know enough about their orthography rules to incorporate that language here.
Gurmukhi ਪੰਜਾਬੀ Punjabi (Panjabi, pa)official state language in Haryana and Punjab; widely spoken as a vernacular in the eastern part of Pakistan where it is not official (written in Arabic alphabet [پنجابی])
Gujarati ગુજરાતી Gujarati (gu) official state language in Gujarat
Bengali (Eastern Nagari) বাংলা Bengali (Bangla, bn) official state language in Western Bengal; national language of Bangladesh
অসমীয়া Assamese (Oxomiya, as) official state language in Assam
মণিপুরি (মৈতৈ লোন) Manipuri (Meitei-lon, mni) official state language in Manipur; unofficial minority language in Bangladesh. The native Meitei Mayek script [ꯃꯩꯇꯩ ꯃꯌꯦꯛ] was replaced by Bengali in the 18.th century and is currently revived; there is a separate spice index in that script available. (Sino–Tibetan)
ককবরক Kokborok (Tripuri, trp)Official regional language in Tripura (spoken also in Bangladesh). Is also written in Latin script (Sino–Tibetan)
বিষ্ণুপ্রিয়া মণিপুরী Bishnupriya Manipuri (bpy) spoken in scattered communities in North East India and Bangladesh (not official)
সিলটী Sylheti (Siloti) spoken in Bangladesh and North East India (not official); there is a native alphabet (Siloti Nagori [ꠡꠁꠟꠐꠡ ꠘꠀꠉꠠꠡ]), which is, however, mostly extinct.
Oriya ଓଡ଼ିଆ Oriya (or)official state language in Orissa
Telugu తెలుగు Telugu (te)official state language in Andhra Pradesh (Dravidian)
Tamil தமிழ் Tamil (ta)official state language in Tamil Nadu; second national language of Sri Lanka (Dravidian)
Kannada ಕನ್ನಡ Kannada (kn)official state language in Karnataka (Dravidian)
ತುಳು ಬಾಸೆ Tulu (tcy)Minority language in Karnataka, formerly written with a specific script but now mostly oral (Dravidian)
Malayalam മലയാളം Malayalam (ml) official state language in Kerala (Dravidian)
Sinhala සිංහල Sinhala (Singhalese, si)national language of Sri Lanka
Uchen (Tibetan)
བོད་སྐད་Tibetan (bo) a group of interrelated languages spoken in Tibet, China, Nepal, India and Pakistan (Sino–Tibetan)
གླེ་སྐད་Ladakhi (ljb)sublanguage of Tibetan located mainly on the North Western edge of India; has official status in Jammu & Kashmir state (Sino–Tibetan)
རྫོང་ཁ་Dzonkha (dz) βnational language of Bhutan. Also part of the Tibetan macrolanguage (Sino–Tibetan)
Ajhapat 𑄌𑄋𑄴𑄟 Chakma (Changma, ccp)small minority language of North Eastern India and Bangladesh (get a font)

It should be noted that I obtained many of the names shown in this index from poorly legible, hand-written lists, and (as anyone with some knowledge of Indic scripts will confirm) it is quite difficult and error-prone to digitalize such scribblings, moreover one may easily fall victim to orthographic deficiencies of the indivuals who wrote them. Whenever possible, I checked on the Internet, but due to a scarcity of sources (especially bilingual ones), this was only partially possible for some languages (notably, Gujarati, Bengali and Malayalam). Spellings in Sinhala, Punjabi, Assamese, Nepali and Konkani should therefore taken with a grain of salt. Comments and corrections are of course welcome.

Some of the more exotic languages with null internet coverage were field researched, which poses significant problems with poor literacy rates and a general lack of writing tradition. Thus, spice names in North East Indian minority languages, Maithili, Dogri and Newari are perhaps to be seen as approximations only (the number of different spellings for a word can well equal the number of people asked, which is particularly nasty if only one native speaker is available); also, Tibetan proved hard work, although I was finally able to cross-check some of the names with written literature on medicinal herbs.

Sanskrit presents a different problem, as the standard dictionary of Monier Williams contains an unreasonable amount of synonyms and polyvalencies (and invalid scientific plant names); neither ancient Indian writers nor modern linguists give much heed to botany, it seems. For that reason, Sanskrit will probably remain in the β state forever.



Unicode Encoded Validate using the WDG validator Validate using the VALIDOME validator