by
Jalal Maleki and Irandokht Salehi
Department of Computer and Information Science
Linköping University
S-581 83 Linköping
Sweden
Email: jma@ida.liu.se
This article introduces the initial version (0.01) of a transliteration Scheme and also an alphabet for writing Persian text. The reason for introducing this scheme is manifold. The main reason for suggesting this scheme is that there are many people who are able to speak Persian but they haven't had the possibility of learning the Persian script. There are also many Farsi speakers that do not have access to Persian fonts on their computers and if you have had email contacts in Persian, you know why some sort of standard in this respect will be useful. Furthermore, at our department we needed a large body of parallel text in Persian and English, and therefore, after some initial search we realised we have to construct this body of text ourselves. Since it is more practical to process text in the latin script we decided to use eFarsi.
We hope that this transliteration scheme will facilitate communication between Farsi speakers. Please write to us and tell us what you think about this scheme. Remember also that it is by no means our intention to suggest this as a replacement for the beautiful Persian script that is deeply rooted in our culture. Furthermore, we consider improvements of this scheme as natural; please write to us and tell us what you think.
We begin by introducing the transliteration alphabet. This alphabet does not introduce any major changes to what people already use. Major fundamental changes in the alphabet is not necessary but may come in future. However, the main point of the schema we are proposing is not the choice of an alphabet but a few convetions that improve what many people already use.
You may wish to jump to the examples at the end directly and learn the scheme through studying them.
| eFarsi letter | Pronounced as | Persian Letter | Comments |
| ā or aa | bar | ā ye bā kolāh | Vowel |
| a | and, bad | alef, ayn | Vowel |
| b | book | be | Consonant |
| c or ch | choose | che | Consonant |
| d | door | dāl | Consonant |
| e | electric | kasre (+alef), ayn | Vowel |
| f | food | fe | Consonant |
| g | good | gāf | Consonant |
| h | habit | he | Consonant |
| i | beam, dim | ye | Vowel |
| j | jelly | je | Consonant |
| k | keep | kāf | Consonant |
| l | lamp | lām | Consonant |
| m | man | mim | Consonant |
| n | nail | nun | Consonant |
| o | book, orbit | zamme(+alef), ayn, vāv | Vowel |
| p | paper | pe | Consonant |
| q | See later | qāf | Consonant |
| r | radio | re | Consonant |
| s | sad | sin, se, sād | Consonant |
| t | tea | te, tā | Consonant |
| u | noon | vāv | Vowel |
| v | velocity | vāv | Consonant |
| w | bow, powder | vāv | Diphtong, used with o |
| x or kh | Khayyām | xe | Consonant |
| y | yes | ye | Consonant |
| z | zero | ze, zā, zāl, zād | Consonant |
| gh | See later | ghayn | Consonant |
| sh | she | sheen | Consonant |
| zh | visual | zhe | Consonant |
| ' | See later | ayn, hamze | Consonant |
As you see, some letters have been assigned multiple transliterations. Correct choice between these alternatives can only be made by studying the context.
There is some redundancy in the alphabet:
ā
and aa are equivalent.
x and kh are equivalent.
c and ch are equivalent.
q and gh are
equivalent.
u
and oo are equivalent.
We haven't made the selection between these because we would like future use of this scheme to show us which alternative prevails.
Some aspects of the traditional Persian script can be characterized as follows:
Here, o, u, v and ow are all used to transliterate single occurrences of the letter vāv. There are similar cases in English, the letter s, for example, in the words: choose (z), vision (zh), mass (s), tension (sh).
Here is a brief list of the general characteristics of the alphabet we are suggesting.
1. long vowels
ā as the a in father
i as the ea in reach
u as the oo in soon
2. short vowels
a as the a in cat
e as the e in net
o as the oo in good
3. diphthong
ow as in the English word "no".
Mowlānā (Persian poet Rumi)
Nowruz (Iranian new year)
gowdāl (hole in the ground)
gowhar (Precious stone)
Xosrow (Iranian name for boys)
owqāt (plural of vaght (time))
fowri (immediately)
This section provides some examples that indicate the correct pronunciation of the consonants. There are three consonants in Farsi (x/kh, q/gh, and ' as it is shown in phonetic alphabet) that have unfamiliar sounds to English-speakers.
b as in bird
p as in play
t as in table
s as in sound
j as in jar
c/ch as in child
h as in house
x/kh
this sound no longer exists in English except in Scottish
dialects, it also occurs in Russian (xarasho), German (achtung),
Dutch (acht, geweldig), Welsh (Llandadno) and others.
d as in day
z as in zero
r as in race
zh as in visual
sh as in she
q/gh
q or gh sound is absent in present English. In Arabic and Persian
q (qāf) and gh (ghayn) are different letters with different
sounds. In Persian, however, the pronunciations of qāf and
ghayn are quite similar. Traditionally, they have both been
transliterated as gh, for example, āghā (Sir), gharibe (stranger),
ghorub (sunset).
f as in fish
k as in kind
g as in good
l as in love
m as in moon
n as in note
v as in van
y as in year
In addition to the sounds listed above, there is another sound in Persian which (in Persian script) is either written as. "eyn" or "hamze" (or glottal stop).
You can produce this sound if you try to say the English words better or bottle without pronouncing -tt- sound in the middle (just like many local accents in England). It occures in many dialects of English but does not have a written form in the Latin alphabet. In phonetics, Glottal Stop is denoted by "?", but in eFarsi it is represented by the character single quote (apostrophe) character (').
Glottal stop may appear in the middle of the word, as in mas'ale (problem) and at the end of a word, as in shoru' (beginning). It is not necessary to make Glottal Stop explict at the beginning of a word since the vowels appearing at the beginning of a word enforce the same effect. For example, pronunciation of 'āb is not different from āb (water).
We have also chosen to make the stop between vowels explicit. For example: jāme'e (society), fa'āl (active), so'āl (question). We are not completely sure whether this is necessary, but we let the future practice to show us the way. For the moment we believe that including the "'" at these positions facilitates correct pronunciation for non-Farsi speakers. Some more examples follow.
1. hamze ye miyāni: mo'men (man of god), ta'min (been provided for), mas'ale (problem, quiz) 2. hamze ye pāyāni: ajzā' (parts), xala' (vacuum) 3. harf e eyn miyāni: me'mār (architect), ba'dan (later), te'dād (number) 4. harf e eyn pāyāni: majmu' (sum), shoru' (start), qāte' (decisive)
Some occurences of the letter hé in Persian are not pronounced. Usually, the h stands for the same sound as the h in the English word house. But in many other cases it is a silent h which is not pronounced. The silent h always appears at the end of a word. In traditional transliteration schemes for Persian, the silent h is included in the words. For example, in padideh (phenomenon), Zohreh (Venus), shāneh (comb) and parandeh (bird).
In our transliteration Scheme we have chosen not to include the silent h in the words, again trying to keep the scheme as close to word pronunciations as possible. In eFarsi, the above examples, are written as padide (phenomenon), Zohre (a female name), shāne or alternatively shaane (comb) and parande (bird).
Here are some examples of words where the h is not silent: havā (air), pahn (wide), rāh (way), beh (quince), bahbah! (wow, exclaiming enjoyment, appreciation and in sometimes sarcasm).
Tashdid in Persian and Arabic is a sign (similar to a small w) which is placed on a consonant to replace a double occurrence of the consonant. Most transliteration schemes indicate this by including two occurrences of the corresponding consonant. eFarsi follows the same principle. In English, for example, many words include double consecutive consonants: little, connect, occurrence. The difference between Persian and English is that, in Persian, both occurrences are pronounced. Here are some example words:
matte (drill) mosallaman (surely) jādde (road) tavallod (birth)
which are pronounced as mat-te, mosal-laman, jād-de and taval-lod respectively.
Ezāfe (e or ye) is an inflectional morpheme which is used to indicate some grammatical aspect of the word. it is comparable with 's when used in possessive case, and also with modifiers in noun combinations in English language.
Since ezāfe occurs so often, we have chosen to write it separately rather than adding it as a postfix. This way some ambiguities in syntactical analysis of words are avoided.
Ezāfe is written in the following forms:
ruz e āftābi (a sunny day) ruz (day), āftāb (sun), āftabi (sunny) ketāb e man (my book) (ketāb (book), man (I)) miz e motāle'e (study table) miz (table), motāle'e (study) rang e qermez (red color) rang (color), qermez (red or crimson)
sedā ye boland (loud voice) sedā (voice/sound), boland (loud) ru ye miz (on the table, above the table) ru (on, above), miz (table) havā ye xub (good air) (havā (air), xub (good, well))
chāy e iran (Iranian tea) ra'y e mardom (people's vote, the wish of the people) baray e to (for you)Finally, some more examples of the use of ezāfe:
ruz e āftābi (sunny day)
qorub e xorshid (sunset)
joz' e kuchak (small part)
farsh e dast bāft (hand-made carpet)
kashti ye Nuh (Noah's ship)
xāne ye mā (our house)
sedā ye boland (loud voice)
kādo ye tavallod (birthday present)
jangju ye dalir (brave fighter)
ketāb e riyāzi ye novin e man (my modern maths book)
Sometimes when ezāfe is used to connect noun and adjective, one can in principle,
switch the position of the noun and the adjective and do without the ezāfe. For
example,
ruz e āftābi (sunny day)and
āftābi ruz (sunny day)are equivalent. This phenomenon appears quite often is some Persian dialects and also in poesi.
Compound words are formed by joining two or more individual words. Such words are numerous in Farsi. Compounds are made up of different speech parts. They combine in nine different ways. There is no fixed rule for writing them in Persian, so it seems very hard to find a general rule for transliterating compounds. The main concern is how to write them, connected or disconnected, in particular when words get very long. In some Central and North European languages there is no fear of long compound words. We have mainly chosen the English style and in most cases we construct compounds by inserting a dash ("-") between individual words.
xosh-raftār (well-behaved)
nāz-parvarde (someone who is brought up with
love and affection)
mehmān-navāz (hospitable person)
xerad-pishe (humble person)
gushe-neshin (withdrawn)
faryād-ras (helper)
barādar-zāde (brother's offspring)
zamin-larze (earthquake)
āb-pāsh (watering-can)
farāmush-kār (absent-minded)
ātash-afruz (person that starts a fire usually meaning a
person that starts heated discussions or
loaded situations.)
dowlat-mard (political leader)
dast-poxt (cooking)
xosh-lebās (someone that dresses well)
Sometimes compound words are constructed by connecting two closely related words (usually synonyms) with "va" (and) which takes a brief form "o":
jost o ju (n. search)
shost o shu (n. washing)
goft o gu (n. conversation)
kār o bār (n. work)
ās o pās (n. hopeless)
seft o saxt (n. solid and hard)
did o bāzdid (n. visiting each other)
raft o āmad (n. commuting)
bi kas o kār (n. person without relatives and job)
bi nām o neshān (n. person without name and address)
Some of these constructions have gradually taken the form of a primitive single word. At the same time the "o" has been transformed to an "e", for example, josteju [jost o ju], shosteshu [shost o shu].
In Persian, plurals are constructed by either adding the postfix "hā" or "ān" to the end of the word. In eFarsi transliteration, these are simply added to the end of the word and not written seperately. Here are some examples:
sib (apple) sibhā (apples)
ketāb (book) ketābhā (books)
zan (woman) zanān (women)
zanhā (women)
shab (night) shabhā (nights)
An alternative way would have been to write the plural postfix separately, for example, "sib hā" instead of "sibhā" just like when we write the Ezāfe seperately. However, we have decided to join these ending to the words.
This rule also applies to foreign words that have entered Farsi. For example, ketāb (book) which was mentioned above is an arabic word. However, the arabic plural for many words also occur in Farsi.
ā āb bārān guyā a abā tars na b bāmdād nabid shab p parastu sepās tup t teshne āshti sokut s sepide hasti pārs j jahān vojud mowj c cehre kuce kuc ch chehre kuche kuch h hush nahān panāh x xorush saxt farāx kh khorush sakht farākh d dānesh tadbir shād z zāyesh ruzgār rāz r ruyesh shegarf bahār zh zharf pazhuhesh kuzh sh shādi xoshnud sorush gh gham arghavān simorgh q qalb meqdār barq f farhixte afsāne gazāf k kushā niki pāk g gol hengām bāng l larzān pahlavān bāl m masti peymān ārām n nasim minu javān v vojud mive sarv y yazdān peyvand ra'yHere is a poem by Xayyām
tā chand hadis e panj o chār ey sāqi moshkel che yeki che sad hezār ey sāqi xākim hame, chang besāz ey sāqi bādim hame, bāde biyār ey sāqiHere are some expressions and sentences
((pardon me) (bebaxshid)) ((excuse me) (bebaxshid)) ((hi) (salām)) ((please) (xāhesh mikonam)) ((it doesn't matter) (eybi nadārad)) ((it doesn't matter) (mohem nist)) ((in jāy e man ast) (this is my seat)) ((what is your seat number ?) (shomāre ye sandali ye shomā chand ast ?)) ((my seat number is 5A) (shomāre ye sandali ye man 5A ast)) ((thank you) (xeyli mamnun)) ((thank you) (mersi)) ((thank you) (motashakkeram)) ((thank you) (sepās gozāram)) ((how long is the flight?) (parvāz cand sāat ast?))
for your trip to Iran (this list is under construction)
((one) (yek)) ((two) (do)) ((three) (se)) ((four) (cahār)) ((five) (panj)) ((six) (shish)) ((seven) (haft)) ((eight) (hasht)) ((nine) (noh)) ((ten) (dah)) ((eleven) (yāzdah)) ((Twelve) (davāzdah)) ((thirteen) (sizdah)) ((fourteen) (chahārdah)) ((fifteen) (pānzdah)) ((sixteen) (shānzdah)) ((seventeen) (hevdah)) ((eighteen) (hejdah)) ((nineteen) (nuzdah)) ((twenty) (bist)) ((twenty one) (bist o yek)) ((twenty two) (bist o do)) ((twenty nine) (bist o noh)) ((thirty) (si)) ((thirty one) (si o yek)) ((forty) (cehel)) ((fifty) (panjāh)) ((sixty) (shast)) ((seventy) (haftād)) ((eighty) (hashtād)) ((ninety) (navad)) ((hundred) (sad)) ((hundred and one) (sad o yek)) ((two hundred) (devist)) ((three hundred) (sisad)) ((four hundred) (chahārsad)) ((five hundred) (pānsad)) ((six hundred) (sheshsad)) ((seven hundred) (hafsad)) ((eight hundred) (hashtsad)) ((nine hundred) (nohsad)) ((one thousand) (hezār)) ((one thousand and one hundred and one) (hezār o sad o yek)) ((one thousand and one hundred and one) (yek hezār o yek sad o yek)) ((one thousand and one) (yek hezār o yek)) ((two thousand) (do hezār)) ((ten thousand) (dah hezār)) ((hundred thousand) (sad hezār)) ((one million) (yek milyon)) ((two millions) (do milyon)) ((quarter) (rob')) ((half) (nim)) ((one third) (yek sevvom)) ((three fifth) (se panjom)) ((first) (avval)) ((first) (naxost)) ((the first) (naxostin)) ((the first) (avvali)) ((first book) (ketāb e naxost)) ((the first book) (avvalin ketāb)) ((the first book) (naxostin ketāb)) ((second) (dovvom)) ((third) (sevvom)) ; chahārom, panjom, and so on ...
((what is your name?) (esm e shoma cist?)) ((my name is Ali Pardis) (esm e man Ali Pardis ast)) ((are you Ali?) (shomā Ali hastid?)) ((are you not Ali?) (shomā Ali nistid?)) ((yes, i am) (bale, hastam)) ((no, I am not) (na, nistam)) ((is Ali at home ?) (Ali xāne ast ?)) ((Ali is at home) (Ali xāne ast)) ((is Goli also at home ?) (Goli ham xāne ast ?)) ((yes, she is) (bale, hast)) ((ms. Goli Pardis is out) (xānom e Goli Pardis xāne nist)) ((good evening) (asr bexeyr)) ((good morning) (sobh bexeyr)) ((good night) (shab xosh)) ((good night) (shab bexeyr)) ((good bye) (xodā hāfez)) ((hi) (dorud)) ((hi) (salām)) ((where is the chair ?) (sandali kojā ast ?)) ((it is here ?) (injā ast ?)) ((table) (miz)) ((on the table) (ru ye miz)) ((on the table) (ru miz)) ((put it on the table !) (ān rā ru ye miz begozār !)) ((under the table) (zir e miz)) ((left) (chap)) ((right) (rāst)) ((turn !) (bepich !)) ((the house is beautiful) (xāne zibā ast)) ((house) (xāne)) ((the house is beautiful, but not very big) (xāne zibā ast , vali xayli bozorg nist)) ((is it a long road?) (jādde ye tulāniyi ast?)) ((no, it is short) (na, kutāh ast)) ((the man is not in the room) (mard e dar xāne nist)) ((i have, you have not) (man dāram, shomā nadārid)) ((you have, i have not) (shomā darid, man nadāram)) ((i told her) (man be u goftam)) ((do you have the letter?) (āyā shomā nāmeh rā dārid?)) ((do they have the book) (āyā ānhā ketāb rā dārand)) ((have you written) (āyā neveshte'id)) ((i have bought it) (man ān rā xarideam)) ((i have not found the address) (man ādres rā peydā nakardeam)) ((she has read the book) (u ketāb rā xānde ast)) ((my book) (ketāb e man)) ((my book) (ketābam)) ((your book) (ketāb e to)) ((your book) (ketābat)) ((her book) (ketāb e u)) ((her book) (ketābash)) ((his book) (ketāb e u)) ((his book) (ketābash)) ((our book) (ketāb e mā)) ((our book) (ketābemān)) ((your book) ; plural and polite for singular (ketāb e shomā)) ((your book) (ketābetān)) ((their book) (ketāb e ānhā)) ((their book) (ketābeshān)) ((your lighter) (fandak e shoma)) ((their school) (madrese ye ānhā)) ((our bus) (utubus e mā)) ((i understand) (man mifahmam)) ; man can be dropped ((i understand) (mifahmam)) ((you understand) (to mifahmi)) ; to can be dropped ((you understand) (mifahmi)) ((she understands) (u mifahmad)) ; u can be dropped ((we understand) (mā mifahmim)) ; mā can be dropped ((you understand) ; plural (shomā mifahmid)) ; shomā can be dropped ((they understand) (ānhā mifahmand)) ; anhā can be dropped ((welcome) (xosh āmadid)) ((the room is not ours, it is theirs) (otāq e mal e ma nist, mal e ānhā ast)) ((the chair is not mine, it is his) (sandali e mal e man nist, mal e u ast)) ((his friend is your brother) (dust e u barādar e shomā ast)) ((how do i get to the bus station?) (cetowri mitunam beram istgāh e otobus?)) ((how much does it cost?) (ceqadr mishavad?)) ((will it be on time?) (sare vaqt xāhad bud?)) ((my friend's name is Ali) (esm e dust e man Ali ast)) ((which house is hers) (kodām xāne māl e u ast)) ((my bag is not here) (sāk e man injā nist)) ((my bag has not arrived) (sāk e man nareside ast)) ((she has not arrived yet) (u hanuz nareside ast)) ((do you have any matches?) (shomā kebrit dārid)) ((my bag is blue) (sāk e man ābi ast)) ((black) (siyāh)) ((white) (sefid)) ((green) (sabz)) ((red) (qermez)) ((yellow) (zard)) ((when will the bus arrive?) (otobus key mirese?)) ((can you help me please?) (mitavānid lotfan be man komak bekonid?)) ((can you help me find this place?) (mitavānid be man komak bekonid injā rā peydā bekonam?)) ((i do not understand what you are saying) (man nemifahmam shomā ce miguyid)) ((what is the name of this street?) (esm e in xiyābān ce ast?)) ((how much do i have to pay?) (ce qadr bāyad pardāxt bekonam?)) ((how much do i have to pay?) (ce qadr mishe?)) ((what is the name of this shop?) (esm e in maghāze ci ast?)) ((what time is it?) (sā'at cand ast?)) ((what time is it?) (sā'at cande?)) ((what day is today?) (emruz cand shanbe ast?)) ((what day is today?) (emruz ce ruzi ast?)) ((today is a nice day) (emruz ruz e xubi ast)) ((where do you live?) (shomā kojā zendegi mikonid?)) ((i live in iran) (man dar irān zendegi mikonam)) ((embassy) (sefārat)) ((where is the Canadian Embassy?) (sefārat e Kānādā kojā ast?)) ((where is the Canadian Embassy?) (sefārat e Kānādā kojāst?)) ((i want to make a telephone call) (man mixāham telefon bezanam)) ((bank) (bānk)) ((money) (pul)) ((money exchange office) (sarrāfi)) ((i want to change 100 dollars) (man mixāham 100 dolār tabdil bekonam)) ((what are office hours of the bank?) (sā'āt e kār e bānk key ast?)) ((do you accept Visa credit cards?) (shoma kārt e Visa qabul mikonid?)) ((i want to rent a car) (man mixāham yek māshin kerāye bekonam)) ((for one day) (barāy e yek ruz)) ((one week) (yek hafte))
((second) (sāniye)) ((minute) (daqiqe)) ((sā'at) (hour)) ((day) (ruz)) ((week) (hafte)) ((month) (māh)) ((year) (sāl)) ((decade) (dahe)) ((century) (qarn)) ((forbidden) (mamnu')) ((open) (bāz)) ((closed) (baste)) ((way in) (vorudi)) ((way out) (xoruji)) ((vacant) (āzād)) ((engaged) (eshghāl)) ((empty) (xāli)) ((about) (taqriban))
((day) (ruz)) ((morning) (sobh)) ((night) (shab)) ((noon) (zohr)) ((afternoon) (ba'd az zohr)) ((before noon) (qabl az zohr)) ((dawn) (pegāh)) ((early morning) (sobh e zud)) ((late afternoon) (asr)) ((evening) (ghorub)) ((sunrise) (tolu')) ((sunset) (ghorub)) ((year) (sāl)) ((decade) (dahe)) ((century) (sade)) ((century) (qarn)) ((millinium) (hezāre)) ((what time is it?) (sā'at chande?)) ((it is 5 o'clock) (sā'at 5 ast)) ; panj = 5 ((it is 12) (12 ast)) ; davāzdah = 12 ((it is 12:30) (sā'at davāzdah o nim ast)) ; davāzdah o nim = 12:30 ((it is 12:30) (sā'at davāzdah o si ast)) ; it is twelve thirty ((it is 12:15) (sā'at davāzdah o rob' ast)) ; rob' = quarter ((it is 12:45) (sā'at yek rob' be yek ast)) ; yek = 1 ((it is 10 to 12) (sā'at 10 daqiqe be 12 ast)) ((it is 10 past 12) (sā'at 12 va 10 daqiqe ast))
((water) (āb)) ((salt) (namak)) ((sugar) (shekar)) ((bread) (nān)) ((drink) (nushābe)) ((breakfast) (sobhāne)) ((dinner) (shām)) ((sandwich) (sāndevich)) ((salad) (sālād)) ((rice) (berenj)) ((butter) (kare)) ((meat) (gusht)) ((tomato) (gowje)) ((cucumber) (xiyār)) ((cheese) (panir)) ((yughort) (māst)) ((onion) (piyāz)) ((jam) (morabbā)) ((honey) (asal)) ((lemon) (limu)) ((lemon juice) (āb e limu)) ((tea) (chāy)) ((coffee) (qahve)) ((cake) (keyk))