Guide to Pronouncing Mandarin
in Romanized Transcription
(Advanced Page)

Table of Contents:

  1. Material for Beginners (previous page)
    1. Introduction (for beginners)
    2. Pinyin Romanization: The Short Story (for beginners)
      Sound file of specimen Chinese Words
  2. Advanced Material (on this page)
    1. Introduction (long version)
    2. Pinyin Romanization: The Long Story (for people who know some Chinese)
    3. Wade-Giles Romanization (for people who know some Chinese)
    4. Gwoyeu Romatzyh Romanization (for people who know some Chinese)
  3. Sound Files of Specimen Syllables icon
  4. Complete Reference Table of Pinyin, WG, and GR Spellings
  5. Interactive Romanization Data Base of Pinyin, WG, and GR Spellings

Part A: Introduction

Chinese is written with a distinctive orthography composed of a separate sign (called a "character") for each semantically significant syllable. The phonology of Mandarin allows fewer than 1000 syllables that are distinct in sound, and in the spoken language, most words are polysyllabic. However each syllable of a polysyllabic word potentially has its own separate semantic field, and so naturally the 1000 phonetic syllables are typically homonyms, that is to say, each of them labels several unrelated semantic fields. For example, the syllable fu2 can mean serve or bat or happiness or roll of cloth. In the spoken language, the context or the association with certain added syllables of similar meaning differentiates these meanings. But in the written language each of those meanings has a different sign. Thus it has always been possible to write a thought with fewer signs than the number of syllables it took to speak it. (This is why written Classical Chinese never really corresponded to any spoken form of Chinese.)

The largest Chinese dictionary contains definitions for about fifty thousand characters, although far more have been used in the course of Chinese history, as people have constantly invented new ones. (The opposite process also occurs as people have merged signs for similar characters that have the same sound.) The standard Chinese computer systems define ASCII-like codes for slightly over 13,000 characters in Taiwan (the so-called "Big-5" coding system) and slightly over 7,000 characters in mainland China (the so-called "National Standard" coding system, usually referred to as GB or Guobiao).

When Chinese words are represented in Western languages, the challenge is to represent them in Latin letters. All such systems necessarily reflect the Chinese spoken language (usually the Mandarin of Beijing), not the Chinese written language. When Romanization is used to gloss characters from Classical Chinese, there is often ambiguity about which character is intended. Still, Romanization is the usual form in which Chinese words appear in English books and articles.

Several different systems of Romanization (spelling with Latin letters) have been used for this, of which three are described in this document: PY, WG, and GR.

Part B of this document describes the Pinyin system, in terms of which the other two systems are described. (A simple overview can be found in on the previous page.) I have been casual about describing the sounds of Pinyin as though they were more closely equivalent to American English than they really are, but the equivalencies are close enough for practical purposes, especially on a page designed to describe the spelling, not the real phonology.

The official "Pinyin" ("phonetic") system of Romanization is one of the most practical ever devised for Chinese in that PY spellings (1) are generally shorter than spellings in other systems, (2) are closer to Chinese language intuitions than other systems are, and (3) allow polysyllabic compounds to be written together with ambiguity about syllable boundaries.

But for an outsider PY can be confusing because the same letter does not always correspond to the same sound, and because some of the sounds rarely occur outside of Mandarin Chinese. It is no wonder that people who have not studied Mandarin sometimes make embarrassing mistakes. When American TV journalists, for example, pronounce PY j and zh both like the g in English/French "rouge" (a sound that would be spelt r in PY), they immediately reveal that they speak no Chinese and suggest that they don't know much about China either.

The description on these two pages is intended to help you do better than that. The beginner's page should remove the deepest mysteries. It is complete enough to allow someone who has not studied Mandarin to read PY aloud clearly enough for a reasonably forgiving Chinese to understand it. The description on this page is more detailed and is intended for people who have studied a little Chinese.

Part C provides a comparative overview of the Wade-Giles scheme for spelling Mandarin. Although not official in China, it was very widely used through most of the XXth century by foreigners and some Chinese, and is still used by a few writers who have not bothered to learn PY or whose ties to Taiwan make use of the PY system politically inexpedient.

Part D provides a comparative overview of Gwoyeu Romatzyh, a.k.a. GR or "tonal spelling," the system that was official (but little used) in the ROC from the 1930s to the 1980s and that is still occasionally used in linguistically sophisticated publications. It is the only system in which editors do not disfigure the language by deleting necessary diacritical marks, since it does not use diacritical marks, but shows tone instead by changes in the spelling itself.

Very, very few Chinese are able to spell Mandarin correctly in any Romanized spelling system, even PY, so you will find countless spelling errors, especially in works by Chinese authors or published in China. Since PY is not widely used in Taiwan, spellings used there are rarely systematic in any obvious way, and even specialists are often unable to make sense of them. The situation is improving slightly as PY-centered computer systems gradually force a few Chinese to improve their accuracy with the system.

Part B: Pinyin Romanization (PY): The Long Story

b =English b [footnote 1]
c = ts in English hats
ci = (silent i!) as though English had a word spelt tsz
ch = ch in English church [footnote 2]
chi = (silent i!) as though English had a word spelt chr
d = English d [footnote 1]
f = English f
g = English hard g in get [footnote 1]
h (initial) = English h, only slightly harder, almost to being like the ch in Scottish Loch or German Hoch
j = j in English jeans (not like g in rouge!!!) [footnote 2]
k = English k [footnote 1]
l = English l
m = English m
n = English n
-ng (final) = ng in English sing
p = English p [footnote 1]
q = English ch in cheat [footnote 2]
r (initial) = French initial j
-r (final) = a Midwestern r [footnote 3]
ri (silent i!) = a French j followed by a Midwestern r!
s = English s
si (silent i!) = as though English had a word spelt sz
sh = sh in English shame [footnote 2]
shi (silent i!) = as though English had a word spelt shr
t = English t [footnote 1]
w = English w
x = English sh in sheet [footnote 2]
y = English y
z = ds in English heads
zi (silent i!) = as though English had a word spelt dz
zh = j or g in English judge [footnote 2]
zhi (silent i!) = as though English had a word spelt jr

Footnote 1: Footnote for the phonologically sophisticated: English p t k differ from b d g in that: (1) p t k are aspirated, whereas b d g are not, and (2) b d g are voiced, whereas p t k are unvoiced. In most European languages you have studied, you will have been told that b d g are voiced and unaspirated as in English, but that p t k are both unaspirated and unvoiced, and aspirating them is part of an English or American accent. In Chinese neither series is voiced, but p t k are heavily aspirated, while b d g are unaspirated. Thus in some systems of romanization (including Wade-Giles), the letters b,d, and g are avoided as implying voicing; both series are spelt p t k, and the aspiration is shown with an apostrophe (p', t', k') or an h (ph, th, kh). Using voicing rather than aspiration to distinguish the p t k and b d g series in Chinese is a mark of a disastrous European accent. Using voicing as well as aspiration marks a less confusing English or American accent. Similarly, c, ch, and q are unvoiced and heavily aspirated, while z, zh, and j are unvoiced (or rather start their articulation that way) and unaspirated

Footnote 2: Footnote for the phonologically sophisticated: j q x are like the sounds in English jeans, cheat, and sheet, except that the Chinese sounds are strongly palatalized (i.e., the tongue is pushed against the front of the roof of the mouth). zh, ch, and sh are pronounced like the sounds in English judge, church, and shame except that they are retroflex, i.e., the tongue is curved up and back slightly to approach the roof of the mouth about in the middle. (This is why the effect is something like a Midwestern r being attached to them.) I chose English examples with front and back vowels for the two series to try to suggest this, but the Chinese difference is stronger than that. The initial r is also retroflex and sounds something like a French j followed by a Midwestern r! Further note for the masochistic: To the Chinese ear, there is a greater difference between the palatalized series (j q x) and either of the other two series (zh ch sh and z c s) than between the two other series themselves. Accordingly MOST non-standard Mandarin dialects pronounce BOTH zh ch sh and z c s identically [like z c s] but still distinguish the j q x series. Thus "ten" (shi2) and "four" (si4) differ only in tone (si2 and si4) for many speakers. The PY use of h to distinguish the two series reflects their similarity in the Chinese mind, while the PY use of entirely separate letters j q x for the palatalized series reflects the clear distinction of that series for the native speaker.

Footnote 3: A final -r wipes out any other final consonant (-n or -ng) and produces nasalization of the preceding vowel, plus the Midwestern-like final -r sound itself. This usually occurs on diminutive nouns or nouns formed from other words, but it is especially associated with the speech of Beijing, where people are said by other Chinese to buzz and rumble a good deal. Thus míngtiān ("tomorrow") is sometimes pronounced míngtiār (or even simply miár). Standard Pinyin transcription retains the full spelling and simply appends the suffix R to show the transformation.

a (in ian or yan) = e in English get (!)
a (elsewhere) = a in English father
e (alone) = usually between the u in English lump or and the u in English lurch
e (alone) = occasionally in exclamations and a few particles) like e in English get.
e (before n or ng) = a in English alone
e (after i or y) = e in English get
er = Midwestern English are
i (after h or r) = Midwestern r in hurt [footnote 4]
i (after z, c, s) = English z [footnote 4
i (before or after another vowel, except u) = English y
i (elsewhere) = ee in English see
iu = English yo as in "Yo ho!" or yeo as in yeoman
o (after b,p,m, or f) = Italian uo
o (after a) = English w
o = Italian o
ou = English owe
u (after j,q,x,i, or y) = French u or German ü
u (before or after another vowel, except i) = English w
u (elsewhere) = Italian u
ui = English way (!)
ü = French u or German ü (written like ordinary u [without the dots] after j,q,x,i, or y, since they never have a regular u-sound after them)

U may never occur as an initial letter of a syllable. It always turns to w in that case. If u is the only sound in the syllable, then it becomes wu.

I may never occur as an initial letter of a syllable. It always turns to y in that case. If i is the only sound in the syllable, then it becomes yi.

Footnote 4: This is not technically a vowel at all. These syllables conclude with voiced syllabic consonants, and the use of i as dummy vowel here is merely a graphic convention to show that the syllable is complete and to carry the tone mark. The sounds that occur here have no relation to i in standard Mandarin. However some non-standard varieties of Mandarin do in fact use an i-like sound in these locations, and cognate syllables in other Chinese languages usually have a similar vowel in these locations, which is presumably why i was chosen for the dummy-vowel.


(This section gives a workable overview of the tones modern standard Mandarin. For a cross-dialectical discussion of the properties of the underlying system, check out the page on More Than You Want To Know About Chinese Tone.)

Note: In standard Mandarin, when two third tones occur together, the first of them is spoken as though it were a 2nd tone. (Chinese speakers find it disconcerting when foreigners miss this.) When a 2nd tone follows a 1st or 2nd and is not phrase final, it is spoken as though it were a 1st tone. (Chinese speakers pay little attention when foreigners miss this.) The phenomenon of a tone pronounced differently because of its environment in a longer phrase is referred to as "tone sandhi." It is missing in Cantonese, relatively rare in most dialects of Mandarin, and quite common in most other kinds of Chinese.

Part C: Wade-Giles Romanization (WG) Explained In Relation To PY

Wade-Giles Consonants
WGPY & Comments
ch zh or j
chih zhi (syllabic)
ch' ch or q
ch'ih chi (syllabic)
f f
h h
-h (final) unwritten
j r
jih ri (syllabic)
k g
k' k
l l
m m
n n
-ng -ng
p b
p' p
rh -r
s s
sh sh or x
shih shi (syllabic)
ssu (szu) si (syllabic)
t d
t' t
ts z
ts' c
tzu zi (syllabic)
tz'u ci (syllabic)
w w
y y
Wade-Giles Vowels
WGPY & Comments
a a
e (in yen or ien) a in yan or ian
e (elsewhere) e
i i, yi
ieh ie
o (after k k' h t t' n l [footnote 5]) e
o (after p p' m f) o
o (elsewhere and after l) uo
u (except as syllabic consonant as listed above) u
ü ü (except where dots omitted as redundant. See PY table)
y y
U and i become w and y in WG according to the same rules as in PY.

Footnote 5: Some writers distinguish two syllables luo and lo (corresponding to PY luo and le); others distinguish the same syllables as lo and le; still others spell them both lo indiscriminately.

Tones: In Wade-Giles spellings, tones are shown by small raised numbers after the syllables to which the refer.

Points to be alert to: WG transcriptions often contain deviations from standard pronunciation, usually for one of the following reasons:

  1. Ignorant editors often remove apostrophes from WG transcriptions or leave the two dots off of the letter ü.
  2. WG transcription was the preferred system for a generation of sinologists who did not actually speak Chinese and who regarded Romanization with disdain. They easily made transcription mistakes.
  3. WG spellings antedate the decision to make Beijing speech a national standard, so some early publications standardized upon other dialects of Mandarin. (Non-Beijing Mandarin spellings are usually preserved for some reason in French even today whenever the French version of WG is used.)
  4. WG is often mixed with non-WG "postal" spellings of place names, which were never consistent but were fossilized in their Romanization when Europeans were in charge of the Chinese post office system. This produced spellings like "Amoy" for Xiamen and "Swatow" for Shantou, Teochiu for Chaozhou, and "Canton" for Guangzhou. With the full universalization of Pinyin in the 1970s such idiocies are gradually disappearing.
  5. Some Sinologists used to write a breve (like the lower half of a circle) over the u (ŭ) in the syllable ssu/szu, and some wrote a circumflex (like a small roof) over the e (ê) in some syllables. These were not tone marks and did not affect the pronunciation. (Don't ask!)

Part D: Gwoyeu Romatzyh (GR) Explained In Relation To PY

Gwoyeu Romatzyh Consonants
GRPY & Comments
b b
ch ch or q
d d
f f
g g
h h
-h- (1st tone mark after m,l,n,r)
j zh or j
k k
l (initial) l
-l (final) -r
m m
n n
-ng -ng
p p
r r
s s
sh sh or x
t t
w w


Gwoyeu Romatzyh Vowels
GRPY & Comments
a a
au ao
e e
i i, y, yi
iou iu
iu yu, ü (or u when it sounds like ü)
o o
u u, w, wu
w See tones
y See tones
-y -i (as a dummy vowel with syllabic consonants)



First tone is shown by
  1. a single vowel
  2. i or u as an initial letter or before a, e, or o
  3. i or u after a in a two-vowel diphthong
  4. h after m, l, n, or r
Second tone is shown by
  1. initial m,l,n,r with no mark to show a different tone
  2. postvocalic r
  3. y or w before a, e, or o, or ou
  4. y as the only vowel before n or ng
Third tone is shown by
  1. any double vowel (aa, ii, etc.)
  2. an e or o between an initial consonant and another vowel
  3. e or o after a in a two-vowel compound (i or u in other tones)
Fourth tone is shown by
  1. postvocalic h
  2. final nn
  3. final nq (ng in other tones)
  4. final ll (= PY -r4)
Neutral tone is shown by a dot before the base spelling (usually the first tone spelling)
Neutralized tone is shown by a dot before the spelling with the word's normal tone

Finally, foreign words retain their original spellings. Hence the Roma in Gwoyeu Romatzyh is the Italian word Roma (Rome); if GR did not make this curious exception to its rules, the word would actually be spelled Luomaa (Luo2ma3 in Pinyin)!

Closing Thought: There now. That won't enable you to write GR, but at least it will let you convert it to Pinyin. Although difficult to teach and learn, those who know it find GR is the most efficient Chinese Romanization, so it is still used in anthropological and linguistic fieldnotes, Email, and other places where accuracy and efficiency need to be combined and where editors opposed to representing tone have no ability to stop us!

