Notes on Chinese names

The extensively varied "Chinese" language.

A common question that I hear often when I present this work on line or off line is : "Why don't we ever see your Chinese names in the literature we come across?" or "Why is it that most of the Chinese names I come across are different to yours ?"

The answer is many-fold.

1. Most scientists only come across romanised (romanized) names. If they only saw Chinese characters their eyes would glaze over.

2. The bulk of the China-related literature in the West has been published in Taiwan and Hong Kong. That implies traditional characters and different romanisation systems.

In the database I have adopted the standard combination simplified character and Hanyu pinyin romanisation because they correspond to modern thinking in Mainland China where the bulk of the Chinese population resides. However I never forget that on one hand many characters have been simplified and therefore have at least two versions, plus sometimes some regional variations (see for instance "si" of sigua-ridged gourd-Luffa acutangula). On the other hand, in Mainland China alone the cultures are as diversified as within Europe and consequently the scripts vary or if the script is the same the pronunciations differ. Various romanisation systems cater for all these variations. In the context of our database we have had to limit ourselves to Chinese as one language (one field) but within the confines of this "field" we do our best to accomodate as many variations as practicable. It should be noted that the scope for expanding this field is far greater while we publish static files / working files / rough drafts. The day we actually publish the database itself, the field "Chinese" will restrict us considerably, even if we split it into 3 or 4 Chinese fields (Chinese-Cantonese, Chinese-Mandarin, Chinese-Taiwanese, Chinese-Hakka for example).

Then again both technical / academic staff and linguists recommend to restrict oneself to 2 or 3 synonyms when publishing a dictionary. This limits confusion they say. It also allows for fruitful comparison with related languages like Japanese, Vietnamese etc. The names in these languages are mainly based on the main Chinese names, not the less important or less popular.


1. From the KANJI DATABASE , part of Jeffrey Friedl's JAPANESE <-> ENGLISH DICTIONARY

< > (Gateway). I prefer the following settings but they may not suit everyone : < >

Data related to character "peng" half of the internationally recognized word: "ponkan" - "a citrus".

  1. dictionary search code ``!5C2E''
  2. Classification: rare
  3. reading: PON
  4. English tags: `name of a place in India', `Poona'
  5. Radical: 75, Stroke count: 12
  6. Encodings: JIS 5C2E, EUC DCAE, Kuten 6014, Shift-JIS 9EAC, Unicode 692A
  7. SKIP code: 1-4-8, Four-Corner code: 4891.2
  8. Indices: S&H 4a8.34, Morohashi #0
  9. Pinyin: peng4

The "reading" is the Japanese romanisation, corresponding to the "Sino-Japanese" in the example below demonstrating the numerous romanisations related to a single character.

2. From Thomas Chin's Hakka Dictionary, the romanisation of most characters can be found together with the meaning of each character. 10 different Hakka systems are included and cross-referenced with other systems from different languages. It is a valuable Chinese -Hakka-centered resource complementary to the above Japanese-centered set of resources.

  1. gam1 (Lau Chun-fat (pinfa) Hakka pinyin dictionary)
  2. kam1 (MacIver Kwangtung Hakka dictionary)
  3. kam1 (Rey Meixian Hakka)
  4. kam1 (Dongguan Qingkai Hakka)
  5. kam1 (Bao'an Guanlan Hakka)
  6. kam1 (Lufeng Hakka
  7. gan1 / qian2 (Pinyin)
  8. gam1 ( Cantonese)
  9. kam / kem (Sino-korean)
  10. kan (Sino-japanese)


Further complementary resources can be found in our file : On-Line Bibliographical Resources < Bibliography_Electron.html#2 > under : "Chinese".

The romanisation issue -
Displaying the sound(s) of the character.

Another question - rather a criticism, is : "Why don't you show the tone in your romanisation?"

I do not directly, true, but if one points to any character one will see on the browser screen somewhere the URL of the gif picture. The first number in this URL is the tone number. The standard code is for example ma1. My URL will be ma11, ma12, ma13 etc. for any character with a flat tone. The second number identifies the character within our internal database. When this principal is understood it will allow anyone to display a character (by typing in the URL window : for example) or several characters in an html file if each character is linked to our server (... <IMG SRC="" WIDTH=24 ... etc.). Not a rapid method but universally practical when discussing the intricacies of one or two characters.

The following shows the simplified character, the proper romanisation, its computer-compatible alternative, our internal database html-compatible coding. The corresponding tone indication is evident. In the case of "ma" (horse) we show the Traditional character (note our coding for it) as well as the simplified form.

= = ma1 = ma11.gif (unique to our database)

== ma2 = ma21.gif (unique to our database)

== ma3 = ma31.gif Tma31.gif (unique to our database)

== ma4 = ma41.gif (unique to our database)

Convention or common sense in romanising words

In our database we have decided from the very start to separate the romanisation of each character from those preceeding or following. This is not the conventional wisdom. We originally did this to accomodate our database and to make it plain to readers when the characters start and end. In composed names of the type : Huangrou huluobo or Hongpi malingshu it adds to the understanding to separate the groups. When the word is displayed in its original script as a name, the romanisation lose much of its status, it becomes an aid to the reading similar to the tiny hiragana characters on top of complex rare kanji characters in Japanese. In that context we feel it is OK to separate each romanised character, the purpose is then to help the reader to "read" the Chinese word and more importantly know instantly where any specific character stops and ends. In Japanese we were using more sophisticated equipment and had more experience when we started we have tried to adopt the convention of separating meaningful "words" (or rather their representation in a romanised form) instead of individual characters. Malaixiya (Malaysia) would gain in being kept in one block.


This site "Child of the World" gives some background on the Chinese language, Mandarin, Cantonese, and Hokkien.

< >



back to Gateway , to list of notes , to the Landcare Group Homepage

Date created: 16 / 02 / 2000
Authorised by Prof. Snow Barlow
Last modified: 18 / 02 / 2000
Access: No restriction
Copyright © 1995 - 2000, The University of Melbourne.
Maintained by: Michel H. Porcher, E-Mail: