Home

An introduction to Chinese writing (v.0.3)

This text is in its third draft and unedited. To give feedback on this version please mail me: kees@ketmia.net.

A short introduction (4 to 5 minutes read) is here.

This introduction will try to explain the current Chinese writing system to a layperson. Specifically it is aimed at giving necessary background information for the graphical etymologies of Chinese and Japanese characters on this site.

As this text is in English, it will assume knowledge of the Latin alphabet, spelling in English and knowledge of the numerals 0123456789.

Most people will have encountered the signs that make up Chinese writing as part of the shop signage of a Chinese restaurant or Chinese market. Like the shop below, which displays the signs 鑫, 滿 and 行 on its facade.

These signs are very different from letters. While letters like a b c d e, or combinations of letters like sh ch ng th represent sounds, signs like 鑫, 滿 and 行 represent words or syllables, and only a subset of those signs give a hint with regard to their pronunciation. Mostly, signs like 鑫, 滿 and 行, have to memorized directly, as arbitrary signs. In that they are exactly like the numerals 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 in English.

How do you know to pronounce any of the signs 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 that you know so well? Answer: you simply memorized them as a child. The signs themselves are completely arbitrary. At some point you learned to connect the words for the numbers (which until then you only knew as sounds) with these scribbles.

Note how each sign for a number stands for a complete word. For example, the sign “4” can also be spelled as “four”. Writing “4” and “four” are different ways to write the word {four}. In regular Chinese writing, the word {sì} (which means “four”) is only written with the single sign 四.

When we combine “4” and “5” to “45”, we get a new word {forty-five}. Now “4” has lost its independence and has become part of a multi-syllable word. Likewise, in Chinese writing, signs can combine to create multi-syllable words. The word {sìwǔ} is analogous to “45” and is similarly spelled with two signs: 四五. But in Chinese writing this principle extends to regular words. The Chinese word for “number; digit” is {shùzì}. It consists of two syllables and is spelled with the two signs 數 and 字, as 數字.

The signs in Chinese writing stand directly for words or syllables, just like the numbers 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. In principle all signs in Chinese writing have to be memorized separately. A substantial part of the signs in Chinese are arbitrary, just like the numbers. You have to memorize how they look and which word they stand for.

Here are a few random examples of Chinese signs that (nowadays) are arbitrary with regard to their shape, and the word (or syllable) they stand for:

Sometimes people will see pictures in signs like these and use that as a (temporary) reminder. Perhaps you were told stories about 0, 1, 2, 3, 4, 5, 6, 7, 8, 9? I wasn’t. Some people see in 4 a little car, facing to the left. And a car has four wheels doesn’t it? Some people take 四 (the Chinese sign for the word “four”) as a fist, four fingers closed around the thumb. But stories like that don’t take you very far.

Try building a story around 黃, which is used to write the word for the color “yellow” in Chinese. Or 民, which is used for writing words relating to “citizens” or “the people”. Or look at 凡, which is used for the word “commonplace”. And so on.

***

As it happens, not all signs that are used in Chinese writing are arbitrary. Some are recognisably based on a picture of an object. Although, mostly you have to be told what it should represent, because the signs are very abstract. Below are examples of signs that represent an object as a picture.

For example, 人 and 亻 should be taken as a person seen form the side. 目 is supposed to be an eye (it has been rotated ninety degrees). 魚 is a picture of a fish (scales and all!). 又 should be a hand, and 手 also. 木 represents a tree. 刀 is a knife. Do you recognize 口 as a picture of a mouth? Not instantly of course. It could also be a cube. Or a brick. A brick is actually 瓦, which supposedly is also a picture, but rather hard to make out.

It is interesting to realize that our technological age has created somewhat similar pictorial signs and that the number of these is rapidly growing. Basic examples are signs to indicate whether a bathroom is for men or women. On the computer there are innumerable icons that have a pictorial quality. For example, a picture that represents a loudspeaker is commonly used to indicate a volume button. Street signs often contain pictures as well. Here are examples of modern pictorial signs:

While superficially Chinese pictorial signs are similar to modern pictograms, icons etc., there are important differences between those two categories. I will deal with some of the differences below after I’ve discussed “puzzle signs”.

The absolute number of signs in Chinese writing that are (abstract) pictures of objects is not very large. However, you are more likely to see them anyway. Firstly, because they are used for words and meanings that are basic and frequent, like “person”, “hand”, “woman”, “child”, etc. Secondly, these signs are more frequently used as parts of composite signs (signs that are a combination of different signs, more on this below).

A small number of the signs that are used in written Chinese are recognisably (once you are told) a picture of an object. Some of them are used frequently, both on their own, and as parts of composite signs.

Note that quite a few of the arbitrary signs I described earlier, started out as pictures, but are no longer recognisable as such. Originally “brick” 瓦 was a picture of... something. What exactly it was supposed to represent has been lost in the mist of time. As such, it has become an arbitrary sign.

***

A third group of signs can be taken as little puzzles or stories. I will analyse a number of them to explain some of the principles involved.

Take for example 大. It is an abstract picture of a person, standing with arms and legs spread out wide. However, it doesn’t mean “person”. It is used to write the word “large”.

The sign 上 is used for the word “above; on top”. The little horizontal line is “above” the larger horizontal line. Likewise 下 is used for “down; below”.

The sign 刃 consists of 刀 (which should be taken as a picture of a knife) and a small line. The small line emphasises “the blade” of the knife, and is used to write the word “blade”.

The sign 本 consists of 木 (which should be take as a picture of a tree) and a line at the bottom of the tree, signifying “root” or “base”.

Other signs combine different signs to tell a little story to convey the word they point to.

For example, the sign 休 consists of two components: a variant of the sign for “person” 人 that looks like 亻, and the sign I just mentioned “tree” 木. Together the two signs are supposed to convey something like “shade” (a person standing in de shade of a tree) or “rest” (a person resting in the shade of a tree).

Or take for example the signs 林 and 森, which respectively consist of a doubling of “tree” 木 and a tripling of “tree” 木, to indicate two different words for “woods” or “forest”.

Unfortunately, the stories can become difficult to decipher. Let’s look at the sign that is used for the word “fresh”: 鮮. Again, this sign consists of two separate signs with separate meanings. At the left 魚, a picture of a fish. And at the right 羊, a picture of (the head of) a sheep. So why do these signs together amount to “fresh”? Apparently because it signifies two kinds of food that need to be eaten fresh.

A bit easier is the sign 武. It consists of a picture of a foot 止 at the bottom left, and a picture of a weapon (probably a “dagger-ax”) 戈 at the right. It seems that “foot” symbolizes “marching”, and combined with “dagger-ax” that should convey “warrior”.

As a final example of this group of signs I will look at 農, a sign that is used for a word “agriculture”. Originally it functioned as the sign 武, which I just analysed. Ostensibly 農 consists of the signs 曲 and 辰. Unfortunately, neither of these signs can be understood as they appear in modern writing, and for that reason 農 is incomprehensible as well. In a way that means that for practical purposes 農 has become one of the signs that have an arbitrary shape.

However, the separate signs 曲 and 辰 are used quite a few times in other complex signs as building blocks as well. As such they have a certain familiarity. While combining 曲 and 辰 to create the compound sign 農 to write the word “agriculture” is arbitrary, 曲 and 辰 have a know shape, and can be “named” by their function as independent signs (曲 is used for a word {qǔ} meaning “bent”; 辰 is for writing a technical term in the traditional calendar, the word {chén}). This relative familiarity can help remember readers and writers 農 as an abstract sign.

A small number of the signs that are used in written Chinese are like a puzzle. Quite often the meaning the puzzle is difficult to decipher or even incomprehensible, but the sign might still be analysed as consisting of familiar shapes.

Just like there are modern counterparts to the Chinese picture signs (point 2), there also exist modern counterparts to Chinese signs that are like puzzles. I will give a few examples of modern “puzzle signs”.

A simple representation of a loudspeaker can indicate a volume button. However, more commonly a sign like is used, which adds lines to indicate sound emanating from the speaker. A more abstract example is . In a computer program the arrows pointing in different directions indicate that an object (like an image) can be moved. In assembly instructions it might indicate that parts need to be opened or spread out. The traffic sign that indicates that a road is closed for motorbikes combines a picture of a person on a motorbike with a red slash and a red circle to convey this prohibition. The sign to indicate an emergency exit (originally designed by Yukio Ota in 1979) combines a stylized running person with the frame of a door and the color green to convey its message. And so on.

***

While both the picture signs and the puzzle signs that remain in Chinese writing seem to have modern counterparts, there are at least two important ways in which they differ.

(Ⅰ) The way the Chinese signs have evolved over millennia has made lots of signs that originally represented some object very abstract and often the intended object is unrecognizable.

In addition to that, insofar as the signs of Chinese writing are still being developed, it’s only for ease of writing. This obfuscates the occasional original pictographic origin of signs even more. As a result these signs are not intuitive and have to be learned and memorized.

Our modern pictorial signs on the other hand, are constantly being developed for ease of comprehension. There is no escaping their pictorial quality, and in fact, they convey their meaning through their pictorial quality. That is totally unlike written Chinese.

(Ⅱ) As I already wrote in point 1, Chinese signs normally stand for specific words or syllables (the latter usually morphemes). That some signs might have a pictorial ancestry or still may have some recognisable pictorial features is something that a reader of Chinese normally doesn’t bother to notice. Instead, the reader sees words, or parts of words.

When we see “1000”, we think the word {thousand}. Likewise, when a reader of Chinese sees “大學”, the reader will think the word {dàxué} (“university”). This is different from our modern pictorial signs. Our signs rarely evoke specific words. More often, they convey general concepts. For example, the sign stands loosely for “a place to eat”, which might be a “restaurant”, a “cafeteria”, a “café” etc.

This goes even further than specific words versus loose concepts. The signs in written Chinese can often only be understood by someone who knows Chinese, because the signs are used to write specific Chinese words and expressions, and include grammar and idiom. Our modern pictograms on the other hand, are specifically designed to cross language borders.

***

Firstly, we’ve seen arbitrary signs, that are exactly like 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9 (point 1). Secondly, signs that are simplified pictures, somewhat like the modern icons used in computer software. They look rather abstract, but still look like something. Once you’re told what they are supposed to represent you can remember them more easily. Only about 10% of the signs are like that, but they are used relatively frequently (point 2). Thirdly, there are signs that are like little puzzles. This group of signs is not very large. However, they are composite signs that combine two or more other signs, and this they have in common with the last and largest group of signs, which I will deal with now.

***

Most of the signs in Chinese writing (perhaps 90% in absolute numbers) combine two different signs that have different functions. And below I will try to explain how that works. Or used to work.

Let’s go back to 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. As you surely know, we can use those signs to write other words than the numbers. For example, we can write: 4 U 2. That is for you too. In doing so we use the sign “4” to write the word {for} instead of the word {four}, and the sign “2” to write the word {too} instead of {two}.

This is a very imported point, because this principle can be used to build a mature writing system.

When the Chinese started writing, they made signs that looked like pictures (point 2) or consisted of little stories to convey abstract concepts (point 3). However, it’s simply impossible to do that for every object and every concept that exists. Not only is it impossible to come up with comprehensible signs for all the words using that method, it would also lead to an almost infinitely large number of signs, that no mortal person could memorize.

For that reason, the Chinese turned to the so called “rebus principle”.

In English we use the sign “4” to write the word {four}. The word {four} has a meaning, “number four”—and a pronunciation, a sound, something like /fɔɹ/ (in the USA). If we ignore the meaning, we can use the sign “4” also to write other words that sound like /fɔɹ/, like the word {for}.

This method assumes you’ve got a bunch of signs to start with. Which the Chinese had: the picture signs, the puzzle signs, and also a bunch of more or less arbitrary signs that still had an agreed upon usage. All these existing signs were used to write other words that sounded more or less the same. That is step one. I will give some real examples from ancient Chinese writing.

又 This ancient sign is a simplified picture of a hand. It wasn't used to write “hand”, but to write the word “right” (side, direction). Since it sounded a bit like the word for “again” in ancient Chinese, it was used to write that word as well.

其 This ancient sign is a picture of a basket, probably for a word “winnowing basket”. It was also used for a similarly pronounced word with the meaning “that”.

正 This ancient sign is a kind of puzzle, consisting of a picture of a “foot” 止 (symbolizing “marching on” or “attacking”) and a sign representing a destination or perhaps a city. It got to be used for the different word “be straight; correct”.

And so on. For a while this process allowed the scribes to write more (especially more difficult to visualize) words. However, in the end the scribes decided that it was too confusing to use identical signs for very different words. Enter stage two: add additional signs to ambiguous signs to differentiate them. I’ll use the same examples.

又

右 Here scribes added a sign (looking like but its meaning now uncertain) when they wanted to explicitly write the original word “right”.

其

箕 Here scribes added a sign (probably a variant of “bamboo” 竹) when they wanted to explicitly write the original word “winnowing basket”.

正

征 Here scribes added a sign (彳, which is an abbreviation of “crossroads” 行, symbolizing “to go”, “travel”) when they wanted to explicitly write the original word “marching on” or “attacking”.

To summarize what happened so far: scribes took one sign purely for its sound. Then combined it with another sign for its meaning. With this the scribes struck upon a principle to create new signs.

The principle amounted to this: whenever scribes needed to write a word for which there was no sign yet, they could pick a sign that could indicate the pronunciation of the word they wanted to write, and could combine it with another sign to determine what word was meant specifically. This was stage three. The signs that result from this method can be called phonograms, or phonosemantic compounds. Below examples from modern Chinese writing.

Examples of composite signs that use 其 to express sound

Remember the winnowing basket I mentioned earlier? Its modern form is 其. On its own it is used for the grammatical word {qí} “his; her; its; their”. Using the sound associated with it, 其 has been used to create other signs. For this it was combined with additional signs to indicate which word is meant specifically.

More signs that combine 其 with other signs exist, but the pattern is hopefully clear by now.

Problems with signs that combine sound and meaning

Below I’ll use the examples with 其, but the problems that I identify affect all signs of this type, not just the ones that use 其.

  1. Sometimes the sound that 其 suggests is spot on, but mostly the vowel has a different tone, and the initial consonant is often different as well. This means that the reader really needs to know the word to be able to pronounce it from a text. A reader who encounters for the first time a specific sign that uses 其 as a hint for the sound of a word or syllable cannot be sure how to pronounce it.
  2. Conversely, when someone wants to write words or syllables with the sound /qí/, the spelling depends on the word in question. The reason is that there are lots of words or syllables (morphemes) that sound like /qí/ and that are written with different signs. Consequently, each of those signs or spellings has to be learned separately.
  3. 其 as a component that lends its sound is often placed at the right. However, it can also be placed at the bottom, or in a corner, or somewhere else still. Sometimes it is difficult to tell which component is supposed to give a hint for the sound.
  4. The hint that the second component gives to determine to what word exactly the sign points to may be obscure. For example, it’s unclear why “to deceive” 欺 has “yawn; lack” 欠. In this instance 欺 was originally used to write another word that has fallen out of use. Since then 欺 was adopted to write “to deceive”, purely because this word was pronounced more or less the same. A second reason that the meaningful component can become unhelpful, is that the form of the meaningful component may have become arbitrarily distorted and unrecognizable. As is the case with “banner; emblem” 旗.
***

In spoken language, words consist of different sounds. Different combinations of different sounds guarantee that we can keep those words apart. They also help create new, unique words. In English, we try to translate the sounds of words into combinations of letters that we can translate back into the original words when we read. We do this by recognizing the word and (optionally) sounding out its pronunciation through its spelling. Some similar sounding words are spelt differently for easy reading. For example, we spell rain and reign, for similar sounding but different words.

While English spelling has quite a few irregularities, its spelling is predictable enough that we can sound out words. Conversely, new words can always be translated into letters that code reliably (enough) for the new words. English uses only a small number of letters and combinations of letters to translate the sounds of words into writing.

When the Chinese discovered the rebus principle, they could have discarded most of their signs and have limited their writing to those signs that represented a unique sound. They could have decided to spell /qí/ always with the sign “其”, just as in English we write /a/ (almost) always with the sign “a”. However, the scribes apparently felt uncomfortable writing different words in the same way. Just as English spells /ɹeɪn/ differently as either rain or reign, the scribes decided that they rather write /qí/ either as 其 or 旗, depending on the word they meant.

This meant that even though the Chinese discovered the rebus principle, in the end they tried to give almost all monosyllabic words and morpheme a different spelling. This destroyed to a large extend the usefulness of the rebus principle. The scribes kept adding unique signs and thereby increased the total number of signs to the point that regular literate people were unable to memorize them all. Likely as a result of the limit of the human brain the total number of different signs that were actually being used at any given point never exceeded 6000 signs. Avarage users are not expected to know much more than about 3000 signs.

So how can written Chinese cope with only 3000 to 6000 signs, given that there are tens of thousands more words in use at any given time? Briefly, there seem to be two reasons. One is that a lot of signs are still being used to write more than one word. Often these words have the same pronunciation, but they can have different pronuncations as well. The other reason is that Chinese, just like English, uses a lot of compound words. For example:

English wordChinese wordChinese spelling
keyboardjiànpán鍵盤
playgroundcāochǎng操場
part-timejiānzhí兼職
web pagewǎngyè網頁
table clothzhuōbù桌布

While most compounds words consist of two parts, more parts are possible as well. Some of those compounds are compounds of compounds.

English wordChinese wordChinese spelling
notepadjìshìběn記事本
the internethùliánwǎng互聯網
voice mailyǔyīn xìnxī語音信息

In conclusion

  1. (1) Most monosyllabic words and morphemes are written with unique signs, like the numbers 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.
  2. (2) Some signs can be used to write different words with the same pronunciation, sometimes also to write words with different pronunciations.
  3. (3) A small number of frequently used signs are abstract pictures.
  4. (4) Most signs consist of two or more familiar components.
  5. (5) Most signs contain one component that (might) give a hint for the pronunciation of a word or syllable, and one component that (might) hint at the meaning.
  6. (6) Compound words are written using existing signs and take care of the bulk of newly created words
***

Appendix 1: An example sentence Chinese

Xīnxiān de shuǐguǒ bùduàn de cóng guówài sòng jìn nóngmào shìchǎng.
新鮮的水果不斷地從國外送進農貿市場。
Fresh fruit is constantly being sent to the farmers market from abroad.

Break down into words and phrases

***

Appendix 2: Examples of further analyses of signs

Not properly part of the introduction as such, the following will look more closely at the graphical make-up of separate signs, and introduce a bit of terminology. It is meant as background information for the graphical etymologies that are being added the site.

The signs of written Chinese are often called characters—“character” in its meaning of “written or printed symbol or letter”. Below I will introduce a few more terms that are often used discussing this topic.

Taking the example sentence of appendix 1, I will give an extensive analysis of the first two words, and categorize the rest briefly. Note that the graphical breakdown and categorization of the signs is almost always rather speculative.

The first two words (xīnxiān-de 新鮮的) are one phrase that consists of the word xīnxiān, followed by the grammatical particle de. The two-syllable word xīnxiān is spelled with two signs, for each one of the syllables.

The first sign is 新. It is a phonogram (a sign that consist of a component that hints at the sound and a component that determines the meaning). The component that gives the sound (often called a phonetic) in 新 is 辛. This phonetic 辛 is used in about a dozen other signs, with varying pronunciations: /qīn/ (親, 寴), /chèn/ (儭, 襯, 櫬, 嚫), /qìn/ (親), /shēn/ (莘), /xīng/ (騂, 垶, 觪) and /xīn/ (辛, 薪, 新).

The component that determines the meaning (which can be called a determiner, or a signific or sometimes a radical) in 新 is 斤, a picture sign of an “ax” (picture signs are often called pictograms). Apparently the sign 新 was originally used for the word “firewood” (which explains the ax) but 新 was loaned to write the word “new; newly” as well.

As I’ve explained earlier, signs (like our numbers 0, 1, 2, 3, 4, 5, 6, 7, 8, 9) can be used to write other words than the word they were created for, as in 4 U 2 ({for} {you} {too}). The Chinese used this principle all through history, not only using (often called loaning) simple picture signs to write other words, but also using complex composite signs to write other words. 新 is such an instance. It can be called a loangraph.

Because 新 was “loaned”, the meaningful component in 新, the so called determiner (here the picture sign “ax” 斤), is not meaningful at all.

Additionally, the hint for the sound in 新 (the phonetic 辛) is of limited value, its range being /qīn/ /chèn/ /qìn/ /shēn/ /xīng/ /xīn/. This is fairly typical. And while breaking down 新 into components makes remembering it easier, it still has to be memorized as a unit. This goes for most of the signs in Chinese.

The second sign we’ve seen before. 鮮 on its own is for a (shorter) word “fresh”. As I explained earlier, 鮮 is a somewhat cryptic puzzle: left and right a picture of a fish and a sheep, suggesting “fresh” because both are perishable foods that need to be eaten fresh.

鮮 is also used as a phonetic in three other signs (廯, 癬, 蘚, pronounced either / xiān/ or /xuǎn/). In the example sentence 鮮 connects with 新 for the two-syllable word xīnxiān. Because 鮮 consists of the relatively familiar separate signs “fish” 魚 and “sheep” 羊 it is easier to remember how to write it, but it still has to be learned as a unit. It’s not frequent as a phonetic, so in order to know its pronunciation, you need to recognize the word first.

Then the third sign, 的. Ostensibly 的 is a phonogram, consisting of the phonetic “spoon” 勺 on the right, and the meaningful component (the determiner) “white” 白 on the left.

The phonetic is pretty useless. 的 is here used for a grammatical word pronounced /de/. The range in the phonetic is /sháo/ /bào/ /biāo/ /liào/ /yào/ /yuē/ /yuè/ /diǎo/ /diào/ /zhuó/ /shuò/ /bó/ /dí/ /dì/. Only 的 can be read /de/. Note that 的 is also used to write other words, for example “aim; clear” (pronounced /dì/), “really and truly” (pronounced /dí/) and more.

It’s not certain why 的 uses the sign “white” as determiner. One theory is that the word {dì} “aim (for a target)” connects with practice targets which were white. However, here 的 is used for a grammatical attributive particle, so there’s no connection with its meaning anyway. With regard to meaning or sound the components of the sign 的 are arbitrary and not helpful, but at least the components “white” 白 and “spoon” 勺 are familiar components, and as such easier to remember.

Here’s a list that categorizes all the signs in the example sentence:

There are 17 signs in all, ten (新 的 地 從 國 外 送 貿 市 場) of which may have been phonograms at some time. However, three of of the phonograms (的 外 市) have completely useless phonetics for the word they point to, while the other phonetics are of varying usefulness since they hint at a range of possible pronunciations. Four signs (鮮 斷 進 農) are puzzle signs (with 農 now an unsolvable puzzle). Three signs (水 果 不) were originally picture signs (不 completely abstract, 果 looks more like a puzzle sign now).

***

Please note that for someone who was taught how to read and write Chinese as a child, or has become proficient at a later age, an analysis as the one above is totally unnecessary and probably quite awkward. This is because this person will have internalized all the signs and will use them thoughtlessly, in the same way that we thoughtlessly write letters or numbers. Insofar as a person who is used to writing Chinese will hesitate reading or writing signs, it will be with the same attitude that we approach words in English for which the spelling and pronunciation is somewhat arbitrary, and may temporarily slip from our mind.