漢字
blog

Terminology

This list is a longer version of the list of terms that I use for the etymology sections of the entries of my database of Chinese and Japanese characters.


Allofam

(plural allofams) (usually in Sino-Tibetan linguistics) a linguistic form (attested or reconstructed word) which belongs to a particular word family. For example, Proto-Sino-Tibetan *r-na would be the allofam from *r/g-na which gave rise to Tibetan rna ("ear").¹

Allograph

Allographs are characters that look different but are used to write the same word. Strictly speaking, only characters that are completely identical in the way that they are used are allographs. However, often the term is used in a wider sense that includes characters that show some differences in usage compared to their variant form. (Qiú, 2000. p. 297)

Arbitrary sign

Qiú defines sign as follows: “Graphic symbols ... without any relationship either on the semantic or phonetic level are signs.” (Qiú, 2000, p. 15) For clarity I use arbitrary sign. (See also below Sign (1) and (Sign (2).)

Note that what counts as an arbitrary sign depends sometimes on knowledge of the reader. A reader who knows how a graph has been standardized might still be able to recognize the original shapes. Likewise, a reader who knows classical Chinese might be able to recognize that a component is a phonetic due to knowledge of ancient graphs that share this phonetic. (Qiú, 2000, p. 57).

Bronze inscriptions

Wikipedia has a decent introduction into the timespan and varieties of these character forms: Chinese bronze inscriptions.

Clerical script

Chinese lìshū 隷書 (simplified 隶书), Japanese reisho 隷書, English “clerical script” is a style of writing that was in common use during the Han dynasty (206 BCE–220 CE). It was a much more efficient way of writing than Seal script and shares with modern script a rectilinear structure. In contrast with modern script it “tends to be square to wide, and often has a pronounced, wavelike flaring of isolated major strokes, especially a dominant rightward or downward diagonal stroke.” It was adopted by the bureaucracy from popular cursive script that was already being used. While much more efficient, it often obfuscated the original structure of characters. (Wikipedia, accessed 2019-07-05; Qiú, 2000, pp 126-130.)

Composite graph

Also composite character. Symbols can be used to stand for words. When those symbols are made up of various graphic components they can be called composite graphs. Most Chinese and Japanese graphs are composite graphs. For example, the Chinese graph “” is a symbol for the Chinese word “flower” (pronounced in Mandrin Chinese as huà). It can be analysed as consisting of two graphic components. On the top what looks like two plus signs ++, which are an abbreviation of the symbol that signifies “grass”. The bottom part is known to have the pronunciation huà. Together the components make it easy to identify the word “flower”. This type of structure is called a phonogram and makes up most of the Chinese characters. Other composite graphs are semantic compounds, deictic graphs.

Since the Qin and Han dynasties (ending ca. 220 CE) all new composite characters are using already existing graphic components. Most of those components are adopted from both simpler graphs (like pictograms) and other composite graphs. Very frequently used components got abbreviated, as for example “grass” mentioned earlier. Another example is “water”, which usually is abbreviated as . (After Qiú, 2000, pp. 13-14.)

Code point in Unicode

A Unicode code point is an abstract entity that describes a character and assigns it a unique number for its use in software. Some characters only exist as a combination of code points, or take a special form in combination with other code points. A few characters exist both as a unique code point and as a combination of code points. Most code points describe characters that differ from other characters in their meaning, but some differ only in their representation or style (see also variant in Unicode).

Deictic graph

Also “deictograph.” A small category of characters that look like pictorial graphs, but have a mark or symbol added to them to convey a modified meaning. For example, “tree” is derived form a representation of a tree, with branches on top and roots at the bottom. The graph adds a mark to the bottom of the graph to signify “roots.” Another example takes “knife” and adds a line on or near the blade of the knife , to indicate “blade” (Qiú, 2000, p. 183). In Chinese and Japanese the term is 指事.

Determinative

See signific.

Empty component

A “component [that] provides an empty form, which is not related to its full form, meaning, or sound” (Outlier). For example, “beautiful” originally depicted a person wearing a headdress. The modern form consists of two elements: on top it has “sheep” and at the bottom “large” . While these components have meaning on their own, in their meanings are irrelevant. See also standardization.

Free or bound

Almost all Chinese characters point to separate morphemes. Morphemes are the smallest meaningful units in a language. A morpheme can either be a bound morpheme, that can appear only as part of a larger expression, or a free morpheme that can stand alone. A free morpheme corresponds to a word, while a bound morpheme can only be part of a word or expression.

Exceptions to the rule that most Chinese characters point to morphemes are for example rare disyllabic words like 蝴蝶 húdié “butterfly” and foreign loan words (see Wikipedia, here and here, accessed 2019-08-01).

Graphs and words

The difference between graphs and words. TODO (See Bottéro for a start).

Huìyì jiān xíngshēng

Traditional Chinese: 會意兼形聲 huìyì jiān xíngshēng, Japanese 会意兼形声 kaii-ken-keisei. It means “both semantic compound and phonogram”. It refers to a graph that can be analyzed both as a semantic compound and as a phonogram, usually because the graph contains a component that expresses meaning through its form and also lends its sound. See yìshēng.

Hanbun

繁文後代に文字の構造が変化した場合本書では変化した後の字形を繁文と呼称している。” (Ochiai, 2016, 用語解説 p. 95)

Iconic

“Writing in which we can recognize things in the everyday world.” (Powell, p. 258) A longer explanation is here.

Linearization

According to Qiú, “the phenomenon of changing thick strokes to fine and replacing squared or rounded solid elements with lines” (Qiú, 2000, p. 70). See also standardization.

Loangraph

“A loangraph is a homophonous or nearly homophonous graph borrowed to write another word.” (Qiú, 2000, p. 261). A somewhat analogous example in English would be to use the sign “2” to write “to” (“will go 2 U”).

Oracle bone inscriptions

Oracle bone inscriptions (short OBI), also called oracle bone script are inscriptions on animal bones that were used in pyromantic divination (the bones were prepared, heated, and the resulting cracks were interpreted). The content of the inscriptions amounted to a kind of administration: date, who performed the divination, for who, what question, what answer. They have been found for the later part of the Shang dynasty, probably between 1200 BCE and 1046 BCE. A few OBI have different content. OBI represent the oldest stage of the Chinese writing system for which there is proof. Some bronze inscriptions (BI) date back to the same period. The style of OBI and BI is usually different, because of the different materials used and the different purpose of the inscriptions. Regular writing at the time was done with a brush, which was most likely different in style as well.

Phonogram

Also phonosemantic compound. A graph that combines phonetic and semantic components. The phonetic should give a strong hint for the pronunciation of the word that the graph points to, while semantic components convey meaning(s) relevant for the word. Sometimes a component conveys both relevant sound and meaning (see yìshēng). Due to linguistic processes that cause changes in the pronunciation of words a phonetic may no longer predict the pronunciation of the word that its graph points to. In that case the designation of “phonogram” to characters has to be taken in a historical sense, since they have become semi-signs or (if the signific is unhelpful as well) arbitrary signs in practice.

Phonetic element

Qiú introduces phonetics that are part of compound characters as follows:

The phonetics of phonograms are also phonetic symbols. There are two kinds of phonetics. One kind is borrowed purely for the purpose of expressing sound, e.g., the phonetic “huà of the character “huā. Another kind also has a semantic relationship with the word represented by the phonogram. For example, a type of ear ornament made from jade is called {} ěr (homophonous with the character “ěr “ear”); the character ěr consists of “,” “jade” plus “ěr (, when used as a component on the left side of a character is written “”). The component “ěr is a phonetic bearing a semantic relationship to ěr. Phonetics of this type can be viewed as phonetic symbols which are concurrently semantic symbols. (Qiú, 2000, p. 17)

Phonetic elements in Chinese are almost always loaned phonetics (ibid). Perhaps an example in English could be the following. One could write tonight as 2night. The graph “2” (normally used to write the word “two”) is in that case used as a loaned phonetic. That “2” also has a meaning is ignored in spelling 2night (just like the meaning of “” is ignored in “”above). On the other hand, the alternative way to write threesome, using the graph “3” in 3some, would be an example of the use of “3” as a loaned phonetic that gives both a pronunciation (its phonetic aspect) and meaning (its semantic aspect).

Phonetic with associated sense

As can be seen from the section on the phonetic, a phonetic element can suggest relevant meaning besides lending its sound. This often happens when the phonetic is also the protoform of the graph. The word being referenced used to be written with the phonetic on itself, but later a signific was added for clarification (often because the original graph was loaned for yet another word). In this case the phonetic can said to have an “associated sense” (see also the section on Yìshēng).

Another way a phonetic can have such an “associated sense” is when the word that the graph signifies is linguistically related to the word the phonetic on its own was used to write. However, some scholars go very far in trying to ascertain whether a phonetic may have had an relevant sense and their theories may become very hypothetical and disputed by other scholars. The reason for this is simple: it is very difficult to reconstruct the original pronunciations that graphs had when they were created, and to determine whether certain words possibly could have an “associated sense”.

Pictogram

Also pictograph or (in Qiú) pictorial graph. It refers to a graph that through its graphic form resembles some physical object.

Note that while a graph may depict recognizably some object, as a sign, it does not stand for this object. Instead, it points to a word in a given language. And this word my not be obvious at all. To give an example, a pictogram of a human being with arms and legs spread, may point to a word “giant”, “man”, “person” “large”, etc. Additionally, in Chinese writing, most so-called “pictograms” have been modified and simplified in their form to the extent that one first has to learn what the graph is supposed to represent. (See Qiú, 2000, pp. 175-183; Powell, p. 3.)

Protoform

Ochiai explains protoform (初文) as: “When the structure of a graph later on has changed, [we] call the earlier form of the character the protoform.” (Ochiai, 2016, 用語解説 p. 95)

Quasi-pictorial graph

Qiú uses the term “quasi-pictorial graphs” for graphs that while they represent concrete things, they do not point to the words for those things, but to words for “attributes, states or actions” (Qiú, 2000, p. 185). An example is writing (a depiction of a person standing with arms an legs wide) to denote the word “large”.

Seal script

From Wikipedia:

Seal script (Chinese: 篆書; pinyin: zhuànshū) is an ancient style of writing Chinese characters that was common throughout the latter half of the 1st millennium BCE. It evolved organically out of the Zhou dynasty script. The Qin variant of seal script eventually became the standard, and was adopted as the formal script for all of China during the Qin dynasty. It was still widely used for decorative engraving and seals (name chops, or signets) in the Han dynasty. The literal translation of the Chinese name for seal script, 篆書 (zhuànshū), is decorative engraving script, a name coined during the Han dynasty, which reflects the then-reduced role of the script for the writing of ceremonial inscriptions.
(Accessed Thu Jun 20 18:39:07 CEST 2019, page: <en.wikipedia.org/wiki/Seal_script>.)

Semantic compound

A semantic compound combines meaningful elements that are not being used for their phonetic value.

This term I use as the equivalent for Japanese kaii moji 会意文字, Chinese huìyìzì 会意字. Qiú uses syssemantograph, which I find a horrible word. Elsewhere it’s often compound ideograph, but I agree with those scholars that think that the terms “ideograph” and “ideogram” can be confusing.³

Semantograph

A cover term for characters that use only semantic symbols and have therefore only a semantic relationship with the words they represent. Traditional subcategories are “pictographs”: xiàngxíngzì 象形字 (Japanese: shoukei moji, 象形文字), “deictic symbols” zhǐshìzì 指示字 (Japanese: shiji moji, 指示文字), “syssemantographs” huìyìzì 会意字 (Japanese: kaii moji, 会意文字). (Qiú, p. 15.)

Semantographic symbol

(Chinese: yífù 義符) Qiú makes a distinction between pictographic symbols and semantographic symbols, but admits that some symbols can be seen as both. Pictographic symbols express meaning “by means of their shape” and “frequently they cannot function independently as graphs.” As an example Qiú writes about which originally consisted of two representations, a left foot and a right foot. The left foot was also used as an independent graph (now ) but the right foot was never used independently. A semantographic symbol has meaning because its shape is the same as (or derived from) an independent graph, and has a known meaning because of that. Only when its shape is recognizably still pictographic, it may be interpreted as either. (See Qiú pp. 53-54.)

Semasiography

Following Powell. “Writing in which the signs are not attached to necessary forms of speech.” This includes signs that are abstract, like the slash and border in 🚭 (no smoking), or representational (like the cigarette in the no smoking sign) or representational but symbolizing, for example using metaphor or metonymy (an hourglass for time, hoofprints for a horse). “[I]n semasiography the marks on the material basis always communicate information without the necessary intercession of forms of speech. Such signs are a form of writing [...] because they communicate information by means of material marks with a conventional reference. However, they represent concepts and meanings directly and may themselves to some extent constitute a language independent of speech, though such signs can sometimes be interpreted in speech.” Other examples are: musical notation, mathematical notation, computer icons. Powell defines a sematogram as an element of semasiography.

Semi-sign graph

Graphs of which the form has been changed, as a result of which one component ended up as an arbitary sign. Most of these characters where originally phonograms. For example, in the graph , the top element is a fusion of and . Originally acted as phonetic. Now the top element has an arbitrary shape, that has neither a phonetic nor a semantic function. However, since at the bottom is still recognizably the graph “sun” it still has a semantic value. Hence, the graph can be categorized as a “semi-sign graph” (Qiú, 2000, p. 20).

Sign (1)

Qiú uses this term as follows: “Graphic symbols which have a semantic relationship with words represented in the script are semantic symbols; those which have a phonetic relationship are phonetic symbols; those without any relationship either on the semantic or phonetic level are signs. Alphabetic writing systems use only phonetic symbols; the Chinese writing system uses all three types of symbols.” (Qiú, p. 15; my emphasis.) Note that Qiú contrasts the “Chinese writing system” with “alphabetic writing,” which are dissimilar categories. The writing systems of the West that use alphabets also use signs. Most importantly numbers, also symbols like & or $. Classically there are also semantic signs like arrows → and such, and since the turn of the century there is a growing number of new semantic symbols (for example emoji like ;-) or 😀). That is excluding all the semantic signs that live outside of books, in the street or on equipment. Note also that Qiú’s use of the term “sign” is much more limited than that of Powell (see semasiography). For Powell a sign can be representational. For Qiú a sign is unintelligible on itself; its meaning needs to be agreed upon. The example of “no smoking,” which combines both a representation of a cigarette and abstract symbols (the slash and the border) is for Qiú a semi-sign (Qiú, p. 20). To avoid confusion I will try to use arbitrary sign for Qiú’s meaning.

Sign (2)

Powell defines “sign” as “something that stands for something else; same as symbol except a sign is part of a system. A collection of certain kinds of signs makes up a system of writing; loosely, any graphic character.”

Signific

The part of the graph that is specifically added to classify a graph semantically, to determine approximately what kind of meaning the word might have that the graph points to. Same term as determinative.

For example “” (derived from as “person”) is often added to words that have to do with human affairs. The graph pertains to other people, or the personal pronoun he/she/it; the graph points an official; the graph points to a servant; etc.

In the past the term “radical,” might have been used in this sense. That usage was unfortunate, because strictly the term “radical” has to do with the “root” or “origin” of a word, which is not what a signific is about. In fact, the other part of the graph is much more informative in that regard, because it points directly to the word in question, while the signific merely puts the word in a category. For that reason it’s preferable to use the term signific or the term determinative, and avoid using “radical.”

Simplified characters

Specifically this refers to characters that were introduced in Japan after 1945 and in the PRC after 1950 and 1960. It involves characters that have replaced almost without exception more complex character forms. Often the new forms adopted cursive forms that were already in use in handwriting. The simplified characters introduced in the PRC overlap partly with those of Japan. The so-called “traditional characters” that they replaced are still in use in Hong Kong, Macau, and Taiwan. More broadly it may refer to all simplifications and abbreviations, mostly (but not exclusively) in handwriting. See Wikipedia: Simplified Chinese characters and Shinjitai.

Standardization

The process in with unique or difficult to draw shapes are redrawn with easier or familiar shapes, often regardless of their meaning. For example, in the lines that curve around the vertical line () have been replaced with bow , ignoring its meaning. See also linearization and empty component.

Style of writing

The earliest still existing examples of Chinese writing are oracle bone inscriptions and some early writing on bronze objects. There are reasons to believe that most writing at the time was actually done with brush on bamboo or silk, but for the earliest writings no examples written with brush remain. This means that the most common style of early writing is unknown.

Style of writing changed over time, but within a certain period the style differed also. Factors that influenced the style of writing depended on the target audience—determining the level of formality—, and the material on which was being written.

Formal writing was written with more care, making the writing more beautiful according to the aesthetics of the time. In early writing on bronze (which often was very formal, made to impress the reader or viewer) this resulted in characters that had rounder shapes, higher iconicity (more effort was made to show clearly what real world objects graphs referred to—if any), and sometimes ornamental features that were lacking in regular writing. In writing on animal bones, the hardness of the material made it very difficult to create flowing round shapes, hence its style was much more rectangular and economical. Still, these writings were also quite formal, given the context of divination at the court, usually involving the emperor himself. Occasionally bits of more informal writings were preserved.

In later times styles of writing changed due to the demands of efficiency of writing with the brush, and changing aesthetics. Formal writing usually lagged behind the new developments. A somewhat rough classification can be made between “regular script” used in official or formal contexts and script forms that showed everyday forms of writing or artistic forms of writing. Regular script forms are known by designations like Seal script, Clerical script, Standard script. An example of an other script form is cursive script.

Syssemantograph

This term is used in Qiú for huìyìzì 會意字. I will use semantic compound.

Variants in Unicode

Some variants of characters have separate code points in Unicode. A well known example is the difference between and that have a slightly different form. In print these characters have exactly the same meaning. However, because they have been assigned a unique code point in Unicode, software attuned to some geographical areas may recognize the one but not the other. For example, in Japanese online stores the book title 說文解字 may be unknown, and only the variant 説文解字 will be recognized. On the other hand, the algorithm of a global search engine may recognize that the graphs are more or less identical and return results for both.

In the recent years the number of variants that have their own unique Unicode code point seems to be rising. Some of these characters may only be of historical interest, and not be well supported. See this section of UAX, specifically the remarks on the kZVariant field.

Yìshēng

Chinese yìshēng 亦聲, Japanese ekisei 亦声, literally “also sound”. This terms refers to components of graphs that both indicate a word through their form (often because the component on its own is a pictogram) and through their sound (they act as a phonetic).

Ochiai describes it as: “When in a semantic compound a part not only expresses meaning but also sound, that part is called ekisei. Also, when in a phono-semantic compound the phonetic not only expresses a sound but also meaning, that is also treated as an ekisei.” (Ochiai, 2016, p. 95 in section 用語解説)

Sometimes scholars seem to use this term to classify a graph as a whole that contains such a component. Alternatively, such a graph may be classified as 會意兼形聲 (traditional Chinese, huìyì jiān xíngshēng, or 会意兼形声 (Japanese, kaii-ken-keisei), meaning “both semantic compound and phonogram” (see huìyì jiān xíngshēng).


2. qiú
3. I’m certainly not the only person to use “semantic compound” as equivalent for 会意 graphs. For example, it’s also used in Hsuan-Chih Chen, Jian Wang, Ralph Radach, Albrecht Inhoff, Reading Chinese Script: A Cognitive Analysis. Psychology Press, 1999.

References