Meaningful Learning System of Chinese Characters Based On The Chinese Character Network

We are a group of researchers who are trying to apply network analysis to Chinese characters while aiming to help teachers and learners to teach and learn Chinese characters more efficiently. Most of us have background training and researching experience in Physics, Math, or Computer Science. Based on meaningful structural connections among Chinese characters, which means that often a more complicated Chinese character can be decomposed into a few of simpler sub characters and from this decomposition, it is also quite often that we can see the connection between the pronunciations and/or meanings of the character and the sub characters, we present the connections as a network of Chinese characters. We refer this network as Chinese Character Network (CCNet) or Chinese Character Map (CCMap).

Using this map, we ask the following two questions:

  1. Learning Order Algorithm (LOA): what is the personalized optimal learning order for a given learner;
  2. Diagnosis Algorithm (DA): how can we test/diagnose efficiently which characters are known to the learner.

Basically, the idea of the first problem is that one should make use of the CCMap so that when the more complicated characters were learned, its sub characters should be likely learned already and also that the characters with higher usage frequency should be learned likely before the less used ones. The idea of the second problem is that when one character is tested and found to be, for instance, unknown to a learner, then it is very likely that the more complicated characters based on this character is also unknown to the learner.

We have now made some progress on the CCMap and also the two algorithms, LOA and DA. While we are working hard to further improve all three, we think it is time to share them with the teachers and learners, also the researchers, of Chinese characters, to make the learning and teaching of Chinese easier. Furthermore, the same principle can be applied to the teaching and learning of other disciplines: if one first construct the map of concepts of that discipline, then it is possible that our LOA and DA (with some minor adjustment) can also help the teachers and learners of that disciplines.

Besides us, there are many other researchers and teachers, who have noted that meaningful structural connections can be helpful in learning Chinese characters. For example, throughout the Chinese history there are quite several Chinese character books (Zishu, 字书) such as the most famous and still-in-use ShuoWenJieZi(说文解字). The task of such etymological Zishu is to decompose the characters into sub characters and explain their sound and meaning from the sub characters while aiming that in doing so helps teachers and learners to teach and learn Chinese characters more efficiently and meaningfully. More recently experts like Ning Wang, Bisong Lv, Pengpeng Zhang, Joël Bellassen also urge that Chinese characters should be taught meaningfully by taking into account of their connections, instead of primary rote learning of each individual characters (Many learners were told that to learn Chinese characters, they have to go home and repeatedly writing each of the characters tens or hundreds of times). A straightforward example is the connection between “木(tree)”, “林(woods)” and “森(forest)”: a single symbol “木”stands for a tree (to certain degree, the symbol does look like a tree), while two of the tree symbol together (“林”) means many trees such an area of woods and thus naturally stacking three tree symbol together (“森”) means a large area of tree, thus forest. One does not need to learn the three characters individually and by pure repeated memorization and recalling.

Furthermore, if one learns the structure and original meaning of characters meaningfully via decomposition and recombination, it is possible that later on meanings of words can be learning meaningfully in the same decomposition-recombination approach from the connection between the words and its characters (note that often Chinese words have several Chinese characters). The same may hold between the meaning of sentences and words, between paragraphs and sentences, and between articles and paragraphs, and so on.

With this decomposition-recombination approach and meaningful learning in mind, then our task becomes: to identify the decomposition of all commonly used Chinese characters, and given such decomposition, how it can help teachers and learners to teach and learn Chinese characters better. Therefore, we first established the Chinese Character Map (CCMap) and then studied the Learning Order Algorithm (LOA) and the Diagnosis Algorithm (DA).

We constructed the CCMap from collecting, comparing and evaluating the various decomposition found in various resources, which are listed in the end of this documents. We have to emphasize that for most characters, we do not come up with our own decomposition and instead we choose one or sometimes two decomposition and etymological explanation from the resources and in doing so we also cite the corresponding resources in our data.

Basic ideas of LOA and DA have been discussed earlier and more detailed information can be found from our research papers, which can also be found in the end of this document in the section of “How to cite us”.

We thus setup this website based on these three things: CCMap, LOA, DA, and hope it will help teacher and learners from all over the world to make their teaching and learning of Chinese to be much simpler and much more efficient.

Figure 1: The full map of Chinese characters, with their meaningful connections.

On this CCMap, each node corresponds to a Chinese character and if two Chinese characters (Say A and B) are combined together to form a more complicated character (Say C), then a directed line is drawn between AC and also between BC, meaning that A is part of C and B is also part of C. In order to make the lines to be meaningful, we decompose the characters hierarchically. For example, in the case of A is also formed by A1 and A2 such that A1 and A2 are also part of C, we do not directly link A1 and A2 to C, but rather A1 and A2 to A and then A to C. Since often that the connection between A1 and A2 to C is less meaningful than that between A and C and between A1 and A2 and A. This is called hierarchical decomposition. Only when the connection between A1 and A2 and C is even more meaningful (means that the pronunciation and/or meaning of C can be well understood from A1 and A2) than that of A and C, we connect directly A1 and A2 and C. Character “照”(meaning light “shine” on something, pronounced as “zhao”, C) is such an example that character “昭”(meaning “bright” light, or “reveal the truth, pronounced as “zhao”) is the A and “灬”(meaning “fire”) is the B, while “日”(meaning “the sun”) and “召”(pronounced as “zhao”) are the A1 and A2. We can see that instead of linking directly “日”,“召”and “灬”(A1 , A2 and B) to “照”(C), it clearly makes more sense to connect “日”(A1), “召”(A2) to form “昭”(A) and them combine “昭”(A) and “灬”(B) to form “照”(C). This is illustrated in Figure 2.

Also, our decomposition stops at characters, instead of strokes, which we do not regard as meaningful components.

Figure 2: Meaningful and hierarchical decomposition, stops at characters instead of strokes.

Our decomposition of characters may be improved further, and our etymological explanation might be wrong in some cases. Some of the ancient fonts of characters may also need further improvement. For that we are designing our user system to collect user feedback. With your love and help, we can make an even bigger difference, and make the world even better.

At current stage, our CCMap is completed to a reasonable degree and thus is fully released on this website. LOA and DA, and our user system based on LOA and DA, are still in development. When they are ready, we expected this website to have the following functions.

How our data and website can be used?

We provide:

  • CCMap and Etymological explanation
    1. Simplified Chinese character
    2. Traditional Chinese character
    3. Pronunciation (Pinyin)
    4. Etymological Explanation mostly for the original meaning
    5. Translated Etymological Explanation (TBA)
    6. English Meaning
    7. Usage frequency
    8. Components (sub characters)
    9. Ancient Fonts (Oracle-bone script, Bronze script, Seal script)
    10. Connections to other characters (Local CCMap centered at the targeted character), including
      • The targeted character (at the center)
      • Its components
      • The more complicated characters composed directly by the targeted character (20 characters at most, chosen from highest usage frequencies)
  • User system (Under development)
    1. To record users learning history if authorized by the users
    2. To calculate users’ personalized optimal learning order of Chinese characters
    3. To generate (also printable) learning materials based on the personalized learning order
    4. To diagnose users’ known/unknown characters efficiently
    5. To collect users’ feedback on decomposition, etymological explanation and users’ experiences
  • Chinese Character Map in PDF and JPG format
  • Chinese Character Map dataset, including decomposition, usage frequency, optimal learning order ( Excel file, CSV file, Readme)

List of sources and references:

  1. ShuoWen (Analytical Dictionary of Chinese Characters), Xu Shen
  2. Zi Yuan (Chinese characters Etymology), Xueqin Li, Tianjin Press for Classic Books, 2012
  3. Han Dian (Chinese characters dictionary):
  4. Multi-function Chinese Character Database:
  5. A Study of the Chinese Characters Recommended for the subject of Chinese Language in Primary Schools:
  6. Chinese Text Project:
  7. Guo Xue Da Shi (Master of Chinese culture):
  8. Xiao Xue Tang (Small school):
  9. Xiang Xing Zi Dian (Hieroglyphic dictionary):
  10. Chinese Etymology:
  11. Corpus Online (YuLiaoKuZaiXian):
  12. Chinese text computing:

Key ideas behind this work

First, learning things meaningfully (Ask why, beyond what and how) helps to learn more efficiently. Second, one way to learn meaningfully is to make use of the connections, which here for Chinese characters refers to the connections among characters in their formation, pronunciation and meaning. Third, both direct and indirect connections can be useful and mathematics helps to figure out indirect connections from direct ones. For more information on the last please read our papers as listed in the “How to cite us”section for details.

Call for collaboration

If you have usage frequency data on Chinese characters for kids and if you are willing to, please contact us. Together, let us make it easier for kids all over the world to learn Chinese!

How to cite us

Xiaoyong yan, Ying Fan, Zengru Di, Shlomo Havlin, Jinshan Wu, Efficient learning strategy of Chinese characters based on network approach, PloS ONE, 8, e69745 (2013) DOI: 10.1371/journal.pone.0069745 Or this website (Meaningfully Learning Chinese Characters

Members of this group: Xiaoyong Yan (闫小勇), Yukun Song (宋玉鲲), Zhesi Shen (沈哲思), Jianzhang Bao (鲍建樟), Jinshan Wu (吴金闪)

Acknowledgement: The project is supported by CATL (Contemporary Amperex Technology Co., Limited).

Contact us:,,

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. If you would like to make commercial or other developments based on our data but beyond this agreement, please contact us.