Lab Home | Dtaset Home               Chinese | English

 

A brief introduction of SCUT-COUCH2009

  SCUT-COUCH2009 database is a comprehensive database that consists of 12 datasets, namely GB1, GB2, TradGB1, Big5, Pinyin, Letters, Digit, Symbol, Word8888, Word17366, Word44208 and online textline_NU respectively. Particularly, SCUT-COUCH2009 contains handwritten samples of 6763 single Chinese characters in GB2312-80 standard, 5401 traditional Chinese characters of Big5 standard, 1384 traditional Chinese characters corresponding to level 1 characters of GB2312-80 standard, 8,888 frequently used Chinese words, 17366 daily-used Chinese words, 44,208 complete words from “The Contemporary Chinese Dictionary (the fourth edition)”, 2,010 Pinyin, 184 daily used symbols and 8,809 online text lines. The current version of SCUT-COUCH2009 is collected with PDA(Personal Digit Assistant) and smart phones with touch screens, contributed by more than 190 different persons, resulting in more than 3.6 million handwritten samples.

  SCUT-COUCH2009 database has more important characters which other database doesn’t have. It is the first public available large vocabulary online Chinese handwriting database, which contains multi-type character/word corpus materials, including various symbol, word and Pinyin samples. It provides basic datasets for new research topics such as online handwritten Chinese word and Pinyin recognition.

  For more details of SCUT-COUCH2009 database, please refer our paper: Lianwen Jin, Yan Gao, Gang Liu, Yunyang Li, Kai Ding. SCUT-COUCH2009----A Comprehensive Online Unconstrained Chinese Handwriting Database and Benchmark Evaluation, International Journal on Document Analysis and Recognition (IJDAR), vol.14, no.1, pp53-56, 2011.