¢¦TOP¢¦³èÆ°¼ÂÀÓ¢¦£²£°£°£±Ç¯¢¦

(31)¡¡Generality for Multi-language of Word Segmentation Method Using Inductive Learning
¡¡¡¡¡¡¡¡¡¡¡¡¡¡Proceedings of Pacific Association for Computational Linguistics 2001 pp.298-306,2001-9

¡¡We have proposed a method of word segmentation for non-segmented language using inductive learning. We use only surface information of a character string, so that the method has an advantage that is entirely not dependent on any specific language. We have confirmed its effectiveness for Japanese and Chinese word segmentation respectively. In this paper, we will discuss the generality of our proposed method for multilanguage. We used a large amount of experimental data from Japanese corpus EDR and Chinese corpus Sinica to carry out the evaluation experiments. For these two kinds of language that they are quite different on grammar, structure and morphology, we have used the same algorithm to carry out the evaluation experiments. The experimental results show our proposed method is effective for Japanese and Chinese word segmentation, and it is possible to be used to multi-langiage.
PREVIOUS << >> NEXT