¢¦TOP¢¦³èÆ°¼ÂÀÓ¢¦£²£°£°£²Ç¯¢¦

(42)¡¡Inductive Learning of Rules for Information Extraction
¡¡¡¡¡¡¡¡Proceedings of FIRST INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY &
¡¡¡¡¡¡¡¡APPLICATIONS (ICITA 2002), pp.230-235, 2002-11

¡¡There are many information extraction systems that help to save time for reading a lot of documents. The information extraction is the method to extract important information from a document. Generally conventional information extraction methods need to prepare many rules for extracting important information. The pattern of extracted information has to be fixed. Therefore they are effective for the limited fields when it is obvious what kind of information a user wants. However, it is not effective when a user reads the documents of various fields. In this paper, we propose an information extraction method for Japanese documents using Inductive Learning. The system learns what kind of information a user needs and the system gets several rules for information extraction from the correct answers given by a user. The system uses two kinds of rules to learn the user's wants. One is the rule to decide the important sentences. And the other is the rule to extract the important words. Using these rules, the system can adapt to a user dynamically. When user's interest changes to other topics, the system can extract information a user wants. The system is able to realize to extract important information from the documents of the various fields. In this paper, we explain how to extract important information and describe the detail of two rules for information extraction. And we evaluate the effectiveness of our proposed method. The recall and the precision of the rules to decide the important sentences is over 80% after the learning progresses. Therefore the rule to decide the important sentences is effective for the various fields. However there are some problems in the rules to extract the important words. The problems are the variety of the output patterns and the method to apply the rules. We consider the causes and describe the solution.
PREVIOUS << >> NEXT