Repository Universitas Lampung

Multi-Inductive Learning Approach for Information Extraction

Muludi, Kurnia and Widyantoro, Dwi H and Kuspriyanto, Kuspriyanto and Santoso, Oerip S (2011) Multi-Inductive Learning Approach for Information Extraction. Proceeding of 2011 International Conference on Electrical Engineering and Informatics. G1-1. ISSN 978-1-4577-0751-3

[img]
Preview
PDF
Download (662Kb) | Preview

    Abstract

    The vast amount of information in the Internet is not easy to find and use. Information Extraction technology is one of alternatives that can solve this problem. Conventional Natural Language Processing approach is hampered by its portability, scalability and adaptability. Introduction of Machine Learning into Information Extraction is one of solutions. Inductive Learning only needs annotated training examples. The problem is there is no performance consistency of algorithms on various information domains. Automatic and smart classifier selection from various machine learning algorithms is one of the best way to handle this problem. The goal of this paper is to propose a method for Information Extraction System based on Inductive Learning and Meta Learning that have good performance. In this paper Multi-Inductive Learning is developed to answer that question. Multi-Inductive Learning is consist of several Inductive Learning algorithms that have significant difference in their mechanism. This is to ensure there is bias variance in this method. Through k-fold cross validation on training document, Multi-Inductive Learning algorithm can choose the best classifier for each slot on a certain domain. These best classifiers then employ to do full extraction on testing document. The conducted experiment shows that Multi-Inductive Learning has better performance than that of single Inductive Learning algorithmbased Information Extraction systems. On Reuters Corporate Acquisition, Multi-Inductive Learning gives a score of 46.3 % and has the best performance among other state of the art information systems. Out of nine slots that should be extracted, six of them give the best performance. Multi-Inductive Learning also gives better performance on Job Posting dataset. Average performance of it gives 82.1 % and is the best among other state of the art of Information Extraction. Out of 17 slots that should be tested, nine of them are extracted with the best performance.

    Item Type: Article
    Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
    Divisions: Fakultas Matematika dan Ilmu Pengetahuan Alam (MIPA) > S1 Ilmu Komputer
    Depositing User: Kurnia . Muludi
    Date Deposited: 26 Nov 2015 15:27
    Last Modified: 26 Nov 2015 15:27
    URI: http://repository.unila.ac.id/id/eprint/825

    Actions (login required)

    View Item