AFAAN OROMOO INFORMATION EXTRACTION SYSTEM

Show simple item record

dc.contributor.author AMEHA GERO BERSISA
dc.date.accessioned 2019-11-14T11:32:13Z
dc.date.available 2019-11-14T11:32:13Z
dc.date.issued 2019-04
dc.identifier.uri http://hdl.handle.net/123456789/1280
dc.description.abstract In our today's digital world, the task of handling electronic sport new document overload is a critical issue. This text document is an essential source of most significant information with respect to selected sport news article. Hence, the use of automated text analyzing method for this domain is an essential and selective strategy while searching for this important information. An IE is a systematic method emerged to handle the process of analyzing and capturing such a significant information existing under the given text news document. As it has been stated by some recent studies, the explosion of Afaan Oromoo sport news text document as electronic form become increasing from period to period. Reading throughout this text document to capture and access most relevant information related the football news topic is a time-consuming, tedious and difficult task for the users. The main objective of this study is to develop automated information extraction system for Afaan Oromoo language text document using the supervised machine learning classification approach. The system extracts the most relevant football news information from the Afaan Oromoo sport news text document and it contains the training and prediction phases as core base. To implement the AOIES, the Afaan Oromoo sport news documents collected from the Radio Fana Share Company Afaan Oromoo broad casting service is used as training and testing corpus, the tokenization, normalization, stop word removal and regular expression methods and the machine learning Naïve Bayes classification algorithm are applied to train how to learn patterns. The standard precision, recall and F-score evaluation metrics are used to evaluate the text classification and IE model accuracy of the developed system prototype. While experimenting the proposed model with training and testing dataset, the 10-fold cross validation method is applied. The developed system classification module achieved 91.7% and the IE model 94.6% F-scores performance by correctly predicting the instances. The above result indicates the developed system prototype has scored promising performance by correctly predicting the instances using the Naïve Bayes classification algorithm. Generally, the evaluation result demonstrates that the machine learning classification algorithm can be adopted as information extraction method for the Afaan Oromoo text document. en_US
dc.language.iso en en_US
dc.publisher Arba minch University en_US
dc.subject Afaan Oromoo, Machine Learning, Naïve Bayes, Information Extraction en_US
dc.title AFAAN OROMOO INFORMATION EXTRACTION SYSTEM en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search AMU IR


Advanced Search

Browse

My Account