JURNAL INTERNASIONAL
Journal of Theoritical and Apllied Information Technology
Arabic noun compound extraction has become a challenging issue in the flied of NLP. SEveral approaches have beeb proposed in trems of extracting Arabic noun compouds. Some of them have usen linghuisticbased approach, other have used statiscal methos and the rest have used a hybrid between them. This research proposed a hybrid method of liguistic-based approach and statical method in order to solve the extraction of nested Arabic noun compound. The datased has been collated from online Arabic newspaper archive from Aljazeara.net Almotamar.net. Several pre-processing steps has been carried out on the data including trasformation, normazalition, stemming and POS tagging. After that, the n-gram model has generated bi-gram, tri gram, 4-gram, 5-gram candidates of noun compound. Them there association measure wgich are NC-value, PMI and LLR have been used in order to rank the candidates. the evaluation has been performed usen the n-best method with a human anatation (manual selection by expertise). NC-value has outperformed PMI and LLR in terms of extracting nested noun compounds.
| JI03220009 | 004.0285 JATIT J JurnalInternasional | Perpus STMIK (Jurnal Internasional) | Tersedia |
Tidak tersedia versi lain