Journal of THeoritical and Apllied Information Technology

Pencarian berdasarkan :

SEMUA Pengarang Subjek ISBN/ISSN Pencarian Spesifik

Pencarian terakhir:

JURNAL INTERNASIONAL

Journal of THeoritical and Apllied Information Technology

Advances in digital technology and the World Wide Web as led to increase of digital documents that are used for various purpose such as publising and digital libary. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retirieval of text. One of the most needed tasks in clustering, which categorizes documents automatically into meaningful groups. Clustering is and important task in data mining and machine learning. The accuracy of clustering depends tigthly on the selection of the task representation method. Traditional methods of the text representation model documents as as bags of words using term-frequncy index document frequency (TFIDF). This method ignores the relationship and meanigs of words in the document. As a result the sparsity and semantic problem that is prevalent is prevalent in textual document are not resolved. In this stuudy, the problem of sparcity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuray of document clustering. The dependency graph representation scheme is created through an accumulation of syntatic and sentic anlysis. A sample 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntatic parsing in order to identify the sentense structure.. Then the simatic of words are modeled using dependency grapg. The produced dependency grap is them used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based the clustering result were copared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperfoms both TFIDF and Ontology basen text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Ketersediaan

JI03220017 004.0285 JATIT J JurnalInternasional Perpus STMIK (Jurnal Internasional) Tersedia

Informasi Detail

Judul Seri: -
No. Panggil: 004.0285 JATIT J JurnalInternasional
Penerbit: USA : JATIT., 2015
Deskripsi Fisik: -
Bahasa: English
ISBN/ISSN: 1992 8645
Klasifikasi: 004.0285
Tipe Isi: -
Tipe Media: -
Tipe Pembawa: -
Edisi: Volume 76 Nomor 1 Tahun 2015
Subjek: -
Info Detail Spesifik: -
Pernyataan Tanggungjawab: -

Ketersediaan

Informasi Detail

Versi lain/terkait

Lampiran Berkas

Komentar