Path: TopJournalJurnal_Keuangan_dan_Perbankan2008

Klasifikasi dokumen web berdasarkan frase kunci pada bagian informatif

Jurnal Keuangan dan Perbankan, Volume 6, Nomor 3, Agustus 2008
Journal from JIPTUNMERPP / 2011-12-22 01:44:07
Oleh : Amak Yunus E.P ; Arif Djunaidy , Diploma 3 of Finance and Banking Merdeka University Malang (jurkubank@yahoo.com)
Dibuat : 2008-08-01, dengan file

Keyword : Dokumen web, Bagian informatif

Along with period development progress, management of classifying document is requires to improve web performance. In accordance with Pierre (2000), a number of web pages available are about one quintillion with addition of about 1.5 millions pages a day. With changer occurred above, so many web pages will be varied in term of content, information and quality. If the data organization is not good so it will be difficult for user to search for information in accordance with his or her desirability. So, a means of classifying efficiently must be done in order to improve information quality required. And it needs to be suggested that in a web page, there are parts that are actually unimportant to be looked for by users such as advertisement, logo, copyright, etc. if the classification is done directly without taking merely important part, it will cause inaccuracy in classifying web document. From the problem above, so the research is arranged relating with how a means of classifying web document based on key phrase on informative part form the document. This research is arranged through some stages. The first stage is taking informative part form a web document by using Feature Extractor method, whereas the second is doing key phrase extraction by using tf-idf method. The last stage is classifying the web page document by using Bayesian method that is known as one of classifying sufficiently good text. In experimental conducted, Feature Extractor method proposed by Zhang apparently takes informative part froe a web page, and it can be integrated into a program that can classify a web page based on key phrase from informative part of web page. By using holdout method in doing the experiment, integrating third module is that Feature Extractor, TFIDF, and Naïve Bayesian gives the sufficiently convenient result. Training data of 25% from total data gives classifying accuracy of 79%. Whereas training data of 70% gives accuracy of 89%.

Deskripsi Alternatif :

Along with period development progress, management of classifying document is requires to improve web performance. In accordance with Pierre (2000), a number of web pages available are about one quintillion with addition of about 1.5 millions pages a day. With changer occurred above, so many web pages will be varied in term of content, information and quality. If the data organization is not good so it will be difficult for user to search for information in accordance with his or her desirability. So, a means of classifying efficiently must be done in order to improve information quality required. And it needs to be suggested that in a web page, there are parts that are actually unimportant to be looked for by users such as advertisement, logo, copyright, etc. if the classification is done directly without taking merely important part, it will cause inaccuracy in classifying web document. From the problem above, so the research is arranged relating with how a means of classifying web document based on key phrase on informative part form the document. This research is arranged through some stages. The first stage is taking informative part form a web document by using Feature Extractor method, whereas the second is doing key phrase extraction by using tf-idf method. The last stage is classifying the web page document by using Bayesian method that is known as one of classifying sufficiently good text. In experimental conducted, Feature Extractor method proposed by Zhang apparently takes informative part froe a web page, and it can be integrated into a program that can classify a web page based on key phrase from informative part of web page. By using holdout method in doing the experiment, integrating third module is that Feature Extractor, TFIDF, and Naïve Bayesian gives the sufficiently convenient result. Training data of 25% from total data gives classifying accuracy of 79%. Whereas training data of 70% gives accuracy of 89%.


Beri Komentar ?#(0) | Bookmark

PropertiNilai Properti
ID PublisherJIPTUNMERPP
OrganisasiD
Nama KontakDra. Wiwik Supriyanti, SS
AlamatJl. Terusan Halimun 11 B
KotaMalang
DaerahJawa Timur
NegaraIndonesia
Telepon0341-563504
Fax0341-563504
E-mail Administratorperpus@unmer.ac.id
E-mail CKOwsupriyanti@yahoo.com

Print ...

Kontributor...

  • Editor: Wiwik Supriyanti, Dra. SS.