Pengelompokan (Clustering) berita berbahasa Indonesia dengan menggunakan K-Mean Clustering

Nahdliyah, Amiroh (2012) Pengelompokan (Clustering) berita berbahasa Indonesia dengan menggunakan K-Mean Clustering. Undergraduate thesis, Universitas Islam Negeri Maulana Malik Ibrahim.

Text (Full text)
05550027.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Attribution Non-commercial No Derivatives.
(1MB) | Request a copy

Abstract

ABSTRAK

Dengan semakin berkembangnya teknologi di sekitar kita mengakibatkan meningkatnya aliran informasi. Terdapat berbagai macam sarana untuk mendapatkan suatu informasi, salah satunya dengan melalui situs online berita berbahasa indonesia. Maka dengan adanya pengelompokan dokumen teks secara otomatis ini diharapkan akan mempermudah pencarian informasi berita berbahasa Indonesia berdasarkan pengelompokan berita yang diinginkan.

Penelitian ini diawali dengan proses text preprocessing, yaitu pemrosesan teks untuk mendapatkan term kata. Metode ini terdiri dari case folding, tokenizing, dan filtering. Hasil dari proses ini kemudian dihitung bobot tf-idf dan bobot similarity antar dokumen dengan menggunakan cosine similarity. Hasil pembobotan antar dokumen itulah yang kemudian diproses menggunakan algoritma k-mean clustering untuk mendapatkan pengelompokan berita.

Dokumen berita untuk pengujian algoritma akan diambil dari surat kabar berbahasa Indonesia online. Adapun hasil uji coba menunjukkan bahwa pengelompokan berita berbahasa Indonesia dengan menggunakan algoritma ini memiliki nilai keakuratan yang cukup relevan. Nilai akurasi sitem akan meningkat dengan meningkatnya data input query. Terbukti dari hasil uji coba sistem dengan menggunkan 5 cluster dan 7 query serta 100 dokumen berita uji coba dihasilkan nilai akurasi pengelompokan dokumen maksimal 43% dan nilai akurasi pengelompokan dokumen minimal sebesar 8%.

ABSTRACT

With the development of technology around us is resulting in increased flow of information. There are a variety of means to obtain some information, one of them through the Indonesia-language news site online. So with the automatic grouping of text documents is expected to facilitate information retrieval language news Indonesia news on the desired grouping.

This study begins with the text preprocessing, is a processing of text to get the word term. This method consists of case folding, tokenizing, and filtering. The results of this process then calculated using tf-idf weights and by using the cosine similarity will calculated of similarity between documents weight. The results of documents weighting then with using k-means clustering algorithm system will be processed for grouping the news.

News documents for testing the algorithm will be taken from the Indonesian-language newspaper online. The trial results showed that the grouping of news in Indonesian language by using this algorithm has sufficient accuracy the value relevant. Accuracy values will be increase if the system increased input data query. Evident from the results of testing the system by using 5 cluster and 7 query and 100 documents generated news test the accuracy of the classification of documents up to 43% and the accuracy of grouping documents a minimum of 8%.

Item Type:

Thesis (Undergraduate)

Supervisor:

Abidin, Zainal and Nashichuddin, Achmad

Contributors:

Contribution	Name	Email
UNSPECIFIED	Abidin, Zainal	UNSPECIFIED
UNSPECIFIED	Nashichuddin, Achmad	UNSPECIFIED

Keywords:

berita; information retrieval; text preprocessing; tf-idf; cosine similarity; k-mean clustering; news; information retrieval; text preprocessing; tf-idf; cosine similarity; k-mean clustering

Departement:

Fakultas Sains dan Teknologi > Jurusan Teknik Informatika

Depositing User:

Nada Auliya Sarasawitri

Date Deposited:

12 May 2023 13:24

Last Modified:

12 May 2023 13:24

URI:

http://etheses.uin-malang.ac.id/id/eprint/49896

Downloads

Downloads per month over past year

Actions (login required)

View Item

Link Terkait

Media Sosial

Alamat

Gedung Abdurrahman Wahid

Jl. Gajayana No.50, Dinoyo, Lowokwaru, Malang,

Jawa Timur 65149, Indonesia

Email: csc@uin-malang.ac.id