Multi Document Summarization menggunakan Algoritma Recurrent Neural Network

Alfin, Moh (2023) Multi Document Summarization menggunakan Algoritma Recurrent Neural Network. Undergraduate thesis, Universitas Islam Negeri Maulana Malik Ibrahim.

Text (Fulltext)
19650024 - Moh Alfin.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
(2MB)

Abstract

INDONESIA :

Peringkasan dokumen berita adalah sebuah aspek penting dalam pemrosesan bahasa alami dan jurnal ini bertujuan untuk menggambarkan perkembangan terbaru dalam bidang ini. Dengan ledakan informasi dan jumlah berita yang terus meningkat, peringkasan dokumen berita menjadi kunci dalam menghadapi tantangan untuk mengakses informasi yang relevan dan berharga. Pada paper ini dilakukan peringkasan multi dokumen berbahasa Indonesia menggunakan algoritma Recurrent Neural Network (RNN), model yang digunakan yaitu Long-Short Term Memory (LSTM), kemudian ekstraksi fitur yang adalah Word2Vec, model Word2vec yang digunakan ada dua yaitu Continous Bag of Word (CBOW) dan Skip-gram. Hasilnya menunjukkan nilai presisi, recall, dan f-measure lebih tinggi menggunakan model CBOW. Untuk model CBOW mendapatkan nilai recall sebesar 0.487, presisi sebesar 0.704, dan F-measure sebesar 0.550. Sementara itu, untuk model Skip-gram, hasil pengujian menunjukkan nilai recall 0.414, presisi dengan nilai 0.687, dan F-measure sebesar 0.504.

ENGLISH :

News document summarization is an important aspect of natural language processing and this journal aims to describe the latest developments in this field. With the explosion of information and the ever-increasing number of news, news document summarization is key to facing the challenge of accessing relevant and valuable information. In this paper, multi-document summarization in Indonesian is performed using the RNN (Recurrent Neural Network) method, the variation used is Long Short-Term Memory (LSTM), with feature extraction using two different Word2Vec models, namely CBOW (Continuous Bag of Words) and Skip-gram. The results show significant recall, precision, and F-measure values. For the CBOW model, the recall, precision, and F-measure values found are 0.487, 0.704, and 0.550. Meanwhile, for the Skip-gram model, the test results show a recall value of 0.414, a precision value of 0.687, and an F-measure value of 0.504

ARABIC :

يعد تلخيص المستندات الإخبارية جانبًا مهمًا من معالجة اللغة الطبيعية وتهدف هذه المجلة إلى وصف أحدث التطورات في هذا المجال. مع انفجار المعلومات والعدد المتزايد باستمرار من الأخبار، يعد تلخيص المستندات الإخبارية أمرًا أساسيًا لمواجهة التحدي المتمثل في الوصول إلى المعلومات ذات الصلة والقيمة. في هذا البحث، يتم تلخيص المستندات المتعددة باللغة الإندونيسية باستخدام طريقة (الشبكة العصبية المتكررة)، والتنوع المستخدم هو الذاكرة طويلة المدى ، مع استخراج الميزات باستخدام نموذجين مختلفين من Word2Vec ، وهما CBOW (حقيبة مستمرة من الكلمات) وتخطي غرام. تظهر النتائج قيمًا كبيرة للاستدعاء والدقة والقياس. بالنسبة لنموذج CBOW ، فإن قيم الاستدعاء والدقة والقياس الموجودة هي ٠.٤٨٧ و٠.٧٤٦ و٠.٥٥٠ وفي الوقت نفسه، بالنسبة لنموذج Skip-gram ، تظهر نتائج الاختبار قيمة استدعاء تبلغ ٠.٦٨٧، وقيمة دقة تبلغ ٠.٦٨٧، وقيمة قياس تبلغ ٠.٦٨٧.

Item Type:	Thesis (Undergraduate)
Supervisor:	Abidin, Zainal and Basid, Puspa Miladin Nuraida Safitri A
Keywords:	Peringkasan; Summarization; Recurrent Neural Network; Long Short-Term Memory; Word2vec; تلخيص ; الشبكة العصبية المتكررة ; الذاكرة الطويلة وقصير; المدى.
Subjects:	08 INFORMATION AND COMPUTING SCIENCES > 0801 Artificial Intelligence and Image Processing > 080107 Natural Language Processing
Departement:	Fakultas Sains dan Teknologi > Jurusan Teknik Informatika
Depositing User:	Moh Alfin
Date Deposited:	01 Apr 2024 13:32
Last Modified:	01 Apr 2024 13:32
URI:	http://etheses.uin-malang.ac.id/id/eprint/59927

Downloads

Downloads per month over past year

Actions (login required)

View Item

Link Terkait

Media Sosial

Alamat

Gedung Abdurrahman Wahid

Jl. Gajayana No.50, Dinoyo, Lowokwaru, Malang,

Jawa Timur 65149, Indonesia

Email: csc@uin-malang.ac.id