Deteksi dini Phising, Disinformasi, dan opini ekstrem berbais teks digital menggunakan Transformer

Wibowo, Munif (2026) Deteksi dini Phising, Disinformasi, dan opini ekstrem berbais teks digital menggunakan Transformer. Masters thesis, Universitas Islam Negeri Maulana Malik Ibrahim.

Text (Fulltext)
240605220017.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
(1MB)

Abstract

ABSTRAK

Perkembangan teknologi digital telah meningkatkan kompleksitas ancaman siber berbasis teks, seperti phishing, disinformasi, dan opini ekstrem yang dapat memengaruhi stabilitas sosial serta keamanan informasi. Penelitian ini bertujuan mengembangkan sistem deteksi dini ancaman siber berbasis teks digital dengan mengintegrasikan arsitektur hybrid CNN–BiLSTM–Transformer. Model ini dirancang untuk menggabungkan keunggulan Convolutional Neural Network (CNN) dalam mengekstraksi fitur lokal, Bidirectional Long Short-Term Memory (BiLSTM) dalam memahami dinamika temporal, serta Transformer dalam menangkap konteks semantik global melalui mekanisme self-attention. Dataset yang digunakan meliputi email phishing, korpus sentimen media sosial, dan teks opini publik yang telah melalui proses preprocessing dan pelabelan ulang. Evaluasi dilakukan menggunakan metrik accuracy, precision, recall, F1-score, dan AUC-ROC dengan pendekatan 5-fold cross-validation. Hasil penelitian menunjukkan bahwa model hybrid secara konsisten mengungguli model baseline tunggal dalam seluruh metrik evaluasi. Uji stres terhadap data noised dan adversarial juga menunjukkan ketahanan model yang baik terhadap manipulasi linguistik. Penelitian ini berkontribusi secara teoretis dalam pengembangan paradigma deteksi ancaman siber berbasis NLP integratif, serta secara praktis sebagai dasar pengembangan Cyber Early Warning System (CEWS) berbasis teks yang adaptif dan proaktif.

مستخلص البحث

أدى تطور التكنولوجيا الرقمية إلى زيادة تعقيد التهديدات الإلكترونية القائمة على النصوص، مثل التصيد الاحتيالي والتضليل والآراء المتطرفة التي يمكن أن تؤثر على الاستقرار الاجتماعي وأمن المعلومات. تهدف هذه الدراسة إلى تطوير نظام للكشف تم تصميم CNN–BiLSTM–Transformer. المبكر عن التهديدات السيبرانية الرقمية النصية من خلال دمج بنية هجينة من في استخراج السمات المحلية، والذاكرة الطويلة المدى ثنائية الاتجاه (CNN) هذا النموذج لدمج مزايا الشبكة العصبية التلافيفية في التقاط السياق الدلالي الشامل من خلال آلية الانتباه الذاتي. Transformer في فهم الديناميات الزمنية، و (BiLSTM) تتضمن مجموعة البيانات المستخدمة رسائل البريد الإلكتروني الاحتيالية، ومجموعات النصوص المتعلقة بمشاعر وسائل التواصل الاجتماعي، ونصوص الرأي العام التي خضعت للمعالجة المسبقة وإعادة التسمية. تم إجراء التقييم باستخدام مقاييس مع نهج التحقق المتقاطع الخماسي. تظهر النتائج أن النموذج الهجين AUC-ROC ، وF1 الدقة، والإحكام، والاسترجاع، ودرجة يتفوق باستمرار على النموذج الأساسي الفردي في جميع مقاييس التقييم. كما أظهرت اختبارات الضغط على البيانات المضطربة والمتعارضة مرونة النموذج في التعامل مع التلاعب اللغوي. تساهم هذه الأبحاث نظريًا في تطوير نموذج تكاملي للكشف عن قابل (CEWS) ، وعمليًا كأساس لتطوير نظام إنذار مبكر سيبراني(NLP) التهديدات السيبرانية قائم على معالجة اللغة الطبيعية

ABSTRACT

The development of digital technology has increased the complexity of text-based cyber threats, such as phishing, disinformation, and extreme opinions that can affect social stability and information security. This study aims to develop an early detection system for digital text-based cyber threats by integrating a hybrid CNN–BiLSTM–Transformer architecture. This model is designed to combine the advantages of Convolutional Neural Network (CNN) in extracting local features, Bidirectional Long Short-Term Memory (BiLSTM) in understanding temporal dynamics, and Transformer in capturing global semantic context through a self-attention mechanism. The dataset used includes phishing emails, social media sentiment corpora, and public opinion texts that have undergone preprocessing and relabelling. Evaluation was performed using accuracy, precision, recall, F1-score, and AUC-ROC metrics with a 5-fold cross-validation approach. The results show that the hybrid model consistently outperforms the single baseline model in all evaluation metrics. Stress tests on noisy and adversarial data also demonstrated the model's resilience to linguistic manipulation. This research contributes theoretically to the development of an integrative NLP-based cyber threat detection paradigm and practically as a basis for the development of an adaptive and proactive text-based Cyber Early Warning System (CEWS).

Item Type:	Thesis (Masters)
Supervisor:	Faisal, Muhammad and Nugroho, Fresy
Keywords:	Deteksi Dini Siber; Phishing; Disinformasi; Opini Ekstrem; CNN– BiLSTM–Transformer للتكيف واستباقي قائم على النصوص CNN–BiLSTM–Transformer الكلمات المفتاحية: الإنذار المبكر السيبراني؛ التصيد الاحتيالي؛ التضليل؛ الآراء المتطرف Cyber Early Warning; Phishing; Disinformation; Extreme Opinions; CNN– BiLSTM–Transformer
Subjects:	08 INFORMATION AND COMPUTING SCIENCES > 0801 Artificial Intelligence and Image Processing > 080107 Natural Language Processing
Departement:	Fakultas Sains dan Teknologi > Jurusan Magister Tehnik Informatika
Depositing User:	Munif Wibowo
Date Deposited:	17 Jun 2026 14:37
Last Modified:	17 Jun 2026 14:37
URI:	http://etheses.uin-malang.ac.id/id/eprint/85480

Downloads

Downloads per month over past year

Actions (login required)

View Item

Link Terkait

Media Sosial

Alamat

Gedung Abdurrahman Wahid

Jl. Gajayana No.50, Dinoyo, Lowokwaru, Malang,

Jawa Timur 65149, Indonesia

Email: csc@uin-malang.ac.id