Penentuan fitur yang relevan terhadap prediksi cacat perangkat lunak berdasarkan seleksi fitur Gain Ratio

Vidian, Diko Andri (2018) Penentuan fitur yang relevan terhadap prediksi cacat perangkat lunak berdasarkan seleksi fitur Gain Ratio. Undergraduate thesis, Universitas Islam Negeri Maulana Malik Ibrahim.

Preview

Text (Fulltext)
14650072.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
(2MB) | Preview

Abstract

ABSTRAK

Software development life cycle (SDLC) merupakan rangkaian tahap untuk menghasilkan produk perangkat lunak. Tahap paling penting dalam SDLC ada pada tahap pengujian, tahap ini melakukan pengujian perangkat lunak dan pemeriksaan validasi, setelah produk dinyatakan tidak cacat maka persetujuan diberikan dan perangkat lunak bisa digunakan pada kebutuhan. Salah satu cara yang bisa dilakukan untuk pengujian adalah teknik prediksi cacat perangkat lunak, teknik tersebut melakukan prediksi menggunakan dataset Metrics Data Program (MDP). Perlu diketahui bahwa dalam dataset tidak semua fitur yang ada memiliki pengaruh besar terhadap prediksi cacat perangkat lunak karena dataset yang digunakan dibuat tidak khusus untuk pediksi cacat perangkat lunak. Oleh karena itu, pemilihan fitur diperlukan untuk mendapatkan fitur yang berpengaruh terhadap prediksi cacat perangkat lunak.

Penelitian ini melakukan pemilihan fitur menggunakan seleksi fitur Gain Ratio dengan jumlah pengambilan fitur yang berbeda-beda dengan tujuan mendapatkan fitur yang paling berpengaruh atau relevan. Hasil dari penelitian ini menyimpulkan bahwa fitur yang relevan terhadap prediksi cacat perangkat lunak berdasarkan seleksi fitur Gain Ratio pada dataset CM1 adalah fitur Time to write program (t). Dataset JM1 adalah fitur Error estimate (b), Count of Statement Lines (lOCode), Line of code (loc), Unique operands (uniq_Opnd), Unique operator (uniq_Op), Count of Code and Comments Lines (lOCodeAndComment), Cyclomatic complexity (v(g)), Branch count (branchCount), Time to write program (t) dan Effort to write program (e). Dataset KC1 adalah fitur Count of lines of comments (lOComment), Error estimate (b) dan Count of blank lines (lOBlank). Dataset KC2 adalah fitur Count of Code and Comments Lines (lOCodeAndComment), Count of blank lines (lOBlank), Unique operands (uniq_Opnd) dan Time to write program (t). Dataset PC1 adalah fitur Unique operands (uniq_Opnd). Fitur-fitur tersebut dikatakan sebagai fitur yang relevan terhadap prediksi cacat perangkat lunak karena berdasarkan uji coba klasifikasi yang menghasilkan akurasi terbaik dari uji coba yang telah dilakukan.

ABSTRACT

Software development life cycle (SDLC) is a series of stages to produce software products. The most important stage in the SDLC is in the testing phase, this stage performs software testing and validation checks, after the product is declared not defective then approval is given and the software can be used on demand. One of the way that testing can be done is a software defect prediction technique, the technique predicts using the Metrics Data Program (MDP) dataset. Note that in the dataset not all features exist have a major effect on the prediction of software defects because the dataset used is not made specifically for software defect deficits. Therefore, feature selection is required to get features that affect the software defect prediction.

This study selects features using Gain Ratio feature selection with varying number of feature captures with the goal of getting the most influential or relevant features. The results of this study conclude that the relevant feature of software defect prediction based on Gain Ratio feature selection on CM1 dataset is Time to write program (t) feature. The JM1 dataset is a feature of Error Estimate (b), Count of Statement Lines (lOCode), Line of code (loc), Unique operands (uniq_Opnd), Unique operators (uniq_Op), Count of Code and Comments Lines (lOCodeAndComment), Cyclomatic complexity (v(g)), Branch count (branchCount), Time to write program (t) and Effort to write program (e). The KC1 dataset is a feature of Count of lines of comments (lOComment), Error estimate (b) and Count of blank lines (lOBlank). The KC2 dataset is a feature of Count of Code and Comments Lines (lOCodeAndComment), Count of blank lines (lOBlank), Unique operands (uniq_Opnd) and Time to write program (t). The PC1 dataset is a feature of Unique operands (uniq_Opnd). These features are said to be features relevant to the prediction of software defects because they are based on a classification test that produces the best accuracy of the experimental tests.

Item Type:

Thesis (Undergraduate)

Supervisor:

Fatchurrochman, Fatchurrochman and Hanani, Ajib

Contributors:

Contribution	Name	Email
UNSPECIFIED	Fatchurrochman, Fatchurrochman	UNSPECIFIED
UNSPECIFIED	Hanani, Ajib	UNSPECIFIED

Keywords:

prediksi cacat perangkat lunak; seleksi fitur Gain Ratio; software defect prediction; selection feature Gain Ratio

Departement:

Fakultas Sains dan Teknologi > Jurusan Teknik Informatika

Depositing User:

Durrotun Nafisah

Date Deposited:

14 Nov 2018 10:11

Last Modified:

14 Nov 2018 10:11

URI:

http://etheses.uin-malang.ac.id/id/eprint/12283

Downloads

Downloads per month over past year

Actions (login required)

View Item

Link Terkait

Media Sosial

Alamat

Gedung Abdurrahman Wahid

Jl. Gajayana No.50, Dinoyo, Lowokwaru, Malang,

Jawa Timur 65149, Indonesia

Email: csc@uin-malang.ac.id