PENERAPAN METODE PRINCIPAL COMPONENT ANALYSIS (PCA) DAN K-NEAREST NEIGHBORS (KNN) UNTUK KLASIFIKASI DATA KANKER PARU-PARU
IMPLEMENTATION OF PRINCIPAL COMPONENT ANALYSIS AND K-NEAREST NEIGHBORS METHODS FOR CLASSIFICATION OF LUNG CANCER DATA
Abstract
Diseases related to breathing can be identified from the throat or the main airways, namely the lungs. It is caused by the spread of certain cells in the lungs and uncontrolled cell growth. Because lung nodules using chest X-ray imaging cannot be detected quickly, interpreting these diagnostic photographs becomes a repetitive and very complicated task. Image pre-processing, detection of malignancy classification, and candidate nodules can be used as parameters in classification. By analyzing CT scan images using an artificial intelligence approach, machine learning techniques help in early diagnosis and evaluation of lung nodules. This system is referred to as a decision support system that examines the image through the process of pre-processing, segmentation, feature extraction, and classification. Processing of lung cancer datasets to classify cancer and non-cancer samples using the Principal Component Analysis (PCA) feature extraction method and the K-Nearest Neighbor (KNN) classification method with values of K = 1, 3, 5, 7, 9. From the research that has been done, the best accuracy results are 98%, namely at K = 9. It can be concluded that the PCA feature extraction method and K-NN classification are suitable for processing lung cancer datasets.