EN PT


0172/2013 - A DATA MINING METHOD FOR BREAST CANCER IDENTIFICATION BASED ON SELECTED FEATURES - RESUBMISSION
MÉTODO DE MINERAÇÃO DE DADOS PARA IDENTIFICAÇÃO DE CÂNCER DE MAMA BASEADO NA SELEÇÃO DE VARIÁVEIS - REAPRESENTAÇÃO

Author:

• Nicole Holsbach - HOLSBACH, N. - Porto Alegre, RS - Universidade Federal do Rio Grande do Sul - <nicole.holsbach@bol.com.br>

Co-author(s):

• Flavio Sanson Fogliatto - Fogliatto, F.S. - Universidade Federal do Rio Grande do Sul - <ffogliatto@producao.ufrgs.br>
• Michel José Anzanello - ANZANELLO, M. J. - Universidade Federal do Rio Grande do Sul - <anzanello@producao.ufrgs.br>

Thematic Area:

Saúde e Gênero

Abstract:

In the majority of countries, female breast cancer is predominant. If diagnosed in early stages, it presents a high percentage of cure. Several statistical-based approaches have been developed to assist early breast cancer detection. This paper presents a method for feature selection for the classification of cases into two classes, benign or malignant, based on cytopathologic analysis from patients’ breast cell samples. Features are ranked according to a new feature importance index that combines Principal Component Analysis weights and the variance explained by each retained component. Observations of a training set are categorized into two classes through the k-Nearest Neighbor tool and Discriminant Analysis, followed by elimination of the feature with the smallest importance index. The subset with the maximum accuracy is used to classify observations in the testing set. When applied to the Wisconsin Breast Cancer Database, the proposed method led to average 97.77% accurate classifications while retaining an average of 5.8 features.

Keywords:

Feature selection Breast cancer identification k-nearest Neighbor Discriminant analysis

Content:

Access Issue in Scielo

Other languages:







How to

Cite

HOLSBACH, N., Fogliatto, F.S., ANZANELLO, M. J.. A DATA MINING METHOD FOR BREAST CANCER IDENTIFICATION BASED ON SELECTED FEATURES - RESUBMISSION. Cien Saude Colet [periódico na internet] (2013/Mar). [Citado em 22/01/2025]. Está disponível em: http://cienciaesaudecoletiva.com.br/en/articles/a-data-mining-method-for-breast-cancer-identification-based-on-selected-features-resubmission/12295?id=12295&id=12295



Execution



Sponsors