EN PT


0030/2025 - Natural Language Processing applied to electronic records: monitoring and detection of health events
Processamento de Linguagem Natural aplicado a registros eletrônicos: monitoramento e detecção de eventos em saúde

Author:

• Gabriel Campos Vieira - Vieira, G.C - <camposvieiragabriel@gmail.com>
ORCID: https://orcid.org/0009-0008-5022-1932

Co-author(s):

• João Henrique de Araújo Morais - Morais, JHA - <joao.tlp@gmail.com>
ORCID: https://orcid.org/0000-0003-3258-1498
• Débora Medeiros de Oliveira e Cruz - Cruz, DMO - <debora.sanitarista@gmail.com>
ORCID: https://orcid.org/0000-0002-8325-6866
• Caroline Dias Ferreira - Ferreira, CD - <carolineferreira.smsrio@gmail.com>
ORCID: https://orcid.org/0000-0001-9631-8571
• Wagner Tassinari - Tassinari, W. - <wtassinari@gmail.com>
ORCID: https://orcid.org/0000-0002-3799-1261
• Valeria Saraceni - Saraceni, V - <valsaraceni@gmail.com>
ORCID: https://orcid.org/0000-0001-7360-6490
• Gislani Mateus Oliveira Aguilar - Aguilar, GMO - <gislanimateus@gmail.com>
ORCID: https://orcid.org/0000-0001-9103-9864
• Oswaldo Gonçalves Cruz - Cruz, OG - <ogcruz@gmail.com>
ORCID: https://orcid.org/0000-0002-3289-3195


Abstract:

Text fields in medical records are a valuable source for Public Health Surveillance but remain underutilized. This study describes the use of natural language processing (NLP) to enhance the identification of suspected cases and monitor disease trends in electronic recordsurgency and emergency visits (RUE), in Rio de Janeiro municipality (MRJ). Texts were pre-processed, and rules were applied to identify individual events (measles and rubella) and collective ones (diarrhea and influenza-like syndrome), comparing the results with ICD-10 dataJanuary 2023 to September 2024. A total of 28 suspected measles cases and 33 suspected rubella cases were identified through ICD, while the NLP technique detected an additional 30 suspected measles cases and 17 of rubella based on patient complaints. Time series of diarrhea and influenza-like syndrome (SG) builtICD and complaints showed a cross-correlation above 0.93 at lag 0. Complaint analysis, particularly after the discontinuation of nonspecific SG ICD codes by RUE management, revealed greater stability and expanded detection of suspected cases, demonstrating the potential of NLP in epidemiological surveillance in MRJ.

Keywords:

natural language processing; public health surveillance; electronic health records; epidemiological monitoring

Content:

Access Issue in Scielo

Other languages:







How to

Cite

Vieira, G.C, Morais, JHA, Cruz, DMO, Ferreira, CD, Tassinari, W., Saraceni, V, Aguilar, GMO, Cruz, OG. Natural Language Processing applied to electronic records: monitoring and detection of health events. Cien Saude Colet [periódico na internet] (2025/Feb). [Citado em 05/12/2025]. Está disponível em: http://cienciaesaudecoletiva.com.br/en/articles/natural-language-processing-applied-to-electronic-records-monitoring-and-detection-of-health-events/19506



Execution



Sponsors