Navegando por Palavras-chave "Vocabulário De Saúde Do Consumidor"
Agora exibindo 1 - 1 de 1
Resultados por página
Opções de Ordenação
- ItemAcesso aberto (Open Access)Vocabulário de saúde do consumidor em idioma português(Universidade Federal de São Paulo (UNIFESP), 2019-11-13) Tenorio, Josceli Maria [UNIFESP]; Pisa, Ivan Torres [UNIFESP]; http://lattes.cnpq.br/2841925497526792; http://lattes.cnpq.br/6362966376837352; Universidade Federal de São Paulo (UNIFESP)Introduction: Some research studies show a distant language gap between the common terms used by laypersons and the technical terms used by healthcare professionals. Thus, a proposed solution to this language gap barrier is the consumer health vocabularies (CHV) index, where could be incorporated technologies which makes the data available, integrated as well as semantic relationships between themselves. Objective: Developing a Brazilian Portuguese CHV model based on web data sources, and structured according to semantic web vocabulary principles and technologies. Method: This study was split into three distinct phases. In Phase1, we have collected and extracted terms from some web-structured data sources, such as the Unified Medical Language System (UMLS) controlled vocabularies and the DBpedia Knowledge Base. These terms and their semantic relationships have represented by a complex network. Some network centrality measures have been obtained in order to characterise it. The selection of terms which could compose the CHV was performed through clustering network techniques. Phase 2 was conducted based on two steps in order to obtain new terms from unstructured web data sources written by and/or for health consumers, composed by recognition of UMLS’ terms and use of term automatic recognition techniques in order to identify candidate terms. A human validation process was conducted in order to approve these candidate terms and insert them into the CHV. In Phase 3 the CHV data have formalised and have represented by the Resource Description Framework (RDF) web data model. Furthermore, we designed and developed a layout to access the dataset by users. Results: Phase 1 resulted into a complex network containing already 146,956 terms linked by semantic relationships as synonyms, hyperonymy, and related terms, of which 31,439 are UMLS concepts, represented by preferred terms and 83,279 are synonyms. DBpedia have raised the synonym per concept rate from 1.6 to 2.5. Centrality measures were important to show some characteristics of the complex network in order to reveal the most important terms. Phase 2 has resulted in the automatic recognition of 5,916 UMLS' terms. The term automatic recognition algorithm allowed recognizes 9,674 n-grams candidates. Human validation has validated 6,245 terms, around 66.24% of the candidate terms assessed. The precision-recall curve of the algorithm that performed the automatic term recognition resulted in [0.732- ~ 0.900], a greater value than founded by other similar studies. In Phase 3, we formalized these data using Simple Knowledge Organization System (SKOS) data model and Provenance, Authoring and Versioning (PAV) ontology, suitable for CHV and supporting RDF data model. The CHV-RDF contains already 150,995 terms, which of 66,992 are preferred terms, and 84,003 are synonyms, besides the mapping of other semantic relationships between terms based on hierarchy and association. Conclusion: It was possible to build a CHV model automatically through computational techniques using data sources available on the web. The complex network model enabled to link and match terms provided by controlled and consumer vocabularies, represent their semantic relationships, and it has supported the CHV-RDF data model. Unpublished synonyms, terms and relationships have been identified. This study showed a data infrastructure which could be used for the development of consumer-oriented applications and proposed a method to development of health vocabularies in other language and updating existing vocabularies.