Abstract:
Tailoring personalized treatments demands the analysis of a patient's characteristics, which may be scattered over a wide variety of sources. These features include family history, life habits, comorbidities, and potential treatment side effects. Moreover, the analysis of the services visited the most by a patient before a new diagnosis and the type of requested tests, may uncover patterns that contribute to earlier disease detection and treatment effectiveness.
Built on the concept of knowledge-driven ecosystems, we devise DE4LungCancer, a data ecosystem of health data sources for lung cancer.
Knowledge extracted from heterogeneous sources, e.g., clinical records, scientific publications, and pharmacologic data, is integrated into knowledge graphs. Ontologies describe the meaning of the combined data, and mapping rules enable the declarative definition of the transformation and integration processes. Moreover, DE4LungCancer is assessed in terms of the methods followed for data quality assessment and curation. Lastly, the role of controlled vocabularies and ontologies in health data management is discussed and their impact on transparent knowledge extraction and analytics.
This paper presents the lesson learned in the DE4LungCancer development and demonstrates the transparency level supported by the proposed knowledge-driven ecosystem
in the context of the lung cancer pilots in the EU H2020 funded project BigMedilytic, the ERA PerMed funded project P4-LUCAT, and the EU H2020 projects CLARIFY and iASiS.