Forschungsthemen
[FP] NLP for Sparse Data Mapping
During clinical studies adverse events mainly are reported as free text in digital patient files.
For this process different applications are used and there is no standard to be followed. 59
centers of the Study Alliance Leukemia manually extract important terms from these reports
and send them to the study center. For a statistical evaluation the extracted terms need to be
mapped to MedDRA terms. Because huge parts of this process are done manually, it takes
lots of time as well as persons with a very good knowledge of MedDRA. To speed up the
process an automatic mapping would be useful. The overall goal would be a fully functioning
application that can be used to do exactly this. Within the scope of the projects a prototype
of the application will be developed. Original terms read from a CSV file will be mapped to
concrete MedDRA terms.
The development will be split into frontend and backend. The backend will be implemented
as a pipeline containing several exchangeable modules like preprocessing of the data, string
matching or classification using word embeddings and machine learning. The functionality of
the backend will be accessible through the front end. To ensure that the different modules as
well as the frontend and the backend work together, common interfaces and data exchange
formats will be defined.
Betreuer: Karsten Wendt