Knowledge Management of Legacy Documents in Science, Administration and Industry
This project has been ended.
The project is carried out in cooperation with the Helmholtz Research Centre for Environmental Health in Munich and an industry partner and focuses on improving electronic and content-based (“semantic”) access to paper-based archival documents relating to a radioactive waste deposit facility in Germany. The tasks of Leipzig University in this project are to enhance OCR recognition rates by means of rule-based or statistical models as well as document separation, document classification and automated content analysis (relation extraction, network analysis, topic threads, sentiment analysis) through statistical NLP and advanced text mining.
Kontakt: Dr. Gregor Wiedemann, Prof. Dr. Gerhard Heyer