Dr. Dirk Goldhahn
Phone: +49-341-97-32203 Address: Postal Address: Dr. Dirk Goldhahn is no longer a staff member of the NLP department. |
![]() |
aktuelle Projekte
abgeschlossene Projekte
- Algorithmic corpus-based approaches to the typological comparison of languages
- The Billion Words Library
Publications
2020-
[QHKEGB20]
Uwe Quasthoff, Lars Hellan, Erik Körner, Thomas Eckart, Dirk Goldhahn, and Dorothee Beermann : Typical Sentences as a Resource for Valence. In: Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020), Marseille (France), 2020
BibTeX
-
[EBQKGK20]
Thomas Eckart, Sonja Bosch, Uwe Quasthoff, Erik Körner, Dirk Goldhahn, and Simon Kaleschke: Usability and Accessibility of Bantu Language Dictionaries in the Digital Age: Mobile Access in an Open Environment. In: First workshop on Resources for African Indigenous Languages (RAIL) at the 12th Language Resources and Evaluation Conference (LREC 2020), Marseille (France), 2020
BibTeX
-
[GEB20]
Dirk Goldhahn, Thomas Eckart, and Sonja Bosch: Enriching and Increasing the Usability of Lexicographical Data for Less-Resourced Languages. In: Selected Papers from the CLARIN Annual Conference 2019, 2020
BibTeX
-
[GEGQ19]
Dirk Goldhahn, Thomas Eckart, Rufus Gouws, and Uwe Quasthoff: Frekwensiewoordeboek van Afrikaans - A new Frequency Dictionary for Afrikaans. In: Workshop of the African Association for Lexicography (AFRILEX), Windhoek, Namibia, 2019
BibTeX
-
[EBGQK19]
Thomas Eckart, Sonja Bosch, Dirk Goldhahn, Uwe Quasthoff, and Bettina Klimek: Translation-based Dictionary Alignment for Under-resourced Bantu Languages. In: OpenAcess Series in Informatics (OASIcs), Vol. 70: Language Data and Knowledge LDK 2019, 2019
BibTeX
-
[EGQG19]
Thomas Eckart, Dirk Goldhahn, Uwe Quasthoff, and Rufus Gouws: Corpus-based Extraction of Word Relations from an Afrikaans Corpus. In: Workshop of the African Association for Lexicography (AFRILEX), Windhoek, Namibia, 2019
BibTeX
-
[ZGEL19]
Imad Zeroual, Dirk Goldhahn, Thomas Eckart, and Abdelhak Lakhouaja: OSIAN: Open Source International Arabic News Corpus - Preparation and Integration into the CLARIN-infrastructure. In: Proceedings of The Fourth Arabic Natural Language Processing Workshop (WANLP 2019), co-located with ACL 2019, 2019
BibTeX
-
[GEB19]
Dirk Goldhahn, Thomas Eckart, and Sonja Bosch: Enriching Lexicographical Data for Lesser Resourced Languages: A Use Case. In: Proceedings of CLARIN Annual Conference 2019. Eds. K. Simov and M. Eskevich. Leipzig, Germany: CLARIN, 2019
BibTeX
-
[MGH19]
Lydia Müller, Dirk Goldhahn und Gerhard Heyer: The Null Result Portal. In: Garoufallou E., Fallucchi F., William De Luca E. (eds) Metadata and Semantic Research. MTSR 2019. Communications in Computer and Information Science, vol 1057. Springer, Cham. 2019
BibTeX
-
[BEKGQ18]
Sonja Bosch, Thomas Eckart, Bettina Klimek, Dirk Goldhahn, and Uwe Quasthoff: Preparation and Usage of Xhosa Lexicographical Data for a Multilingual, Federated Environment. In: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki (Japan), 2018
BibTeX
-
[KEQG18]
Christoph Kuras, Thomas Eckart, Uwe Quasthoff, and Dirk Goldhahn: Automation, Management and Improvement of Text Corpus Production. In: 6th Workshop on the Challenges in the Management of Large Corpora at the 11th Language Resources and Evaluation Conference (LREC 2018), Miyazaki (Japan), 2018
BibTeX
-
[BGEHQSH18]
Dorothee Beermann, Dirk Goldhahn, Thomas Eckart, Lars Hellan, Uwe Quasthoff, Medadi Ssentanda, and Tormod Haugland: Digital Infrastructure for Morpho-syntactic Analysis of Under-Resourced Languages - A Case Study for Luganda. In: Comparative Corpus Linguistics: New Perspectives and Applications at the 51st Annual Meeting of the Societas Linguistica Europaea (SLE 2018), Tallinn, Estonia, 2018
BibTeX
-
[EGQB18]
Thomas Eckart, Dirk Goldhahn, Uwe Quasthoff, and Sonja Bosch: Cross-Language Dictionary Alignment for Bantu Languages. In: Workshop of the African Association for Lexicography (AFRILEX), 20th International Congress of Linguists (ICL20), Cape Town, South Africa, 2018
BibTeX
-
[EKGB18]
Thomas Eckart, Bettina Klimek, Dirk Goldhahn, and Sonja Bosch: Using Linked Data Techniques for Creating an IsiXhosa Lexical Resource - a Collaborative Approach. In: CLARIN Annual Conference 2018 in Pisa, Italy, 2018
BibTeX
-
[TGQ18]
Dieu-Tu Le, Dirk Goldhahn and Uwe Quasthoff: Frequency Dictionary Vietnamese - Từ điển tần số xuất hiện các từ trong tiếng Việt. Uwe Quasthoff, Sabine Fiedler and Erla Hallsteindóttir (eds.). Leipziger Universitätsverlag, 2018
BibTeX
-
[TEGK2017]
Jochen Tiepmar, Thomas Eckart, Dirk Goldhahn, and Christoph Kuras: Integrating Canonical Text Services into CLARIN's Search Infrastructure. In: Linguistics and Literature Studies, 5(2) , 99 - 104., 2017
BibTeX
-
[SGQ17]
Serge Sharoff, Dirk Goldhahn and Uwe Quasthoff: Frequency Dictionary Russian - Частотный словарь русского языка. Uwe Quasthoff, Sabine Fiedler and Erla Hallsteindóttir (eds.). Leipziger Universitätsverlag, 2017
BibTeX
-
[GEQ17]
Dirk Goldhahn, Thomas Eckart, and Uwe Quasthoff: A Portal for Corpus Collection for Under-Resourced Languages. In: Workshop of the African Association for Lexicography (AFRILEX), CLASA 2017, Grahamstown, 2017
BibTeX
-
[EGQ17]
Thomas Eckart, Dirk Goldhahn, and Uwe Quasthoff: Using Corpus Query Engines for Facilitating Lexicographical Analysis of African Languages. In: Workshop of the African Association for Lexicography (AFRILEX), CLASA 2017, Grahamstown, South Africa, 2017
BibTeX
-
[GTEGK17]
Till Grallert, Jochen Tiepmar, Thomas Eckart, Dirk Goldhahn, and Christoph Kuras: Digital Muqtabas CTS Integration in CLARIN. In: CLARIN Annual Conference 2017 in Budapest, Hungary, 2017
BibTeX
-
[TEGK16]
Jochen Tiepmar, Thomas Eckart, Dirk Goldhahn, and Christoph Kuras: Canonical Text Services in CLARIN - Reaching out to the Digital Classics and beyond. In: CLARIN Annual Conference 2016, 2016
BibTeX |
Download
-
[GSQ2016]
Dirk Goldhahn, Maciej Sumalvico and Uwe Quasthoff: Corpus collection for under-resourced languages with more than one million speakers. In: Workshop on Collaboration and Computing for Under-Resourced Languages (CCURL), LREC, Portorož, 2016
BibTeX |
Download
-
[BGQR16]
Solomija Buk, Dirk Goldhahn, Uwe Quasthoff and Andrij Rovenchak: Frequency Dictionary Ukrainian - Частотний словник української мови. Uwe Quasthoff, Sabine Fiedler and Erla Hallsteindóttir (eds.). Leipziger Universitätsverlag, 2016
BibTeX
-
[QGE15]
Uwe Quasthoff, Dirk Goldhahn and Thomas Eckart: Building Large Resources for Text Mining: The Leipzig Corpora Collection. In: Text Mining - From Ontology Learning to Automated Text Processing Applications, Springer, 2015
BibTeX
-
[DDGQ15]
Martine Dalmas, Dmitrij Dobrovol'skij, Dirk Goldhahn and Uwe Quasthoff: Bewertung durch Adjektive. Ansätze einer korpusgestützten Untersuchung zur Synonymie. In: LiLi - Zeitschrift für Literaturwissenschaft und Linguistik: Bewerten im Wandel, 2015
BibTeX
-
[GEGDH15]
Dirk Goldhahn, Thomas Eckart, Thomas Gloning, Kevin Dreßler, and Gerhard Heyer: Operationalisation of Research Questions of the Humanities within the CLARIN Infrastructure – An Ernst Jünger Use Case . In: CLARIN Annual Conference 2015 in Wroclaw, Poland, 2015
BibTeX |
Download
-
[KGQ15]
Deny A. Kwary, Dirk Goldhahn and Uwe Quasthoff: Frequency Dictionary Indonesian - Kamus Frekuensi Bahasa Indonesia. Uwe Quasthoff, Sabine Fiedler and Erla Hallsteindóttir (eds.). Leipziger Universitätsverlag, 2015
BibTeX
-
[HEG15]
Gerhard Heyer, Thomas Eckart, and Dirk Goldhahn: Was sind IT-basierte Forschungsinfrastrukturen für die Geistes- und Sozialwissenschaften und wie können sie genutzt werden?. In: Information - Wissenschaft & Praxis, De Gruyter, 2015
BibTeX
-
[EAQG14]
Thomas Eckart, Faisal Alshargi, Uwe Quasthoff, and Dirk Goldhahn: Large Arabic Web Corpora of High Quality: The Dimensions Time and Origin. In: Workshop on Free/Open-Source Arabic Corpora and Corpora Processing Tools, LREC, Reykjavík, 2014
BibTeX |
Download
-
[EHHQG14]
Thomas Eckart, Erla Hallsteinsdóttir, Sigrún Helgadóttir, Uwe Quasthoff, and Dirk Goldhahn: A 500 Million Word POS-Tagged Icelandic Corpus. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC), 2014
BibTeX |
Download
-
[GQ14]
Dirk Goldhahn and Uwe Quasthoff: Vocabulary-Based Language Similarity using Web Corpora. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC), 2014
BibTeX |
Download
-
[QGEHF14]
Uwe Quasthoff, Dirk Goldhahn, Thomas Eckart, Erla Hallsteinsdóttir and Sabine Fiedler: High Quality Word Lists as a Resource for Multiple Purposes. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC), 2014
BibTeX |
Download
-
[QBG14]
Uwe Quasthoff, Sonja Bosch and Dirk Goldhahn: Morphological analysis for less-resourced languages: Maximum Affix Overlap applied to Zulu. In: Workshop on Collaboration and Computing for Under-Resourced Languages in the Linked Open Data Era, LREC, Reykjavík, 2014
BibTeX |
Download
-
[GRQB14]
Dirk Goldhahn, Steffen Remus, Uwe Quasthoff and Chris Biemann: Top-Level Domain Crawling for Producing Comprehensive Monolingual Corpora from the Web. In: Workshop on Challenges in the Management of Large Corpora (CMLC-2), LREC, Reykjavík, 2014
BibTeX |
Download
-
[QMMEGGM14]
Uwe Quasthoff, Ritwik Mitra, Sunny Mitra, Thomas Eckart, Dirk Goldhahn, Pawan Goyal and Animesh Mukherjee: Large Web Corpora of High Quality for Indian Languages. In: 2nd Workshop on Indian Language Data: Resources and Evaluation, LREC, Reykjavík, 2014
BibTeX |
Download
-
[FGKREGQ14]
Rico Feist, Daniel Gerighausen, Manuel Konrad, Georg Richter, Thomas Eckart, Dirk Goldhahn, and Uwe Quasthoff: Using Significant Word Co-occurences for the Lexical Access Problem. In: Workshop on Cognitive Aspects of the Lexicon (CogALex-IV) at COLING 2014, Dublin, Ireland, 2014
BibTeX
-
[GQH14]
Dirk Goldhahn, Uwe Quasthoff and Gerhard Heyer: Corpus-Based Linguistic Typology: A Comprehensive Approach. In: Proceedings of the 12th Edition of the Konvens Conference, Hildesheim, Germany, 2014
BibTeX |
Download
-
[FGQ14]
Sabine Fiedler, Dirk Goldhahn and Uwe Quasthoff: Frequency Dictionary Esperanto - Oftecvortaro de Esperanto. Uwe Quasthoff, Sabine Fiedler and Erla Hallsteindóttir (eds.). Leipziger Universitätsverlag, 2014
BibTeX
-
[BBEGQSSSZ13]
Chris Biemann, Felix Bildhauer, Stefan Evert, Dirk Goldhahn, Uwe Quasthoff, Roland Schäfer, Johannes Simon, Leonard Swiezinski, and Torsten Zesch: Scalable Construction of High-Quality Web Corpora. In: Special Issue of the Journal for Language Technology and Computational Linguistics (JLCL), Gesellschaft für Sprachtechnologie und Computerlinguistik, 2013
BibTeX |
Download
-
[G13]
Dirk Goldhahn: Quantitative Methoden in der Sprachtypologie: Nutzung korpusbasierter Statistiken. 2013
BibTeX |
Download
-
[QGH13]
Uwe Quasthoff, Dirk Goldhahn and Gerhard Heyer : Technical Report Series on Corpus Building. 2013
BibTeX
-
[GHQ13]
Dirk Goldhahn, Zita Hollós and Uwe Quasthoff: Frequency Dictionary Hungarian - Magyar gyakorisági szótár. Uwe Quasthoff, Sabine Fiedler and Erla Hallsteindóttir (eds.). Leipziger Universitätsverlag, 2013
BibTeX
-
[EQG12]
Thomas Eckart, Uwe Quasthoff, and Dirk Goldhahn: Language Statistics-Based Quality Assurance for Large Corpora. In: Proceedings of Asia Pacific Corpus Linguistics Conference 2012, Auckland, New Zealand, 2012
BibTeX
-
[GEQ12]
Dirk Goldhahn, Thomas Eckart, and Uwe Quasthoff: Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), 2012
BibTeX |
Download
-
[EQG12a]
Thomas Eckart, Uwe Quasthoff, and Dirk Goldhahn: The Influence of Corpus Quality on Statistical Measurements on Language Resources. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), 2012
BibTeX
-
[GEQ12a]
Dirk Goldhahn, Thomas Eckart and Uwe Quasthoff: Finding Language Universals: Multivariate Analysis of Language Statistics using the Leipzig Corpora Collection. In: Leuven Statistics Days 2012, Leuven, Belgium, 2012
BibTeX
-
[MSBGLTF12]
Julia Merrill, Daniela Sammler, Marc Bangert, Dirk Goldhahn, Gabriele Lohmann, Robert Turner and Angela D Friederici: Perception of words and pitch patterns in song and speech. 2012
BibTeX
-
[FGQ12]
Sabine Fiedler, Dirk Goldhahn and Uwe Quasthoff: Frequency Dictionary English. Uwe Quasthoff, Sabine Fiedler and Erla Hallsteindóttir (eds.). Leipziger Universitätsverlag, 2012
BibTeX
-
[GCLT11]
Dirk Goldhahn, Daniel Callan, Gabriele Lohmann and Robert Turner: Song and speech - perception and covert production: New findings using multi-voxel pattern analysis. In: 19th Scientific Meeting & Exhibition of the International Society for Magnetic Resonance in Medicine (ISMRM), Montreal, Canada, 2011
BibTeX
-
[GCSLT11]
Dirk Goldhahn, Daniel Callan, Johannes Stelzer, Gabriele Lohmann and Robert Turner: Perception and covert production of song and speech: New findings using multi-voxel pattern analysis. In: 17th Annual Meeting of the Oranization on Human Brain Mapping, Quebec City, Canada, 2011
BibTeX
-
[GQ10]
Dirk Goldhahn and Uwe Quasthoff: Automatic Annotation of Co-Occurrence Relations. In: Proceedings of LREC 2010, Valletta, Malta, 2010
BibTeX
-
[LMHPLGSSVT10]
Gabriele Lohmann, Daniel S. Margulies, Annette Horstmann, Burkhard Pleger, Joeran Lepsien, Dirk Goldhahn, Haiko Schloegl, Michael Stumvoll, Arno Villringer and Robert Turner : Eigenvector centrality mapping for analyzing connectivity patterns in fMRI data of the human brain. In: PLoS ONE 5(4). 2010
BibTeX
-
[MBLLKSGAMLV]
Daniel S. Margulies, Joachim Böttger, Xiangyu Long, Yating Lv, Clare Kelly, Alexander Schäfer, Dirk Goldhahn, Alexander Abbushi, Michael P. Milham, Gabriele Lohmann and Arno Villringer: Resting developments: a review of fMRI post-processing methodologies for spontaneous brain activity. In: Magnetic Resonance Materials in Physics, Biology and Medicine December 2010, Volume 23, Issue 5-6, pp 289-307. 2010
BibTeX