My research combines methods for text retrieval, extraction, machine learning and analytics (TREMA).
Currently, I am working on methods that automatically, and in a query-driven manner, retrieve materials from the Web and compose Wikipedia-like articles. Especially for information needs, where the user has very little prior knowledge about, the web search paradigm of 10 blue hyperlinks is not sufficient. Instead, I envision to provide a synthesis of the Web materials to give a comprehensive overview (TREC CAR).
My goal is to develop algorithm to find what users are looking for based on text content only. In contrast, most Web-search algorithms are based on interaction data such as query-log, click, or session information---information that is not available when searching private document collections. Consequently, we aim to maximize the utility of information retrieval models in combination with methods from natural language processing.
A particular emphasis of my work is to utilize information from structured knowledge bases such as Wikipedia, Freebase, or DBpedia together with text-based reasoning on general document and Web corpora (KG4IR). In my work on "Entity Query Feature Expansion" (SIGIR 2014), I demonstrate that significantly better search results are obtained when using entity linking and knowledge bases in the retrieval algorithm.
Ph.D., Computer Science, Max Planck Institute
M.S., Goethe University, Germany
B.S., Goethe University, Germany
SCIENCE & TECHNOLOGY/MATHEMATICS/COMPUTER SCIENCE
CS 696W: Independent Study
CS 753/853: Information Retrieval
CS 758/858: Algorithms
CS 780/880: Top/Information Retrieval
CS 953: DS - Knowledge Graphs and Text
CS 980: Adv Top/Data Sci w/ KnowGraphs
CS 999: Doctoral Research
Dietz, L., Xiong, C., Dalton, J., & Meij, E. (2019). Special issue on knowledge graphs and semantics in text analysis and retrieval \textbf[Special Issue]. Information Retrieval Journal, 1-3.
Weiland, L., Hulpuş, I., Ponzetto, S. P., Effelsberg, W., & Dietz, L. (2018). Knowledge-rich image gist understanding beyond literal meaning. Data & Knowledge Engineering, 117, 114-132. doi:10.1016/j.datak.2018.07.006
Nanni, F., Dietz, L., & Ponzetto, S. P. (2018). Toward a computational history of universities: Evaluating text mining methods for interdisciplinarity detection from PhD dissertation abstracts. Digital Scholarship in the Humanities, 33(3), 612-620. doi:10.1093/llc/fqx062
Weiland, L., Ponzetto, S. P., Effelsberg, W., & Dietz, L. (2018). Understanding the Gist of Images-Ranking of Concepts for Multimedia Indexing. arXiv preprint arXiv:1809.08593.
Nanni, F., Ponzetto, S. P., & Dietz, L. (2018). Toward comprehensive event collections. International Journal on Digital Libraries, 1-15.
Dietz, L., Xiong, C., & Meij, E. (2018). Overview of The First Workshop on Knowledge Graphs and Semantics for Text Retrieval and Analysis (KG4IR). ACM SIGIR Forum, 51, 139-144.
Aliannejadi, M., Hasanain, M., Mao, J., Singh, J., Trippas, J. R., Zamani, H., & Dietz, L. (2018). ACM SIGIR Student Liaison Program. ACM SIGIR Forum, 51, 42-45.
Nanni, F., Dietz, L., & Ponzetto, S. P. (2017). Data from the paper: Towards a Computational History of Universities: Evaluating Text Mining Methods for Interdisciplinarity Detection from Ph. D. Dissertation Abstracts. Digital Scholarship in the Humanities.
Nanni, F., Dietz, L., Faralli, S., Glavaš, G., & Ponzetto, S. P. (2016). Capturing interdisciplinarity in academic abstracts. D-lib magazine, 22, 9.
Nanni, F., Zhao, Y., Ponzetto, S. P., & Dietz, L. (2016). Enhancing domain-specific entity linking in DH. computational linguistics, 2, 67-88.