My research combines methods for text retrieval, extraction, machine learning and analytics (TREMA).
Currently, I am working on methods that automatically, and in a query-driven manner, retrieve materials from the Web and compose Wikipedia-like articles. Especially for information needs, where the user has very little prior knowledge about, the web search paradigm of 10 blue hyperlinks is not sufficient. Instead, I envision to provide a synthesis of the Web materials to give a comprehensive overview (TREC CAR).
My goal is to develop algorithm to find what users are looking for based on text content only. In contrast, most Web-search algorithms are based on interaction data such as query-log, click, or session information---information that is not available when searching private document collections. Consequently, we aim to maximize the utility of information retrieval models in combination with methods from natural language processing.
A particular emphasis of my work is to utilize information from structured knowledge bases such as Wikipedia, Freebase, or DBpedia together with text-based reasoning on general document and Web corpora (KG4IR). In my work on "Entity Query Feature Expansion" (SIGIR 2014), I demonstrate that significantly better search results are obtained when using entity linking and knowledge bases in the retrieval algorithm.
Ph.D., Computer Science, Max Planck Institute
SCIENCE & TECHNOLOGY/MATHEMATICS/COMPUTER SCIENCE
CS 696: Independent Study
CS 753: Information Retrieval
CS 780: Top/Information Retrieval
CS 980: Adv Top/Data Sci w/ KnowGraphs
Aliannejadi, M., Hasanain, M., Mao, J., Singh, J., Trippas, J. R., Zamani, H., & Dietz, L. (2018). ACM SIGIR Student Liaison Program. ACM SIGIR Forum, 51, 42-45.
Dietz, L., Xiong, C., & Meij, E. (2018). Overview of The First Workshop on Knowledge Graphs and Semantics for Text Retrieval and Analysis (KG4IR). ACM SIGIR Forum, 51, 139-144.
Nanni, F., Dietz, L., & Ponzetto, S. P. (2017). Data from the paper: Towards a Computational History of Universities: Evaluating Text Mining Methods for Interdisciplinarity Detection from Ph. D. Dissertation Abstracts. Digital Scholarship in the Humanities.
Nanni, F., Dietz, L., & Ponzetto, S. P. (2017). Toward a computational history of universities: Evaluating text mining methods for interdisciplinarity detection from PhD dissertation abstracts. Digital Scholarship in the Humanities.
Nanni, F., Zhao, Y., Ponzetto, S. P., & Dietz, L. (2016). Enhancing Domain-Specific Entity Linking in DH. computational linguistics, 2, 67-88.
Nanni, F., Dietz, L., Faralli, S., Glavaš, G., & Ponzetto, S. P. (2016). Capturing interdisciplinarity in academic abstracts. D-lib magazine, 22.
Konietzny, S. G. A., Dietz, L., & McHardy, A. C. (2011). Inferring functional modules of protein families with probabilistic topic models. BMC bioinformatics, 12, 141.
Dietz, L. (2010). Directed factor graph notation for generative models. Max Planck Institute for Informatics, Tech. Rep.
Dietz, L. (2006). Exploring Social Topic Networks with the Author-Topic Model. Proceedings of ESWC’06, 54-60.
Tandler, P., & Dietz, L. (2005). Cooperation in ubiquitous computing: an extended view on sharing. In From Integrated Publication and Information Systems to Information and Knowledge Environments (pp. 241-250). Springer Berlin Heidelberg.