Laura Dietz

Office: Computer Science, Kingsbury Hall, Durham, NH 03824

My research combines methods for text retrieval, extraction, machine learning and analytics (TREMA).

Currently, I am working on methods that automatically, and in a query-driven manner, retrieve materials from the Web and compose Wikipedia-like articles. Especially for information needs, where the user has very little prior knowledge about, the web search paradigm of 10 blue hyperlinks is not sufficient. Instead, I envision to provide a synthesis of the Web materials to give a comprehensive overview (TREC CAR).

My goal is to develop algorithm to find what users are looking for based on text content only. In contrast, most Web-search algorithms are based on interaction data such as query-log, click, or session information---information that is not available when searching private document collections. Consequently, we aim to maximize the utility of information retrieval models in combination with methods from natural language processing.

A particular emphasis of my work is to utilize information from structured knowledge bases such as Wikipedia, Freebase, or DBpedia together with text-based reasoning on general document and Web corpora (KG4IR). In my work on "Entity Query Feature Expansion" (SIGIR 2014), I demonstrate that significantly better search results are obtained when using entity linking and knowledge bases in the retrieval algorithm.


  • Ph.D., Computer Science, Max Planck Institute
  • M.S., Goethe University, Germany
  • B.S., Goethe University, Germany

Research Interests

  • Computer Science

Courses Taught

  • CS 696W: Independent Study
  • CS 753/853: Information Retrieval
  • CS 758/858: Algorithms
  • CS 780/880: Top/Machine Learn for Sequnces
  • CS 953: DS - Knowledge Graphs and Text
  • CS 980: Adv Top/Data Sci w/ KnowGraphs
  • CS 999: Doctoral Research

Selected Publications

Dietz, L., Xiong, C., Dalton, J., & Meij, E. (2019). Special issue on knowledge graphs and semantics in text analysis and retrieval. Information Retrieval Journal, 22(3-4), 229-231. doi:10.1007/s10791-019-09354-z

Dietz, L., Xiong, C., Dalton, J., & Meij, E. (2019). Special issue on knowledge graphs and semantics in text analysis and retrieval \textbf[Special Issue]. Information Retrieval Journal, 1-3.

Nanni, F., Dietz, L., & Ponzetto, S. P. (2018). Toward a computational history of universities: Evaluating text mining methods for interdisciplinarity detection from PhD dissertation abstracts. Digital Scholarship in the Humanities, 33(3), 612-620. doi:10.1093/llc/fqx062

Weiland, L., Hulpuş, I., Ponzetto, S. P., Effelsberg, W., & Dietz, L. (2018). Knowledge-rich image gist understanding beyond literal meaning. Data & Knowledge Engineering, 117, 114-132. doi:10.1016/j.datak.2018.07.006

Nanni, F., Ponzetto, S. P., & Dietz, L. (2018). Toward comprehensive event collections. International Journal on Digital Libraries, 1-15.

Weiland, L., Ponzetto, S. P., Effelsberg, W., & Dietz, L. (2018). Understanding the Gist of Images-Ranking of Concepts for Multimedia Indexing. arXiv preprint arXiv:1809.08593.

Dietz, L., Xiong, C., & Meij, E. (2018). Overview of The First Workshop on Knowledge Graphs and Semantics for Text Retrieval and Analysis (KG4IR). ACM SIGIR Forum, 51, 139-144.

Aliannejadi, M., Hasanain, M., Mao, J., Singh, J., Trippas, J. R., Zamani, H., & Dietz, L. (2018). ACM SIGIR Student Liaison Program. ACM SIGIR Forum, 51, 42-45.

Nanni, F., Dietz, L., & Ponzetto, S. P. (2017). Data from the paper: Towards a Computational History of Universities: Evaluating Text Mining Methods for Interdisciplinarity Detection from Ph. D. Dissertation Abstracts. Digital Scholarship in the Humanities.

Nanni, F., Zhao, Y., Ponzetto, S. P., & Dietz, L. (2016). Enhancing domain-specific entity linking in DH. computational linguistics, 2, 67-88.

Most Cited Publications