Mining document, concept, and term associations for effective biomedical retrieval - Introducing MeSH-enhanced retrieval models
Abstract
Manually assigned subject terms, such as Medical Subject Headings (MeSH) in the health domain, describe the concepts or topics of a document. Existing information retrieval models do not take full advantage of such information. In this paper, we propose two MeSH-enhanced (ME) retrieval models that integrate the concept layer (i.e. MeSH) into the language modeling framework to improve retrieval performance. The new models quantify associations between documents and their assigned concepts to construct conceptual representations for the documents, and mine associations between concepts and terms to construct generative concept models. The two ME models reconstruct two essential estimation processes of the relevance model (Lavrenko and Croft 2001) by incorporating the document-concept and the concept-term associations. More specifically, in Model 1, language models of the pseudo-feedback documents are enriched by their assigned concepts. In Model 2, concepts that are related to users’ queries are first identified, and then used to reweight the pseudo-feedback documents according to the document-concept associations. Experiments carried out on two standard test collections show that the ME models outperformed the query likelihood model, the relevance model (RM3), and an earlier ME model. A detailed case analysis provides insight into how and why the new models improve/worsen retrieval performance. Implications and limitations of the study are discussed. This study provides new ways to formally incorporate semantic annotations, such as subject terms, into retrieval models. The findings of this study suggest that integrating the concept layer into retrieval models can further improve the performance over the current state-of-the-art models.
Citation
Mao, J., Lu, K., Mu, X., & Li, G. (2015). Mining document, concept and term associations for effective biomedical retrieval - Introducing MeSH-enhanced retrieval models. Information Retrieval Journal, 18(5), 413-444.
Collections
The following license files are associated with this item:
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States
Related items
Showing items related by title, author, creator and subject.
-
4DVAR retrieval of prognostic land surface model variables.
Ren, Diandong. (2004)The major findings of the first type of retrieval are: Initial soil moisture contents as well as deep soil temperature can all be successfully retrieved, for realistic initial guess errors; the relative difficulty in ... -
Effects of semantic relatedness on the retrieval of words from long-term memory
Jastrzembski, James Edward (1976-07) -
Performance-cost-value decision parameters of reference retrieval systems /
Pederson, John Alvin, (The University of Oklahoma., 1973)