Mining document, concept, and term associations for effective biomedical retrieval - Introducing MeSH-enhanced retrieval models

dc.contributor.authorMao, Jin
dc.contributor.authorMu, Xiangming
dc.contributor.authorLi, Gang
dc.date.accessioned2016-11-05T16:13:47Z
dc.date.available2016-11-05T16:13:47Z
dc.date.issued2015
dc.description.abstractManually assigned subject terms, such as Medical Subject Headings (MeSH) in the health domain, describe the concepts or topics of a document. Existing information retrieval models do not take full advantage of such information. In this paper, we propose two MeSH-enhanced (ME) retrieval models that integrate the concept layer (i.e. MeSH) into the language modeling framework to improve retrieval performance. The new models quantify associations between documents and their assigned concepts to construct conceptual representations for the documents, and mine associations between concepts and terms to construct generative concept models. The two ME models reconstruct two essential estimation processes of the relevance model (Lavrenko and Croft 2001) by incorporating the document-concept and the concept-term associations. More specifically, in Model 1, language models of the pseudo-feedback documents are enriched by their assigned concepts. In Model 2, concepts that are related to users’ queries are first identified, and then used to reweight the pseudo-feedback documents according to the document-concept associations. Experiments carried out on two standard test collections show that the ME models outperformed the query likelihood model, the relevance model (RM3), and an earlier ME model. A detailed case analysis provides insight into how and why the new models improve/worsen retrieval performance. Implications and limitations of the study are discussed. This study provides new ways to formally incorporate semantic annotations, such as subject terms, into retrieval models. The findings of this study suggest that integrating the concept layer into retrieval models can further improve the performance over the current state-of-the-art models.en_US
dc.description.peerreviewYesen_US
dc.identifier.citationMao, J., Lu, K., Mu, X., & Li, G. (2015). Mining document, concept and term associations for effective biomedical retrieval - Introducing MeSH-enhanced retrieval models. Information Retrieval Journal, 18(5), 413-444.en_US
dc.identifier.doi10.1007/s10791-015-9264-0en_US
dc.identifier.urihttp://hdl.handle.net/11244/45540
dc.languageen_USen_US
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/*
dc.subjectInformation retrievalen_US
dc.subjectMeSHen_US
dc.subjectConcept-based retrievalen_US
dc.subjectRetrieval modelsen_US
dc.subjectText miningen_US
dc.titleMining document, concept, and term associations for effective biomedical retrieval - Introducing MeSH-enhanced retrieval modelsen_US
dc.typeArticleen_US
ou.groupCollege of Arts and Sciencesen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
IR_pre-print.pdf
Size:
2.15 MB
Format:
Adobe Portable Document Format
Description:
Pre-print main article
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.72 KB
Format:
Item-specific license agreed upon to submission
Description: