Similarity Estimation for Identification of Biomedical Articles with Similar Research Focuses
In this project, we plan to extend our previous studies on retrieval and mining of highly related biomedical articles. We plan to develop a method to improve similarity estimation for biomedical articles. Given a biomedical article r, the similarity measure should be able to identify those articles that focus on similar research issues of r. This similarity measure can thus support analysis and curation of biomedical evidence already published in biomedical literature. Previous inter-article similarity measures often estimated how two articles cite similar set of references. They were developed based on the expectation that two articles that cite similar references may focus on similar research topics, and hence similarity estimation for these references is essential. However, two articles with similar research focuses may still often cite different sets of references, making previous similarity measures unable to identify highly related articles properly. We thus plan to improve the similarity measures. Contributions of the project are of significance to development of information retrieval technology, as well as practical significance to the cross-validation and curation of biomedical evidence published in the huge and ever-growing biomedical literature.