Data mining of Citations in Doctoral Dissertations: Tool for Collection Development and Instructional Services

Abstract
Usage statistics, such as access and download data, are a widely used tool in a collection development librarian’s toolkit to assess the relevance and usefulness of a library’s collection to its patrons. The use of citation analysis of students’ theses and dissertations adds another dimension to this evidence-based user-centered approach to assessing collection development activities of the library.

In this project, a liaison librarian and a systems specialist teamed up to make use of a systems approach to analyze the citations of doctoral dissertations from the Biological and Environmental Science and Engineering (BESE) Division in a graduate research institution. Making use of KNIME, an open source data-mining software, we created a workflow to examine citation data to discover citation patterns of student dissertations across the different programs within the BESE division and resource usage. This is matched against the current library holdings as well as compared with usage statistics obtained from JUSP.

Results suggest that as an academic division, the BESE Division is not a homogenous division and citation patterns are different across the different programs. What and how references are cited are also valuable information to inform, direct and focus our collection development and information literacy program.

The use of an open source data-mining software helps to automate the citation analysis process and provides an efficient and replicable framework to analyze citation data to supplement usage statistics. This would be useful for academic libraries planning to conduct similar studies to assess the usefulness of their collection with respect to the research activities of graduate students.

Conference/Event Name
Library Assessment Conference

Permanent link to this record
Collections