Data mining of Citations in Theses: a workflow for automated analysis of Open Access and library holdings coverage(2019-06-19) [Poster]A systems specialist and a liaison librarian worked together in this project to analyze resource usage, Open Access coverage, library holdings coverage and citation patterns from a collection of doctoral theses in a graduate research university. The extracted citations and some basic metadata about the theses and their authors were processed using a workflow created with KNIME, an open source data-mining software. The workflow uses Summon and Crossref APIs for library holdings coverage, CORE and Unpaywall APIs for Open Access coverage and an SQLite database to store the output and enable detailed analysis. This tool provides an insight into the resources that have been effectively used to produce doctoral theses. It would be useful for academic libraries interested in evaluating the impact of Open Access resources and how they contribute to their scholarly output, and to evaluate the coverage provided by their holdings to the research activity in their institutions beyond the usual usage reports provided by publishers or third parties.
Data mining of Citations in Doctoral Dissertations: Tool for Collection Development and Instructional Services(2018-12) [Poster]Usage statistics, such as access and download data, are a widely used tool in a collection development librarian’s toolkit to assess the relevance and usefulness of a library’s collection to its patrons. The use of citation analysis of students’ theses and dissertations adds another dimension to this evidence-based user-centered approach to assessing collection development activities of the library. In this project, a liaison librarian and a systems specialist teamed up to make use of a systems approach to analyze the citations of doctoral dissertations from the Biological and Environmental Science and Engineering (BESE) Division in a graduate research institution. Making use of KNIME, an open source data-mining software, we created a workflow to examine citation data to discover citation patterns of student dissertations across the different programs within the BESE division and resource usage. This is matched against the current library holdings as well as compared with usage statistics obtained from JUSP. Results suggest that as an academic division, the BESE Division is not a homogenous division and citation patterns are different across the different programs. What and how references are cited are also valuable information to inform, direct and focus our collection development and information literacy program. The use of an open source data-mining software helps to automate the citation analysis process and provides an efficient and replicable framework to analyze citation data to supplement usage statistics. This would be useful for academic libraries planning to conduct similar studies to assess the usefulness of their collection with respect to the research activities of graduate students.
Using the IR as a Research Data Registry(2018-05-30) [Poster]As data and software become increasingly common research outputs, universities have an opportunity to expand their existing efforts to record affiliated publications so that they also capture information about research data releases. At KAUST we have taken several steps to put our repository on a path towards becoming a reliable registry for information about the existence and location of research data released by affiliated researchers. These included developing a process to retrospectively retrieve and register information about datasets with machine-readable relationships to publications already in the repository, and updates to our active publications tracking procedures so that data availability statements are retrieved at the time of harvesting and checked for references to research data. The presentation will conclude by discussing how these efforts help put the repository in a position to provide expanded services in support of improved research data management, including access to and preservation of research data not explicitly linked to a formal publication.
Capturing from the start: Managing grey literature in a brand new university and enriching the institutional repository(2014-12-09) [Poster]The institutional repository became popular solutions to Capture, Preserve and Share university intellectual outputs; it had a great value on advancing scholarly communication, for both published materials such as peer reviewed article, conference paper where it is considered as an additional dissemination channel or for the best scenario as a nonrestrictive (Open Access) channels, where for the gray literature the institutional repository could be the only sharing method, and thus the gray literature could be considered as the uniqueness items that can add significant values the university repository materials. This Poster demonstrates the establishment phases of institutional repository in a brand new University. We will be discussing, the early consideration of the gray literature, success stories, difficulties, establishing different workflow, plagiarism checks, approvals , stakeholder’s involvement, establishing policy and service level agreements. The future challenges associated with grey literature, the increasing demand on different material types such as audio, video and research datasets. The skill sets required in describing gray literature materials, expertise in preserving datasets, research data expertise and continued access are going to be the challenging faces for any ETD programs in near future. Finally will wrap up with statistical of the download and viewed items of the shared grey literature materials from the repository and how that led to increase items citation.