The objective of this study is to create tools to allow for analysis of topic lifecycles across heterogeneous corpora. While the growth of large-scale datasets has enabled examination within scientific datasets, there is a lack of research that looks across datasets--examining how different scientific activities enable or propel scientific discovery. This project will examine the development of topics in four domains: history of science, social network analysis, cognitive science, and digital humanities. Examination of the lifecycle of topics in these domains should provide insights into how scholarship evolves across genres in the social sciences and humanities. Triangulation of the methods (word analysis, topic modeling, burst detection, and survival analysis) will be used to ensure the highest level of validity.
This project is innovative in its combination of datasets; this research will combine data from formal sources (dissertations, conference proceedings, journal articles, and grant proposals) and informal communication channels (listservs, blogs, and twitter) in order to provide a more holistic lens on scientific communication. For years, our knowledge of the scholarly landscape, and subsequently, our understanding of innovation, productivity, and impact, has been largely informed by data from a single source. However, the growth of diverse datasets that reflect unique areas of scholarly activity have altered the research landscape and provide an opportunity to create more accurate understandings of the nature of science. The results of this work will have implications for policy makers, as they seek to identify emergent areas of research. It will also provide an indicator of the importance of certain communication channels for identifying emerging areas of knowledge--identifying which scholarly activities are most indicative of emerging areas and, thereby, identifying datasets that should no longer be marginalized, but built into our understandings and measurements of scholarship.
This grant was made as part of the Digging Into Data Challenge, an international competition designed to foster research collaboration across countries and to encourage innovative approaches to analyzing large data sets in the social sciences and humanities.
Principal Investigators: Cassidy R. Sugimoto, Ying Ding, Staša Milojević, Indiana University, Bloomington, NSF; Vincent Larivière, Université de Montréal, SSHRC; Mike Thelwall, University of Wolverhampton, AHRC/ESRC/JISC.