Thank you for visiting CGS! You are currently using CGS' legacy site, which is no longer supported. For up-to-date information, including publications purchasing and meeting information, please visit cgsnet.org.
Jeffrey Engler and Julia Kent, Council of Graduate Schools
The challenge of Big Data in graduate student research
Large datasets present exciting new opportunities for the U.S. and global research enterprise. Indeed, “big data” approaches to research have the potential to develop new knowledge and innovations across nearly every broad field of study, particularly in the biomedical sciences, computer science, engineering, and the social sciences. Yet the methods used to assemble large datasets, and their applications in decision-making contexts, challenge existing ethical paradigms for data management, data integrity, human subject protections, and data use. In many fields, for example, aggregating data from different sources can make privacy protections for human subjects more complex, and raise questions about data ownership. In others, the use of algorithms and predictive analytics may lead researchers to influence—not simply predict—human behaviors. Unfortunately, current attempts to identify and address these challenges are often focused within specific disciplines or corporate settings and offer little opportunity to integrate these evolving ethical concerns within graduate programs preparing the next generation of researchers.
The Graduate dean’s role in training in academic integrity
Graduate deans often oversee professional development and RCR training curricula and are uniquely positioned to present the ethical concerns of big data research to their university communities and to bridge potential silos that impede the sharing of best practices to address these evolving challenges. To address this gap in graduate student preparation, the Council of Graduate Schools (CGS) and PERVADE (Pervasive Data Ethics for Computational Systems), embarked on a project to better understand the challenges and opportunities universities face in preparing graduate students in the ethical use of big data. Our goals were to identify both broad and specific ethical challenges that arise from the use of big data resources in graduate student research; to discuss and evaluate existing resources for training in the ethical use of big data; to identify potential levers for introducing and discussing these challenges, and for engaging Principal Investigators (PIs) and advisors in helping students prepare for them; and to formulate potential strategies for deploying and embedding resources for big data ethics within academic programs, professional development opportunities, and RCR training.
Workshop on expanding graduate training in big data ethics
With generous funding from the Office of Research Integrity (ORI) and Elsevier, CGS and PERVADE convened a diverse group of graduate education leaders around these topics. The virtual event, held in April 2021, brought together graduate deans, experts in the ethics of big data research, and representatives from disciplinary societies and other organizations. This report synthesizes lessons learned from this event with the goal of informing and strengthening efforts to prepare graduate students for the challenges of big data research.
The five major conclusions and recommendations from this collective work are intended to stimulate further action and reflection in the research graduate education communities.
Conclusions and Recommendations:
All participants at the workshop agreed that continued discussion of this developing ethical concern should continue, to support graduate deans and other institutional leaders in expanding their efforts for ethical use of these research methods and databases. CGS will continue to provide updated resources on this developing area of research integrity training, as well as a forthcoming report, Preparing Graduate Students for the Ethical Challenges of Big Data.