Between land and sea: divergent data stewardship practices in deep-sea biosphere researchAGU Fall Meeting (American Geophysical Union) (2013)
AbstractData in deep-sea biosphere research often live a double life. While the original data generated on IODP expeditions are highly structured, professionally curated, and widely shared, the downstream data practices of deep-sea biosphere laboratories are far more localized and ad hoc. These divergent data practices make it difficult to track the provenance of datasets from the cruise ships to the laboratory or to integrate IODP data with laboratory data. An in-depth study of the divergent data practices in deep-sea biosphere research allows us to: - Better understand the social and technical forces that shape data stewardship throughout the data lifecycle; - Develop policy, infrastructure, and best practices to improve data stewardship in small labs; - Track provenance of datasets from IODP cruises to labs and publications; - Create linkages between laboratory findings, cruise data, and IODP samples. We present findings from the first year of a case study of the Center for Dark Energy Biosphere Investigations (C-DEBI), an NSF Science and Technology Center that studies life beneath the seafloor. Our methods include observation in laboratories, interviews, document analysis, and participation in scientific meetings. Our research uncovers the data stewardship norms of geologists, biologists, chemists, and hydrologists conducting multi-disciplinary research. Our research team found that data stewardship on cruises is a clearly defined task performed by an IODP curator, while downstream it is a distributed task that develops in response to local need and to the extent necessary for the immediate research team. IODP data are expensive to collect and challenging to obtain, often costing $50,000/day and requiring researchers to work twelve hours a day onboard the ships. To maximize this research investment, a highly trained IODP data curator controls data stewardship on the cruise and applies best practices such as standardized formats, proper labeling, and centralized storage. In the laboratory, a scientist is his or her own curator. In contrast to the IODP research parties, laboratory research teams analyze diverse datasets, share them internally, implement ad hoc data management practices, optimize methods for their specific research questions, and release data on request through personal transactions. We discovered that while these workflows help small research teams retain flexibility and local control -- crucial in exploratory deep-sea biosphere research -- they also hinder data interoperability, discoverability, and consistency of methods from one research team to the next. Additional consequences of this contrast between IODP and lab practices are that it is difficult to track the provenance of data and to create linkages between laboratory findings, cruise data, and archived IODP samples. The ability to track provenance would add value to datasets and provide a clearer picture of the decisions made throughout the data lifecycle. Better linkages between the original data, laboratory data, and samples would allow secondary researchers to locate IODP data that may be useful to their research after laboratory findings are published. Our case study is funded by the Sloan Foundation and NSF.
- data curation,
- data management,
- case study
Publication DateDecember, 2013
Citation InformationRebekah Cummings and Peter Darch. "Between land and sea: divergent data stewardship practices in deep-sea biosphere research" AGU Fall Meeting (American Geophysical Union) (2013)
Available at: http://works.bepress.com/rebekah_cummings/2/