Skip to main content
Article
The opportunities and shortcomings of using big data and national databases for sarcoma research
Cancer
  • Heather Lyu, Harvard Medical School, Boston, Massachusetts
  • Adil H Haider, Harvard Medical School, Boston, Massachusetts
  • Adam B Landman, Harvard Medical School, Boston, Massachusetts
  • Chandrajit P Raut, Harvard Medical School, Boston, Massachusetts
Publication Date
9-1-2019
Document Type
Review Article
Abstract

The rarity and heterogeneity of sarcomas make performing appropriately powered studies challenging and magnify the significance of large databases in sarcoma research. Established large tumor registries and population-based databases have become increasingly relevant for answering clinical questions regarding sarcoma incidence, treatment patterns, and outcomes. However, the validity of large databases has been questioned and scrutinized because of the inaccuracy and wide variability of coding practices and the absence of clinically relevant variables. In addition, the utilization of large databases for the study of rare cancers such as sarcoma may be particularly challenging because of the known limitations of administrative data and poor overall data quality. Currently, there are several large national cancer databases, including the Surveillance, Epidemiology, and End Results database, the National Cancer Data Base of the American College of Surgeons and the American Cancer Society, and the National Program of Cancer Registries of the Centers for Disease Control and Prevention. These databases are often used for sarcoma research, but they are limited by their dependence on administrative or billing data, the lack of agreement between chart abstractors on diagnosis codes, and the use of preexisting documented hospital diagnosis codes for tumor registries, which lead to a significant underestimation of sarcomas in large data sets. Current and future initiatives to improve databases and big data applications for sarcoma research include increasing the utilization of sarcoma-specific registries and encouraging national initiatives to expand on real-world, evidence-based data sets

Citation Information
Heather Lyu, Adil H Haider, Adam B Landman and Chandrajit P Raut. "The opportunities and shortcomings of using big data and national databases for sarcoma research" Cancer Vol. 125 Iss. 17 (2019) p. 2926 - 2934
Available at: http://works.bepress.com/adil_haider/247/