BACKGROUND: Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of healthcare information by both professionals and the lay public.
OBJECTIVE: This document quantifies: (1) The amount of medical content on Wikipedia, (2) the citations supporting Wikipedia’s medical content, (3) the readership of medical content, and (4) the quantity/characteristics of Wikipedia’s medical contributors
METHODS: Using a well-defined categorization infrastructure we identify medically pertinent English Wikipedia articles and links to their foreign language equivalents (Objective 1). With these, Wikipedia’s API can be queried to produce metadata and full texts for entire article histories (Objective 1-2). Wikipedia also makes available hourly reports that aggregate reader traffic at per-article granularity (Objective 3). An online survey was used to determine the background of contributors (Objective 4). Standard mining and visualization techniques (e.g.,aggregation queries, cumulative distribution functions, and/or correlation metrics) are applied to each of these datasets. Analysis focuses on year-end 2013, but historical data permits some longitudinal analysis.
RESULTS: Wikipedia’s medical content (at the end of 2013) is made up of more than 155,000 articles and 1 billion bytes of text across more than 255 languages. This content is supported by more than 950,000 references. Content was viewed more than 4.88 billion times in 2013. This makes it one of -- if not the most viewed -- medical resource(s) globally. The core editor community numbers less than 300 and has declined over the past 5 years. The members of this community are half health care providers and 85% have a university education.
CONCLUSIONS: While Wikipedia has a considerable volume of multi-lingual medical content that is extensively read and well-referenced, the core group of editors that contribute and maintain that content is small and shrinking in size.
- Web 2.0,
- collaborative applications,
- user generated content,
- traffic analysis,
- multilingual content