Skip to main content
Data Lake: an opportunity for Big Data?
  • Ahmed Banafa, San Jose State University
“Data Lake” is a massive, easily accessible data repository for storing "big data". Unlike traditional data warehouses, which are optimized for data analysis by storing only some attributes and dropping data below the level aggregation, a data lake is designed to retain all attributes, especially when you do not yet know what the scope of data or its use. Currently, Hadoop is the most common technology to create a data lake. It is important to distinguish the difference between Hadoop and a data lake. A data lake is a concept, and Hadoop is a technology to implement the concept.

Publication Date
August 12, 2014
Citation Information
Ahmed Banafa. "Data Lake: an opportunity for Big Data?" (2014)
Available at: