Answered By: Bill Corey
Last Updated: Nov 18, 2015     Views: 55

Everyone has a different definition for Big Data.  It often depends on your discipline.

Probably the most used definition is from the NSF & NIH, who in a joint program solicitation in 2012, defined it thusly: "big data...refers to large, diverse, complex, longitudinal, and/or distributed data sets generated from instruments, sensors, Internet transactions, email, video, click streams, and/or all other digital sources available today and in the future." (

Michelle Chen describes it as "...the amassing of huge amounts of statistical information on social and economic trends and human behavior..." (  

Michael Jordan, a machine learning researcher at UC-Berkeley, says that "The issue is not just size—we’ve always had big data sets — the issue is granularity." ( For examples look at the disciplines of Astronomy, Physics and Medicine - specifically genomic and protein sequencing - for big datasets that have been around for a while.

Hannah Wallach, a researcher in computational social science and machine learning, goes a bit further.  She says that "not only do these data sets document social phenomena, they do so at the granularity of individual people and their activities." (ibid

Another important aspect of defining big data are the tools used to work with and analyze it, because most existing systems are inadequate.  Gartner Research defines big data as "...high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation."  Volume refers to the amount of data, velocity to the speed with which it can be attained or processed, and variety to the types and sources of the data.



Related Topics

Contact Us