What is Big Data?

Photo Credit: Jim Kaskade via Compfight cc

Photo Credit: Jim Kaskade via Compfight cc


“Without Big Data, you are just blind and deaf person in the middle of a freeway”

-Geoffrey Moore

The term Big Data is used extensively when people begin talking about IT trends. With the rise in the popularity of this term, many companies are trying to get into this space. Before we go much further we should explain what Big Data is defined as. Our friends at Wikipedia define it as:

Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, and information privacy. The term often refers simply to the use of predictive analytics or other certain advanced methods to extract value from data, and seldom to a particular size of data set. Accuracy in big data may lead to more confident decision making. And better decisions can mean greater operational efficiency, cost reduction and reduced risk.

This definition covers a lot of key points. First off the term Big Data is very broad and encompasses many different things for various organizations and technologies. Let’s break this down and talk about three key components of Big Data, first off is predictive analysis, second part is how this affects management, and last part is the technologies involved in creating this Big Data revolution.

Predictive Analysis

As we gather all this data the hope is we can begin to spot trends at a faster rate than previously thought possible. For instance in the healthcare arena if we gather patient data and analyze it for trends we could potentially save lives and save money in the treatment of patients.

Other industries are looking for predictive analysis and real-time analysis to help drive management decisions. I was recently talking with a corporate executive who was talking about a major financial services company that implemented a real-time analysis as part of their Big Data initiative. Their new system alerted them that a major revenue source was trending down over the course of the day. By noon that day they discovered a competitor had changed pricing and they matched the change to keep current. This was a large company and this saved billions of dollars. The old system would not have noticed this issue for days or weeks.


This massive amount of data is causing management to change its processes to be able to react to the amount of data it has at its disposal. As we see applications and systems encompass more data we must make everyone and everything “Data-aware” so data is not siloed off. Giving the data to management in many forms facilitate decision making in real-time and help predict market movements.


The foundational technologies that enable the Big Data revolution allow for us to search across unstructured and semi-structured data sources on multiple servers. As processing power has increased so has the amount of things stored on our systems. This created large datasets that we can now search using technologies such as MapReduce. MapReduce takes the computing power and splits up the requests over various nodes. One such tool that uses the MapReduce technique is Hadoop. Hadoop is an open-source Apache project.

5 Vs

Big data is usually described in part by the 5 Vs that are used to classify it.

Volume or the quantity of generated data

Variety or the content of the data that can be very different

Velocity is the speed of which the data is generated

Variability is the inconsistency the data can show

Veracity is the quality of the data

Looking at the different dimensions of data you can see why this will change our world. This technology revolution will continue to take foot in more industries over time and will change many businesses and organizations.





No comments yet.

Leave a Reply

Powered by WordPress. Designed by Woo Themes