Hadoop Overview
In any Big Data solution, 3V’s-Volume, Velocity and Variety- play a vital role. Hadoop is designed to tackle all these 3V’s.
a) Hadoop can handle a large volumes of data with its distributed file system. And the MapReduce programming model helps in processing this data in parallel.
b) Hadoop can process data very fast as it uses in-memory computations.
c) Hadoop supports a variety of data like structured, semi-structured and unstructured data.
The Three V’s of Big Data
Big data is more than just a lot of data. Data is so large and complex that it is difficult to process using traditional techniques. The three V’s of big data are volume, velocity, and variety.
Volume
In order to make sense of big data, businesses must first focus on the three V’s: volume, velocity, and variety.
Volume refers to the sheer amount of data that businesses must sift through. In 2012, 2.5 quintillion bytes of data were generated per day–and that number is only growing. The excessive volume of big data can quickly overwhelm traditional business intelligence (BI) tools and systems, making it difficult for organizations to glean useful insights.
The next V, velocity, captures the speed at which data is generated and processed. Today’s digital economy moves at lightning speed, and customers expect real-time responses from the companies they do business with. To keep up, businesses must be able to quickly collect and analyze large volumes of data in order to make informed decisions about their products, services, and customers.
The last V, variety, refers to the many different types of data that organizations must now deal with. In addition to structured data (alphanumeric data that can easily be stored in a database), there is unstructured data such as text documents, images, videos, social media posts, and logs. To fully unlock the value of big data, businesses need tools and systems that can effectively handle all types of data–not just structured data.
Velocity
DATA IN MOTION
“More data has been created in the past two years than in the rest of human history combined.” – Eric Schmidt, Executive Chairman of Google
This is only half of the story. Not only is Big Data growing at an unprecedented rate, but it is also moving at an unprecedented velocity. In other words, data is becoming generated faster than ever before. This has a number of consequences:
Firstly, it becomes more difficult to collect all of the relevant data. Consider trying to collect data from social media – by the time you have collected and processed it, the data is effectively redundant, as it will have moved on. Secondly, this pace of change means that historical data becomes less useful for understanding current trends. Thirdly, real-time analysis becomes more important – if you can analyse data as it is being generated, you can make decisions based on up-to-the-minute information.
All of these consequences require new approaches to dealing with data. Traditional methods simply cannot keep up with the pace of change.
Variety
The three V’s of big data are volume, velocity, and variety.
Volume: The volume of data is the amount of data that is generated. Velocity: The velocity of data is the speed at which the data is generated. Variety: The variety of data is the different types of data that are generated.