There has been tremendous outburst of data from different sensors, devices in various formats from self-governing or connected applications. Data increase has enhanced challenges to process, scrutinize, stock and comprehend it. Some amazing statistics:
Web pages indexed by Google in the year 1998 were approximately one million, which swiftly rose to 1 billion in the year 2000, crossing 1 trillion in the year 2008, and the figure is increasing cumulatively while you are reading this. This is making it much more difficult to keep up with the pace of the data that is being gathered.
Introduction of social networking applications including Facebook, Twitter etc., allow users to create and share content spontaneously; increasing the already enormous web volume.
Last but not the least is the medium; through which data that is transferred – the mobile phones. Coming years would be a witness to exponential outburst of data due to “Internet of Things (IoT) applications”.
Hence, there is the need of a tool that can help in mining all this data efficiently. This is where big data comes into picture. Big data and data mining go hand in hand and it should be noted that data mining for business intelligence has its own advantages. In this write-up we will discuss about what future holds for big data and data mining.
Big data emerged on the scene originally in 1998 in a Silicon Graphics (SGI) slide deck by John Mashey titled “Big Data and the Next Wave of Infrastress”. Big data gained lot of prominence since the inception.
The first academic paper with the words “Big Data” appearing in the title in the year 2000 by Diebold. Big data was born because there was wide array of data which was being processed daily. In order to mine the data properly there was a need for synchronizing data mining for business intelligence. This is when big data was discovered.
Some of the amazing and startling numbers presented on internet usage by a workshop on big mine showcased the following things:
- Every day Google gets more than 1 billion queries.
- Twitter has approximately more than 250 million tweets per day
- Facebook gets roughly 800 million updates per day
- YouTube has more than 4 billion views every day.
The data that is being processed nowadays is in zeta bytes. This figure is growing by more than 40% each year. Even mobile companies are scrutinizing data from their mobile devices and big companies comprising of Google, Apple, Facebook, Yahoo, and Twitter are trying to find different ways to exploit these data for these business.
Let’s have a look at some of the suggested solution for big data and data mining:
It is still ambiguous to comprehend on how an ideal architecture can help in dealing with historic and real time data simultaneously. However a proposed solution is being reviewed rigorously, that tries to divide entire big data into three layers: the batch layer, the serving layer and the speed layer.
Time Embryonic Data:
Since data is evolving everyday it is very important for big data techniques to be proficient enough to adapt and detect the changes beforehand.
Concealed Big Data:
According to IDC study on Big Data conducted in the year 2012, approx., 23% of the digital universe would be valuable for Big Data if utilized properly with appropriate scrutiny. But today only about 3% of the data is tagged, and even less is scrutinized.
The practicality of big data mining can be gauged from the work carried out by Global Pulse, a United Nations initiative to enhance the life in developing countries by utilizing big data. It was primarily launched in the year 2009 and was functioning as a ground-breaking lab that mines big data for developing countries.
They have laid a strategy of studying state-of-the-art approaches and procedures for scrutinizing actual digital data to identify initial emergent susceptibilities, amassing free and open source technology toolkit for scrutinizing pertinent data and sharing hypotheses, and launching a combined worldwide network of Pulse Labs to execute the plan on country level.
Market Acceptance and Replication Probabilities
Thomas H. Davenport and Jinho Kin, 2013, quote “Big data for business intelligence is set to change every business and industry in the next ten years. We are going to see that those who adapt to the big data marvel early will have momentous competitive advantage.”