Friday, December 21, 2012

The business impact of Bigdata

As a company engaged in Big data before the term became as common as it is today, we are constantly having conversations around solutions that have a big data problem. Naturally, a lot of talk ranges around Hadoop, NoSQL, and other such technologies.

But what we notice is a  pattern in how this is impacting business. There is a company that caters to researchers who till recently were dealing with petabytes of data. This is a client company and we helped implement our HSearch real time big data search engine for Hadoop. Before this intervention, the norm was to wait for upto 3 days at times to receive a report for a query spanning the petabytes of distributed information that was characterized by huge volumes and lot of variety. Today, the norm has changed with big data solution and it is about sub second response times.

Similarly, in a conversation with a Telecom industry veteran, we were told that the health of telecom has always been networks monitored across large volume of transmission towers and together generate over 1 Terabyte of data each day as machine logs, sensor data, etc. The norm here was to receive health reports compiled at a weekly frequency. Now, some players are not satisfied with that and want to receive these reports on a daily basis, and possibly hourly or even in real time.

Not stopping at reporting as it happens, or in near real-time, the next question business is asking, if you can tell so fast, can you predict it will happen, especially in  a world of monitoring IT systems and machine generated data. We will leave predicting around human generated data analytics (read - social feed analysis) out of the story for the moment. Predictive analysis could mean predicting that a certain purple shade large size polo neck is soon going to run out of stock for a certain market region given other events. Or it could mean, more feasible, that a machine serving rising number of visitors to a site is likely to go down soon since its current sensor data indicates a historical pattern, therefore, alert the adminstrator or better still bring up a new node on demand and keep it warm and ready.

So it seems the value of big data is in its degree of freshness and actionability, and at most basic level, simply get the analysis or insight out faster by a manifold factor!

V Power and big data

Even a decade back, when you say mention the alphabet V one thinks of powerful diesel engines. Today, some say data is the new oil and big data is best understood by the V words - Volume, Velocity, Variety. According to this IBM page these are not enough. Apparently, business still dont trust the output from any appliance or architecture that deals with these three V things. So they introduce a new V word - Veracity. With Veracity, you can establish or frame what you want from big data, and more importantly decide to believe it. Fair enough! Now, Evan Quinn of ESG Global, an IT research and advisory company, has added 2 new Vs to the string - Visualization and Value. Now that is a V6 engine!
Image courtesy: Educational Technology Clearinghouse