Monday, April 28, 2014

Data Fracking

In 2006, Clive Humby drew the analogy between crude oil and data in a blog piece titled "Data is the new Oil" which since then has captured imagination of several commentators on big data. No one doubts the value of the 'resources' that varies in the effort required to extract. During a discussion with a billion dollar company CIO, he indicated that there is a lot of data but can you make it "analyzeable."

Perhaps, he was inferring to the challenges of dealing with unstructured data in a company's communication and information systems, besides the structured data silos that are also teeming with data. In our work with a few analytics companies, we found validation of this premise. Data in log files, PDFs, Images, etc. is one part of it. There is also the deep web, that part of data not readily accessible by googling, etc. or as this Howstuffworks article refers to it as 'hidden from plain site.'

Bizosys's HSearch is a Hadoop based search and analytics engine that has been adapted to deal with this challenge faced by data analysts, referred to commonly as Data Preparation or Data Harvesting. If indeed finding value in data poses these challlenges, then Clive's analogy to crude oil is valid. Take a look at our take on this. If today, Shale gas extraction represents the next frontier in oil extraction employing a process known as Hydraulic Fracturing, or Fracking, then our take on that is 'data fracking' as a process of making data accessible.