Post written by Wade W., BI Consultant at Ideaca. Read more about BI on his blog: Pragmatic Business Intelligence.
If nothing else, IT is all about buzzwords, and “Big Data” is one of the new arrivals to the party.
It is, however, a descriptive one. “Big Data” evokes images of enormous relational databases, providing analytical (or operational) reporting.
Big Data is not only about size however. Rather, it refers to attributes of the data that together challenge the constraints of a business need or system to respond to it. Those attributes can include any or all of attributes such as size/volume (of data), speed (of generation), and number and variety of systems or applications that simultaneously generate data. Another thing that is unique about Big Data is how it varies in structure. Elements of “structure” would include the diversity of its generation (eg. Social media, video, images, manual text, automatically generated data, such as a weather forecast, etc), information interconnectedness and interactivity.
I heard somewhere a thumbnail statistic that 80% of data in companies is unstructured or semi-structured. Just to clarify the meanings of those terms, an unstructured data artifact would be a document, an email, a video or audio clip. A semi-structured data artifact would include data that does not conform to the norms of structured data but contains markers or tags that enforce some kind of loose (or not so loose) structure. XML documents would be an example of semi-structured data. Tagged documents in a Knowledge Management system would also fit into this definition.
Structured data is what we would find in any database – a Data Model has been defined and the data is physically arranged within this model into tables. The data in these tables is described with metadata (i.e. data types (such as “character”) and the maximum length of that data (number of bytes)).
The methods of data creation are multiplying and the velocity of its creation are increasing. And that, in itself is a complicating factor. Some analysts (IDC, for example), predict that the Digital Universe - that is, the world’s data – will increase by 50x by 2020. There will be in the same period, a growing shortage of storage, which will drive investment in the cloud as both individuals and corporations look for scalable, ubiquitously accessible, lower-cost and environmental data storage options. In addition, the same study predicts that of all that data, unstructured data, especially video, will account for 90% of that data.
There is also an important historical dimension to Big Data. For decades, companies have been hoarding structured, semi-structured and unstructured data in hopes of one day being able to extract value from it at some point in the future.
A large percentage of all this data will come with a wrapper of automatically generated Metadata – that is, (as indicated above), data about (or that describes) that data. A practical example could be the generation of a data artifact coming wrapped with metadata from those GPS enabled, media rich, socially linked mobile devices we all carry with us that transparently capture location, GPS coordinates, time, weather conditions and a plethora of other data elements when you click that holiday photo with your mobile phone. IDC predicts that such metadata is growing twice as fast as data.
It is clear from the last three paragraphs that Big Data describes explosive growth in data and metadata and an equally explosive opportunity to capture, tame and corral that data to extract value from it.
So the case has been made that we have a lot of data today and we will have even way more tomorrow, but should your organization be investing in Big Data today?
In a sense, probably you already are. Enterprise Business Intelligence environments lay a solid foundation for the next phase of Big Data. EBI is an earlier iteration of Big Data and, married to tools such as Hadoop and NoSQL databases for example, enable a natural evolutionary growth curve to your mastery of your information ecosystem.
Big Data has a requirement for a new way of thinking, new tools, clustered commodity hardware and probably, substantial investment. It comes down to your business, and if there is a clear value-based case to present that data to your company’s brainpower. The actual needs for this will be radically different depending on your industry. Oil and Gas may be interested in leveraging real time alerts in wellhead data or analyzing petabyte seismic datasets. Packaged Goods multinationals may be interested in monitoring and engaging advocates, detractors and influencers across multiple Social Media platforms, mining and understanding sentiment and identifying problem areas in real time in order to identify opportunity or identify and avert potential brand-damaging events. Financial institutions may be interested in monitoring international money traffic to identify fraud or illegal activity. Government entities may mine extremist forums, or other unstructured data traffic to identify national threats.
Big Data can serve these needs in real time, enabling rapid (or even automated) response to flagged events. Whether it is a fit for your organization today would be determined through viewing your industry and business through a critical lens on your current Information Intelligence maturity, a strategic assessment of the data and information assets currently owned or available to your organization, and a prioritization of potential initiatives. How much data you harness and convert into information should be a key outcome required from this exercise. The opportunities are legion, but initiatives should have clear objectives and success metrics understood prior to a project kickoff.
Whether it is today or tomorrow, Big Data is becoming mainstream through necessity. Whether that is a road your organization wants, or needs to drive today, is something all medium and large organizations should be considering now.
What are your thoughts on Big Data? Is your organization currently considering Big Data as a strategic imitative or Proof of Concept?
0 comments:
Post a Comment