Data Anarchy – A real threat to Self-Service BI (Part 2)


The possibility of Data Anarchy is real. It can creep up on you slowly and overwhelm an IT department easily. While getting out of that mess is a good idea, it is way better to avoid getting in it in the first place.  That, of course, presumes that we can recognize the early signs. So, how and why does data get out of control?

Industry dynamics are contributing to data craziness – are you surprised?

Companies are becoming more BI and analytics savvy and are collecting more data because it is cheap to store data. They are turning to their day-to-day business data to glean insights that will help them stay competitive that is to better understand their own business in terms of product performance, customer behavior, demographics etc. In an effort to improve how they do business, an Austin-based hotel scheduling company  is collecting large web click data daily so that it can start performing historical trend analyses and decide their future ad campaigns.

As hardware costs continue to spiral down, commoditized storage continues to spark data hoarding. Today companies are realizing that it is very economical to store and retain data over a longer period of time. Today’s data retention solutions are also offering ways to not only store multiple varieties of data (including structured, semi-structured and un-structured) in an efficient manner but also providing front end tools to mine the data in the future. For example, the IT manager in one of the large biomedical testing labs recently decided to start storing multiple TBs of semi-structured data getting logged by 7000 sensors worldwide. Previously the data used to get flushed away on a daily basis.

Another phenomenon that is driving the explosion of data is the use of social media. Businesses are already looking at ways to build sentiment analysis applications to analyze social conversations and in that process are starting to capture social content on a regular basis.

Business Intelligence (BI) tool evolution – shiny new tools are tempting

BI tools have come a long way. Traditional BI tools were extremely good at tracking raw transactional numbers like sales figures and profit margins but failed to adequately address the root causes, or drivers, of trends in those numbers. Moreover, they were typically able to tell what happened (backward reporting) – but not explain why (unless it was evident in some other numeric data) let alone alert the business as a change emerges. The tools were complicated to deploy and operate. Users wanted self-service BI.

Over time, BI tools have evolved to support features like auto-modeling techniques, rich visualizations, metrics and auto-calculations on the fly as well as “What if” analysis. Tools now boast new in-memory technologies to enable users to quickly port data sets into memory to crank out insights quickly, thus enabling self-service BI.

End user evolution – we change, we demand more, we want it faster

The user dynamics are changing from IT controlled to end-user driven self-service led analytics. (In this time of the i-everything, BI users demand iBI – the easy, cheap and fast magic answer box.)

Traditionally IT managers were responsible for adopting the right reporting tools and giving the end-users access to consume the reports. Typically in an organization 80% of the people were consumers of data1 while the remaining 20% were actually creators of ad-hoc reports and custom dashboards. That model worked for a while but the balance of information consumers and information creator s shifted significantly. The effects of this shift manifest themselves differently for enterprises and SMB’s.

Most SMB customers fall in the category of casual data access using simple tools like Excel for their day-to-day analyses and are in dire need of self-service BI tools to help them migrate to the next level of analytics maturity. Typical SMB customers are characterized by limited IT resources and budgetary constraints which is driving them to the use of these easy to use and faster to deploy self-service tools.

Departmental IT’s within traditional enterprises are responsible for disrupting the BI ecosystem already put in place by corporate IT. The complexity and inertia of the current BI situation for end users has led to an increasing need for Self-service enabled BI tools. Users simply demand the democratization of the BI tools to gain quick and meaningful insights.

Changing IT demands – they want to help us. Really!

Democratization of BI is a thorn in the side of IT. Per IDC Digital Universe study (2011) the amount of data being stored is more than doubling every two years, and could grow by 50X by 2020 while IT staff is estimated to grow at 1.5X only! This shocking statistic in itself should be a cause of concern for today’s IT managers. Thus in addition to designing the next generation data architectures, IT managers will also need to make sure that they can disseminate this information to the business users in a easily digestible manner.

IT is still challenged with maintaining a “single version of truth” while supporting day-to-day BI needs. Today most of the IT departments within traditional enterprises have already started defining a master data framework for maintaining an authoritative, reliable, sustainable, accurate, and secure data environment that represents a “single and holistic version of the truth”. IT managers recognize the following components as the critical pieces to architecting a robust Master Data Management (MDM) framework: Customer Information File (CIF), Product Masters (BOM), Extract, Transform, and Load (ETL) architectures, Enterprise Data Warehouse (EDW), Operational Data Store (ODS), Data Quality (DQ) technologies and Enterprise Information aggregators. What is missing from this framework is the need to acknowledge the new evolving self-service enabled, in-memory BI data stores.

Next time,  let’s see what we can do about this…Stay tuned!


[1] The Myth of Self-service BI [Wayne Eckerson, TDWI  What Works Enterprise Business Intelligence v24]