For Business Intelligence (BI) projects to be sucessful, they must meet user requirements, be designed with strong user involvement, include a manageable scope, provide fast query response times, deliver the required reporting and analytic capabilities and provide value. Also, as nearly all experienced BI practitioners will attest, for BI programs to be successful, they must be built iteratively. In the trenches, that means sourcing, integrating, and developing front-end capabilities that target a subset of the enterprise data in each project. Sometimes the scope of an interation is limited to a specific organizational process, such as invoicing, and other times by system, such as order-to-cash. Either way, each subsequest project adds to the scope and volume of data that is integrated and available for analysis. For initial BI projects and subsequent iterations to be successful, the data architecture must be scalable, flexible, understandable and maintainable. However, most organizations put more thought and effort into their selection of BI tools than their data architecture. The end result is that they implement a data architecture that crumble under the growing requirements of their users and data.
What is a BI Data Architecture?
A data architecture, as defined by Wikipedia, is the "design of data for use in defining the target state and the subsequent planning needed to achieve the target state". For BI, this means the design of enterprise data to facilitate the ability by the business to efficiently transform data into information. They two key words in this definition are "facilitate" and "efficiently". The vast majority of BI projects are initiated because of the iniability of the business to efficiently access and consume enterprise data. They must navigate non-user friendly application databases, must manually integrate data from disparate systems, and/or must fight to find and gain access to the data they require. If BI is to be successful, it must deliver at the most basic level...the data. If it takes three weeks to add five attributes to the data warehouse or twenty minutes to return a simple query, users will go it on their own.
To be more specific, the BI data architecture includes:
- Data store architecture (database (RDBMS), unstructured data store (NoSQL), in-memory, OLAP)
- The data warehouse process design (i.e. staging, ODS, base, presentation layers)
- The data warehouse data model (for each process layer)
- Metadata data model
- MDM process design
- MDM data model
- BI tool semantic layer models
- OLAP cube models
- Data virtualization models
- Data archiving strategy
What are the Signs of a Poor BI Data Architecture?
Sometimes the signs of a poor BI data architecture are obvious before the first project is rolled out, other times it's further downstream. These include, but are not limited to:
- Inability to meet DW load windows (batch, mini-batch, streaming, aggregates, etc)
- Persistent slow query response
- Non-integrated data (see: http://www.b-eye-network.com/blogs/dine/archives/2010/03/are_you_deliver.php)
- Takes too long to add new attributes to data architecture
- Users don't understand presentation layers (DW, semantic layers, etc)
- Poor data quality
- Doesn't meet compliance standards
- Inability to adapt to business rule changes
Why is Designing a Solid BI Data Architecture so Challenging?
The main challenge with data architectures is that there is no one right architecture for every organiziation. There are differences in data volumes, data latency requirements, number of source systems, types of data, network capacity, number of users, analytical needs, and many other variables. We must also design not only for the requirements of the first project, but also future iterations. This requires highly experienced data architects that have a background designing and modeling for BI, are flexible in their design approach, have the ability to communicate effectively with the business, a strong business acumen and a technical understanding of data and databases. These resources are usually expensive and can be difficult to find, especially for more specialized verticals.
It also takes time to design and model a scalable, flexible, understandable and maintainable data architecture, which must be allocated for in project whether waterfall or agile. The business must be involved to communicate requirements, data definitions and business rules, especially during the design of the logical data model. Many data architects skip the logical modeling process and proceed directly to creating a physical model based on reverse engineering the source systems and/or their understanding of the systems. Other times, companies look to pre-packaged solutions (see The Allure of the Silver Bullet) but later find that they their architecture is both non-scalable and inflexible.
What's the Solution?
This issue reminds me of a home improvement program that was aired on TV a few months ago. The people wanted to add a second story on their house but were concerned since they had noticed cracks in their walls and uneven floors. After inspecting the foundation, they discovered that it wasn't designed correctly and would crumble under the additional load. Their only choice was to redesign and rebuild their foundation. I find that many organizations discover the same issues with their data architectures. The data architecture is the foundation of the BI program, and very few succeed with a poor one. At some point it will crumble under the load. For new programs, the solution is to understand the challenge and ensure that you're putting the right amount of focus and proper resource(s) on the task. For existing programs, look for the signs mentioned above and assess your existing architecture. You may still have time to recover and avoid a failed BI initiative.
Posted September 18, 2011 4:37 PM
Permalink | 1 Comment |