<-- Back to full color view

Trends in SAP Business Warehousing

Originally published May 7, 2009


SAP NetWeaver Business Intelligence – now rebranded as SAP NetWeaver Business Warehouse (SAP BW) – is a logical choice for those enterprises operating the underlying SAP enterprise resource planning (ERP) modules. SAP BW has a sophisticated architecture, developed for large enterprise clients over the past ten years. SAP’s Layered Scalable Architecture (LSA) includes technology for data acquisition, propagation, quality, master data (“corporate memory”) and data archiving. Just as a historical footnote, the acquisitions of Acta Works for ETL (which itself acquired Firstlogic for data quality) have been fully digested and enhanced. An LSA methodology is provided along with the architecture. Given the existence of diverse and heterogeneous data in the enterprise, design and development work is sometimes useful to accommodate specific business requirements.

If an enterprise were a startup, developing its systems from scratch for the first time, then the LSA would be a good approach to enterprise architecture for the entire information technology function. That is rarely the case. The issue is how does LSA merge with existing architecture – so to speak – a meta-architecture that encompasses SAP BW along with existing business intelligence applications, data marts and even an enterprise data warehouse (EDW)? Yes, some of those systems can be stuffed back down behind the acquisition layer, but that is hard to do with an EDW, especially if the latter has grown up with the firm and provides competitive advantage, custom services and distinct business value.

In short, what about those firms with highly heterogeneous data, only a fraction of which is in SAP? What about global firms with dozens and dozens of SAP instances? What about the data warehouse appliance opportunities? How these three trends play out in an SAP context are the targets of this article.

The situation of those enterprises that operate an enterprise data warehouse (EDW) in addition to SAP BW is a source of ongoing debate. Where is the famous “single source of truth” that is the reason for building a data warehouse in the first place? The issue with a peer-to-peer architecture is precisely that the single version of the truth keeps shifting back and forth. In the long run, that is not workable for most enterprises.

The short version of SAP’s tactical answer to heterogeneous data is “SAP OpenHub.” This is the function that maps and enables extraction of a given SAP BW InfoCube to the underlying SAP data sources that may include dozens and dozens of individual tables. However, the longer version is a piece of design work and custom architecture for those enterprises that decide to operate a separate EDW alongside SAP BW. For those SAP clients that, for whatever reason, choose not to make LSA the total, overall architecture of the enterprise, the requirement arises for a meta-architecture. Here “architecture” does not refer to SAP’s LSA. Rather the approach that some large SAP customers have been constrained to consider is how to integrate LSA with an existing EDW and custom BI framework. One suggestion is to choose a master-servant approach as the meta-architecture. Pick one – whether SAP BW or the existing EDW – as the master and the other as the servant (“slave”). Pick one and plan on sticking to it! Remember, the goal here is to end up with a “single version of the truth,” rather than out of sync, dysfunctional and distinct data stores. The master is the proverbial “single version of the truth,” of course. Decision criteria for picking a master versus a servant include how much data is located now (and in the future) in the EDW and SAP BW, respectively; the existing interfaces between the data warehouse (in either form) and the targeted business users; and the organizations toleration for complexity, latency, and customization; and the ability to support these over the long-term.

As usual, trade-offs will need to be assessed and managed. If a replication approach is chosen, then the master will drive the servant by means of synchronization technologies such as message brokering, update in a batch process or workflow through OpenHub. If a federated approach is chosen, the master will drive the servant by means of SAP Universal Data Connection (UDC) for remote SAP BW InfoCubes. UDC enables federated connectivity between SAP BW and a diversity of J2EE-compatible databases such as DB2, Oracle and Microsoft SQL Server. Strictly speaking, Teradata is not native J2EE database, but was certified by SAP with UDC in June 2005 and has high profile success stories of operating remote InfoCubes using a prepackaged integration approach.

As the debate raged over whether SAP “really gets heterogeneous data,” SAP proposed a strategic answer by acquiring Business Objects (BObj). BObj “gets” heterogeneous data, and, in many ways, has been the tail wagging the dog at Waldorf. This is especially so in a dynamic (and difficult) economy, in which sales of front-end and data integration technologies is relatively easier than complex ERP and BI installations. Expect additional leadership from the BObj franchise going forward.

In the meantime, reports from the field indicate the integration of BObj with the underlying InfoCube technology has advanced dramatically since the acquisition in 2007, but at least one large client known to this analyst still has issues. In one instance, in an example of a perfect storm of bad timing, the enterprise found Business Explorer (BEx), SAP’s Excel plug-in, to be clumsy and lacking in “curb appeal.” This enterprise then chose Cognos because it worked with the underlying BW InfoCubes whereas BObj did not. SAP immediately announced the acquisition of BObj. Fast forward a couple of years and the experience of marching and counter-marching has been frustrating. Tools such as BObj, Crystal Reports (also a part of BObj), and Cognos seem like good alternatives – but only if they can connect and interact smoothly. SAP will want to be proactive in making the road map for integration better known and undertaking the necessary remedial action to get customers access to their data with the ease, options and sophistication that is required by enterprise information access. See Figure 1.

 
alt

Multiple instances of SAP R/3 do not represent heterogeneous data as such. But as a general rule, system costs are driven by the number of system interfaces. The more interfaces to operate and maintain, the greater the cost. The result can be functionally similar to heterogeneous data in the sense that coordination costs at the system interfaces produces dis-economies of scale. Therefore, global enterprises – some of which report  eighty or more instances of SAP – are on a trajectory to consolidate down to one large image or, in some cases, one per continent.  Obviously this is a years' long effort that spans economic cycles in good times and in bad. This represents a kind of annuity that generates revenue for SAP in terms of software upgrades and architecture enhancements even if SAP consultants are not performing all the work themselves.  

This is a good place at which to make an observation on architecture and branding. SAP NetWeaver started out as a message broker enabling near real time update between the underlying SAP R/3 system and the SAP BW InfoCubes. However, it has gradually morphed into an all-encompassing brand including developer services for service-oriented architecture (SOA), ETL-like capabilities, data quality and metadata, as well as the initial messaging capabilities. Although completely different than IBM’s WebSphere brand, it functions similarly in that it has become a “catch all” for architectural features that are valuable and that may usefully be integrated some day, but for the time being do not fit anywhere else.

This leads to the third and final trend in the market(s) in which SAP is operating – the data warehousing appliance. Once again the short answer is – SAP has one of those too! It is called the SAP NetWeaver Business Intelligence Accelerator (BIA). In SAP’s case, the hardware from premium partners is optionally HP, Fujitsu-Siemens, IBM or Sun blades with vast capability for data caching of InfoCubes that require a high performance approach or that, for whatever reason, are under-performing in their native SAP BW context. And thereby is a crucial distinction between the BIA and data warehouse appliances as previously defined in the market. BIA is in addition to the existing InfoCubes, not a replacement for it. Oracle has followed a similar innovative definition of an appliance with its Exadata solution. Can IBM be far behind?

The unsaid criticism here, which will have occurred to many DBAs, is that if the performance of the original application and the standard relational database on which it is implemented were adequate in the first place, then a substantial “hardware assist” at the back end should not be needed. Duly noted. One proper answer is that such back-end appliances provide investment protection. Whether or not a given data warehousing investment is worth protecting is a tough decision that the client executive management is empowered to make. Throwing hardware at a problem has always been a tried and true approach to performance, even before the invention of the data warehousing appliance. An even more innovative spin on the BIA is an offering called SAP Pole Star. If a firm has a BObj front end that requires a billion row scan, Pole Star provides its own special purpose appliance accelerator that caches the billion rows in memory, applies the BObj Inxight (text mining) technology to formulating queries in plain English and handles heterogeneous back-end data (from Oracle, Hyperion, PowerPlay cubes) into the bargain. (And, yes, NetWeaver is required.) Cognos users please note well that while SAP provides assurances about ongoing and continuing support, Pole Star will work only with BObj front end, not Cognos.

The recommendations: If you are a larger user of SAP ERP, who is consolidating onto fewer instances, stay the course. SAP’s Layered Scalable Architecture has been developed and hardened in the molten cauldron of real-world client experience over the past ten years. However, for those enterprises with business intelligence and data warehousing investments that have grown up alongside SAP ERP, it is useful and indeed essential to consider meta-architecture. How do all the pieces fit together in terms of the big picture? In a context of multiple data sources and diverse updates, meta-architecture says which database is the driver (master) and which is the driven (servant). After consideration, pick one and stick to it. If size and scalability become an issue as SAP BW instances are consolidated, consider the approach of the BI Accelerator as a way of meeting service level agreements (SLAs). If you are a heavy user of Cognos with SAP, understand that SAP “gets” heterogeneous data to the extent that it can be expected to continue to support Cognos for the foreseeable future. However, the most innovative enhancements will occur with and for the benefit of Business Objects as exemplified by the Pole Star initiative. If you are a relational database partner of SAP, understand that you have never looked more like a “bit bucket.” While the relational database is expected to be the dominant design for managing structured commercial business data for the foreseeable future, such innovation as appliances, back-end accelerators and column-oriented databases of diverse kinds are gnawing on the periphery of applications in interesting and unexpected ways. That they will chew their way to the center is becoming a distinct possibility, but by no means a certainty.

SOURCE: Trends in SAP Business Warehousing

  • Lou AgostaLou Agosta
    Lou Agosta is an independent industry analyst, specializing in data warehousing, data mining and data quality. A former industry analyst at Giga Information Group, Agosta has published extensively on industry trends in data warehousing, business and information technology. He is currently focusing on the challenge of transforming America’s healthcare system using information technology (HIT). He can be reached at LAgosta@acm.org.

    Editor's Note: More articles, resources, and events are available in Lou's BeyeNETWORK Expert Channel. Be sure to visit today!

Recent articles by Lou Agosta

 

Comments

Want to post a comment? Login or become a member today!

Posted May 7, 2009 by Richard Hackathorn

Lou - Great concise assessment of SAP architecture. Way to go! It is tough to parse the usual SAP literature to surface the important stuff.

You mentioned Teradata only once, as not being a native J2EE database. More! In particular, I would like to see an assessment for SAP customers of how Teradata can fit into their enterprise architecture. What does this new partnership bring to the table for them? Is it anything deeper than what Oracle, IBM and Microsoft are already doing?

Is this comment inappropriate? Click here to flag this comment.

 

Copyright 2004 — 2019. Powell Media, LLC. All rights reserved.
BeyeNETWORK™ is a trademark of Powell Media, LLC