I have noticed a pattern of repeated recommendation recently towards a more real-time information architecture for my clients. This is not a â€śpushâ€ť recommendation. Itâ€™s a pull. End clients are becoming much more astute at being able to deal with real-time information and are demanding more of it. After years of knowing what to do with clean information once it is accessed, competitive pressures are forcing companies to try to access the information earlier in its cycle.
For some enterprise data warehouse oriented shops, the EDW is the first consideration in a real-time strategy. That is, moving some of the ETL selectively from batch to real-time. This is certainly a viable strategy, especially when data access investment has most predominantly occurred in the data warehouse environment, real-time feeds are feasible and data access in the operational environment, with EII for example, is not just really a better option.
In order to even consider the real-time EDW option for real-time information needs, there are key requirements that must be considered. Each side of the equation (operational environment and data warehouse) must be co-conspirators in the success of real-time sourcing. This mainly affects the operational environments, which are often old, fragile, expensive to upkeep and lacking in-house expertise. Ironically, these are often the reasons data is extracted from these systems to begin with, yet it also can be prohibitive to doing the sourcing in real-time. By the same token, I have also experienced real-time attempts with underpowered data warehouses where the data warehouse could not manage both loading and high query activity simultaneously. Sometimes partitioning strategies alleviate this concern, but not always.
There is no standard way of achieving real-time data warehouses. Some reschedule their â€śbatchâ€ť ETL to run more frequently. This is feasible when the ETL has been designed well in the first place to â€śpick up where it left offâ€ť with the operational extracts, regardless of wall clock time. Others utilize EAI and EII technologies and are able to perform the extracts in the operational infrastructure with these.
Technorati tags: EDW, Real Time, data+warehouse, EII