Blog: Wayne Eckerson Welcome to Wayne's World, my blog that illuminates the latest thinking about how to deliver insights from business data and celebrates out-of-the-box thinkers and doers in the business intelligence (BI), performance management and data warehousing (DW) fields. Tune in here if you want to keep abreast of the latest trends, techniques, and technologies in this dynamic industry. Copyright 2014 Thu, 10 Apr 2014 14:41:00 -0700 Big Data Part III: Hadoop Will Not Kill the Data Warehouse Hadoop advocates know they've struck gold. They've got new technology that promises to transform the way organizations capture, access, analyze, and act on information. (See Big Data Part II: "Hadoop 2 Changes Everything.") Market watchers estimate the potential revenue from big data software and systems to be in the tens of billions of dollars. So, it's not surprising that Hadoop advocates are eager to discard the old to make way for the new. But in their haste, some Hadoop advocates have plied a lot of misinformation about so-called "traditional" systems, especially the data warehouse. They seem to think that by bashing the data warehouse, they'll accelerate the pace at which people adopt Hadoop and the "data lake". (See "Big Data Part I: Beware of the Alligators in the Data Lake"). This is a counterproductive strategy for a couple of reasons. Thu, 10 Apr 2014 14:41:00 -0700 Big Data Part II: Hadoop 2 Changes Everything Last year, we could write off Hadoop as a giant, low cost, data processing pump for refining multi-structured data and delivering it to the data warehouse. No more. Hadoop 2, and in particular the Apache Yarn project, changes everything. Released in October of 2013, Hadoop 2 turns the open source data management platform into a multipurpose operating system for big data. Rather than supporting just one type of data processing, Hadoop 2.0 supports any data processing application written to the YARN interface. As such, Hadoop 2 can not only support batch processing (i.e. MapReduce), but also real-time queries, search, in-memory computing and whatever else anyone dreams up and writes to Yarn. Wed, 26 Mar 2014 12:55:00 -0700 To Explore Hadoop, Find a Hunk Say you have a ton of data in Hadoop and you want to explore it. But you don't want to move it into another system. (After all, it's big data so why move it?) But you don't want to go through the hassle and expense of creating table schemas in Hadoop to support fast queries. (After all, this is not supposed to be a data warehouse.) So what do you do?? You Hunk it. That is, you search it using Splunk software that creates virtual indexes in Hadoop. With Hunk, you don't have to move the data out of Hadoop and into an outboard analytical engine (including Splunk Enterprise). And you don't need to create table schemas in advance or at run time to guide (and limit) queries along predefined pathways. With Hunk, you point and go. It's search for Hadoop, but more scalable and manageable than open source search engines, such as SOLR, according to Splunk officials. Mon, 17 Mar 2014 19:22:36 -0700 Big Data Part I: Beware of the Alligators in the Data Lake As silver bullets go, the "data lake" is a good one. Pitched by big data advocates, the data lake promises to speed the delivery of information and insights to the business community without the hassles imposed by IT-centric data warehousing processes. It almost seems too good to be true; and it is. With a data lake, you simply dump all your data, both structured and unstructured, into the lake (i.e. Hadoop) and then let business people "distill" their own parochial views within it using whatever technology they feel are best suited to the task (i.e. SQL or NoSQL, disk-based or in-memory databases, MPP or SMP.) And you create enterprise views by compiling and aggregating data from multiple local views. The mantra of the data lake is think global, act local. Not bad! Wed, 12 Mar 2014 16:49:04 -0700 RedRock BI Rocks the Cloud The cloud eliminates the need to buy, install, and manage hardware and software, significantly reducing the cost of implementing BI solutions while speeding delivery times. One new company hoping to cash in on the movement to run BI in the cloud is RedRock BI, which offers a complete BI stack in the cloud starting at $2,500 a month for up to 2TB of data. The service runs on Amazon EC2, leverages Amazon RedShift, and comes with a single-premise cloud upload utility, 120 hours of Syncsort's ETL service, a five-user license to the Yellowfin BI tools, and five hours of RedRock BI support. Tue, 04 Mar 2014 17:21:44 -0700 SiSense Prism: The New Kid on the Visual Discovery Block Data visualization vendor Tableau Software is the darling of the BI industry these days, giving daily doses of heartburn to established BI vendors. Yet, Tableau is being chased by newer vendors with innovative technologies that offer the promise of even faster, better, and cheaper BI for business users and analysts. One such vendor is SiSense, an Israeli firm now based in New York City, which launched in 2010 after six years of stealth development. Like Tableau, SiSense Prism is a Windows desktop tool that users can download and install from the internet. But unlike Tableau, SiSense was designed from scratch with a scalable, memory-optimized columnar database that can comfortably handle terabytes of data and dozens of concurrent queries. Thu, 20 Feb 2014 11:49:53 -0700 Rollup and Rollout: Actian Eyes Big Data Every database vendor of note has shipped a big data platform trying to capitalize on the current market opportunity and buzz surrounding big data. This includes industry heavyweights, such as IBM, Oracle, SAP, Teradata, Hewlett-Packard and Microsoft, but also many less-known vendors who are hoping to carve out a profitable niche in the emerging market that combines Hadoop with traditional database management systems and data integration and analytics software. Tue, 11 Feb 2014 10:47:25 -0700 Newcomer Decisyon Delivers Next-Generation BI In 2013, you wouldn't think the market (or your organization) could possibly make room for another business intelligence (BI) vendor, but think again. I just took a look at Decisyon, Inc., whose flagship product, Decisyon 360, offers an integrated BI and planning tool with a modern architecture (e.g. all Java and HTML5) and integrated functionality built from the ground up for collaboration, transactions, the cloud and mobile. What more could you ask for? Mon, 10 Feb 2014 08:25:21 -0700 The Battle for the Future of Hadoop: Cloudera versus Hortonworks Big data begs a big question: does Hadoop replace your enterprise data warehouse or augment it? The two leading vendors of Hadoop distributions offer very different answers. For Cloudera, the first vendor to commercial a Hadoop distribution, the answer is an unequivocal yes. Last November, Cloudera finally exposed its true sentiments by introducing the Enterprise Data Hub in which Hadoop replaces the data warehouse as the center of an organization's data management strategy. In contrast, Hortonworks takes a hybrid approach, partnering with leading commercial database vendors, such as Microsoft, Teradata, Red Hat, HP, CSC, NetApp, and SAP, to create a data environment that blends the best of Hadoop and commercial software supporting existing data warehouses, among other things. In short, Cloudera offers revolution, Hortonworks evolution. Thu, 06 Feb 2014 17:24:26 -0700 Integrated Visual Discovery: MicroStrategy Analytics Desktop Visual discovery tools have surged in popularity because they are quick to deploy, easy to use, and don't require upfront IT involvement or a sizable capital investment. Upstart BI vendors, such as Tableau, QlikTech and Tibco Spotfire, have led the visual discovery charge, but enterprise BI vendors are now jumping into the game, offering comparable tools that have the added benefit of integrating with their enterprise BI platforms. MicroStrategy was one of their first enterprise BI vendors to offer visual discovery capabilities two years ago. But unlike pureplay offerings, MicroStrategy Visual Insight was not a standalone, desktop product that customers could buy and download via the internet; they had to first purchase and install MicroStrategy's Enterprise suite, not an inexpensive or speedy proposition. Or they could use MicroStrategy Express, a cloud-based front-end to MicroStrategy Enterprise. Sun, 02 Feb 2014 12:22:23 -0700 Time to Watch Datawatch While a number of established BI vendors are showing signs of duress due to the market shift from reporting to analytics and the onslaught of sexy, new visual discovery vendors, one old-time BI vendor that is bucking the trend is Datawatch, a public company whose stock price has skyrocketed from 11 to 33 in the past nine months. Founded in 1985, Datawatch for many years has toiled out of the limelight selling decidedly unsexy "report mining" tools, which enable customers to extract and integrate data from print spools, ASCII files, EDI feeds, XBRL and PDF documents and just about any other formatted document or report as well as relational sources and output the results into BI tools, Excel, HTML, and other interactive environments. Mon, 27 Jan 2014 16:10:06 -0700 Adding Value to Big Data: Infochimps Raises the Bar Big data , by itself, is useless. Think about it. Data is simply a stream of 1s and 0s. It's not until you analyze data or use it to support or automate a business function that you gain any value. To date, the big data movement has focused on technology and components and how to make them easy to use. The big software and systems providers--IBM, Oracle, SAP, EMC, Hewlett Packard, Microsoft, and Teradata--have all recognized the changing tide and raced to acquire, assemble, and integrate various big data technology and components. But the next wave of competition is delivering real business value with these newly architected big data platforms. Wed, 22 Jan 2014 13:33:20 -0700 Oxdata Makes Tools for Data Scientists What is a sexy statistician? A data scientist! That is, of course, until the statistician has to learn how to write Java and parallel code to unearth patterns and outliers in "big data" a.k.a. Hadoop. Then, the fledgling data scientist reverts back to homely statistician. Fortunately, startups are rushing to fill the void with big data tools for statisticians that enable them to skip the coding and focus on what they do best: massage and model data. One of the recent newcomers is Oxdata whose H2O prediction engine supports the complete analytical workflow, from preparing to modeling to scoring data. Its secret ingredients are an in-memory distributed key value store (think memcache) and superfast algorithms and scoring routines, including classification, regressions, clustering, and ensemble modeling. Tue, 21 Jan 2014 13:17:04 -0700 Jinfonet Goes Upstream Through a recent consulting assignment, I discovered that there are many Java-based reporting tools on the market, each carving out a more-or-less profitable niche out of the industry spotlight. One of the more promising Java-based reporting vendors I've met is Jinfonet Software. Founded in 1998, Jinfonet, until recently, sold its flagship product, JReport, as embedded software to other software vendors. Today, Jinfonet is taking JReport upstream, selling it more aggressively as a stand-alone product with new modules for dashboarding, self-service reporting, visual analysis, Web and mobile delivery. It also provides access to relational, NoSQL, Hive, and Amazon Redshift data sources, and it's working on modules for predictive analytics and collaboration. JReport is no longer just a Java-reporting tool, it's a full-featured BI platform. ]]> Tue, 21 Jan 2014 12:13:47 -0700 Birst Goes Visual Visual discovery tools have breathed new life into the business intelligence (BI) market. They are not only fun to use, quick to deploy, and easy on the eyes, they are the fastest growing segment of the BI market, thanks to pioneers QlikTech, Tableau, and Spotfire (now owned by Tibco), whose growth has curbed the market share of traditional enterprise BI players. In response, most other BI vendors have recently introduced their own visual analysis products to widen their product capabilities and protect their installed bases. Birst is the latest BI vendor to enter the visual discovery fray with its Birst Visualizer tool. While most enterprise BI players have simply shipped Tableau look-alike products whose only value-add comes from offering the software free of charge to enterprise customers, Birst offers some interesting new twists. By leveraging its semantic layer, search capabilities, and some pseudo-artificial intelligence, Birst Visualizer aims to bring the world of ad hoc visual analysis to non-analysts. For example, a product manager could type "sales for Mass..." and Birst Visualizer automatically displays a line chart with the past four quarters of sales for Massachusetts. From there, users can filter and fine tune the chart with a few clicks of the mouse to detect a trend or follow a hunch. This search-driven interface, which makes educated guesses about what data users want and how they want to see it, takes ease of use to the next level. ]]> Tue, 21 Jan 2014 12:04:29 -0700