BeyeNETWORK Spotlights focus on news, events and products in the business intelligence ecosystem that are poised to have a significant impact on the industry as a whole; on the enterprises that rely on business intelligence, analytics, performance management, data warehousing and/or data governance products to understand and act on the vital information that can be gleaned from their data; or on the providers of these mission-critical products.Presented as Q&A-style articles, these interviews conducted by the BeyeNETWORK present the behind-the-scene view that you won’t read in press releases.
This BeyeNETWORK spotlight features Ron Powell's interview with Chuck Berger, CEO of ParAccel. Ron and Chuck talk about “big data” and how it can be used effectively in analytics-driven organizations.Chuck, it seems that everywhere you turn these days somebody's talking about “big data.” There's so much data now, in many different formats and from many different sources. Organizations are facing challenges as they try to analyze these tremendous data volumes. Could you tell us why data volumes are growing and how that is impacting organizations from your perspective? Chuck Berger:
Sure, Ron, I'd be happy to do that. Your statement couldn’t be more true that data volume is exploding, and it's happening at an incredible rate. I've been CEO here at ParAccel for little over a year now, and when I joined the term big data was just beginning to catch on. This explosion is happening at such a rate that a whole new terminology and industry have grown in a very short period of time.
What's driving this? The first driver is the explosion in mobile devices. Smartphone growth is going up at astronomical rates. The iPad is now a $12 to $13 billion a year business. There are people spending lots of time attached to their iPads, most notably my two youngest children who seem to be inseparable from them. While they're mostly playing games, they're also creating data.
The other driver is the explosion of social media. We have all heard on the news that the Facebook IPO is scheduled to go out at $100 billion valuation, and the reason for that is the hundreds of millions of subscribers or friends that they have and all the data that friends generate every day when they interact on Facebook.
Beyond that, I think we often get caught up in the "glamour" of new things. There are also enormous amounts of data that grow with retail volume from point-of-sale information, shop floor information that comes in through companies' normal shop floor management systems, and sensing devices that didn't exist before. For example, I get a report every month from my car via the OnStar service telling me that all my key systems are in great shape – and it even reports how much air pressure I have in each of my four tires. Additionally, it’s constantly beaming location data of where I am and where I've been. When I go through a tollbooth now, that data is logged into my EZ Pass file. Machine-generated data and the increasing number of sensors around the world are causing the volume of data to explode. It seems that while it used to take ten years for the volume of data to double, it now happens in ten months.I've heard similar statistics, and you're absolutely right. I think the data explosion is an understatement. And I can identify with your comment about the iPad. That's what my 10-year-old son wants for his birthday! It is amazing where data and mobile are taking us. Obviously we're generating all this data, but it requires a tremendous amount of horsepower to analyze that data. And because there is so much data, how does an analyst even know what data he or she needs, let alone how big it will be? It also seems, in many cases, that the analysis of the data has to be done in real time in order to provide the most value. Are you seeing the same things from your customers?Chuck Berger:
We are. One of the things that is driving our business is the fact that people are just beginning to learn what they can do with the volume of data and the variety of data that is out there. Therefore, they're doing a lot of exploring with that data. Even before you get to the need for a conclusion in real time, you need the ability to do ad hoc analysis to see, for example, if the change in weather in the Chicago market affects Campbell Soup sales. If it's a cold day, do I sell more? How do I price against that, and how do I stock my shelves against that? Are there particular flavors of soup that people prefer more than others in the cold weather?
As people started to move beyond their initial question and build queries against traditional databases, they’d often build queries that would take days to run and some that were impossible to run. So one of the first things that analysts are realizing is that legacy databases from the traditional data warehouse vendors like Teradata or the OLTP vendors like Microsoft and Oracle were not designed to be up to the task of doing these complex analytics against very large data sets. As a result, people are realizing they need a new type of database, like a ParAccel analytic database
or others that we compete against in this space, to do the analysis because it's not only the speed of the final result, but it's the ability to iterate and modify the query so that they can make it an even better predictor of behavior, which is ultimately what they're all trying to do.When I look at analyzing this data, the cloud will start to play a major role in this type of analysis. What does ParAccel offer customers that want to perform their analytics in the cloud?Chuck Berger:
Well, as you may be aware, we recently announced our ParAccel Analytic Platform Cloud Edition
, which is our database with the extra features required to run in a cloud configuration. Essentially, we're seeing two types of demand from our customers for cloud implementations. The largest, frankly, in our space is private clouds where a large enterprise, Citibank or someone like that, wants to maximize the use of their server and storage assets and so they virtualize them in a cloud environment. That allows them to move resources around on an as-needed basis. For example, during the Christmas season they may need more compute power on credit card clearing than they need for market analysis data so they can have fluid application of their server resources. We allow that now. Particularly important is that because we're a massively parallel database, adding nodes to a cluster in a cloud is relatively seamless and gives them the ability to maximize the use of our database.
There's certainly another category where we also participate, which is a more public cloud. We've announced a relationship with MicroStrategy – and as you know Amazon invested in us and they are the largest public cloud provider. As an example, suppose somebody has an ad hoc project and they're not sure it's going to become a permanent effort, but they need to run some analytics. They can very quickly spin up the compute resources they need to do that and run the analysis they need to run. If it works, they can either leave it in the cloud, or move it into an on-premise implementation. Or maybe as some companies are doing, they find out that the cloud is a preferred option for them. So again, giving the ability to very quickly spin up and spin down the resources you need to do timely analysis of the data is where the cloud comes in.When you talk about these large public and private clouds, obviously it takes a lot of hardware to make these things happen. Is there an expensive hardware component associated with ParAccel?Chuck Berger:
Well, one of our key differentiators is we are a software-only solution, which allows our customers to run ParAccel database on any standard server hardware. If you are an HP shop and you've negotiated a great deal with HP, you can just add to that at a very effective price and then buy the software from us. With an appliance vendor, like Netezza or Teradata, in addition to paying markup on the software, you pay markup on the appliance that you need to buy from them plus you don't have the flexibility to spin up and spin down once you've made a commitment to the amount of the appliance you're going to buy. So we're often as little as 10% the hardware cost of our competitors who have chosen the appliance model.That's a tremendous advantage especially when you can use commodity off-the-shelf hardware that you probably already have in your enterprise.Chuck Berger:
Absolutely, and anadded benefit is that the system administrators don't need to learn a new environment.Could you give us some examples of the benefits of an analytics-driven enterprise and how ParAccel can help organizations achieve that goal?Chuck Berger:
Sure, I can give you two or three. We have one very large global banking customer who was analyzing market data related to mortgage-backed securities every month as the data was published. They were using legacy technology. Once they loaded the queries, it took them 4˝ days to run those queries. They moved the queries over to our analytic platform and they now process them in 7˝ minutes. You can imagine the very tangible benefit of being in the market with fresh data almost a full business week ahead of what it had been. And in fact, both their trading volume and their profits on their trading volume have gone up dramatically.
We have a retail customer who is doing their market basket analysis on ParAccel. Their prior legacy database could only use three months of data. Even with that, they had to reduce the data set to a 10% sample of the data. They really wanted to do 12 months of analysis with all the data. Even with the three months and the sample, the queries were taking several hours to run sometimes longer. We enabled them to use a full 12 months and 100% of the data, and they did it in 3˝ minutes. They believe that the accuracy of their market basket analysis has improved dramatically.
Another customer is using ParAccel for credit card fraud detection, and they have seen about a 30% increase in their detection of fraud. Particularly important to those of us who aren't committing fraud on our credit cards, they are also seeing about 25% reduction in false positives, which are very costly to vendors and credit card issuers who have to respond to them.Well, that's excellent. To me, the timing alone increases accuracy because if it takes you a week, things can change in a week. But if it takes just minutes, things are not going to change that quickly. So it makes a lot of sense that your customers would be achieving such great results.Chuck Berger:
That bring us back to your first premise, which is that the volume of data has exploded and the only value besides storing that data is the ability to analyze it and do it very, very quickly. Mortgage-backed securities are trading every second, and millions of customers are shopping and checking out at stores every day. If you can analyze the data very fast, you can potentially alter their behavior while they're in the store or while they’re checking out. It's just a huge benefit. At the store of the future – which is modeled, I'm sure, at every major retailer – the shopping cart is going to have a sensor in it so you know where each person is in the store. The pricing will be variable and controlled by a large database. If you see a person taking a particular pattern during the store, you might flash specials in front of them right as they shop. I might see a different special than you'll see because I exhibited behavior that shows I'm more likely to buy that particular item than you are – so they don't need to discount it for me, but maybe they need to discount it for you to get you to buy it.The other thing is that databases and specialized hardware are incredibly expensive. If I need to increase my capacity due to increased analytic workloads, I can offload that to a database like ParAccel and reduce my costs significantly. That seems like a definite win-win situation.Chuck Berger:
Yes. Earlier I mentioned the customer who had only been able to use three months of data and we enabled them to use 12. We had replaced the legacy vendor that's still their principal data warehouse. We enabled them to do things they weren't able to do before at about a tenth the hardware cost of what their legacy vendor charged, but at 20 times better performance.That’s certainly a great result. Thank you, Chuck, for providing our readers with very useful information about big data, analytic databases and the success achieved by ParAccel customers.
Recent articles by Ron Powell