Oops! The input is malformed!
Originally published July 24, 2013
This is my first article for the BeyeNETWORK Expert Channel that focuses on big data so I want to provide you with some of my personal history, insight and predictions about big data and its evolution.
I’ve been in IT for more than 30 years – I wrote my first application on a Radio Shack TRS-80 and stored the application and data on a cassette tape. I had a “big data” problem when I ran out of room on the cassette tape to store the test scores for the Math Department of my high school in 1980.
I know this is a silly analogy, but the point is that we’ve always been able to create more data than we could store and process. Before the early ‘90s, we had to craft our application’s consumption of storage and memory to meet the limitation of memory and storage devices. For example, many of you will remember that we used to limit dates to five digits using Julian dates (YYDDD) or, if we had the luxury, we were able to use six digits (YYMMDD). But, what happens when you have the turn of the century? Yikes!
Fast forward to today. We’re creating and consuming so much data today with our always-on, always-connected, information-sensing devices. Our devices not only include smartphones, tablets and laptops. We’re also engaging with companies like Progressive Insurance with their SnapShot device you plug into your vehicles diagnostic port under the dash to provide driver telematics, such as how often a driver slams on their breaks, how many miles they drive and how often they drive between midnight and 4 a.m. These are all data points streaming into storage devices that have to be processed to provide a risk assessment in determining a driver’s auto insurance coverage and rate.
GM also recently announced that a majority of their 2015 lineup within the U.S. and Canada will include 4G LTE connectivity provided through a partnership with AT&T. This bandwidth in millions of vehicles over the next few years is another example of how vehicles with provide consumer info-tainment services via this bandwidth, but it will allow the vehicle to provide even more telemetry information on its internal operation (e.g., engine temperature, fuel economy, time to next service) allowing GM to be proactive in providing service and support to vehicle owners and provide user operation information like the SnapShot.
The examples above are consumer-focused examples of big data generating devices, but for decades we’ve had commercial applications like SCADA (supervisory control and data acquisition) industrial control systems generating large streams of data from systems like power generation, oil and gas wells and pipelines and fabrication systems. These systems traditionally generated large volumes of data that for years could only be used for near real-time monitoring because the storage was not available to warehouse the data to enable predictive analytics on the operation and maintenance of devices.
Today we’re able to collect this data through ever-expanding bandwidth on ubiquitous networks and store this information on commodity hardware using an ever-maturing set of tools and technologies like the Hadoop Distributed File System (HDFS) and create applications using the MapReduce programming model, which allows us to work with thousands of computational nodes and petabytes of data, providing the ability to explore these vast arrays of data and discover insights we never knew existed.
My future articles will be focused on the business trends in big data that lead to big analytics, as well as the technology supporting the evolution – all through a perspective that seeks to separate the reality from the hype. Along this journey, I will also enlist the views and perspectives of my colleagues in the industry who are working with customers to solve big data problems and provide business insights through big data. I look forward to your feedback and comments.
SOURCE: Big Data: Is the Hype Over Yet?