Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

Let's face it, cloud computing, grid computing, ubiquitous computing platforms are here to stay.  More and more data mind you will make it's way on to these platforms, and enterprises will continue to find themselves in a world of hurt if they suffer from security breaches.  If we think today's hackers are bad, just wait...  they're after the motherload: all customer data, massive identity theft, etc...  I'm not usually one for doom and gloom, after all - we have good resources, excellent security and VPN and firewalls right?   In this entry we'll explore the notion of what it *might* take to protect your data in a cloud/distributed or hosted environment.  It's a thought provoking future experiment - maybe it would take a black swan?

Imagine your data on a cloud environment, or a hosted BI solutions vendor, or any other number of hosted environments. 

* How do you protect your data from getting stolen?

* How do you trace it if it is stolen?

* How do you track down the invader/hackers?

* How do you /can you stall the hackers long enough to take evasive action?

First off: we all know that NO security solution will ever be 100% fail-safe, it just simply will not exist.  When someone creates a more secure solution, someone else comes up with a way to break it, that's just the way it goes.  BUT, there is such a thing as "thinking ahead", thinking outside the box, thinking about what you CAN do to prevent and deter succumbing to these types of problems - which may cost you your business, may cost you money (in law-suits), may tangle you up in legalities with governments, and on and on.

There's no question as to the issues that can arise if you don't prove that you've "done everything in your power" to protect the consumers/customers.  So what can you do?

Cloud Computing and Hosted Environments (that is if they host your Data) present unique challenges, really unique challenges.  Everything from "shared data on shared machines" to shared data on dedicated machines, to VPN with non-shared, non-public machines, etc...

Well, let's blow away the fog in the cloud for a minute and take a look at a simple case:

Suppose you outsource your BI and your data for that BI to an analytics as a service firm.... So far so good.  The BEST possible protection you can have (and they should tell you this up front) is to NOT release any personally identifyable data to the hosted service or cloud, only release aggregate data - rolled up trends, trends of trends, and so on...  Then, add to that all the standard and well known security that you can buy, and it's fairly decent.

Now let's say for whatever reason you've outsourced the CLOUD environment, and you've uploaded sensitive data...  What can you do?  what kinds of questions should you ask?  What should the vendors be willing to help with?

First, there's not much you can do - other than ask the vendors for new features... which leads me to answer the question: what can / should the vendors do in this new computing arena?

Cloud is interesting: it follows on the notion that if you need more computing power, that it becomes available on-demand.  Ok, cool.  What if the extra computing power needed was for encryption/decryption of the data?  In other words, I believe that ALL outsourced data should be stored in encrypted format on disk - well, that's a good start, especially if each DISK array carries it's own encryption/decryption hardware to perform the task on the fly.  This prevents someone from "hot-swapping" a RAID 5 disk with a copy of sensitive data and taking it home to crack it.   Well - not really, they can still hot-swap it and take it home, but cracking it (with the right algorithm) can be another story.

The next step: is encryption/decryption at the database level.  I'm assuming that since you're running BI / Analytics in a cloud, that you'll also be needing a Database engine right?  Ok, so the Database engine should encrypt/decrypt the data as it works with it.  The only place the sensitive data should "be visible" is in RAM of the database engine or the machine on which it is currently existing.

Am I saying to encrypt ALL data?  No - that would be ludicrous, and would cost way too much money...  Only encrypt the data that your organization deems sensitive or private in nature.  However, the Database engine should make it EASY with little to no performance hit (due to cloud resource availability).

Next, comes the part out of the box...  What if: Data needed a DONGLE to be utilized properly?  What if the data was sent to your machine, exported to excel, but to SEE the data had to be decrypted on your desktop with a public or private key?  Now, this is interesting...  A database manufacturer working with a cloud based hosted service, to produce DONGLE's instead of SEAT licenses, or instead of selling "software clients" they sold "decryption" dongles for USB...  Hmmm - interesting.  The Dongles then would talk to the cloud, report who, when, and what was viewed - you don't have a choice (sorry, big brother has been watching for a VERY long time).

Let's do one better... what if, just what if, the data itself could be SIGNED - just like a certificate is signed, and this "watermark" went whereever the data went, it couldn't be removed, and it couldn't be seen - but it would be used as part of the key to decrypt and make sense of it.  Now, if the data itself was stolen (even in encrypted format) it would be traceable.  No - it wouldn't call home, as there's no "application" for that, but if it shows up somewhere, there would be forensic evidence (like a finger print) on the data that would point to it's origin....  Now that's some cool science fiction stuff (I wish it were so)....

Anyhow, back to reality... The Dongles provide really good protection mechanisms today, and in fact can also be embedded with a finger-print reader as part of the authentication mechanisms.  This technology exists, and could be put to good use.  \

In some cases your data is worth more than your gold or money in the bank, because it represents tomorrows profitability.  Don't you have the right to ask vendors to help you protect it?  Of course, they have the right to ask you to pay for this service....

Just a thought.  If you have some other cool thoughts, reply in the comments to this blog - I'd love to hear them.

Thanks,
Dan Linstedt
DanL@DanLinstedt.com
http://www.DanLinstedt.com


Posted March 30, 2010 3:05 PM
Permalink | No Comments |

Leave a comment

    
Search this blog
Categories ›
Archives ›
Recent Entries ›