Blog: Wayne Eckerson Subscribe to this blog's RSS feed!

Wayne Eckerson

Welcome to Wayne's World, my blog that illuminates the latest thinking about how to deliver insights from business data and celebrates out-of-the-box thinkers and doers in the business intelligence (BI), performance management and data warehousing (DW) fields. Tune in here if you want to keep abreast of the latest trends, techniques, and technologies in this dynamic industry.

About the author >

Wayne has more than 15 years’ experience in data warehousing, business intelligence (BI) and performance management. He has conducted numerous in-depth research studies and wrote the best-selling book Performance Dashboards: Measuring, Monitoring, and Managing Your Business. He is a keynote speaker and blogger and conducts workshops on business analytics, performance dashboards and business intelligence. Eckerson served as director of education and research at The Data Warehousing Institute, where he oversaw the company’s content and training programs and chaired its BI Executive Summit.

Wayne is director of research at TechTarget, where he writes a weekly blog called Wayne’s World, which focuses on industry trends and examines best practices in the application of BI. He is also president of BI Leader Consulting and founder of BI Leadership Forum, a network of BI directors who exchange ideas about best practices in BI and educate the larger BI community.  He can be reached at weckerson@techtarget.com.

My blog last week, "QlikTech Goes Enterprise" created quite a stir from all quarters, much to my surprise. I presented an even-handed portrait of QlikTech, and I stand by everything I wrote. However, I'd like to elaborate on some issues that came under scrutiny from readers.

First, I never claimed QlikView (QlikTech's product) is an enterprise BI product. Today, QlikView is an extremely successful departmental BI tool. The problem with all good departmental BI products is that customers push them upstream into enterprise deployments. And that's exactly what's happening with QlikView, for better or worse. The same thing happened with today's vanguard of enterprise BI players--namely, MicroStrategy, SAP BusinessObjects, and IBM Cognos--all of which started out as desktop BI products in the 1990s.

The fact that a small, but increasing number of QlikView customers is purchasing and deploying thousands or, in some cases, tens of thousands of seats doesn't mean that QlikView is a bonafide enterprise BI product. At least yet. QlikView customers and partners are currently doing somersaults to work around product limitations that I mentioned last week. I have no doubt that QlikTech will address these deficiencies in the near future. So enterprise BI players need to stay alert lest QlikView ambush them from behind. The bigger question is whether QlikView will lose some of its appeal--or more specifically, it's ease of use, performance, agility, and affordability--in making the transition from a departmental to enterprise BI product.

The BI Triumvirate
Many people last week commented on my notion of a BI triumvirate consisting of MicroStrategy for reporting, QlikView for interactive dashboarding, and Tableau for visual discovery.

First, I view these capabilities as distinct and separate categories of BI, each of which addresses different information requirements and groups of users. (In truth, there are two additional BI categories--OLAP cubes and data mining--but these are smaller niches.)

Second, the vendors I referenced are examples only. I could have substituted any number of vendors in that list. For example, Oracle BI Enterprise Edition is an enterprise dashboard tool and IBM Cognos recently released a visual discovery tool, called Insight. However, I chose the three I did because I view them as leaders in their respective categories.

But just because I reference a vendor in one category doesn't exclude it from other categories. For example, MicroStrategy also provides exceptionally good dashboards, and last year, it unveiled a Tableau-like product called Visual Insight. So, if you are a MicroStrategy customer, your triumvirate could easily be: Microstrategy Report Services for reporting, MicroStrategy Report Services for dashboarding, and MicroStrategy Visual Insight for visual discovery. (And to boot, you can also use MicroStrategy OLAP Services for OLAP cubes and MicroStrategy Data Mining Services for data mining.) Obviously, one of the benefits of going with a BI platform vendor like MicroStrategy is that you get all the BI functionality you need in a single, integrated environment.

Interactive Dashboards

The real question is whether MicroStrategy and other comparable products are best of breed in each category. In terms of dashboards, MicroStrategy and QlikView both offer significant value but in different ways. MicroStrategy dashboards are pixel-perfect reports delivered via a Flash/DHTML interface that download data to the user's desktop or mobile device, while QlikView dashboards are delivered via a Web/AJAX interface powered by an in-memory database on the server. Filters in QlikView expose relationships (or lack thereof) among all elements displayed on a dashboard screen, while filters in MicroStrategy constrain views of data to support drill down and drill across navigation. Obviously, these are different interfaces and architectures powered by different database structures. Broadly generalizing, QlikView dashboards are more horizontally interactive (via its associative model of data), while MicroStrategy dashboards are more vertically interactive (via its dimensional data structure.) The best product is in the eyes of the beholder.

Visual Discovery

In terms of visual discovery, MicroStrategy Visual Insight is a first-generation product that currently lacks many of the features in Tableau. For instance, today MicroStrategy Visual Insight only accesses one data source at a time and displays one visualization per page. Customers also need to purchase and implement the entire MicroStrategy stack (version 9.2) to use Visual Insight. Thus, it's not a downloadable product like Tableau, which you can install and start using within minutes. To compensate, MicroStrategy now offers a free cloud-based version of Visual Insight, called Cloud Personal, that lets users upload and manipulate Excel spreadsheets without having to install any software. Touche!

MicroStrategy plans to release a new version of Visual Insight later this year that will move the 1.0 product closer to the current version of Tableau. Of course, Tableau isn't sitting still, either. It's working on a new version slated for a fall delivery and continues to raise the bar for what's possible in a visual discovery environment.

Dashboard Development Environments

Although Tableau is a market-leading visual discovery tool, it can do other things as well. I've run into many customers that use Tableau as a development environment for building departmental dashboards. As such, Tableau often butts heads with QlikView for these types of accounts. In the past year, Tableau has added many features, including an in-memory database, server-side data storage, and data blending of multiple sources that transform it from just a very good desktop analyst tool to a departmental dashboard development environment that competes with QlikView.

Summary

Clearly, vendors watch each other carefully and mirror each other's moves. If one succeeds in the marketplace, then others quickly adopt similar functionality to staunch real or potential losses in market- and mindshare. As a result, BI innovations spread quickly across vendors and products. The key is to understand whether new functionality is more a marketing makeover than a bonafide product extension.

At some point, all customers face an "all-in-one" or "best-of-breed" decision. Enterprise BI customers have to decide whether to go with an upstart that offers market-leading innovations or wait for their BI vendor to catch up. Conversely, departmental BI customers need to decide whether to jump ship for an integrated BI platform or wait for their pet BI vendor to embrace enterprise-scale computing.

This is when it pays to know your vendor. If you have confidence in its direction and ability to execute, then it might be wise to stay put. Otherwise, it's probably time shake the dice and look at alternatives.


Posted May 17, 2012 11:07 AM
Permalink | 2 Comments |

I was fortunate to attend QlikTech's annual Partner conference in Miami Beach in April, and I discovered a few things about the fast-growing in-memory visualization vendor.

Historically, QlikTech has sold QlikView to departmental business leaders and then used a "land and expand" strategy to spread its reach within an organization and grow revenues. This strategy catapulted the company to a successful 2010 initial public offering and 40+% annual growth.
With $320 million in annual revenues, QlikTech now is determined to eclipse the $1 billion revenue mark. To do this, it's pushing hard and fast into the enterprise BI market, which has been the province of industry BI heavy weights, such as SAP, Oracle, IBM, and MicroStrategy.

Enterprise Deployments. The good news is that QlikTech's customers are leading the way into enterprise territory. QlikTech is increasingly signing six- and seven-figure deals to support tens of thousands of users. As a result, QlikTech is working hard to transform what started out as a desktop and departmental tool into a bonafide enterprise platform.

The company took its first enterprise steps in QlikView 10 when it moved security and administration from individual applications to a shared server environment. QlikView 11, released last fall, makes incremental improvements to performance, administration, team development, metadata management, clustering, and security capabilities. But QlikTech still has much work left to do. In particular, QlikView needs more granular clustering, a bonafide semantic layer, graphical data design and mapping tools, native change control and version management, an improved administrative console, and rationalized global licensing.

Courting IT. As part of its push into the enterprise, QLikTech is trying to make nice with information technology (IT) departments, which tend to view QlikView as an invasive species that threatens to undermine information consistency and their control over corporate data. The real truth is that QlikView is an IT professional's best friend if it sources data from the data warehouse. Thanks to its intuitive visual interface, QlikView can help liberate data locked in the data warehouse and offload development from besieged IT staffers. There are 1,500 QlikView partner companies that have the technical expertise and project management skills to implement QlikView and can serve as extensions to the IT department. It would behoove IT departments to embrace QlikView and its partners if they want to stay a step ahead of the QlikView tsunami.

Here are a few other insights I picked up from the Partners event:

Mobility. Last year, QlikTech did an about-face with its mobile strategy, converting from native applications for iOS to generic Web-based mobile applications. A Web-based approach to mobile applications aligns better with QlikView's in-memory architecture which requires keeping large volumes of data in memory. Since memory is limited on mobile devices, native applications effectively turn QlikView into static dashboard viewer, which cheapens its value. However, many QlikView users are upset with the new Web-based applications because they lack native iOS features that aren't yet baked into HTML5 and Web-based mobile applications can't be used in disconnected mode. It will be interesting to see what fixes QlikTech makes, if any, to its mobile applications.

QlikView versus Tableau. The two darlings of the BI space these days are QlikView and Tableau, which many people lump together as visual analysis tools. In reality, these two tools are quite different, serving different users and purposes. In fact, the tools are complementary, making a nice one-two combination in any BI toolkit.

QlikView is an application development platform that requires an IT team (or QlikView partners) to set up, build, and maintain the applications. Companies use QlikView to build small, purpose-built, interactive dashboards for casual users. Architecturally, the tool creates in-memory data marts to ensure fast performance. Dashboards query these in-memory data sets rather than source data directly. IT administrators generally update the data marts in batch at night, although the tool supports incremental updates in near real-time as well.

Tableau, on the other hand, is a visual exploration tool designed for power users. It's primarily a desktop tool that users can download, install, and start visualizing data in minutes. Tableau recently added an in-memory cache to improve performance when querying large databases, but in-memory processing is an adjunct to its architecture, not the core piece, unlike QlikView. Although Tableau can be used to build departmental dashboards, it is a better exploration tool than an application development platform.

Basically, companies should purchase QlikView to drive their interactive dashboards, Tableau to support visual exploration, and a standard BI tool, like MicroStrategy or IBM Cognos, to handle scheduled reporting. That's a nice, modern day BI tool triumvirate. This standardization strategy also lower total cost of ownership and puts pressure on standard BI tool vendors to lower prices.

Future. QlikTech announced nice collaboration and comparative analysis features in version 11, and I expect more enhancements in these areas going forward. I also expect QlikTech to fill in some holes in its product lineup. These include things like predictive analysis, support for unstructured data, location intelligence, graphical ETL tools, a visual semantic layer, better support for near-real time data delivery, and printing. Software partners provide some of these capabilities today, but QlikTech will need to acquire or build such capabilities if it's going to assume the mantle of a true enterprise BI vendor.


Posted May 3, 2012 9:30 AM
Permalink | No Comments |

Imagine this: Would Google have built the predecessor to Hadoop in the mid-2000s if IBM's InfoSphere Streams (a.k.a. "IBM Streams") had been available? Since IBM Streams can ingest tens of thousands to millions of discrete events a second, perform complex transformations and analytics on those events with sub-second latency, why would Google have bothered to invest the man-hours into building a home-grown, distributed system to build its Web indexes?

Like Hadoop, IBM Streams runs on a cluster of commodity servers and parallelizes programming logic and handles node outages, relieving developers from having to worry about many of these low-level tasks. Moreover, contrary to what some may think about Big Blue products, Streams works on just about any data, including text, images, audio, voice, VoIP, video, web traffic, email, GPS data, financial transaction data, satellite data, and sensors.

Ok. I suspect Google has a "not invented here" mentality and needs to put its oodles of Java, Python, and C programmers to work doing something innovative to harness the massive reams of data that it collects daily from its sprawling Web and communications empire. And since it was an internet startup at the time, Google probably didn't want to pay for commercial software, and probably still doesn't. (Google was only six years old when it began developing Big Table, a predecessor to Hadoop, in 2004.) An entry-level implementation of Streams will set you back about $300,000, according to Roger Rea, Streams Product Manager at IBM.

Origins of CEP

I suppose you could argue that Hadoop inspired Streams. But that's probably not true. Streams emanates from a rather esoteric domain of computing, known as Complex Event Processing (CEP), which has been around for more than two decades and a major focus of a sizable amount of academic research. In fact, reknowned database guru and MIT professor, Michael Stonebraker, threw his hat into the CEP ring in 2003 when he commercialized technology that he had been working on with colleagues from several other universities. His company, StreamBase Systems, was actually a latecomer to the CEP landscape, preceded by companies, such as Tibco, Sybase, Progress Software, Oracle, Microsoft, and Informatica, all of which have developed CEP software inhouse or acquired it from startups.

CEP technology is about to move from the backwaters of data management technology to center stage. The primary driver is Big Data, which is at the height of the hype-cycle today. Open source technologies, such as Hadoop, have finally made it cost-effective for companies not only to amass mountains of Web and other data, but do something valuable with it. And no data is off limits: Twitter feeds, smart meter data, sensor feeds, video surveillance, systems logs, as well as voluminous transaction data. Much of this data is so big that you have to process it in real-time or you can never catch up.

Unfortunately, Hadoop is very young technology at this stage. It's also batch-oriented. That means you have to dump big, static files of data into a Hadoop cluster and then launch another big job to process or query the data. (Apache does support a project called Flume that is supposed to stream Web log data into Hadoop but early users report it doesn't work very well.)

Data in Motion

But what if you could process data as events happen? In other words, analyze data in motion instead of data at rest? This is where CEP technologies come into play.

Say you are a manager at a telecommunications company who wants to count the number of dropped calls each day, track customer calling patterns, and identify individuals with pre-paid calling plans who might churn. Every night, you could dedupe and dump all six billion of your company's call detail records (CDR) into your Hadoop cluster. Then you could issue queries against that entire data set to calculate the summaries and run the churn model. Given the volume of data, it might take more than 12 hours to process everything and by then it would be two days since the calls were made.

But if our telecommunications manager had a CEP system, he wouldn't have to load anything or run massive queries to get the answers he wants. He would create some rules and point his CEP engine at the CDR event stream and let it work its magic. The CEP system would first dedupe the data as it comes in by checking each incoming CDR against billions of existing CDRs in a data warehouse. It would then calculate a running summary of dropped calls, summarize call activity by customer, compute the churn model, and deposit the summaries into a SQL database. And it would do all that work in a fraction of a second per event record. A marketing manager could monitor the data on a real-time dashboard and send promotional offers to customers on a prepaid plan who are likely to churn within minutes of making their final call.

Now, if that isn't powerful computing, I'm not sure what is. That's certainly worth $300,000 or even ten times that amount for an enterprise deployment like I've just described. Google be damned!

CEP Use Cases

In a broad sense, CEP software creates a sophisticated notification system that works on high-volume streams of incoming data. You use it to detect anomalies and issue alerts. Fraud detection systems are a classic example of CEP systems in action. But, in reality, CEP offers more value than just pure notification. In fact, in the age of Big Data, other use cases may come to the forefront and even give Hadoop a run for its money.

According to Neil McGovern who heads worldwide strategy at Sybase, CEP has four use cases:


  1. Situational Detection. This is the traditional use case in which CEP applies calculations and rules to streams of incoming data and identifies exceptions or anomalies.

  2. Automated Response. This is an extension of situation detection in which CEP automatically takes predefined actions in response to an event or combination of events that exceeds thresholds.

  3. Stream Transformation. Here, CEP transforms incoming events to offload the processing burden from Hadoop, ETL tools, or data warehouses. In essence, CEP becomes the transformation layer in an enterprise data environment. It can filter, dedupe, and calculate data, including running data mining algorithms on a record by record basis.

  4. Continuous Intelligence. Here, CEP powers a real-time dashboard environment that enables managers or administrators to keep their fingers on the pulse of an organization or mission-critical process.

In many applications, CEP supports all four use cases at once. Certainly, in the era of Big Data, companies would be wise to implement CEP technology as a stream transformation engine that minimizes the size of data they have to land in Hadoop or a data warehouse. This would reduce their hardware footprint and software execution costs. Even though its commercial software, CEP products would provide high ROI in a Big Data environment.

CEP is a technology offers many valuable uses and is currently being adopted by leading edge companies. SAP plans to embed Sybase's CEP engine in all of its applications. So, if you are an SAP user, you'll be benefiting from CEP whether you know it or not. If you are a BI architect, it's time that you gave it a look and see how it can streamline your existing data processing and analytical operations.


Posted March 26, 2012 12:42 PM
Permalink | No Comments |

The social media behemoth, Facebook, is expected to be worth $100 billion when it goes public this spring, making it the largest initial public offering (IPO) for an internet company in history. Not bad for a company projected to make about $3 billion in 2011.

The hullabaloo surrounding Facebook's IPO underscores the two sides of being the world's biggest social network. On one hand, by concentrating hundreds of millions of people on a single social media platform, Facebook offers a tantalizing opportunity for advertisers to deliver highly targeted marketing campaigns through a bevy of rich, social applications. On the other, by giving advertisers unparalleled access to people's personal and activity data, Facebook has become the lightening rod in the debate about the proper balance between openness and privacy on the social internet.

A Marketer's Dream

Facebook is a marketer's dream come true. With more than 850 million monthly active users who generate more than 2.7 billion likes and comments a day, Facebook is a treasure trove of continuously updated, highly personalized customer data. Why would a company spend $100 million or more on a customer relationship management (CRM) system, whose data has a half-life of 36 months, if it can tap Facebook's rich set of demographic, psychographic, activity, location, and social network data? Why should it build custom campaigns via email, direct mail, or traditional media if it can use Facebook as a delivery channel for highly targeted offers? This is a no-brainer!

To date, Facebook's efforts to make this incredible information asset accessible to advertising partners have been somewhat disappointing. Currently, marketers can set up their own Facebook pages and communicate with people who friend them, which provide interactivity but are not very targeted. Or they can purchase Facebook display ads, which are targeted but not very interactive.

Facebook Applications. However, the newest Facebook channel for advertisers is the most promising: custom applications built on Facebook's open application programming interfaces (APIs). Many companies have already built Facebook applications and games that provide people with highly personalized content in exchange for their "tokens."

Tokens are the keys to unlocking peoples' Facebook data. A token is a user's permission to access their data. It's the ultimate opt-in mechanism, and the key to making Facebook applications work. Once a marketer has your token, it can collect everything about you and your friends. To be fair, applications must explicitly request permission to access your data, specifying the content they want extract. (See figure 1.) As long as marketers have your token, they can extract your data indefinitely and build a rich, historical profile about you.

Figure 1. Facebook Application Token
Facebook 1.jpg
This is a typical opt-in screen that people see when they activate a Facebook application.

With a token in hand, marketers can request to collect, store, and use any of the user's information held by Facebook. And that's quite a lot of stuff. The available data includes:

  • Demographic and psychographic information users write about themselves in their profile:
    • This includes name, gender, birthday, relationship status, friends, religion, political views, hometown, schools attended, current and past occupations, family members, current location, religious and political views, contact information, including phone, address and email, friends, IP address, and user name.
  • Activity data about what you do on the site:
    • This includes likes/dislikes, status updates, music, photos, videos, links, notes, Facebook applications you've opted into, places you've visited, events you've attended, and basically everything you've posted, linked to, or responded to on Facebook.
  • Demographic and activity data about your friends

This rich set of information is far more descriptive and useful than what exists in most CRM databases today. It's tremendously valuable to marketers, especially those who work in large consumer-oriented organizations who want and need to deliver highly targeted messages to customers and offer better customer service. The best part about the data is that Facebook users keep it current themselves. And if they don't, the social dynamic on Facebook often shames them into correcting inaccurate or intentionally misleading data. With Facebook, marketers can collect customer data without having to pay millions of dollars to cleanse, scrub, and update that data on a regular basis.

Why Share? The socially paranoid might ask why Facebook users willingly hand over so many personal tidbits to Facebook and its application partners. The upside is pretty obvious. For one, they enjoy the social experience on Facebook and want to replicate it on other sites. Second, they want these sites to leverage information they've already entered into Facebook, including their log-on information, so they don't have to re-educate each new site about themselves and their preferences. And last, and most important, Facebook and its partners give them stuff they want.

For instance, Hallmark has a Facebook application called Social Calendar that collects your friends' dates of birth so it can remind you to send them personalized greetings and virtual goods on their birthdays. American Express has an application called "Link>Like>Love" which delivers couponless offers from its partners tailored to your interests gleaned from Facebook that you can redeem online with your American Express card and share with your friends. (See figure 1.) This is social computing at its best. Companies tailor services to you and your friends based on your personal profile, interests, and ongoing activities.

Privacy Concerns

But not everyone thinks that personalized offers are worth sacrificing your personal privacy. With most Facebook applications, the information exchange is an all or nothing proposition. People must cede all their information to the provider or they can't use the application. In a marketer's calculus, this is a rational exchange. People provide their personal information and marketers give them highly tailored products and services. Hundreds of millions of Facebook users seem to agree.

But it's unclear how many of these people truly comprehend the amount of data that marketers collect about them and the frequency with which they collect it. Moreover, it's a fair bet that most people don't understand that opting into a Facebook application gives marketers instant access to detailed, personal information about their Facebook friends. All of them.

The Multiplier Effect. Since the average Facebook user has 130 friends, each token that a marketer receives gets magnified a hundredfold or more. Some savvy, consumer-oriented companies have already amassed detailed personal information about millions of people with just tens of thousands of tokens. Some of these companies use statistical techniques to enrich Facebook data with salary and psychographic information and then combine it with existing customer data in CRM systems. The result is that corporations can now gather detailed information about large numbers of their customers and prospects. This is a primary reason for Facebook's gravity-defying IPO valuation.

Although the socially paranoid are horrified by this wanton aggregation of personal data in the name of commerce, I'm a bit more sanguine. Currently, it takes a lot of technical sophistication to collect and analyze these vast amounts of customer points, let alone use them effectively in corporate marketing campaigns. And, truth be told, we want companies to excel at using our data so they can deliver personalized offers of interest to us. Why blanket the market with irrelevant appeals that we tune out?

But privacy advocates counter that governments, insurance companies, and hackers might be able to access this information, exposing the minute details of our lives to people we'd rather not have nosing around in our affairs. They have a point. But you can't have perfect privacy within the context of social media. People engage with social media because they want to share information with others. Those who wish to remain private, should not participate. But this doesn't mean we have to jettison privacy entirely. The market clearly wants Facebook and its partners to strike a balance: they want a social experience that gives them an assurance of privacy and a degree of control.

Facebook Privacy Controls. In the past, Facebook has taken a public whipping for its lack of privacy controls. Today, Facebook still comes under attack, but it does a much better job managing privacy than most of its internet peers, such as Google, which is the undisputed king of activity tracking. Google recently changed its privacy policy so that it can consolidate customer information and activity across its sprawling set of internet domains, including Google Search, Google+, YouTube, Gmail, Google Maps, and Google Apps. And since Google provides the operating system on Android devices, it can now track our every movement and conversation via our smartphones. (To learn how Google tracks your online behavior, read Patricia Seybold's excellent report titled, "How Does Google's Privacy Policy Affect You?") Other internet, media, and communications companies offer fewer privacy controls than Facebook, yet paradoxically have largely escaped unwanted attention about their use of personal information, although Google is starting to feel the heat, as it should.

For its part, Facebook gives users minute control over every aspect of their privacy. If I'm a savvy Facebook user, I can uncheck all the items I want to keep out of the hands of Facebook marketers when my friends opt-in to their applications. (See figure 2.) But unfortunately, the fine print reads, "If you don't want apps and Web sites to access other categories of information (like your friends list, gender, or info you've made public) you can turn of all Platform apps." Huh? To really prevent application marketers from getting your information through friends, you can't use Facebook applications at all. That seems a little Draconian, an example of a binary privacy policy--either on or off. People should be able to block individual applications from accessing their data via their friends' tokens. If you can do this, I've missed it.

Figure 2. Facebook Privacy Settings for Applications
Figure 2 - facebook.jpg
This overlay dialogue box shows how people can control the information applications can access through their friends. The fine print at the bottom says that you need to turn off the application Platform entirely to prevent public information, including your friend list, from being captured.

Tacit versus Explicit Approval. Although Facebook's privacy controls give users the ability to determine what personal data Facebook partners can access through a friend's token, it's not an explicit consent. In other words, people aren't notified at the moment a marketer gains access to their data. Rather, users give blanket permission to all marketers based on the settings configured in Facebook's privacy pages. But for most people, this approval is a default setting--they never consciously configure the controls. In other words, Facebook users give tacit, not explicit, approval to marketers to mine their information. As a result, most people don't realize that their friends are giving away their personal information.

Facebook should bite the bullet and require partner applications to explicitly request friends' permission to gather their data at the time they acquire a token. They should also require partners to indicate that they can collect this data perpetually. This will take courage because explicit approvals disrupt the freeflow of information and make the applications less appealing. People might get annoyed with repeated requests for access; marketers won't get as much data about people's friends; and companies will have to work harder to code and manage the applications. But some partners have already stepped up to the plate and do this voluntarily. For example, Hallmark sends an email to each of your friends when you subscribe to its Social Calendar application that requests permission to access their dates of birth .

Simplify Privacy. Facebook can also make its privacy settings easier to access and use. Currently, people have to hit a small down arrow on the home page to access account and privacy settings. Since the arrow doesn't have a label, it almost seems as if Facebook doesn't want people to find these settings. Furthermore, the privacy tab contains 40 checkboxes spread across 10 different screens, half of which deal with Facebook applications. Although the layout and text of these screens is simple and easy to understand, asking people to navigate ten screens and pick the right settings is too much. And not all settings are intuitive, especially for new and less active Facebook users. Did Facebook intentionally make its privacy pages complex to use to discourage people from changing the default settings?

If it is just poor design, there's an easy fix. For instance, I'd like to see Facebook create a small graphical privacy widget that runs on people's home pages and lets them choose from three privacy settings, ranging from "Most Private" to "Most Public." The widget would let people move a graphical slider up or down to see what personal information gets blocked or made public in each setting. This is what Internet Explorer does to help people define their Web security settings, and I think it's effective. The widget would also link to Facebook's current privacy controls so people can customize the settings further.

Summary.

Facebook has revolutionized how we use the internet to interact with each other and corporate entities. By consolidating hundreds of millions of people on a single social media platform, Facebook has unlimited potential to make money as a medium for advertising and targeted marketing. But, Facebook also has a responsibility to protect users from the over-exuberant use of personal information by advertisers and marketers. Balancing the demands of marketers with the rights of consumers will be a major challenge for Facebook as it strives to achieve its lofty IPO valuation in the coming years.


Posted March 19, 2012 12:21 PM
Permalink | 1 Comment |

Informatica this week inscribed another notch in its Big Data belt by inking a partnership agreement with MapR, one of the leading Hadoop distributions in the marketplace. The partnership further opens Hadoop to the sizable market of Informatica developers and provides a visual development environment for creating and running MapReduce jobs.

The partnership is fairly standard by Hadoop terms. Informatica can connect to MapR via PowerExchange and apply PowerCenter functions to the extracted data, such as data quality rules, profiling functions, and transformations. Informatica also provides HParser, a visual development environment for parsing and transforming Hadoop data, such as logs, call detail records, and JSON documents. Informatica has already signed similar agreements with Cloudera and HortonWorks.

Deeper Integration. But Informatica and MapR have gone two steps beyond the norm. Because MapR's unique architecture bundles an alternate file system (Network File System) behind industry standard Hadoop interfaces, Informatica has integrated two additional products with MapR: Ultra Messaging and Fast Clone. Ultra Messaging enables Informatica customers to stream data into MapR, while Fast Clone enables them replicate data in bulk. In addition, MapR will bundle the community edition of Informatica's HParser, the first Hadoop distribution to do so.

The upshot is that Informatica developers can now leverage a good portion of Informatica's data integration platform with MapR's distribution of Hadoop. Informatica is expected to announce the integration of additional Informatica products with MapR later this spring.

The two companies are currently certifying the integration work, which be finalized by end of Q1, 2012.


Posted March 6, 2012 12:51 PM
Permalink | No Comments |
PREV 1 2