Blog: William McKnight Subscribe to this blog's RSS feed!

William McKnight

Hello and welcome to my blog!

I will periodically be sharing my thoughts and observations on information management here in the blog. I am passionate about the effective creation, management and distribution of information for the benefit of company goals, and I'm thrilled to be a part of my clients' growth plans and connect what the industry provides to those goals. I have played many roles, but the perspective I come from is benefit to the end client. I hope the entries can be of some modest benefit to that goal. Please share your thoughts and input to the topics.

About the author >

William is President of McKnight Consulting Group. His practice focuses on delivering business value and solving business problems utilizing proven, streamlined approaches in data warehousing, master data management and business intelligence, all with a focus on data quality and scalable architectures.

William has more than 20 years of information management experience, nearly half of which was gained in IT leadership positions, dealing firsthand with the challenging issues his clients now face.  His IT and consulting teams have won best practice competitions for their implementations. In 11 years of consulting, he has been a part of 150 client programs worldwide, has over 300 articles, white papers and tips in publication and is a frequent international speaker. William is the author of 90 Days to Success in Consulting. Contact William at william@williammcknight.com or (214) 514-1444.

Editor's Note: More articles, news and resources are available in William's BeyeNETWORK expert channel. Be sure to visit today. Meet, discuss and receive advice from William McKnight by visiting his BeyeCONNECT community.


Santa brought many a package this year - a packaged data warehouse.  There are a host of these offerings now available which represent the consultant dream of repackaging custom work to more customers.  Of course, the popular ones come from the large vendors who own the systems which originate the data that is challenging to warehouse.  For many, these approaches yields the following practical benefits:

 

  • Enterprise Data Model using relatively acceptable best-practices and standards
  • Views for the Business Model metadata
  • ETL framework containing mappings/routines to load and refresh the data warehouse
  • 100s of reports and dashboard content
  • ETL routines have built-in Change Data Capture logic
  • Open Interfaces to  other data sources
  • Industry standard Metadata for Common Business Functions (i.e. AP, AR, GL, Sales Forecasting, Pipeline, etc.)
  • Faster Time to some value
  • Small team to deploy initially
  • New features available via upgrades

 

It IS the fastest route to SOME value.  Is it the right value for you?  Is the time savings worth the extra cost?  Will it pay off in the long run?  It's not something to get into lightly.  Unfortunately, packaged data warehousing tends to come with many misconceptions that make it seem more attractive.  These are the most common:

 

1.      It's a total plug-and-play

  1. Data modeling is not important
  2. The client does not have to staff data warehouse expertise
  3. The business does not have to be involved
  4. Data quality is not important
  5. Your individual enhancements will merge seamlessly with vendor enhancements in their releases
  6. The vendor will continue to upgrade its solution in real-time with the corresponding source system
  7. There are no other costs, i.e., Hardware costs, Employee costs, Consulting costs

 

Be sure to take a complete and multi-faceted view of this important decision.


Posted February 2, 2010 11:39 AM
Permalink | 1 Comment |

I'm preparing my content for Enterprise Data World 2010, where I will speak on March 17.  My topic is "Comparison of Enterprise Data Platforms".  It's mostly based on my popular service helping get end clients into the right platforms for them.  I'm covering the DBMS market (where we've come, where we're going), data warehouse appliances, columnar data storage, open source, on-demand and virtualization.  In 1 hour! 

I'll do my best, but one thing that struck me as interesting as I reviewed was that the future of data management will have many opportunities as well as confusion in it.  Unlike about 5 years ago, the enterprise data warehouse was not only hitting stride, it was considered a "holy grail" and an end-game.  Many data management resources went into building data warehouses (with single platforms, strictly row-based structures, on standard hardware and in-house.)

This is not so true any more.  We've re-entered chaos.  A columnar orientation is working its way into enterprise architectures and popular database offerings (Microsoft, Oracle).  You need to know how it works and when to use it.  Appliances and on-demand services are pushing function out of the enterprise.  You need to know how to manage the changing roles and responsibilities.  With departmental budgets buying software-as-a-service business intelligence, low-end appliances, and open source BI tools, shops need to have integration capabilities now more than ever.  And, something I've been saying for a while, master data management is leading the charge of business intelligence back into the operational world.

The economies and performance abilities of data management are changing and the benefits need to go to your bottom line.  Most unhelpful is the terminology wars surrounding all of this.

And finally what to start doing about the MapReduce approach, and syndicated data, and the data cloud.... 

Sorry, no answers here, but sometimes it's helpful to identify the questions.


Posted January 25, 2010 11:24 AM
Permalink | No Comments |

There are several new features that will be in SQL Server 2008 R2, which is due to be GA in May.  As someone who implements on multiple platforms, I'm constantly comparing platform capabilities and consequently have a hard time getting too excited about releases, but R2 is giving me some reason to be excited.

Like many of SQL Server's R2 releases, it builds on its corresponding R1.  SQL Server 2008 been commercially available since mid 2008.  From a data warehousing perspective, SQL Server has long been a choice for data marts, regardless of the data warehouse platform.  It has also been a data warehouse platform for the midmarket and occasionally a Fortune 100 company.  Some of the scalability concerns that have limited SQL Server's reach may be being answered in R2 with the added support for 256 processors.  This is quite a move up from 64 processors.  Also improving scalability will be the acquired DataAllegro technology, rebranded Parallel Data Warehouse, Microsoft's data warehouse appliance.

What I am most interested in and excited about is Master Data Services, Microsoft's entry into the crowding master data management market.  Microsoft is the first to make it a part of the DBMS package.  They are clearly targeting the Microsoft shops that are having master data issues.  I've had the CTP for some time (this is a worked over form of the product from Stratature, which Microsoft acquired in 2007) and have been able to exercise it and even see it implemented at a client.  While its capabilities are limited compared to its more mature competition, it has a lot of potential. Microsoft will be putting strong development effort behind Master Data Services.  Even today, it can certainly, with some effort, play the technical role in a MDM program.

And then there's Gemini or PowerPivot for Excel. Gemini is the new generation of Analysis Services.  Those who chagrin at the notion that Microsoft Excel is the #1 business intelligence tool (it is) will have a lot more concern now.  Bye-bye ProClarity interface.  We must embrace Excel.  I'm increasingly crafting procedures for IT's role with Excel and this need will only increase as Gemini will cause an even more fluid spreadsheet environment in shops.  Data security strategies are imperative. 

Gemini will also be a collaborative environment.  Everyone in a workgroup can cooperate in managing Excel.  Excel is certainly already mission critical and R2 will create even more possibilities to depend on Excel.  The Gemini server is SharePoint, another element gaining traction in the SQL Server family.

I am also completely impressed with the addition of the columnar, in-memory storage option for this downloaded data, called VeriPaq.  Some data just belongs in columnar, though most would not want to put all their data in this structure.  It's great for data where a lot of columnar functions will be done to it, as well as generally for those queries that don't return a lot of columns.  It's also great for compressing data.  I have a seminar on columnar and expect to be helping more clients effectively tier their data to this format now that SQL Server will be utilizing the option.

So, are you ready?  Are you on SQL Server 2008?  Are you ready to upgrade to Office 2010 to take advantage of Gemini?  SQL Server 2008 R2 will be one of the big BI stories of 2010.


Posted December 28, 2009 8:55 AM
Permalink | No Comments |

If a little bit of something is good, more must be better.  This is true of some things - exercise, community service, patience, etc.  Well, it's true to a degree.  What about business intelligence?  Almost all businesses have the proverbial business intelligence user community now.  Though sometimes fragmented and informal, the IT organizations supporting these communities are planning to expand their reach and their community.  In most cases, this is completely warranted.  Most organizations are not at the point of diminishing returns with their user community rollout.

Remember the first Inmon definition of data warehousing? "a subject-oriented, integrated, non-volatile, time-variant collection of data, organized to support management needs."  It's a solid definition that has stood the test of time.  I know it's for data warehousing, not business intelligence, but you can see the mentality of the management user there and no mention of the rest of the organization.  Of course, BI has all kinds of users - people and systems - now.  And organizations are better off for it.

However, the so-called Pervasive BI movement goes beyond any of this.  It's "BI for everyone."  Unlike my thoughts on self-service BI, I am less optimistic about pervasive BI in the short term.  I'm all for smarter organizations, but not many workforces are structured such that everyone, or even most, employees have impressive key business decisions to make that incorporate the need for interaction with data well beyond their immediate environment.  I can think of a few exceptions - financial research firms, pharmaceutical research, etc. 

Probably the key thing to find palatable ground here is to define user.  Everyone in an organization can be the beneficiary of BI in the organization.  However, the tiered nature of the user community will remain true for the next decade.  Some will:

1.       Mine the data

2.       Interact with the data

3.       Consume and interact with reports

4.       Consume reports for decision making

5.       Make tactical decisions for others based on information seen in a report

6.       Make small adjustments to their team's workdays based on information seen in a report

7.       Make tactical decisions based on information seen in a report

8.       Make small adjustments to their workday based on information seen in a report

At some point in this hierarchy, the need for actual BI tools stop.  So, if you have a project that will entail a large number of new BI users - and they're going to productively benefit the business through the knowledge they gain from being a user - that is great.  If you are displaying corporate KPIs throughout the organization, I can see that.  If the culture that arises from this sharing, and possibly decomposing the KPIs to multiple levels, is empowering everyone to do their best job, that too is great.  But that's not BI for everyone.

Let your organization benefit from BI.  However, a project to blanket the business with BI tools in an untargeted fashion because it is thought that [pervasive BI is good, it means everyone needs BI and BI means tools] is not the best use of resources.


Posted December 1, 2009 8:14 PM
Permalink | No Comments |

Right now, as you read this, many business decision makers are waiting.  They are waiting on IT to deliver business intelligence so they can make an informed business decision.  They may have filled out the IT-required form or maybe just lobbed a call in to the most responsive BI developer in IT.  They have asked for a simple report with A LOT of data on it because they know the cycle time on these things is days or weeks so they're trying to think ahead a little bit.  Also, it's clearly a report and not an interactive tool because those "IT tools" are difficult to understand without some training, for which they don't have time for or they don't think the tools are meant for them. 

 

As time passes, many of these reports will lose their value to the requestor.  They won't break off the request, however, because it's not their time that's invested in building the report and the report MAY still have some value anyway.  Whether the decision maker moved on without it or not, she will have to sort through some voluminous data to get to some value, and that's also time consuming and value-reducing.

 

So it goes in many environments that have not made the leap to empowering the end user.  Yet, it seems every shop sees the value in self-service BI and minimizing or eliminating the IT bottleneck, yet few have made it happen.  So should we accept this reality and down-license the business intelligence software we're buying and forsake user empowerment?  No way.

 

Self-service BI takes work.  It takes focused energy until an organization makes some breakthroughs in process and the culture change necessary.  This is important enough that if most users are dependent on IT, this is probably one of the most important strategic activities to work on for the BI program, and the business as a whole.  So why the delay?

 

Change is hard.  Even small investments in changing habits are hard to do.  Despite comments to the contrary, using IT and treating BI as "just reports" is the devil many know.  Here are some tips (directed at IT) for getting past this point.  Remember it is IT that bears the brunt of the failure of user empowerment.

 

1.       Promote the data that is available.  Sure, BI is built now based on known requirements.  However,  there are still those unknown value propositions for the data that can be unearthed quickly and provide more value than the requirements for which BI was built for.  Users are often reluctant to articulate requirements if the turnaround is multi-months, but if the data is already accessible, well-performing and clean, usage ideas will take form.  More usage, and getting  will mean more user profiles.

2.       Promote the various methods of accessing data.  Early in a BI cycle, I will profile the users to the style of access that is most suitable for them and passive report receiving is not the default.  Dashboards are a step up from reporting and ad-hoc, interactive environments  are beyond that.  Furthermore, it may be possible to automatically enact the business change that is desired through operational business intelligence.   Put it all on the table with the users.

3.       Invest in training the users to interact with the data.  Real training - classrooms, materials, breaks, snacks, etc.  Half technical-general, half company/data model-specific.  The first few trainings will be learning experiences for everyone involved.  There may be serious feedback or some serious deskside component to the training until it is learned what works to make the user feel empowered.  And, while the training is serious, it can't be LONG, boring or irrelevant.

4.       Invest early extra energy in a few key users.  Invest in them the notion of the iterative nature of BI and other things they need to know and get them sharing the good word throughout their peers.  Self-service BI is iterative like everything else.  Start small.

5.       Don't, for a second, believe it's an all-or-nothing proposition.  This is where many shops go wrong.  They think they can flip a switch and users will do it all.  There should be a healthy split of responsibilities between IT and the users for data access.  There is still a data access role in IT in self-service BI environments.


Posted November 15, 2009 5:54 PM
Permalink | No Comments |

My book, 90 Days to Success in Consulting, published by Course Technology PTR, is now available at Amazon, Barnes and Nobles, Borders, etc. 

 

Consulting is a passion and I've seen and learned a lot in my 15 years of doing it.  What I do with the book is to provide an understandable and implementable action plan in consulting, whether you're looking to self-consult, are already working for a consulting firm, or are contemplating either route.  It's also for those who know they need to treat their non-consultant position like a consultant.  In terms of fields of application of consulting, this book is about the business of consulting and applies to all fields.

 

I hope this sounds intriguing and that you'll get a copy!!! 

  

978-1-4354-5442-2_hr copy.jpg

 

Here are the chapters:

 

What is consulting?

The Traits of a Professional Consultant

The Top Consultant Image Building Blocks

The Bottom Line

How to Stay Current: Technology and Skills

Service Planning

Establishing Fees

The Role of the Consultant

Client Contracts

Acquiring People

Requests for Information/Requests for Proposals

Client Communications

Writing and Speaking

Managing Capital

Partnerships

Getting the Word Out

Marketable Value and Exit Strategies

Parting Words


Posted October 3, 2009 10:37 AM
Permalink | 1 Comment |

I'm engaged with a number of clients who need the benefits of a streamlined data infrastructure with multiple systems utlizing the same customer list, product list and so on.  Master data management (MDM) is a term that comes to mind to solve the problems.  Sometimes I'm engaged to organize master data management in the environment and sometimes to look an organization's serious data issues - the ones driving up TCO, keeping the staff busy with repetitive, error-prone tasks and keeping them from getting a full view of customers and products - which leads to MDM as an area to look at for solutions.

I think of MDM as a term to address to a number of real business problems.  It has to do with data quality, sharing data across systems, an enterprise data model, and governance processes and workflows as well as sometimes being a direct data layer to a data-intensive application like CRM.  It's not a product although some products bearing the label provide these benefits.

So, why do MDM?  If and when it alleviates real business pain, solves business problems and enables new ROI-producing functionality.  MDM happens to be a discipline sitting squarely in the sweet spot of modern competitive advantages.

 


Posted September 16, 2009 12:16 PM
Permalink | No Comments |

Netezza's big technology news this week came with an unexpected price fall for the technology.  Whereas Netezza customers to-date have paid around $60,000 per terabyte of storage, Netezza's new TwinFin appliance will go for $20,000 per terabyte of storage.  This storage assumes a 2.25 X compression ratio, which Netezza says is typical and will improve, so figure you are actually storing a little less than half of that in terms of real storage, but the practical application of the price points remains.

In addition to the price drop, the upper limit has been expanded to, depending on who you speak with, 700 terabytes or 1 petabyte.  Either way, it's a big leap and a huge amount of storage that's now possible with Netezza.

Making this all possible are some forklifts and tweaks to the underlying technology.  First and foremost is the switch from the Hitachi drives with 2- or 4-way HP/Intel host CPUs to Intel-based IBM blade servers.  Netezza is taking advantage of the faster chips, bigger disks and better interconnects that have come to market in recent years.  It has also introduced a cache, which will improve the access performance of commonly accessed tables and sections of tables.

The field programmable gate array (FPGA) remains very important in the architecture.  However, the disk controlling function has been removed from the FPGA in favor of an actual disk controller. 

I wrote a description of Netezza technology some time ago that may be worth refreshing on regarding the FPGA:

"The architecture is a shared nothing but there is a major twist.  The I/O module is placed adjacent to the CPU. The disk is directly attached to the SPU processing module.  More importantly, logic is added to the CPU with a Field Programmable Gate Array (FPGA)  that performs record selection and projection, processes usually reserved for relatively much later in a query cycle for other systems.  The FPGA and CPU are physically connected to the disk drive.  This is the real key to Netezza query performance success - filtering at the disk level.  This logic, combined with the physical proximity, creates an environment that will move data the least distance to satisfy a query.  The SMP host will perform final aggregation and any merge sort required.   Enough logic is currently in the FPGA to make a real difference in the performance of most queries."

The Intel adaption, as well as going all-Linux, makes other software more compatible with Netezza.  Obviously one of Netezza's aims is to bring over some of those other DBMS applications - appliance and non-appliance. 

The lowered price point is actually quite important in this rapidly commoditizing field.  And data size is actually a good barometer for measuring price since once you get into the terabytes with an enterprise data warehouse, the workload tends to mix in some similar ways across enterprises.  For those high-data, but specific-use workloads, Netezza will have a high capacity model available soon.  As well, Netezza intends to deliver entry level and a "memory intensive" models.  This strategy is not dissimilar to Teradata's appliance line, already available and at around Netezza's new price points.

This is a very good signal from Netezza - that it is still investing and intends to pursue price/performance for its customers.  At a time when major players like Teradata, with a longer pedigree and half the Global 2000 as customers, have entered the appliance market, and with Microsoft's looming Madison, something was necessary from Netezza.  The question is will Netezza be able to make up for the price drop with significantly more volume in this space they essentially pioneered?

 


Posted August 5, 2009 9:54 PM
Permalink | No Comments |

 

Basketball.png

 

 

      1.   The last time I checked, the NBA All-Star teams were stocked with players from 20 or so teams.  Kobe Bryant, Dwight Howard, LeBron James, Steve Nash and Dwayne Wade all play for different teams.  If a consultancy puts forward its team as the all-league all-star team, with no deficiencies whatsoever, that is a red flag.  All teams have them.  Both sides should understand this and strive for a best fit, given the realities that talent gets spread around naturally.

2.       However, consulting teams need a winning formula.  Do they know what it is?  Will that work in your environment?  For the Lakers, it was Kobe and a solid supporting cast.  For the Magic, it was Howard, Lewis, Turkoglu and a solid rotation.  Other teams put all shooters on the floor or play defense first.   

3.       I did not notice an NBA team, in an effort to save money, put the cheapest, most inexperienced player they could find on the court this season.  Heck, there are people who would pay for the glory of paying.  No, I think every team tried their best to win as many games as possible.  If your consulting team consists of 3 solid players that you are presented with, with the rest to be named later, make sure they are not filling it out with the cheapest players they can find.  Of course, that is misguided on their part as well, but sometimes you need to save the consultancies from doing the wrong thing for both of you.

4.       Scores and game clocks are not kept in the referee's head.  He does not suddenly blow the whistle and say "game over, Suns win 104-99, goodbye."  The time and the score are kept on large scoreboards for all to see throughout the game.  Do you have a scoreboard?  Does your consultancy?  It is important to know how much progress is being made throughout the game.

5.       Beyond the starting 5, NBA benches are filled with world-class athletes, many of whom get as much or more playing time as starters.  What is your consultancy's bench?  I'm not referring, necessarily, to their employees not on billing, but just what is their contingency plan in case of injury, sudden and unexpected poor performance or if a player were to leave in the middle of the game?  Is the consultancy plugged into the culture of the discipline they are engaged in?  Do they have a warm network?  Do they scout?

6.       NBA teams come to expect certain things from the places they play - things like fans, referees, locker rooms, food, transportation, hoops, lights, a marked court and basketballs to play with.  What is your consulting team expecting from you?  Software?  Hardware?  Requirements?  Access to certain individuals?  Physical space?  The ability to network their laptops?  It would be a drag to see the game try to start without a basketball or to have the lights go out in the 3rd quarter.  Clear up expectations ahead of time with your consultancy.

7.       When the Pistons show up to the American Airlines Arena in Miami, they expect the Heat to come out of the dressing room to play against.  Imagine their surprise should the Warriors come out!  Or they have to play against 6 players on the court.  Now, they have game-planned for one team (5 players at a time) and get to play an entirely different team.  This bit of surprise will not help the Pistons be successful that night.  Is there information the consultancy is not asking for that they should be in order to know what they are up against?

8.       Sure, playing basketball is fun.  However, it's also work.  Players dive after loose balls, flying into the stands if necessary, and are expected to go all out with little consequence to their body.  They need to be skilled at avoiding injury, but cannot play overly concerned with it.  There are many moments in a consulting project where it's less fun and more work.  Are you hiring a consultancy that is prepared for the potential hard work ahead?

9.       NBA teams shoot about 80 field goals per game, hitting less than half.  Actually, only a handful of players in the league hit over 50 percent of their field goals.  However, you can't score or win if you don't shoot.  The Harlem Globetrotters are entertaining when they go into their circle and keep passing the ball, but you don't see that in a real game.  Is your consultancy willing to shoot, and are you willing to let them, even though half of the shots aren't going in, or is the consultancy interested in making entertaining passes, perhaps back to you? 

10.   Finally, experience counts.  At the NBA draft last week, I was alarmed when the announcers said that some of the second round picks would not even make the NBA.  Only 60 players are drafted each year, all with eye-popping highlights from college and European leagues, and some won't make it?!  That's how tough it is.  Is your consultancy circumventing this rule and passing along the inexperienced to you?

 


Posted June 29, 2009 11:55 AM
Permalink | No Comments |

ParAccel is a columnar, MPP, TPC-H-submitted data warehouse appliance.  I received an update from ParAccel yesterday and in the wake of the public challenges at some other small firms filling some similar market needs, I was curious about their customer wins and continued development.  I'm happy to report there has been some of both.  Some of the information is embargoed, but keep an eye on them for announcements soon.

 

What I can say is I received a sneak peak at PADB (ParAccel Analytic database) V2.0 as well as a customer review.  Their new deal was sold by EMC, and this is, in my opinion, perhaps a forbearer of a new outlet for ParAccel as well as where EMC might participate in the large data appliance market.  EMC is a technology partner to ParAccel.

 

I have otherwise talked about these columnar appliances and when to use them.

 

PADB Version 2 innovations include SAN-Based optimizations which will find ParAccel claiming to be the fastest data processing machine in the world.

 

ParAccel uses a leader node, similar to Netezza, but different from Veritca and others, which initiates, coordinates, and collects results for the distributed queries and parallel load operations supported by the system.   The SAN utilization in Version 2 may further differentiate ParAccel and Vertica.  ParAccel is a one of the recent market entrants with the potential to offer "game changing".  

 

paraccel.jpg


Posted June 18, 2009 11:33 AM
Permalink | No Comments |
   VISIT MY EXPERT CHANNEL

Search this blog
Categories ›
Archives ›
Recent Entries ›