Blog: Lou Agosta Copyright 2016 Wed, 14 Sep 2011 16:06:57 -0700 Can an Expert in the Game Show Jeopardy! Solve Healthcare Problems in America?

The diagnosis of a disease is part science, part intuition and artistry. The medical model trains doctors and healthcare specialists using an apprentice system (in addition, of course, to long schooling and lab work). The hierarchical nature of disease diagnosis has long invited automation using computers and databases. Early expert medical systems such as MYCIN at Stanford or CADUCEUS at Carnegie-Mellon University were initially modest sized arrays of if-then rules or semantic networks that grew explosively in resource consumption, time-to-manage, and cost and complexity of usability. They were compared in terms of accuracy and speed with the results generated by real world physicians. The matter of accountability and error was left to be worked out later. Early results were such that automated diagnoses was as much work, slower, and not significantly better - though the automation would occasionally be surprisingly "out of the box" with something no one else had imagined. One lessons learned? Any computer system is better managed and deployed like an automated co-pilot rather than a primary locus of decision making or responsibility.

Work has been ongoing at universities and research labs over the decades and new results are starting to emerge based on orders of magnitude improvements in computing power, reduced storage costs, ease of administration, and usability enhancements. The case in point is IBM's Watson, which has been programmed to handle significant aspects of natural language processing, play jeopardy (it beat the humans), and, as they say in the corporate world, other duties as assigned.

Watson generates and prunes back hypotheses in a way that simulates what human beings do in formulating a differential diagnoses. However, the computer system does so in an explicit, verbose, and even clunky way using massive parallel processing whereas the human expert distills the result out of experience, years of training, and unconscious pattern matching. Watson requires about eight refrigerator size cabinets for its hardware. The human brain still occupies a space about the size of a shoe box.

Still, the accomplishment is substantial. An initial application being considered is having Watson scan the vast medical literature on treatments and procedures to match evidence-based outcomes to individual persons or cohorts with the disease in question. This is where Waton's strengths in natural language processing, formulating hypotheses, and pruning them back based on confidence level calculations - the same strengths that enabled it to win at Jeopardy - come into play. In addition, oncology is a key initial target area because of the complexity of the underlying disorder as well as the sheer number of individual variables. Be ready for some surprises as Watson percolates up innovative approaches to treatment that are expensive and do not necessarily satisfy anyone's cost containment algorithm. Meanwhile, there are literally a million new medical articles published each year, though only a tiny fraction of them are relevant to any particular case. M.D.s are human beings and have been unable to "know everything" there is to know about a specialty for at least thirty years. In short,  Watson just could be the optimal technology for finding that elusive needle in a haystack - and doing so cost effectively.

A medical differential diagnosis in medicine is a set of hypotheses that subsequently have to be first exploded, pruned, and finally combined based on confidence and prior probability to yield an answer. This corresponds to the so-called Deep Question and Answering Architecture implemented in Watson. Within five years, similar technologies will have been licensed and migrated to clinical decision support systems from standard EMR/EHR vendors.

While your clinical data warehouse may not be running 3,000 Power 750 cores and terabytes of self-contained data in a physical footprint about the size of eight refrigerators, some key lessons learned are available even for a modest implementation of clinical data warehousing decision support:

  • Position the clinical data warehouse as a physician's assistant (think: co-pilot) to answer questions, provide a "sanity check," and fill in the gaps created by explosively growing treatments.
  • Plan on significant data preparation (and attention to data quality) to get data down to the level of granularity required to make a differential diagnoses. ICD-10 (currently mandated for 10/2013 but likely to slip), will help a lot, but may still have gaps.
  • Plan on significant data preparation (and more attention to data quality) to get data down to the level of granularity required to make a meaningful financial decision about the effectiveness of a given treatment or procedure. Pricing and cost data is dynamic, changing over time. New treatments start out expensive and become less costly. Time series pricing data will be critical path. ICD-10 (currently mandated for 10/2013 but likely to slip) will help but will need to be augmented significantly into new pricing data structures and even then but may still have gaps.
  • Often there is no one right answer in medicine - it is called a "differential diagnosis" - prefer systems that show the differential (few of them today do, though reportedly Watson can be so configured) and trace the logic at a high level for medical review.
  • Continue to lobby for tort and liability reform as computers are made part of the health care team, even in an assistant role. Legal issues may delay, but will not stop implementation in the service of better quality care.
  • Look to natural language interfaces to make the computing system a part of the health care team, but be prepared to work with a print out to a screen till then.
  • Advanced clinical decision support, rare in the market at this time, is like a resident in psychiatry, in that it learns from its right and wrong answers using machine learning technologies as well as "hard coded" answers from a database of semantic network.
  • This will take "before Google (BG)" and "after Google (AG)" in medical training to a new level. Watson-like systems will be available on a smart phone or tablet to residents and attendings at the bedside.

Finally, for the curious, the cost of the hardware and customized software for some 3,000 Power 750 cores (commercially available "off the shelf"), terabytes of data and including time and effort of a development team of some 25 people with Ph.D.s working for four years (the later being the real expense), my back of the envelope pricing (after all this is a blog post!) weighs in at least in the ball park of $100 million. This is probably low, but I am embarrassed to price it higher. This does not include the cost of preparing the videos and marketing. One final thought. The four year development time of this project is about the length of time to train a psychiatrist in a standard residency program.


  1. "Wellpoint's New Hire. What is Watson?" The Wall Street Journal. September 13, 2011.

  2. IBM: "The Science Behind and Answer":


]]> comparative effectiveness research Wed, 14 Sep 2011 16:06:57 -0700
Live from Pervasive Software IntegratioNEXT 2010 Conference: Pervasive Software Innovates Both Below and Above the Waterline!

Amid its 39th quarter of consecutive profitability, Pervasive has launched a new SaaS version of its flagship Data Integrator (DI) product called DI "Cloud Edition". In short, as part of a process described by CTO Mike Hoskins as "innovating below the water line," the Pervasive software development team has service-enabled the platform for SOA. This enables the DI product to bring its extensive connectivity to the cloud, either on the Pervasive DataCloud onAmazon's EC2, or on any cloud.


At the same time the architecture was undergoing a major upgrade, innovation was also occurring above the waterline. Significant functionality was added in the areas of data profiling and data matching. Both profiling and matching are now exploiting the DataRush parallel processing engine. According to Jim Falgout (DataRush Chief Technologist), DataRush continues to set records for high performance.The latest record broken was on September 27, 2010 as it delivered an order of magnitude greater throughput than earlier biological algorithm results, performing 986 billion cell updates/second on 10 million protein sequences using a 384 core SGI Ultrix UV1000 in 81.1 seconds.[1] Wow! This enables the entire platform to deliver the basics for a complete data governance infrastructure to those enterprises that know the answer to the question "What are your data quality policies and procedures?" When combined with the data mining open source offering called "KNIME" (Konstanz Information Miner), which featured prominently in numerous use cases at the conference, Pervasive is now a triple threat across data integration, quality, and predictive analytics.


In addition to software innovation, end-user enterprises will be interested to learn that Pervasive is also delivering pricing innovations. Client (end user) enterprises will be unambiguously pleased to hear about this one, unlike some uses of the phrase "pricing innovations" that have meant gaming the system to implement price increases. Instead of per connector pricing, the classic approach for data integration vendors, according to which each individual connector costs an additional fee, Pervasive provides all of its available connectors in DI v10 Cloud Edition for a single reasonable monthly charge. The figure that was given to me was $1000 a month for the software, including all the available connectors. And remember, Pervasive is famous since its days doing business as Data Junction for the diversity of connectors, across the entire spectrum from xbase class to high end enterprise adapters. For those enterprises confronting the dual challenges of a short timeline to results and a large number of heterogeneous data sources, the recommendation is clear: check this out.

[1] Or further details see "Pervasive DataRush on SGI® Altix® UV 1000 Shatters Smith-Waterman Throughput Record by 43 Percent":

]]> data integration Thu, 04 Nov 2010 10:23:27 -0700
Live from Pervasive Software IntegratioNEXT 2010 Conference: Envisioning Healthcare Interoperability Healthcare and healthcare information technology (HIT) continue to be a data integration challenge for payers, providers, and consumers of healthcare services. This post will explore the role of Pervasive's Data Integrator software in the solution envisioned by one of the partners presenting at the iNExt, Misys Open Source Solutions (MOSS). However,first we need to take a step back and get oriented to the opportunity and the challenge.

Enabling legislation by the federal government in the form of the American Recovery and Reinvestment Act (ARRA) of 2009 provides financial incentives to healthcare providers (hospitals, clinics, group practices) that install and demonstrate the meaningful use of software to (1) capture and share healthcare data by the year 2011 (2) enable clinical decision support using the data captured in (1) by 2013 and (3) improve healthcare outcomes (i.e., cause patients to get well) against the track record laid down in (1) by 2015. Although the rules are complex and subject to further refinement by government regulators, one thing is clear. Meaningful use requires the ability of the electronic medical record (EMR/EHR) systems to exchange significant volumes of data and do so in a way that preserves the semantics (i.e., the meaning of the data).

Misys Open Source Solutions (MOSS) is jumping with both feet into the maelstrom of competing requirements, standards, and technologies with a plan to make a difference. MOSS first merged with Allscripts and then separated from them as Allscripts merged with Eclipsys. MOSS is now a standalone enterprise on a mission to harmonize medical records. Riding on Apache 2.0 and leveraging the Open Health Tools (OHT), MOSS is inviting would-be operators and participants in health information exchanges (HIE) to download its components and make the integrated healthcare enterprise (IHE) a reality. As of this writing, the need is great and so is the vision. However, the challenges are also formidable and extend from the requirements for patient identification, document sharing, record location, audit trail production and authentication, subscription services, clinical care documentation and persistence in a repository. Since the software is open source and comes at no additional cost, MOSS's revenue model relies on fees earned for providing maintenance and support.

Whether the HIE is public orprivate the operator confronts the challenge of translating between a dizzying array of healthcare standards - HIPAA, HL7, X12, HCFA, NCPDP, and so on. With literally hundreds of data formats in legacy as well as modern systems out there, those HIEs that are able to provide a platform for interoperability are the ones that will survive and prosper by value-added services rather than just forwarding data. The connection with Pervasive is direct since the incoming data formats may be in HL7 version 2.3 and the outbound format in version 2.7. Pervasive is cloud-enabled and, not to put too fine a point on it, has more data conversion formats than you can shake a stick at. Pervasive is a contributor to the solution at the data integration, data quality, and data mining levels of the technology stack being envisioned by MOSS. Now and in the future, an important differentiator between winners and runner-ups amongst HIEs will be the ability to translate between diverse data formats. In addition, this work must be done with velocity and in significant volume, while preserving the semantics. This approach will add value to the content rather than just acting as a delivery service.

As noted in the companion post to this one, Pervasive has a cloud-enabled version of its data integration product, Version 10, Cloud Edition. DataRush, the parallel processing engine, continues to set new records for high performance parallel processing across large numbers of cores. Significant new functionality in data profiling and data matching is now available as well, making Pervasive a triple threat across data integration, data quality, and when open source data mining from Knime is included, data mining. [For example, NCPDP = National Council of Prescription Drug Programs; HCFA = Healthcare Finance Authority (the precursor to the Centers for Medicare and Medicaid); HIPAA = Health Insurance Portability Accountability Act; HL7 = Health Language Seven; X12 = a national insurance standard format for electronic data exchange of health data.

]]> data integration Thu, 04 Nov 2010 10:04:25 -0700
Innovations in Predictive Analytics: Introducing Predixion Insight

Predictive analytics has been Page One news for many years in the business and popular press; and I caught up with Predixion Software principals Jamie McLennan (CTO) and Simon Arkell (CEO) shortly prior to the launch earlier this month. Marketers, managers, and analysts alike continue to lust after golden nuggets of information hidden amidst the voluminous river of data flowing across business systems in the interest of revenue opportunities. Competetive opportunities require fast time to results and competition for clients remains intense when they can be enticed to engage.


Predixion represents yet another decisive step in the march of predictive analytics in two tension-laden but related directions. First, the march down market towards wider deployment of predictive analytics is enabled by deployment in the cloud for a modest monthly fee. Second, the march up market towards engaging and solving sophisticated predictive problems is enabled by innovations in implementing advanced algorithms from Predixion in readily usable contexts such as Excel and PowerPivot. Predixion Insight offers a wizard-driven plug-in for Excel and PowerPivot that uses a cloud-based predictive engine and storage.


The intersection of predictive analytics and cloud computing forms a compelling value proposition. The initial effort on the part of the business analyst is modest in comparison with provisioning an in house predictive analytics platform. The starting price is reportedly $99 per seat per month - though be sure to ask about additional fees that may apply. The business need has never been greater. In survey after survey, nearly 80% of enterprise clients acknowledge having a data warehouse in production. What is the obvious next step, given a repository of clean, consistent data about customers, products, markets, and promotions? Create a monetary opportunity using advanced analytics without having to hire a cadre of statistical PhDs.


Predixion Insight advances the use of cloud computing to deliver innovations in algorithms in advanced analytics. Wizards are provided to analyze key influencers; detect categories; fill from example; forecast; highlight exceptions; prediction calculator; perform shopping basket analysis. While the Microsoft BI platform is not the only choice for advances in usability, those marching to the beat of the usability drum acknowledge the leadership in ease of deployment wizards and implementation practices.


Finally, notwithstanding the authentic advances being delivered by Predixion and its competitors, a few words of caution about predictive analytics. Discovering and determining meaning is a business task not a statistical one. There are no spurious relationships between variables, only spurious interpretations. How to guard against misinterpretation? Deep experience and knowledge of human behavior (customer), the business (service), and the market (intersection of latter two). The more challenging (competitive, immature, commodity-like, etc.) the business environment, the more worthwhile are the advantages attainable through predictive analytics. The more challenging the business environment, the more important is planning, prototyping, and perseverance. The more challenging the business environment, the more valuable is management perspective and depth of understanding. Finally, prospective clients will still need to be rigorous about data preparation and quality. Much of the narrative about ease of use and lowering the implementation bar should not obscure the back-story that many of the challenges in predictive analytics relate to the accuracy, consistency, and timeliness of the data. You cannot shrink wrap knowledge of your business. However, the market basket analysis algorithm and wizards that I saw demoed by Predixion come as close to doing so as is humanly possible so far.

]]> Business Intelligence Tue, 05 Oct 2010 07:02:08 -0700
Healthcare Patient Safety in the Cross-Hairs: The Uses of IT Delivered Checklists

Consider three different scenarios that place healthcare patient safety at risk. The first is an individual hazard, the second human behavior, and the third a system issue in the broad sense of "system" as distinct from information technology (IT).

The first consists in placing concentrated potassium alongside diluted solutions of potassium based electrolytes. Now you need to know that intravenous administration of the former (concentrated potassium) results in stopping the heart almost instantaneously. In one tragic case, for example, an individual provider mistakenly selected a vial of potassium chloride instead of furosemide, both of which were kept on nearby shelves just above the floor. A mental slip-erroneous association of potassium on the label with the potassium-excreting diuretic-likely resulted in the failure to recognize the error until she went back to the pharmacy to document removal of the drug.[1] By then it was too late.  

Second, a pharmacist supervising a technician, working under stress and tight time deadlines due to short staffing, does not notice that the sodium chloride in a chemotherapy solution is not .9% as it should be but is over 23%. After being administered, the patient, a child, experiences a sever headache and thirst, lapses into a coma, and dies. This results in legislation in the state of Ohio called Emily's Law. The supervisor loses his license and is reportedly sentenced to six months in jail.[2]

Finally, in a patient quality assurance session, the psychiatric residents on call at a major  urban teaching hospital dedicated to community service express concern that patients are being forwarded to the psychiatric ward without proper medical (physical) screening. People with psychiatric symptoms can be sick too with life-threatening physical disorders. In most cases, it was 3 AM and the attending physician was either not responsive or dismissive of the issue. In one instance, the patient had a heart rate of 25 (where 80+ would be expected) and a Code had to be declared. The nurses on the psychiatric unit were not allowed to push a mainline insertion into the artery to administer the atropine and the harried resident had to perform the procedure himself. Fortunately, this individual knew what he way doing and likely saved a life. In another case, the patient was delirious and routine neurological exam - made up on the psychiatric unit, not in the emergency room where it ought to have been done - resulted in his being rushed into the operating room to save his life.

In all three cases, training is more than adequate. The delivery of additional training would not have made a difference. The individual knew concentrated potassium was toxic but grabbed the wrong container, the pharmacist knew the proper mixture, and the emergency room knew how to conduct basic physical(neurological)  exams for medical well being. What then is the recommendation?

One timely suggestion is to manage quality and extreme complexity by means of check lists. A checklist of high alert chemicals can be assembled and referenced. Wherever a patient is being delivered a life-saving therapy, sign off on a checklist of steps in preparing the medication can [should] be mandatory. The review of physical status of patients in the emergency room is perhaps the easiest of all to be accommodated, since vital signs and status are readily definable. Note that such an approach should contain a "safe harbor" for the acknowledgment of human and system errors as is routinely performed in the case of failures of airplane safety, including crashes. Otherwise, people will be people are try to hide the problem, making a recurrence  inevitable.

The connection with healthcare information technology (HIT) is now at hand. IT professionals have always been friends of check lists. Computer systems are notoriously complex and often are far from intuitive. Hence, the importance of asking the right questions at the right time in the process of trouble shooting the IT system. Healthcare professionals are also longtime friends of checklists for similar reasons, both by training and experience. Sometimes symptoms loudly proclaim what they are; but often they can be misleading or anomalous. The differential diagnosis separates the amateurs from the veterans. Finally, we arrive at a wide area of agreement between these two professions, eager as they are to find some common ground.

Naturally, a three ring binder on a shelf with hard copy is always a handy backup; however, the computer is a ready made medium for delivering advice, including top ten things to watch in the form of a checklist, in real time to a stressed provider. In this case of emergency room and clinics, the hospital information system (HIS) is the choice platform to install, update, and maintain the checklist electronically. However, this means that the performance of the system needs to be consistent with delivery of the information in real time or near real time mode. It also means that the provider should be trained in the expert fast path to the information and need to hunt and peck through too many screens. The latter, of course, would be equivalent to not having a functioning list at all.

And this is where a dose of training in information technology will make a difference. The prognosis is especially favorable if the staff already have a friendly - or at least accepting - relationship with the HIS. It reduces paper work, improves workflow, and allows information sharing to coordinate care of patients.

This is also a rich area for further development and growth as system provide support to the physician in making sure all of the options have been checked. The system does not replace the doctor, but acts like a co-pilot or navigator to perform computationally intense tasks that would otherwise take too much time in situations of high stress and time pressure. Obviously issues of high performance response on the part of the IT system and usability (from the perspective of the professionals staff) loom large here. Look forward to further discussion on these points. Meanwhile, we now add another item to add to the vendor selection checklist in choosing a HIS: must be able to provide templates (and where applicable, content) for clinical checklists by subject matter area.

It should be noted that "the checklist manifesto" is the recommendation in a book of the same title by the celebrity physician, Atul Gawande, MD, who is also the author of an engaging column in the New Yorker on all manner of things medical and a celebrity since being quoted by President Obama in the healthcare debate. However, make no mistake. The checklist is not a silver bullet. By all means, marshal checklists and use them to advantage. Still, there are some scenarios where long training, experience, a steady hand, and concentration are indispensable. In flight school, take off and landing checklists are essential. This is the normal checklist. Non-normal checklists are also available on how to restart your engines from 30,000 feet. But a double bird strike on take-off at 5000 feet, disabling both engines, is not on anyone's list. You cannot even get the binder down off the shelf in the amount of available time, the latter generally being the critical variable in short supply. Then there is no substitute for experience, leavened with a certain grace.  But for the rest of us humans, and in the meantime, assemble, update, and use your checklists.

[1] "Potassium may no longer be stocked on patient care units, but serious threats still exist"  Oct 4 2007,

[2] "An Injustice has been done,"

]]> Mon, 16 Aug 2010 13:08:27 -0700
Life Imitates Reality TV (Again): Mark Hurd Out at H-P

The cruel economy strikes again, and Mark Hurd is out as CEO at Hewlett-Packard Co. (H-P). First lesson learned here? Always be sure that your expense reports are accurate. If the participant in the event was "Jodie Fisher," then the documentation should say "Jodie Fisher". And, of course, while $20K - the alleged amount of over payment for marketing services reportedly not performed - is a tad north of dinner for 25 people at Charlie Trotter's (Chicago), we are not talking Enron (fortunately).  H-P said its investigation showed that Mr. Hurd did not violate H-P's sexual harassment policy, but other anomalies were uncovered.[1] There is no excuse for it, nor will any be suggested here. Zero tolerance strikes again - as well it should. And yet ...


Lesson number two, and this is a tough one. As Presidential candidate Jimmie Carter said in a notorious interview with Playboy magazine (in 1975), "In my heart I have lusted after woman." Much as the tabloids might miss the details of a juicy sexual adventure, this is not one. If this is not the case of a sexual phantasy gone astray, then I would not know one. Make no mistake. Mark Hurd is no Bill Clinton, who after all got to keep his job (along a strict party line vote). This is a severe penalty to pay for the "crime" of having a sexual phantasy. The cynic in me momentarily says Mr. Hurd ought to have been fired for failing to score, but that is an idea for which I apologize in advance to him, knowing that he is family man. Leave it to a numbers guy to try to impress a girl with ... well, numbers. No doubt he will schedule some time off to repair the damage done, and that is as it should be.


The third lesson? Life imitates art? Not quite. Life imitates Reality TV. And this is the sad part. Not to dump on the would be  femme fatal in this case, who, after all, was literally a Hollywood star and a contestant in a Reality TV show, she anticipated a settlement and reportedly has one. Not having a single fact in the matter, I infer that this is the governance paradigm in reality TV. Write a letter; get even more publicity (and maybe a settlement or at least another TV show); no harm done. Not quite. Now she says she's sorry that Mr. Hurd lost his job because he turned over her letter alleging sexual harassment to the H-P Board. I believe it. Rumors that she has reportedly invited him to appear with her on the next episode of The Apprentice have been debunked. While this has rich comic possibilities, I fear that I have crossed the line between reality and phantasy (again), just like this entire episode.


And that is the final lesson here. Strict corporate governance is appropriate. Corporate governance is different than The Lives of the Rich and Famous. The pendulum swings back-and-forth between wild parties and zero tolerance. Mr. Hurd's mistake was momentarily to confuse the non-reality of Reality TV with trust and integrity between human beings. Yet he ought to have known that, in many contexts, a mere conversation about sex (or anything remotely sexual) is a method of gaining power and maximizing revenue (through a settlement), especially among those who traditionally have lacked power. Poor judgment in choosing friends? You bet! The unintended result? Corporate America has lost a powerful and articulate leader, albeit an imperfect one, at a time when leadership is in short supply. We eagerly await his reflections (next book?) on life in the corporate jungle, business leadership, and the cruel economy; and if Mr. Hurd requires an editor or a foil for his ideas, then I hope he will reach out. He has a listening here.

[1] August 8, 2010, "Mark Hurd Failed to Follow HP Code," BEN WORTHEN and JOANN S. LUBLIN,

]]> Data Warehousing Mon, 09 Aug 2010 09:54:16 -0700
BMMsoft Raises the Bar on Data Warehousing: Integrating Structured and Unstructured Data

What happens when an irresistible force meets an immovable object? We are about to find out. The irresistible force of BI, eDiscovery, compliance, fraud detection, governance, risk management, and other analytic and regulatory mandates is heading straight toward the immovable rock of year-to-year 10% reductions in information technology budgets.


The convergence of the markets for structured and unstructured data has been heralded many times, but maybe the time has come. We think that the new generation of solutions with increasing overlap of structured and unstructured data and multi-functionality will emerge and that BMMsoft EDMT® Server is the pioneer in that space. Looking into the crystal ball, what will happen is that an increasing overlap already underway will disrupt incumbents across these diverse markets.

The world wide BI Market as defined by Gartner is sized at $8.8B.[1] Realistically that includes a lot of Business Objects (SAP), SAS applications, and IBM solutions so the database part of that is probably closer to $6 billion.[2] The document management software market is estimated at nearly $3 billion.[3] While email archiving is relatively new and growing rapidly due to near federal regulations, it has now reached the $1 billion "take off" point. In short, at nearly $10 billion total, a product that addressed requirements across all three of these markets with a reasonable prospect of response from even one third of the enterprises, would have an outside boundary of over $3 billion. This is a substantial market under any interpretation.


In the meantime, the exiting markets for these three classes of products is fragmented into silos of the traditional data warehousing vendors, email archiving, and document management, the latter sometimes including compliance and governance software. The first are well known in the market - extending from such stalwarts as HP, IBM, Oracle, Microsoft, SAP, to data warehousing appliances and column-oriented databases - and will not be repeated here (though one new developments will be noted below). Document management systems include IBM FileNet Business Process Manager (, EMC Documentum (, OpenText LiveLink ECM (, Autonomy Cardiff Liquid Office ( Strictly speaking, risk management is considered a separate market from document management. Risk management and compliance offerings include Aventis (, BWise (, Cura (, Protiviti (, Compliance 360 ( and IBM, which has at least two offerings one based on Lotus Notes and one based on FileNet. This list is partial and could easily be expanded with many best of breed offerings. The result? Fragmentation. Diversity, though not in a positive sense. Many offerings instead of a comprehensive approach to unified access and unified analysis.


Five years from now data will be as heterogeneous as ever and the uses of data even more so, but individual products - single instance products, not solutions - will characterize a transformed market for database management that traverses the boundaries between email archiving, document management, and data warehousing with agility that is only dreamt about in today's world. Video clips are now common on social networking sites such as Facebook and YouTube. Corporate sponsorship of such opportunities for viral marketing is becoming more common. The requirement to track and manage product brands and images will necessitate the archiving of such material, so multi-media (image/video/audio) are being added to the mix.


This future is being driven and realized by the imperative for business transparency, risk management and compliance, and growing regulatory requirements layered on top of existing business intelligence and document management requirements. Still, document management is distinct from workflow. If an enterprise needs workflow, then it will continue to require a special purpose document management system. Workflow was invented by FileNet in 1985, acquired by IBM in 2006, and continues to lead the pack where detailed step-by-step process engineering is required. Elaborate rules-engines for enterprise decision management are different than compliance. If an enterprise requires a rule-engine for compliance and governance, then it will need a special purpose compliance, risk management, and governance system. Such solutions would be over-kill for those enterprises that require email archiving for eDiscovery, document management for first order compliance, and cross references to transactional data in the data warehouse. While the future is uncertain, one of the vendors to watch is BMMsoft.


Innovation Happens

BMMsoft has put together a product delivering functionality across these three previously unrelated silos - data warehousing, eDiscovery (e-mail), and document management - and able to be purchased as a EDMT®Server - a single part number from BMMsoft (EDMT stands for "E-Mail, Documents, Media, Transactions"). The database "under the hood" is Sybase IQ, a column-oriented data store with a proven track record and several large objectively audited benchmarks. The latest of these weighs in at 1000 terabytes - a petabyte - and was audited by Francois Raab, the same professional who audits the benchmarks.[4]  The business need is real and based on customer acceptance. So is the product.


The three keys to connect and make intelligible the data from the three different sources are:


1) extreme scalability to handle the data volumes - this is where a column-oriented database would come in handy since the storage compaction is intrinsic and prior to the additional compression that could be applied;

2) parallel, real-time high performance ETL functionality to load all the data; and finally

3) search capabilities that enable high performance inquiries against the data.

Such unified access to diverse data types, intelligently connected by metadata, is also sometimes described as a "data mashup."


A part of the challenge that a start up - and up start - such as BMMsoft will face is building credibility, which BMMsoft has already solved with numerous client installations in production and success stories. In the case of BMMsoft EDMT® Server there is another consideration:  metadata is an underestimated and underdeveloped opportunity. Innovations in metadata that make possible many applications that require cross referencing emails, documents, and the transactional data. For example, fraud detection, threat identification, enhanced customer relations - all require navigating across the different data types. Metadata makes that possible. That is not an easy problem to solve; and BMMsoft has demonstrated significant progress with it. Second, the column-oriented database is intrinsically skinny in terms of data storage in comparison with the standard relational database, which continues to be challenged by database obesity. As data warehouses scale up, the cost of storage technology becomes a disproportionably large part of the price of the entire system. Note that for the column-oriented approach proportional cost savings come into view and are realized. Third, this also has significant performance implications, since if there is less data - in terms of volume points - to manage, then it is faster to do so. So when all the reasons are considered, the claims are quite modest, or at least in line with common sense. The wonder is that no one thought of it sooner.


When you think about it for a minute, there is every reason that an underlying database should be capable of storing a variety of different data types and doing so intelligently. The latter intelligence is the "secret sauce" that differentiates BMMsoft. The relationships between the different types of data are built as the data is being loaded by BMMsoft using multiple software technology patents.  The column-orientation of the underlying data store - Sybase IQ - intrinsically condenses the amount of space required to persist the information, yielding up to an order of magnitude - more typically a factor of two or three - in storage savings, even prior to the application of formal compression algorithms. This fights database obesity across all segments - email, document, media, transactional (structured) data warehousing information. This means that the application that lives off of the underlying data is able to take advantage of performance improvements since less data is being stored and more being fetched with every data retrieval. For those enterprises with a commitment to installed Oracle or MySQL infrastructure, BMMsoft provides investment protection. The EDMT® Server runs also on Oracle, Netezza and MySQL and can be easily ported to any other relational Database. 


Thus, BMMsoft is a triple threat and is able to function as a standalone product addressing data warehousing, email archiving, and document management requirements as separate silos. But just as importantly, for those enterprises that need or want to compete with advanced applications in fraud detection, security threat assessment, customer data mining beyond structured data, BMMsoft offers the infrastructure and application to do so. For example, the ability to perform cross-analysis between securities traded on the stock market and those companies named in email and voice mail (remember multimedia handling) will immediately provide a short list for follow up detection on on-going insider trading or other fraudulent scheme. While hindsight is 20-20, a similar method of identifying emerging patterns through cross-analysis would have been be useful in surfacing the 8 billion dollar Societe General fraud, Madoff's nonexistent options plays at the basis of the pyramid, the Georgia Tech shooter, and relevant chatter that shows up prior to terrorist attacks. Going forward, this technology is distinct in that it can be deployed on a small, medium, or large scale to highlight emerging hot spots that require attention.


One may object - but won't the competition be able to reverse engineer the functionality and provide something similar using different methods? Of course, eventually every innovation will be competitively attacked by some more-or-less effective "work around." Read the prospectus - new start ups and existing software laboratories at HP, IBM, etc. will eventually produce innovations that challenge the contender. However, that could require three to five years. Then there is the matter of bringing it to market. IBM provides an example, based on publicly available news reports. IBM went out and purchased FileNet for about $5 billion dollars. FileNet is a great company, which virtually invented workflow, and if one requires advanced workflow capabilities, it is hands down a good choice. However, it does not do data warehousing or email archiving. As a subsidiary of IBM which delivers substantial revenue to the "mother ship," the executives in charge will set a high bar on any IBM innovations which combine email archiving and structured data warehousing with document management. In short, IBM is faced with the classic innovator's dilemma.[5] The price points that interest it - both internally and externally - are further up on the curve than the deals that BMMsoft will be able to complete. Given that BMMsoft has established presence in the market, it has a good chance of marching up market, displacing the installed, legacy solutions as it goes. This happened before in the client server revolution when IBM mainframe deals at the several million dollar price point were undercut by a copy of PowerBuilder and a copy of Sybase, albeit a different version of the database. Given that BMMsoft has a head-start, it is exploiting first mover advantages and building an installed base that will be challenged only with great difficulty. The relevance of such technology in the context of healthcare information technology (HIT) will be explored in a pending post. Please stand by for update - and keep in touch!

[2] For an alternative point of view see an IDC forecast (published 2007) that pegs the Data Warehouse management/platforms market as approx $8.97B in 2010

[3] Cited in "Document Management Systems Market: 2007 - 2010,":



[5] Clayton Christensen, The Innovator's Dilemma. Cambridge, MA: Harvard Business School Press, 1998.

]]> Data Warehousing Mon, 09 Aug 2010 08:31:11 -0700
Datawatch is One to Watch in Healthcare Information Technology (HIT)

Datawatch provides an ingenious solution to information management, integration, and synthesis by working from the outside inwards. Datawatch's Monarch technology reverse engineers the information in the text files that would otherwise be sent to be printed as a hardcopy, using the text file as input to drive further processing, aggregation, calculation, and transformation of data into usable information. The text files, PDFs, spreadsheets, and related printer input become new data sources. With no rekeying of data and no programming, business analysts have a new data source to build bridges between silos of data in previously disparate systems and attain new levels of data integration and cohesion.


datawatch (5).JPG

For those enterprises running an ERP system for back office billing such as SAP or a hospital information system (HIS) such as Meditech, the task of getting the data out of the system using proprietary SAP coding or native MUMPS data store can be a high bar, requiring custom coding. Datawatch intelligently zooms through the existing externalization of the data in the reports, making short work of opening up otherwise proprietary systems.


Note that a trade-off is implied here. If your reporting is a strong point, Datawatch can take an installation to the next level, enabling coordination and collaboration, breaking down barriers between reporting silos that were previously impossible to bridge and doing so with velocity. Programming is not needed, and the level of difficulty is comparable to that of managing an excel spreadsheet targeting a smart business analyst. However, if the reports are inaccurate or even junk, even Datawatch cannot spin the straw into gold. You will still have to fix the data at its source.


Naturally, cross functional report mining works well in most verticals extending from finance to retail, from manufacturing to media, from the public sector to not for profit organizations. However, what makes healthcare a particularly inviting target is the relatively late and still on-going adoption of data warehousing combined with the immediate need to report on numerous clinical, quality and financial metrics such as the pending "Meaningful Use" metrics created via the HITECH Act. This is not a tutorial on meaningful use; however, further details can be found in a related article entitled "Game on! Healthcare IT Proposed Criteria on 'Meaningful Use' Weigh in at 556 Pages" click here. One of the goals of "meaningful use" in HIT is to combine clinical information with financial data in order to drive improvements in quality care, patient safety and operational efficiency while simultaneously optimizing cost control and reduction. The use of report mining and integration of disparate sources also allow the healthcare industry to migrate towards a pay-for-performance model, whereby providers will be reimbursed based on the quality and efficiency of care provided. However, financial, quality, clinical metrics and the evolving P4P models all require cross functional reporting from multiple systems. Even for many modern hospital information systems (HIS) that is a high bar. For those enterprises without an enterprise-wide data warehousing solution, no one is proposing to wait three to five years for a multi-step installation prior to learning the needed data still requires customization. In the interim, Datawatch has a feasible approach worth investigating.


In conversations with Datawatch executives John Kitchen (SVP Marketing) and Tom Callahan (Healthcare Product Manager), I learned that Datawatch has more than 1,000 organizations in the healthcare sector using Datawatch technology. Datawatch is surely a well kept secret, at least up until now. This is a substantial resource for best practices, methods and models, and lessons learned in the healthcare area. Datawatch can leverage these resources to its advantage and the benefit of its clients. While this is not a recommendation to buy or sell any security (or product), as a publicly traded firm, Datawatch is well positioned to benefit as the healthcare market continues its expansion. Datawatch provides a compelling business case with favorable ROI from the time of installation to the delivery of problem-solving value for the end user client. The level of IT support required by Datawatch is minimal, and sophisticated client departments have sometimes gone directly to Datawatch to get the job done.


Let's end with a client success story in HIT. Michele Clark, Hospital Revenue Business Analyst, Los Angles based Good Samaritan Hospital, comments on the application of Datawatch's Monarch Pro: "We simply run certain reports from MEDITECH's scheduling module, containing data for surgeries already scheduled, by location, by surgeon. We then bring those reports into Monarch Pro. Then, in conjunction with its powerful calculated fields, Monarch allows us to report on room utilization, block time usage and estimated times for various surgical procedures. The flexibility of Monarch to integrate data from other sources results in a customized, consolidated dataset in Monarch. We can then analyze, filter and summarize the data in a variety of ways to maximize the efficiency of our operating room resources. Thanks to Monarch, we have dramatically improved the utilization of our operating rooms, can more easily match available surgeons with required upcoming procedures, and better manage surgeon time and resources. Our patients are receiving the outstanding standard of care they expect, while we make the most of our surgical resources. This kind of resource efficiency is talked about a lot in the healthcare community. With Monarch, we are achieving it."  This makes Datawatch one to watch.

]]> Healthcare Information Technology (HIT) Thu, 01 Jul 2010 09:59:17 -0700
Datameer Launches - an Option for Big Data - Really Big Data Datameer takes its name from the sea - the sea of data - as in the French la mer or German, das Meer.


I caught up with Ajay Anand, CEO, and Stefan Groschupf, CTO. Ajay earned his stripes as Director of Cloud Computing and Hadoop at Yahoo. Stefan is a long-time open source consultant, and advocate, and cloud computing architect from EMI Music.


Datameer is aligning with datameerlogo.JPGthe two trends of Big Data and Open Source. You do not need an industry analyst to tell you that data volumes continue to grow, with unstructured data growing at a rate of almost 62% CAGR and structured less, but a still substantial 22% (according to IDC). Meanwhile, open source has never looked better as a cost effective enabler of infrastructure.


The product beta is launched with McAfee, nurago, a leading financial services company and a major telecommunications service provider  in April with the summer promising to deliver early adopters with the gold product shipping in the autumn. (Schedule is subject to changes without notice.) 


The value proposition of Datameer Analytics Solution (DAS) is  helping users perform advanced analytics and data mining with the same level of expertise required for a reasonably competent user of an Excel spreadsheet.


As is often the case, the back story is the story. The underlying technology is Hadoop. Hadoop is an open source standard for highly distributed systems of data. It includes both storage technology and execution capabilities, making it a kind of distributed operating system, providing a high level of virtualization. Unlike a relational database where search requires chasing up and down a binary tree, Hadoop performs some of the work upfront, sorting the data and performing streaming data manipulation. This is definitely not efficient for small gigabyte volumes of data. But when the data gets big - really big - like multiple terabytes and petabytes, then the search and data manipulation functions enjoy an order of magnitude performance improvement. The search and manipulation are enabled by the MapReduce algorithm.  MapReduce has been made famous by the Google implementation as well as the Aster Data implementation of it. Of course, Hadoop is open source. MapReduce takes a user defined mapping function and a user defined reduce function and performs key pair exchange, executing a process of grouping, reducing, and aggregation at a low level that you do not want to have to code yourself. Hence, the need for and value in a tool such as DAS. It generates the assembly level code required to answer business and data mining questions that business wants to ask of the data. In this regards, DAS functions rather like a Cognos or BusinessObjects front-end in that it presents a simple interface in comparison to all the work being done "under the hood". Clients who have to deal with a sea of data now have another option for boiling the ocean without getting steamed up over it.

]]> Agile Methods Thu, 15 Apr 2010 09:21:54 -0700
Greenplum Launches Major Release, Gets MAD I caught up with Ben Werther, Director of Product Marketing, for a conversation about business developments at Greenplum and Greenplum's major new release.


According to Ben, Greenplum has now surpassed more than 100 enterprise customers and is enjoying revenue growth of about 100%, albeit from a revenue base that befits a company of relatively modest size. They also claim to be adding new enterprise customers faster than either Teradata or Netezza.


What is particularly interesting to me is that with its MAD methodology Greenplum is building an agile approach to development that directly addresses the high performance of its massively parallel processing capabilities. This is an emerging trend in high end parallel databases that is receving new impetus. More on this shortly. Meanwhile, release 4.0 includes enterprise class DBMS functionality such as -

-        Complex query optimization

-        Data loading

-        Workload Management

-        Fault-Tolerance

-        Embedded languages/analytics

-        3rd Party ISV certification

-        Administration and Monitoring

From the perspective of real world data center operations, the workload management features are often neglected but are critical path for successful operations and growth. Dynamic query balancing is a method used on mainframes for the most demanding workloads, and Greenplum has innovated in this area, with its solution now being "patent pending".


Just in case scheduling does not turn you on, a more sexy initiative is to be found in fault tolerance. Given that Greenplum is an elephant hunter, favoring large and high end installations, this is news you can use. Greenplum Database 4.0 enhances fault tolerance using a self-healing physical block replication architecture. Key benefits of this architecture are:

-        Automatic failure detection and failover to mirror segments

-        Fast differential recovery and catchup (while fully online / read-write)

-        Improved write performance and reduced network load

 Greenplum has also made is easier to update single rows against on-going queries. While data warehouses are mostly inquiry-intensive, it has been a well known secret that update activity is common in many data warehousing scenarios, driven by business changes to dimensions and hierarchies.


At the same time, Greenplum is announcing a new product - Chorus - aimed at the enterprise data cloud market. Public cloud computing has the buzz. What is less well appreciated is that much of the growth is in enterprise cloud computing - clouds of networked data stores with (relatively) user friendly frontends within the (virtual) four walls of a global enterprise such as a telecommunications company, bank, or related firm.



E N T E R P R I S E  D A T A  C L O U D

This shows the Enteprrise Data Cloud schematically with the Greenplum database on top of the virtualized commodity hardware, operating system, public Internet tunnel, and Chorus abstraction layer. Chorus aims at being the source of all the raw data (often 10X size of the EDW); providing a self-service infrastructure to support multiple marts and sandboxes; and, finally, furnishing a rapid analytic iteration, and business led solution. Chorus enables security, providing extensive, granular access control over who is authorized to view and subscribe to data within Chorus; collaboration, facilitating the publishing, discovery, and sharing of data and insight using a social computing model that appears familiar and easy-to-use. Chorus takes a data-centric approach, focusing on the necessary tooling to manage the flow and provenance of data sets as they are created/shared within a company.

One more thing. Even given the blazingly fast performance of massively parallel processing data warehousing, heterogeneous data requires management. It is becoming an increasingly critical skill to surround one's data and make it accessible with a useable, flexible method of data management. Without a logical, rational method of organizing data, the result is just more proliferating, disconnected  islands of information. Greenplum's solution to this challenge? Get MAD!

Of course, this is a pun, standing for a platform capable of supporting the magnetic, agile, and deep principles of MAD Skills. "Magnetic" does not refer to disk, though there is plenty of that. This conforms to data warehousing orthodoxy in one respect only - it agrees to get all the data into one repository; but it does not subscribe to the view that it must all be conformed or rendered consistent. This is where the "agile" comes in - deploying a flexible, stabe-by-stage process and in parallel. A laboratory approach to data analysis is encouraged with cleansing and structuring being staged within the same repository. Analysts are given their own "sandbox" in which to explore and test out hypotheses about buying behavior, trends, and so on. Successful solutions are generalized as best practices. In effect, given the advances in technology, the operational data store is a kludge that is no longer required. Regarding the "deep," advanced statistical methods are driven close to the data. For example, one Greenplum customer had to calculate the ordinary least square (OLS is a method of fitting a curve to data) by exporting the data into the statistical language R for calculation and then importing it back, a process that required several hours. This regression was moved into the database thanks to the capability of Greenplum and ran significantly faster due to much less data movement. In another example involving highly distributed data assembled by Chorus, T-Mobile assembled data from a number of large untapped sources (cell phone towers, etc), as well as data in the EDW and others source systems, to build a new analytic sandbox; ran a number of analyses including generating a social graph from call detail records and subscriber data; and discovered behavior where T-Mobile subscribers were seven times more likely to churn if someone in their immediate network left to another service provider. This work would ordinarily require months of effort just to provision databases and discover and assemble the data sources, but was completed within two weeks while deploying a one petabyte production instance of Greenplum Database and Greenplum Chorus. As the performance bar goes up, methodologies and architectures (such as Chorus) are required to sprint ahead in order to keep up. As already noted and in summary, with its MAD methodology, Greenplum is building an agile approach to development that promises to keep up with the high performance bar of its massively parallel processing capabilities. An easy prediction to make is that the competitors already know about it and are already doing it. Really!?


]]> Business Intelligence Wed, 14 Apr 2010 07:46:21 -0700
Provider Performance Management Comes to Healthcare Information Technology There are so many challenges that it is hard to know where to begin. For those providers (hospitals and large physician practices) that have already attained a basic degree of automation there is an obvious next step - performance improvement. For example, if an enterprise is operating eClinic Works (ECW) or similar run-your-provider EHR system, then it makes sense to take the next step and get one's hand on the actual levers and dials
that drive revenues and costs.

Hospitals (and physician practices) often do not understand their actual costs, so they are struggling to control and reduce the costs of providing care. They are unable to say with assurance what services are the most profitable, so they are unable to concentrate on increasing market share in those services. Often times when the billing system drives provider performance management, the data, which is adequate for collecting payments, is totally unsatisfactory for improving the cost-effective delivery of clinical services. If the billing system codes the admitting doctor as responsible for the revenue, and it is the attending physician or some other doctor who performs the surgery, then accurately tracking costs will be a hopeless data mess. The amount of revenue collected by the hospital may indeed be accurate overall; but the medical, clinical side of the house will have no idea how to manage the process or improve the actual delivery of medical procedures.

Thumbnail image for Thumbnail image for riverlogicjpg.JPG

Into this dynamic, enters River Logic's Integrated Delivery System (IDS) Planner ( The really innovative thing about the offering is that it models the causal relationship between activities,
resources, costs, revenues, and profits in the healthcare context. It takes what-if analyses to new levels, using its custom algorithms in the theory of constraints, delivering forecasts and analyses that show how to improve performance (i.e., revenue, as well as other key outcomes such as quality) based on the trade-offs between relevant system constraints. For example, at one hospital, the operating room was showing up as a constraint, limiting procedures and related revenues; however, careful examination of the data showed that the operating room was not being utilized between 1 PM and 3 PM. The  way to bust through this constraint was to charge less for the facility, thereby incenting physicians to use it at what was for them not an optimal time in comparison with golf or late lunches or siesta time. Of course, this is just an over-simplified tip of the iceberg.


IDS Planner enables physician-centric coordination, where costs, resources, and activities are tracked and assessed in terms of the workflow of the entire, integrated system. This creates a context of physician decision-making and its relationship to costs and revenues. Doctors appreciate the requirement to control costs, consistent with sustaining and improving quality, and they are eager to do so when shown the facts. When properly configured and implemented, IDS Planner delivers the facts. According to River Logic, this enabled the Institute for Musculosketal Health and Wellness at the Greenville Hospital System to improve profit  by more than $10M a year by identifying operational discrepancies, increase physician-generated revenue over $1,700 a month, and reduce accounts receivable by 62 down to 44 days (and still falling), which represents the top 1% of the industry.  Full disclosure: this success was made possible through a template approach with some upfront services that integrated the software with the upstream EHR system, solved rampant data quality issues, and obtained physician "buy in" by showing this constituency that the effort was win-win.

The underlying technology for IDS Planner is based on the Microsoft SQL Server (2008) database and Share Point for web-enabled information delivery.

In my opinion, there is no tool on the market today that does exactly what IDS Planner does in the areas of optimizing provider performance.River Logic's IDS Planner has marched ahead of the competition, including successfully getting the word out about its capabilities. The obvious question is for how long? The evidence is that this is a growth area based on the real and urgent needs of hospitals and large provider practices. There is no market unless there is competition; and an overview of the market indicates offerings
such as Mediware's InSight (, Dimensional Insight ( with a suite of the same name, Vantage Point HIS  ( once again with a product of the same name. It is easy to predict that sleeping giants such as Cognos (IBM) and Business Objects (SAP) and Hyperion (Oracle) are about to reposition the existing performance management capabilities of these products in the direction of healthcare providers. Microsoft is participating, though mostly from a data integration perspective (but that is another story), with its Amalga Life Science offering with a ProClarity frontend. It is a buyer talking point whether and how these offerings are able to furnish useable software algorithms that implement a robust approach to identifying and busting through performance constraints. In every case, all the usual disclaimers apply. Software is a proven method of improving productivity, but only if properly deployed and integrated into the enterprise so that professionals can work smarter. Finally, given market dynamics in this anemic economic recovery, for those end-user enterprises with budget, it is a buyer's market. Drive a hard bargain. Many sellers are hungry for it and are willing to go the extra mile in terms of extra training, services, or payment terms.

]]> comparative effectiveness research Mon, 05 Apr 2010 11:33:34 -0700
Reason # 4 Against Certification of EHR technology: Validates -1 Generation Architecture, Stifles Innovation

Just as it is often true that some generals prepare to fight the last war, so too it is also the case that certification is getting ready to validate last generation's client-server technology. The famous example is the construction after World War I of the Maginot Line by France, a series of trench like fortifications in anticipation of trench warfare. Equally famous is (German) General Heinz Guderian's World War II blitzkrieg that went around (and over) the wall using tanks and aircraft, rendering the Maginot line obsolete. Innovations in cloud computing, the provisioning of web based EHRs (EMRs) and database appliances are the equivalent of an emerging technology blitzkrieg in HIT. While we should not represent different players in the market as enemies, large and established HIT vendors (see separately published research here for an overview) are likely to confront a looming innovator's dilemma.[1] In the 1980s IBM would not look at any sales under $1 million dollars and was nearly driven out of business by upstarts building systems with a copy of SQL Server and a copy of PowerBuilder priced one tenth the cost. The dominant HIS and PPMS vendors are positioning themselves to be the next candidates for the unwelcome category of "too big to fail," though this time in HIT, not banking. The recommendation? Stop them before they hurt themselves. Stop them by deleting the word 'certified'. 


The prospective buyers of EHR technology are physician practices (eligible professionals (EP)) and hospitals. They are best protected from the unprofessional practices of a small - very small - minority of system sales people through their own professional procurement practices, diligently reviewing written proposals, specifications, and contracts. It is unlikely to be the basis of a legal claim against a software vendor installed at a given site that failed to satisfy meaningful use criteria for provider that the software system or modules were certified. The provider receives the reimbursement (if any is forthcoming) and is responsible for attaining and demonstrating a use of the system that delivers healthcare services in a more effective way according to the criteria of meaningful use. From such a perspective, certification may be of value to the vendor but it furnishes a false sense of security to the prospective buyers (such as hospitals and eligible physicians).


On the one hand, some of the meaningful use criteria, including those that capture basic clinical transactions, are best delivered by an underlying system consisting of a standard relational database, a frontend query and reporting tool, and a data model that represents the patient demographics, diagnoses and procedures, and related clinical nomenclature. The user interface at which the meaningful use of the technology, including the capturing of data electronically, reporting of data electronically, and the manipulation of quality measures is the contract at which the meaningful use is delivered. Multimillion dollar HIS systems are candidates for "over kill" from a transactional perspective in addressing such use yet at the same time lack the necessary infrastructure to support quality metrics through aggregation and inquiry via a data warehouse function. 


On the other hand, some of the meaningful use criteria - in particular the requirement to report about 100 quality measures - are poorly accommodated by the vast majority of HIS system on the market today. I do not know of a single exception where the HIS system provides for the aggregation and analysis of metrics in a way similar to what business intelligence (now 'clinical intelligence') does in business verticals such as retail, finance, telecommunications. Once again, the user interface at which the quality measures are surfaced is (in effect) the contract, but it will prove impractical to generate such a result. If these systems propose to perform the aggregations of a year's worth of data based on transactional detail, the predictable result will be a long, slow process. Reinventing the wheel is hard work, but that is the path on which existing HIS and PPMS (EHR) systems have embarked. I hasten to add that the quality metrics are critical path and required. However, the path to the efficient and effective production of them does not lie through certification.


Those providers that are currently reporting quality metrics from a data warehouse that gets its data from the transactional EHR are concerned that they are out of compliance. It is widely reported that Intermountain Healthcare is one such enterprise.[2] Are they now supposed to certify their data warehouse? Where is the value-added in that?


A long list of clinical quality measure requires reporting a percentage of a total aggregate of a given diagnostic condition; for example, percentage of patients aged 50 years and older who received an influenza immunization during the flu season (PQR1 110 / NQF 0041). These are what other business operations such as retail or finance call "business intelligence" questions. In the context of healthcare, we might call them "clinical intelligence" - or just plain quality measures - but the function is similar - to make possible the tracking of improvements in the delivery of care. To manage enhancements by measuring outcomes in aggregate. The periodic calculation of some 94 percentages requires scanning the entire database and performing an analysis, review, and aggregation of a totality of the diagnostic data. Even though most records will not satisfy a given diagnosis - whether coronary artery disease of diabetes mellitus - they will have to be examined. The lessons of some two decades of computing in finance, telecommunications, retail, manufacturing, and fast food are clear, but not always  obvious to those medical professional who have spent their time in healthcare IT. When you attempt to perform business intelligence (BI) against the operational, transactional system, then something has got to give. Either the performance of the transactional system is brought to its knees or the BI process has to wait. In fact, system design for high performance transactions processing are poorly adapted to generate the results required to perform business intelligence. The difference is between update intensive operations and query intensive ones, between updating and scanning. That is why the data warehouse was invented - in order to collect and aggregate the data required for reporting in optimal format, allowing the transactional system to do its job supporting clinical process whereas clinical intelligence is used to guide process improvements.


The counter-argument that certification of the functionality around clinical data warehousing ought to be rolled up into the certification process is the reduction to absurdity of the process itself. There will always be some such functionality - if not data warehousing, then cloud computing, database appliances, artificial intelligence in clinical decision support, and so on. The list is limited only by our imagination and that of our fellow innovators. The Meaningful Use Proposal has got it basically right (though one can (and should) argue about details). The reporting of quality measures, based on aggregated clinical detail is a proven method in other disciplines of driving improvements in outcomes by surfacing, managing, and reducing variability from the quality norm. ('If you can't measure it, you can't manage it') However, this value is provided by implementing systems that directly address the superset of meaningful use criteria captured in the proposal, not by driving the process up stream into one particular system architecture that happens to have emerged in the early 1990s and gotten some significant traction.

To see the full Comment submitted to on the rule-making for reimburseable EHR investments go here: Certification _Lou Agosta's Feedback to Centers for Medicare and Medicaid Services, HHS.pdf

[1] Clayton Christensen, The Innovator's Dilemma. Boston: HBS press, 1999. Professor Christensen tells the story of how successful companies - across incredibly diverse and different industries - are single-mindedly motivated to listen only to their biggest and best customers (in general, not a bad thing to do), and are, thereby, blind to innovations that get a foothold at a price point down market and subsequently disrupt the standard market dynamics, leading to the demise of the 'big guys.' Note that IBM, mentioned in the next sentence, is one of the few companies to have succeeded in turning itself around, admittedly in a painful and costly process led by the now legendary Lou Gerstner, and to have brought itself back from the brink. Watch for this dynamic to continue in HIT, albeit at a slower pace due to the friction caused by large government participation and with results such as that at IBM being the rare exception.

[2] For example -

]]> Certification Thu, 18 Feb 2010 07:07:04 -0700
Reason # 3 Against Certification of EHR technology: The Buck Stops with the Provider

Healthcare providers are in the business of fighting disease, not installing and maintaining software. Still, software is now a part of every business process; and it is on the horizon to become a part of the every process of diagnosing, managing, and treating diseases that occurs in a modern first world setting. The healthcare provider (whether eligible physician (EP) or hospital) has to demonstrate meaningful use. That burden cannot be shifted onto a certifying organization. Nor would such certifying entity be willing to warrant that its certified EHR system is used in a meaningful way by the provider merely based on its test scripts. Whose job is it to demonstrate that meaningful use is occurring? It is the provider's job. 


The proverbial buck stops with the EP and hospital. The ONC Certification document cautions: 'Eligible professionals and eligible hospitals that elect to adopt and implement certified EHR Modules should take care to ensure that the certified EHR Modules they select are interoperable and can properly perform in their expected operational environment.' All the usual disclaimers apply - the certified product is not warranted for any particular purpose and your mileage may vary. In other words, the adoption of a 'certified' EHR (EMR) does not create a safe harbor for a provider who purchases, installs, and uses one. Nor is the recommendation of this post that such a safe harbor be implemented, based on the requirement for provider accountability and the needs to improve quality and efficiency. The provider is still responsible for outcomes, demonstrating quality metrics, and showing that the provider makes meaningful use of whatever technology is acquired. Presumably the 'certification' is supposed to provide the provider with guidance (in the market) and 'a leg up' on getting over the bar to reimbursement and quality improvement. However, the result is to privilege a certain class of large, existing, dominant players in the market and to complicate the EHR technology acquisition process.


In general, the criteria of meaningful use represent a contract - an agreement - to which the underlying system (whether certified or not) has to conform in terms of producing usable, interoperable output. The provider will work backwards from the interface to assure that the underlying functionality conforms to its attestation - the provider's sworn testimony ('attestation') to the reimbursement office - that the system functions as required. By applying the principles of object-oriented system design and implementation to the challenge represented by meaningful use, the diligent and conscientious provider eliminate the extra step of certification.


The situation is analogous to that encountered in the so-called ERP revolution (enterprise resource planning) in computing. In the run up to the Y2K (year 2000) event in the year 2000, companies such as SAP, Oracle, JD Edwards, PeopleSoft, and many other ERP systems insisted their architectures were robust and flexible enough to support both transaction processing and BI from the same system architecture and data stores. Fast forward a few years, and by the year 2004 those of these companies that still survived were announcing and delivering revised architectures to support business intelligence. We are on track to repeat an entirely analogous repetition of architectural trial and error. Companies such as Cerner, Epic, Eclipsys, McKesson, GE Centricity, MedSphere, Meditech, and so on predictably claim that their transactional engines based in Mumps and Cache (those are data stores) can handle the job. But the job is complex and is inherently at odds with itself. It is a tad like building a car that gets both good gas mileage and one that has a lot of horsepower. In the software world, the solution is to feed the transactional data into a data warehouse or other business intelligence subsystem and use the warehouse to optimize the transactional one. (The automotive problem has still not been solved, nor will it be in this post.)


However, what about the innocent one to five physician practices that deliver much of the healthcare in the country? Who is going to prevent these practices, smart but totally naïve from an information technology perspective, from buying a 'pig in a poke'?


First, the need for an assistance program to support such practices with the IT process is well taken. The Regional Extension Centers, where such practices can go to get a leg up on the process, are a step in the right direction - when they actually come into existence. There is also merit is something like a 'good house keeping seal of approval' that interoperability standards are actually implemented by the software in question. The availability of 'Geek Squad Class' professional services at an entry level price point using graduates of community college-based HIT programs also belongs on the list - see more details here. However, none of these mandates certification of the EHR software, the latter adding an unprecedented layer of overhead to an already complex process.


If a small physician practice is savvy enough to know its requirements and buy out of the box or configure a custom solution, it makes no sense to require such a practice to limit its choices to those approved by a certification authority, on penalty of being excluded from the reimbursement process. For those that are the opposite of savvy - and require additional support - the certification process will be as misleading as it is incomplete. The practices are still required to demonstrate meaningful use whether or not the software is certified to provide certain functionality that allegedly lines up with these criteria of meaningful use.


There is some controversy about whether small practices will be diligent in interviewing several competing vendors and undertaking at least a mini-RFP process prior to paying some $44K for software. Given that such physicians are alone responsible for demonstrating meaningful use, regardless of the certification status, the recommendation is to err on the side of caution and be a diligent buyer. Such a buyer will require the vendor to explain in detail, step-by-step how its software will address the requirements for reimbursement. That is a de facto, virtual RFP. The push back from small physician practices is looming. How can such small players be expected to acquire a system on their own? Better to partner with a regional hospital who operates an EHR? Of course, the rules and potential loss of independence for doing so are [sometimes] substantial. So here is the rock and here is the hard place. Is there value in buying from a list of 'certified' EHR providers? Yes, but ... The 'but..." is that such a list will not in itself demonstrate "meaningful use". The physician and her or his practice are alone responsible for that.

Update: To see the full Comment submitted to on the rule-making for reimburseable EHR investments go here: Certification _Lou Agosta's Feedback to Centers for Medicare and Medicaid Services, HHS.pdf   ]]> Certification Wed, 17 Feb 2010 07:58:12 -0700
Reason # 2 Against Certification of EHR Technology: Anti-Competitive Technology Lock In

The certification process is anti-competitive in that it advantages existing HIS and PPMS systems architectures, configurations and whatever standards are thereby implemented. This is not bad in itself; however, it does not promote whatever innovations are occurring in a given market - for example, such as those that are occurring in cloud computing and database appliances. Of course, this implies the dynamics of 'technology lock in' as applied to HIT. 'Technology lock in' refers to the situation in which there is a cost associated with switching from one technology to another. There is a common approach to market innovation in software to discount deeply the software at the start of the sales and implementation cycle and recoup some of the discounts through subsequent prices increases that are proportional to switching costs plus the quality cost advantage.[1] This is a fair approach and results in total revenue to the developer and seller of the software that covers the development costs plus a profit roughly proportional to its marketing success based on the perceived and delivered value. One of the consequences - entirely independent of any certification process - is a standards war - where 'war' means 'competition' between alternative technology standards. Let's be clear on this one point. Technology lock in is a fact of life in technology and is not unfair when it is the result of the market voting with its dollars to prefer one standard or architecture over another. What is questionable and arguably unfair is to implement technology lock in by means of a certification process.


If the US Congress and the Office of the National Coordinator of Healthcare Information Technolgy (ONC HIT) want to benefit large established software vendors in the existing EHR technology market - see separately published research here for an overview -  then proceed with the rule-making in its current form. The result will be that the big will get bigger. The powerful, more powerful. Depending on one's perspective, this is not necessarily bad: a certain amount of consolidation and centralization offers economies of scale in the acquisition and installation of hospital information systems (HIS) at large  hospital systems and research institutions. What it does not offer is an opportunity to build out the HIT infrastructure at the community hospital level by innovating in affordable, usable, interoperable systems that address the requirement of the struggling constituencies at the community level. Of course, these two scenarios are not mutually exclusive. Just to pick one example, 'big' is required if you are HCA (formerly Hospital Corporation of America). My point is merely the latter arguably requires less of a financial subsidy in priming the pump (and at least at the level of an intuitive sense of fairness is less deserving of one) than the areas of the economy that lack HIS infrastructure altogether.


The situation in the case of so-called Ambulatory systems is different in that there are literally tens of thousands of small physician practices that will not even look up from their stethoscopes to consider a certified EHR technology with the bar being so far above what they can reasonably contemplate proposing, purchasing, and implementing. Once again the large practices such as Mayo, Intermountain, Geisinger, will require big systems that scale to accommodate hundreds of practitioners. Even if they prefer to report quality metrics from existing data warehousing architecture rather than rip-and-replace in favor of a new EHR with redundant functionality, they have resources to manage the alternatives. On the other hand, maybe not - Intermountain is reportedly complaining that its meaningful use process hits a dead end since its own data warehousing is not 'certified' and it most efficiently and effectively reports its reimbursable quality metrics from it - see story here.


In conclusion, note that a case can be made to implement policies that create a practice environment such as that at Mayo where doctors are on salary, are compensated to perform coordination of care, and are incented to talk about patients for the benefit of patients rather than turning the crank on tests and visits. This is certainly the wave of the future in large urban settings. I hasten to add that such an alternative point of view is not being advocated here. In any case, we do not want to get there by driving individual and small group practitioners into the ground. These practices require a PPMS (EHR) that resembles what Quicken has done for small scale finance - usable, runs on a PC or small server, requires zero (or minimal) maintenance, and has a correspondingly low price point. Alternatively, the delivery of such a set of capabilities over the Internet as an application service provider (ASP) within a medical cloud infrastructure is a compelling value proposition. Going forward, the possibility of such an initiative flourishing in a 'certification' environment is constrained. Still, it is such a compelling value proposition that it may be unstoppable, regardless of certification or reimbursement.  Once such a vendor has succeeded in empowering a client to obtain reimbursement with its EHR, the world of prospective buyers will beat a path to its door. The certification process drops out as so much overhead.

[1] The dynamics around this process are nicely presented and explained in Carl Shapiro and Hal R. Varian, Information Rules: A Strategic Guide to the Network Economy. Boston: Harvard Business Review Press, 1999: p. 114. Update: this just in: To see the full Comment submitted to on the rule-making for reimburseable EHR investments go here: Certification _Lou Agosta's Feedback to Centers for Medicare and Medicaid Services, HHS.pdf

]]> Certification Tue, 16 Feb 2010 07:48:42 -0700
Reason # 1 Against Certification of EHR technology: Anti-Competitive Pricing

Information technology (IT) is a key enabler in the transformation of the delivery of high quality, affordable, accessible healthcare in the USA. This post contains a single recommendation: delete the word 'certified' from 'certified electronic healthcare record'  (EHR) as a requirement for reimbursement under the HITECH portion of the American Recovery and Reinvestment Act (ARRA). In short, implement EHR technology pure-and-simple instead of 'certified EHR' technology. Although easy to state, this is a radical proposal. Thus, a discussion is useful of the pros and cons of a certification approach to transforming the HIT infrastructure using EHR. This acts as a lightening rod to constellate diverse issues and challenges in the market. The advantages have been called out in a previous post here. This post begins a conversation about the disadvantages of certification.


The certification process introduces a substantial distortion into a robust and dynamic market for such EHR software systems at multiple levels. Certification is an anti-competitive and administrative ('bureaucratic') barrier to market entry by innovative cloud computing applications, database appliance providers, and those high end hospitals and physician practices that command sufficient IT resources to implement their own in-house solutions using standard off-the-shelf components. In addition to satisfying the criteria of meaningful use and properly reporting quality metrics, all these three types of organization must now go to a certification authority (whether or its successor organization(s)) and navigate the process of certifying a particular implementation and understanding of how the vendor EHR technology maps to meaningful use. I hasten to add that a given interpretation of the certification criteria (for example, at may be quite legitimate and proper without being the only possible one, the optimal one in a given context, or one on which a healthcare provider decides to 'bet the farm". In short, choosing a solution that is not certified result in the provider being excluded from the reimbursement process and the market whose outer boundary is defined by the reimbursement process.


It is important to note that 'certification' does not just advantage the selected large, existing HIS and PPMS systems at the expense of would-be innovators at the low end of the market. At the risk of sounding cynical, it is one thing to throw over the side smaller players who have no constituency - whether in the market or amongst regulatory administrators. However, certification also disadvantages such software, hardware, and system sellers as HP, IBM, Oracle, Microsoft, SAS, SAP as well as a host of large but lesser known players in the enterprise and corporate markets such as Informatica, Pervasive, ParAccel, Teradata. Certification disadvantages these companies and even more importantly the buyers that they might benefit whether they realize it or not. Perhaps IBM, HP, and so on are earning plenty of revenue and the sales staff does not believe they are disadvantaged. These sleeping giants are largely still just issuing press releases and marketing brochures - admittedly while continuing to perform consulting services - while pondering real investments in HIT software and technology. Well and good - but the real losers are the hospitals and physician practices, paying for reduced competition by means of what going forward I propose to call a 'certification markup' in price.


Eliminating the word 'certified' from the criteria for reimbursement will drive down prices of EHR systems, and, in addition, has the potential approximately to double the size of this growth by 2015, though not directly through the actual reimbursement process itself. How then? By allowing increased software competition not just from new, small innovative players, but by encouraging the entry of large, established IT players, the looming market shake up in EHR technology will be accelerated and deepened. For further background on the existing market see my research 'The Heatlhcare Information Technology Market is Poised for Growth' (here). 


Prospective buyers still need the products offered in a full, robust market and the cost saving that is (would-be) created by full participation. Many of the criteria of meaningful are precisely what the basic technology does best by means of a standard relational database, a frontend query and reporting tool, and a data model that represents the intersection of such basic dimensions as patient data, provider data, diagnosis code, cost, and so on. It strains credibility that well established software products such as IBM, Oracle, Microsoft or well-known brands from Cognos, Business Objects, and Hyperion require 'certification'. Of course, under the proposed rule-making, the healthcare providers that buy such products are still properly required to demonstrate meaningful use according to the proposed criteria and that may require a significant effort (or not); but having these products 'certified' adds no value from functional perspective, and the certification steps falls out of the process of acquiring software and demonstrating its meaningful use. The buck starts and stops with the healthcare provider's demonstration of meaningful use of the software to deliver quality care. The certification step falls out of the process as an idle wheel. Case closed.


One clarification is useful. Unless otherwise noted, this post takes issue with the point of view that 'certification' is a required ('mandatory') one for reimbursement. An alternative point of view that allows 'certification' as a voluntary process is called out.  Such an approach would capture some of the benefits of standardization and testing without the disadvantages of stifling innovation, distorting the market, and privileging one part of the market at the expense of another. Although such a compromise is feasible, it is anyone's guess what will emerge from the rule-making until the pro and con positions have been clearly articulated.


The enabling legislation in the HITECH portion of the ARRA is clear - a certification process is authorized but is unambiguously marked as 'voluntary' in at least two key passages. How this got 'built into' the rule making process is the equivalent of a typographical error, which may usefully be corrected. In short, there is no language in the initial underlying, enabling legislation that would prevent this recommendation [to delete 'certification'] from being accepted and implemented. The exact language is of the HITECH portion of the ARRA as cited in a previous post about 'voluntary certification' is here. As an alternative point of view, acknowledging that certification is voluntary would have many of the same practical results as this proposal to delete the word 'certified' from 'certified EHR' in the rule-making.

Update: To see the full Comment submitted to on the rule-making for reimburseable EHR investments go here: Certification _Lou Agosta's Feedback to Centers for Medicare and Medicaid Services, HHS.pdf ]]> Certification Mon, 15 Feb 2010 07:43:33 -0700