Blog: Lou Agosta Subscribe to this blog's RSS feed!

Lou Agosta

Greetings and welcome to my blog focusing on reengineering healthcare using information technology. The commitment is to provide an engaging mixture of brainstorming, blue sky speculation and business intelligence vision with real world experiences – including those reported by you, the reader-participant – about what works and what doesn't in using healthcare information technology (HIT) to optimize consumer, provider and payer processes in healthcare. Keeping in mind that sometimes a scalpel, not a hammer, is the tool of choice, the approach is to be a stand for new possibilities in the face of entrenched mediocrity, to do so without tilting windmills and to follow the line of least resistance to getting the job done – a healthcare system that works for us all. So let me invite you to HIT me with your best shot at

About the author >

Lou Agosta is an independent industry analyst, specializing in data warehousing, data mining and data quality. A former industry analyst at Giga Information Group, Agosta has published extensively on industry trends in data warehousing, business and information technology. He is currently focusing on the challenge of transforming Americas healthcare system using information technology (HIT). He can be reached at

Editor's Note: More articles, resources,and events are available in Lou's BeyeNETWORK Expert Channel. Be sure to visit today!

The answer is clinical data warehousing, decision support, and analytics. What's the question? Wellpoint (one of the leading Blue Cross branded health insurance companies) is reportedly contracting to use IBM's computing grand challenge system nicknamed "Watson" (after IBM's founder) to address a list of clinical issues in medical diagnosis, treatment, and (potentially) cost. In the spirit of Jeopardy!, the question is will it advance in the direction of enabling comparative effectiveness research (CER) and pay for performance (P4P) while enhancing the quality of medical outcomes? Healthcare consumers tend to get a tad nervous when they suspect that insurance companies are going to deploy a new computer system as part of the physician payment approval process, nor (let us be clear) has anyone actually said that will happen in this case.

The diagnosis of a disease is part science, part intuition and artistry. The medical model trains doctors and healthcare specialists using an apprentice system (in addition, of course, to long schooling and lab work). The hierarchical nature of disease diagnosis has long invited automation using computers and databases. Early expert medical systems such as MYCIN at Stanford or CADUCEUS at Carnegie-Mellon University were initially modest sized arrays of if-then rules or semantic networks that grew explosively in resource consumption, time-to-manage, and cost and complexity of usability. They were compared in terms of accuracy and speed with the results generated by real world physicians. The matter of accountability and error was left to be worked out later. Early results were such that automated diagnoses was as much work, slower, and not significantly better - though the automation would occasionally be surprisingly "out of the box" with something no one else had imagined. One lessons learned? Any computer system is better managed and deployed like an automated co-pilot rather than a primary locus of decision making or responsibility.

Work has been ongoing at universities and research labs over the decades and new results are starting to emerge based on orders of magnitude improvements in computing power, reduced storage costs, ease of administration, and usability enhancements. The case in point is IBM's Watson, which has been programmed to handle significant aspects of natural language processing, play jeopardy (it beat the humans), and, as they say in the corporate world, other duties as assigned.

Watson generates and prunes back hypotheses in a way that simulates what human beings do in formulating a differential diagnoses. However, the computer system does so in an explicit, verbose, and even clunky way using massive parallel processing whereas the human expert distills the result out of experience, years of training, and unconscious pattern matching. Watson requires about eight refrigerator size cabinets for its hardware. The human brain still occupies a space about the size of a shoe box.

Still, the accomplishment is substantial. An initial application being considered is having Watson scan the vast medical literature on treatments and procedures to match evidence-based outcomes to individual persons or cohorts with the disease in question. This is where Waton's strengths in natural language processing, formulating hypotheses, and pruning them back based on confidence level calculations - the same strengths that enabled it to win at Jeopardy - come into play. In addition, oncology is a key initial target area because of the complexity of the underlying disorder as well as the sheer number of individual variables. Be ready for some surprises as Watson percolates up innovative approaches to treatment that are expensive and do not necessarily satisfy anyone's cost containment algorithm. Meanwhile, there are literally a million new medical articles published each year, though only a tiny fraction of them are relevant to any particular case. M.D.s are human beings and have been unable to "know everything" there is to know about a specialty for at least thirty years. In short,  Watson just could be the optimal technology for finding that elusive needle in a haystack - and doing so cost effectively.

A medical differential diagnosis in medicine is a set of hypotheses that subsequently have to be first exploded, pruned, and finally combined based on confidence and prior probability to yield an answer. This corresponds to the so-called Deep Question and Answering Architecture implemented in Watson. Within five years, similar technologies will have been licensed and migrated to clinical decision support systems from standard EMR/EHR vendors.

While your clinical data warehouse may not be running 3,000 Power 750 cores and terabytes of self-contained data in a physical footprint about the size of eight refrigerators, some key lessons learned are available even for a modest implementation of clinical data warehousing decision support:

  • Position the clinical data warehouse as a physician's assistant (think: co-pilot) to answer questions, provide a "sanity check," and fill in the gaps created by explosively growing treatments.
  • Plan on significant data preparation (and attention to data quality) to get data down to the level of granularity required to make a differential diagnoses. ICD-10 (currently mandated for 10/2013 but likely to slip), will help a lot, but may still have gaps.
  • Plan on significant data preparation (and more attention to data quality) to get data down to the level of granularity required to make a meaningful financial decision about the effectiveness of a given treatment or procedure. Pricing and cost data is dynamic, changing over time. New treatments start out expensive and become less costly. Time series pricing data will be critical path. ICD-10 (currently mandated for 10/2013 but likely to slip) will help but will need to be augmented significantly into new pricing data structures and even then but may still have gaps.
  • Often there is no one right answer in medicine - it is called a "differential diagnosis" - prefer systems that show the differential (few of them today do, though reportedly Watson can be so configured) and trace the logic at a high level for medical review.
  • Continue to lobby for tort and liability reform as computers are made part of the health care team, even in an assistant role. Legal issues may delay, but will not stop implementation in the service of better quality care.
  • Look to natural language interfaces to make the computing system a part of the health care team, but be prepared to work with a print out to a screen till then.
  • Advanced clinical decision support, rare in the market at this time, is like a resident in psychiatry, in that it learns from its right and wrong answers using machine learning technologies as well as "hard coded" answers from a database of semantic network.
  • This will take "before Google (BG)" and "after Google (AG)" in medical training to a new level. Watson-like systems will be available on a smart phone or tablet to residents and attendings at the bedside.

Finally, for the curious, the cost of the hardware and customized software for some 3,000 Power 750 cores (commercially available "off the shelf"), terabytes of data and including time and effort of a development team of some 25 people with Ph.D.s working for four years (the later being the real expense), my back of the envelope pricing (after all this is a blog post!) weighs in at least in the ball park of $100 million. This is probably low, but I am embarrassed to price it higher. This does not include the cost of preparing the videos and marketing. One final thought. The four year development time of this project is about the length of time to train a psychiatrist in a standard residency program.


  1. "Wellpoint's New Hire. What is Watson?" The Wall Street Journal. September 13, 2011.

  2. IBM: "The Science Behind and Answer":


Posted September 14, 2011 4:06 PM
Permalink | No Comments |

Amid its 39th quarter of consecutive profitability, Pervasive has launched a new SaaS version of its flagship Data Integrator (DI) product called DI "Cloud Edition". In short, as part of a process described by CTO Mike Hoskins as "innovating below the water line," the Pervasive software development team has service-enabled the platform for SOA. This enables the DI product to bring its extensive connectivity to the cloud, either on the Pervasive DataCloud onAmazon's EC2, or on any cloud.


At the same time the architecture was undergoing a major upgrade, innovation was also occurring above the waterline. Significant functionality was added in the areas of data profiling and data matching. Both profiling and matching are now exploiting the DataRush parallel processing engine. According to Jim Falgout (DataRush Chief Technologist), DataRush continues to set records for high performance.The latest record broken was on September 27, 2010 as it delivered an order of magnitude greater throughput than earlier biological algorithm results, performing 986 billion cell updates/second on 10 million protein sequences using a 384 core SGI Ultrix UV1000 in 81.1 seconds.[1] Wow! This enables the entire platform to deliver the basics for a complete data governance infrastructure to those enterprises that know the answer to the question "What are your data quality policies and procedures?" When combined with the data mining open source offering called "KNIME" (Konstanz Information Miner), which featured prominently in numerous use cases at the conference, Pervasive is now a triple threat across data integration, quality, and predictive analytics.


In addition to software innovation, end-user enterprises will be interested to learn that Pervasive is also delivering pricing innovations. Client (end user) enterprises will be unambiguously pleased to hear about this one, unlike some uses of the phrase "pricing innovations" that have meant gaming the system to implement price increases. Instead of per connector pricing, the classic approach for data integration vendors, according to which each individual connector costs an additional fee, Pervasive provides all of its available connectors in DI v10 Cloud Edition for a single reasonable monthly charge. The figure that was given to me was $1000 a month for the software, including all the available connectors. And remember, Pervasive is famous since its days doing business as Data Junction for the diversity of connectors, across the entire spectrum from xbase class to high end enterprise adapters. For those enterprises confronting the dual challenges of a short timeline to results and a large number of heterogeneous data sources, the recommendation is clear: check this out.

[1] Or further details see "Pervasive DataRush on SGI® Altix® UV 1000 Shatters Smith-Waterman Throughput Record by 43 Percent":

Posted November 4, 2010 10:23 AM
Permalink | No Comments |

Healthcare and healthcare information technology (HIT) continue to be a data integration challenge for payers, providers, and consumers of healthcare services. This post will explore the role of Pervasive's Data Integrator software in the solution envisioned by one of the partners presenting at the iNExt, Misys Open Source Solutions (MOSS). However,first we need to take a step back and get oriented to the opportunity and the challenge.

Enabling legislation by the federal government in the form of the American Recovery and Reinvestment Act (ARRA) of 2009 provides financial incentives to healthcare providers (hospitals, clinics, group practices) that install and demonstrate the meaningful use of software to (1) capture and share healthcare data by the year 2011 (2) enable clinical decision support using the data captured in (1) by 2013 and (3) improve healthcare outcomes (i.e., cause patients to get well) against the track record laid down in (1) by 2015. Although the rules are complex and subject to further refinement by government regulators, one thing is clear. Meaningful use requires the ability of the electronic medical record (EMR/EHR) systems to exchange significant volumes of data and do so in a way that preserves the semantics (i.e., the meaning of the data).

Misys Open Source Solutions (MOSS) is jumping with both feet into the maelstrom of competing requirements, standards, and technologies with a plan to make a difference. MOSS first merged with Allscripts and then separated from them as Allscripts merged with Eclipsys. MOSS is now a standalone enterprise on a mission to harmonize medical records. Riding on Apache 2.0 and leveraging the Open Health Tools (OHT), MOSS is inviting would-be operators and participants in health information exchanges (HIE) to download its components and make the integrated healthcare enterprise (IHE) a reality. As of this writing, the need is great and so is the vision. However, the challenges are also formidable and extend from the requirements for patient identification, document sharing, record location, audit trail production and authentication, subscription services, clinical care documentation and persistence in a repository. Since the software is open source and comes at no additional cost, MOSS's revenue model relies on fees earned for providing maintenance and support.

Whether the HIE is public orprivate the operator confronts the challenge of translating between a dizzying array of healthcare standards - HIPAA, HL7, X12, HCFA, NCPDP, and so on. With literally hundreds of data formats in legacy as well as modern systems out there, those HIEs that are able to provide a platform for interoperability are the ones that will survive and prosper by value-added services rather than just forwarding data. The connection with Pervasive is direct since the incoming data formats may be in HL7 version 2.3 and the outbound format in version 2.7. Pervasive is cloud-enabled and, not to put too fine a point on it, has more data conversion formats than you can shake a stick at. Pervasive is a contributor to the solution at the data integration, data quality, and data mining levels of the technology stack being envisioned by MOSS. Now and in the future, an important differentiator between winners and runner-ups amongst HIEs will be the ability to translate between diverse data formats. In addition, this work must be done with velocity and in significant volume, while preserving the semantics. This approach will add value to the content rather than just acting as a delivery service.

As noted in the companion post to this one, Pervasive has a cloud-enabled version of its data integration product, Version 10, Cloud Edition. DataRush, the parallel processing engine, continues to set new records for high performance parallel processing across large numbers of cores. Significant new functionality in data profiling and data matching is now available as well, making Pervasive a triple threat across data integration, data quality, and when open source data mining from Knime is included, data mining. [For example, NCPDP = National Council of Prescription Drug Programs; HCFA = Healthcare Finance Authority (the precursor to the Centers for Medicare and Medicaid); HIPAA = Health Insurance Portability Accountability Act; HL7 = Health Language Seven; X12 = a national insurance standard format for electronic data exchange of health data.

Posted November 4, 2010 10:04 AM
Permalink | No Comments |

Predictive analytics has been Page One news for many years in the business and popular press; and I caught up with Predixion Software principals Jamie McLennan (CTO) and Simon Arkell (CEO) shortly prior to the launch earlier this month. Marketers, managers, and analysts alike continue to lust after golden nuggets of information hidden amidst the voluminous river of data flowing across business systems in the interest of revenue opportunities. Competetive opportunities require fast time to results and competition for clients remains intense when they can be enticed to engage.


Predixion represents yet another decisive step in the march of predictive analytics in two tension-laden but related directions. First, the march down market towards wider deployment of predictive analytics is enabled by deployment in the cloud for a modest monthly fee. Second, the march up market towards engaging and solving sophisticated predictive problems is enabled by innovations in implementing advanced algorithms from Predixion in readily usable contexts such as Excel and PowerPivot. Predixion Insight offers a wizard-driven plug-in for Excel and PowerPivot that uses a cloud-based predictive engine and storage.


The intersection of predictive analytics and cloud computing forms a compelling value proposition. The initial effort on the part of the business analyst is modest in comparison with provisioning an in house predictive analytics platform. The starting price is reportedly $99 per seat per month - though be sure to ask about additional fees that may apply. The business need has never been greater. In survey after survey, nearly 80% of enterprise clients acknowledge having a data warehouse in production. What is the obvious next step, given a repository of clean, consistent data about customers, products, markets, and promotions? Create a monetary opportunity using advanced analytics without having to hire a cadre of statistical PhDs.


Predixion Insight advances the use of cloud computing to deliver innovations in algorithms in advanced analytics. Wizards are provided to analyze key influencers; detect categories; fill from example; forecast; highlight exceptions; prediction calculator; perform shopping basket analysis. While the Microsoft BI platform is not the only choice for advances in usability, those marching to the beat of the usability drum acknowledge the leadership in ease of deployment wizards and implementation practices.


Finally, notwithstanding the authentic advances being delivered by Predixion and its competitors, a few words of caution about predictive analytics. Discovering and determining meaning is a business task not a statistical one. There are no spurious relationships between variables, only spurious interpretations. How to guard against misinterpretation? Deep experience and knowledge of human behavior (customer), the business (service), and the market (intersection of latter two). The more challenging (competitive, immature, commodity-like, etc.) the business environment, the more worthwhile are the advantages attainable through predictive analytics. The more challenging the business environment, the more important is planning, prototyping, and perseverance. The more challenging the business environment, the more valuable is management perspective and depth of understanding. Finally, prospective clients will still need to be rigorous about data preparation and quality. Much of the narrative about ease of use and lowering the implementation bar should not obscure the back-story that many of the challenges in predictive analytics relate to the accuracy, consistency, and timeliness of the data. You cannot shrink wrap knowledge of your business. However, the market basket analysis algorithm and wizards that I saw demoed by Predixion come as close to doing so as is humanly possible so far.

Posted October 5, 2010 7:02 AM
Permalink | No Comments |

Consider three different scenarios that place healthcare patient safety at risk. The first is an individual hazard, the second human behavior, and the third a system issue in the broad sense of "system" as distinct from information technology (IT).

The first consists in placing concentrated potassium alongside diluted solutions of potassium based electrolytes. Now you need to know that intravenous administration of the former (concentrated potassium) results in stopping the heart almost instantaneously. In one tragic case, for example, an individual provider mistakenly selected a vial of potassium chloride instead of furosemide, both of which were kept on nearby shelves just above the floor. A mental slip-erroneous association of potassium on the label with the potassium-excreting diuretic-likely resulted in the failure to recognize the error until she went back to the pharmacy to document removal of the drug.[1] By then it was too late.  

Second, a pharmacist supervising a technician, working under stress and tight time deadlines due to short staffing, does not notice that the sodium chloride in a chemotherapy solution is not .9% as it should be but is over 23%. After being administered, the patient, a child, experiences a sever headache and thirst, lapses into a coma, and dies. This results in legislation in the state of Ohio called Emily's Law. The supervisor loses his license and is reportedly sentenced to six months in jail.[2]

Finally, in a patient quality assurance session, the psychiatric residents on call at a major  urban teaching hospital dedicated to community service express concern that patients are being forwarded to the psychiatric ward without proper medical (physical) screening. People with psychiatric symptoms can be sick too with life-threatening physical disorders. In most cases, it was 3 AM and the attending physician was either not responsive or dismissive of the issue. In one instance, the patient had a heart rate of 25 (where 80+ would be expected) and a Code had to be declared. The nurses on the psychiatric unit were not allowed to push a mainline insertion into the artery to administer the atropine and the harried resident had to perform the procedure himself. Fortunately, this individual knew what he way doing and likely saved a life. In another case, the patient was delirious and routine neurological exam - made up on the psychiatric unit, not in the emergency room where it ought to have been done - resulted in his being rushed into the operating room to save his life.

In all three cases, training is more than adequate. The delivery of additional training would not have made a difference. The individual knew concentrated potassium was toxic but grabbed the wrong container, the pharmacist knew the proper mixture, and the emergency room knew how to conduct basic physical(neurological)  exams for medical well being. What then is the recommendation?

One timely suggestion is to manage quality and extreme complexity by means of check lists. A checklist of high alert chemicals can be assembled and referenced. Wherever a patient is being delivered a life-saving therapy, sign off on a checklist of steps in preparing the medication can [should] be mandatory. The review of physical status of patients in the emergency room is perhaps the easiest of all to be accommodated, since vital signs and status are readily definable. Note that such an approach should contain a "safe harbor" for the acknowledgment of human and system errors as is routinely performed in the case of failures of airplane safety, including crashes. Otherwise, people will be people are try to hide the problem, making a recurrence  inevitable.

The connection with healthcare information technology (HIT) is now at hand. IT professionals have always been friends of check lists. Computer systems are notoriously complex and often are far from intuitive. Hence, the importance of asking the right questions at the right time in the process of trouble shooting the IT system. Healthcare professionals are also longtime friends of checklists for similar reasons, both by training and experience. Sometimes symptoms loudly proclaim what they are; but often they can be misleading or anomalous. The differential diagnosis separates the amateurs from the veterans. Finally, we arrive at a wide area of agreement between these two professions, eager as they are to find some common ground.

Naturally, a three ring binder on a shelf with hard copy is always a handy backup; however, the computer is a ready made medium for delivering advice, including top ten things to watch in the form of a checklist, in real time to a stressed provider. In this case of emergency room and clinics, the hospital information system (HIS) is the choice platform to install, update, and maintain the checklist electronically. However, this means that the performance of the system needs to be consistent with delivery of the information in real time or near real time mode. It also means that the provider should be trained in the expert fast path to the information and need to hunt and peck through too many screens. The latter, of course, would be equivalent to not having a functioning list at all.

And this is where a dose of training in information technology will make a difference. The prognosis is especially favorable if the staff already have a friendly - or at least accepting - relationship with the HIS. It reduces paper work, improves workflow, and allows information sharing to coordinate care of patients.

This is also a rich area for further development and growth as system provide support to the physician in making sure all of the options have been checked. The system does not replace the doctor, but acts like a co-pilot or navigator to perform computationally intense tasks that would otherwise take too much time in situations of high stress and time pressure. Obviously issues of high performance response on the part of the IT system and usability (from the perspective of the professionals staff) loom large here. Look forward to further discussion on these points. Meanwhile, we now add another item to add to the vendor selection checklist in choosing a HIS: must be able to provide templates (and where applicable, content) for clinical checklists by subject matter area.

It should be noted that "the checklist manifesto" is the recommendation in a book of the same title by the celebrity physician, Atul Gawande, MD, who is also the author of an engaging column in the New Yorker on all manner of things medical and a celebrity since being quoted by President Obama in the healthcare debate. However, make no mistake. The checklist is not a silver bullet. By all means, marshal checklists and use them to advantage. Still, there are some scenarios where long training, experience, a steady hand, and concentration are indispensable. In flight school, take off and landing checklists are essential. This is the normal checklist. Non-normal checklists are also available on how to restart your engines from 30,000 feet. But a double bird strike on take-off at 5000 feet, disabling both engines, is not on anyone's list. You cannot even get the binder down off the shelf in the amount of available time, the latter generally being the critical variable in short supply. Then there is no substitute for experience, leavened with a certain grace.  But for the rest of us humans, and in the meantime, assemble, update, and use your checklists.

[1] "Potassium may no longer be stocked on patient care units, but serious threats still exist"  Oct 4 2007,

[2] "An Injustice has been done,"

Posted August 16, 2010 1:08 PM
Permalink | 1 Comment |
PREV 1 2


Search this blog
Categories ›
Archives ›
Recent Entries ›