Seven Things Every Government Agency Should Know about Taxonomies and Business Intelligence

Originally published October 18, 2011

It happens more often than not – at the first mention of the word “taxonomy,” eyes glaze over and those seated in the less conspicuous parts of the room start to exit. Why? Does it have to do with any term having Greek roots, or just with one that elicits memories of long hours attempting to memorize a lengthy classification schema of living things during high school biology class? Taxonomy often has that effect on us, and yet it is both a truly simple concept and an essential one for the effective operation of any enterprise in the information age. Without a method of arrangement, a logical classification of items as we collect or store them, it becomes very difficult to find anything at a later date.

That is the key to taxonomy:  It enables an organization to successfully search for specific objects that it may need to accomplish its objectives, whether it be a file, a letter, a book, an invoice, a truck or a piece of information.

Specifically relevant to an enterprise is the concept of the business taxonomy, or the art of describing and organizing information consistently using a controlled vocabulary. Similar to the folder structure you use to organize documents in your computer, a business taxonomy strives to create a common categorization structure that can be used across the enterprise. This categorization structure must be based on simple and intuitive labels so that regardless of who you are – an employee, a customer, a researcher – you can classify, maintain and search information easily and intuitively. Business taxonomies can guide the design of organizational systems, ensuring that a common vocabulary is shared by all. Business taxonomies, along with metadata sets, can be implemented in areas such as data organization, document tagging, system navigation menus and search to help improve information storage, analysis, usability and ”findability.”

Ultimately, it is about finding the things we need. Librarians were arguably the first pioneers in taxonomy. They came up with ways to classify their books or papyrus documents from the earliest times. How else were they going to find anything at the great library in Alexandria? Any one of us familiar with card catalogs in libraries prior to their electronic automation remembers the three dimensions for search: title, author and subject. Title and author searches were always simple since they were direct, specific and factual. But a search on subject was always an adventure since there were so many nuances upon leaving a mainstream topic like China, the French Revolution or the Phoenicians. It was as a result of the need for books to be “findable” that librarians developed the Dewey Decimal System taxonomy and later the Library of Congress Classification taxonomy.

As we move on to the world of Web 2.0, specifically in the public sector, we have to contend with electronic content. The payoff for the work a government agency puts into creating and managing business taxonomies comes specifically when the users – citizens, employees, researchers – can  find and use content efficiently. Searching and browsing are the two most common methods people use to find the information they need. By building intuitive business taxonomies and metadata, an agency will provide the structured classification required to enhance both the searching and browsing experience.

There are multiple ways in which business taxonomies can be leveraged to improve content “findability.” Faceted search, auto-complete, targeted content, security and personalization are some of them. These features help content managers restrict search results to certain areas of the taxonomy and present content depending on the user profiles. Additionally, these features help users discover content based on topic, location, document type or intended audience, and ultimately find the content they are looking for in a simple, consistent and intuitive way.

For example, a United States intelligence agency previously managed all its intranet content through a complex publishing process, leaving the content on the intranet outdated and underutilized. By designing a business taxonomy and metadata strategy, the agency was able to develop a solution that provided updated, improved site navigation and more accurate search results. SharePoint was leveraged to accommodate metadata tagging, leveraging library columns for documents, images and pages libraries.  These metadata fields enabled faceted searching and filtering as an additional navigation option for this agency’s intranet, significantly improving search and findability for their critical mission.

Another example involves a United States federal agency that protects the public from the risks of injury or death from certain types of products. The agency designed a business taxonomy and metadata strategy for its public website. This strategy has helped the agency receive, classify, and route electronic incident reports from the public about the agency’s products. The strategy includes a product taxonomy, company taxonomy, hazard taxonomy and injury taxonomy, among others. These taxonomies are now used to process thousands of reports for thousands of companies about hundreds of products, and have allowed the agency to implement faceted taxonomy and auto-complete, considerably enhancing the public’s search experience.

Can you imagine the horrors of a world where you can’t find anything? One of the areas that has most clearly illustrated the need for business taxonomies is the legal process of eDiscovery. As part of “discovery” it is fairly common for attorneys to request a large amount of archival documentation, and with electronic discovery now embedded in the new Federal Rules of Civil Procedure, the stage is set for some monumental problems without a truly robust taxonomy to assist in the review process. In their landmark 2007 article titled “Information Inflation: Can the Legal System Adapt,” George Paul and Jason Baron argue that finding anything in the billions of electronic documents involved in eDiscovery in current cases has become unmanageable. Furthermore, the authors estimate that a case with one billion email records, of which one quarter have attachments, would require 100 attorneys to work full time for 54 years in order to complete the adequate discovery process. Hence, “review” – finding something – is almost impossible in terms of time and cost, and the main reason is the lack of an adequate business taxonomy to classify the documents and find the needed content.

So if taxonomy is so important, how do we address it? Here are the primary steps in the design of a best-of-class business taxonomy: revise relevant documentation, interview users and key stakeholders, conduct taxonomy and metadata workshops, convene focus groups to refine the taxonomy, and conduct card-sorting and usability tests.

A taxonomy and metadata workshop is the most important activity when designing a business taxonomy. These workshops, which typically are held over a two to three day period, convene a cross-organizational group of users with the objective of identifying a starter taxonomy based on the needs of end users and content owners. These workshops help set expectations for long-term processes and simplify content categorization requirements to the most basic level.

Creating a business taxonomy is iterative and evolutionary in nature. Beyond some of the suggestions above, it is particularly important for a government agency to consider the following seven key items regarding taxonomy development and business intelligence (BI):

  1. Ensure that content is consistent and compliant with federal and state statutes, acts and regulations.
  2. Request input from external users or from the public affairs office, if the taxonomy is for public use.
  3. Request input from various agency areas and various profiles – multiple roles, responsibilities, or hierarchical positions – to ensure a cross-organizational taxonomy that is intuitive to all.
  4. Select a technology tool that allows for a complete implementation of the desired taxonomy and search features.
  5. Engage upper management to overcome users’ resistance to changing content categorizations.
  6. Analyze old taxonomies and map them to the new taxonomies, so the search engine can also find historic content that is still tagged to old taxonomies.
  7. Communicate the evolving nature of the taxonomy to make sure stakeholders understand the taxonomy will continue to evolve during the course of a project.


There are a number of sources that can assist an agency in need of taxonomy expertise. A classic text book on the subject is The Accidental Taxonomist, by Heather Hedden, which is the most comprehensive guide available to the art and science of building information taxonomies. Beyond that, every year practitioners gather at the Taxonomy Boot Camp of the KMWorld Conference held annually in Washington, D.C. The authors would be delighted to provide additional guidance to anyone requiring further information.
 
The key to finding content is a well-defined taxonomy and the metadata that supports it. This has been true for any organization at any time in history, and it is particularly critical for any agency attempting to serve the public in the virtual age. For BI practitioners increasingly intent on extracting business intelligence from unstructured data, a robust taxonomy is an essential prerequisite.

  • Tatiana BaqueroTatiana Baquero
    Tatiana is a Principal Knowledge Management Analyst at Project Performance Corporation (PPC), part of AEA Group, with experience in the development and implementation of Web-based content management systems and other corporate information systems for the banking, health care, telecom and micro-finance industries, as well as for the government and multilateral organizations. She has extensive experience with knowledge management strategy, portal strategy and taxonomy design for multiple applications, including Microsoft Office SharePoint Server (MOSS).

    Tatiana excels at designing and conducting taxonomy and metadata workshops to effectively manage corporate content. Her project management skills and bilingual skills have allowed her to successfully implement organization development projects in the U.S. and Latin America. She was a board member at the Knowledge Management Institute in Virginia for four years. She holds a M.S. in Engineering Management, with concentration in Knowledge Management from The George Washington University in Washington, DC, and a B.S. in Industrial Engineering with concentration in Organizations Management from Los Andes University in Bogota, Colombia. She is also an ELI graduate from the Leadership Fairfax Institute in Fairfax, VA. She can be reached at tatiana.baquero@ppc.com.


  • Dr. Ramon BarquinDr. Ramon Barquin

    Dr. Barquin is the President of Barquin International, a consulting firm, since 1994. He specializes in developing information systems strategies, particularly data warehousing, customer relationship management, business intelligence and knowledge management, for public and private sector enterprises. He has consulted for the U.S. Military, many government agencies and international governments and corporations.

    Dr. Barquin is a member of the E-Gov (Electronic Government) Advisory Board, and chair of its knowledge management conference series; member of the Digital Government Institute Advisory Board; and has been the Program Chair for E-Government and Knowledge Management programs at the Brookings Institution. He was also the co-founder and first president of The Data Warehousing Institute, and president of the Computer Ethics Institute. His PhD is from MIT. Dr. Barquin can be reached at rbarquin@barquin.com.

    Editor's note: More government articles, resources, news and events are available in the BeyeNETWORK's Government Channel. Be sure to visit today!

     

Recent articles by Tatiana Baquero, Dr. Ramon Barquin


Related TechTarget Editorial Content


 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!