Originally published October 18, 2011
It happens more often than not – at the first mention of the word “taxonomy,” eyes glaze over and those seated in the less conspicuous parts of the room start to exit. Why? Does it have to do with any term having Greek roots, or just with one that elicits memories of long hours attempting to memorize a lengthy classification schema of living things during high school biology class? Taxonomy often has that effect on us, and yet it is both a truly simple concept and an essential one for the effective operation of any enterprise in the information age. Without a method of arrangement, a logical classification of items as we collect or store them, it becomes very difficult to find anything at a later date.
That is the key to taxonomy: It enables an organization to successfully search for specific objects that it may need to accomplish its objectives, whether it be a file, a letter, a book, an invoice, a truck or a piece of information.
Specifically relevant to an enterprise is the concept of the business taxonomy, or the art of describing and organizing information consistently using a controlled vocabulary. Similar to the folder structure you use to organize documents in your computer, a business taxonomy strives to create a common categorization structure that can be used across the enterprise. This categorization structure must be based on simple and intuitive labels so that regardless of who you are – an employee, a customer, a researcher – you can classify, maintain and search information easily and intuitively. Business taxonomies can guide the design of organizational systems, ensuring that a common vocabulary is shared by all. Business taxonomies, along with metadata sets, can be implemented in areas such as data organization, document tagging, system navigation menus and search to help improve information storage, analysis, usability and ”findability.”
Ultimately, it is about finding the things we need. Librarians were arguably the first pioneers in taxonomy. They came up with ways to classify their books or papyrus documents from the earliest times. How else were they going to find anything at the great library in Alexandria? Any one of us familiar with card catalogs in libraries prior to their electronic automation remembers the three dimensions for search: title, author and subject. Title and author searches were always simple since they were direct, specific and factual. But a search on subject was always an adventure since there were so many nuances upon leaving a mainstream topic like China, the French Revolution or the Phoenicians. It was as a result of the need for books to be “findable” that librarians developed the Dewey Decimal System taxonomy and later the Library of Congress Classification taxonomy.
As we move on to the world of Web 2.0, specifically in the public sector, we have to contend with electronic content. The payoff for the work a government agency puts into creating and managing business taxonomies comes specifically when the users – citizens, employees, researchers – can find and use content efficiently. Searching and browsing are the two most common methods people use to find the information they need. By building intuitive business taxonomies and metadata, an agency will provide the structured classification required to enhance both the searching and browsing experience.
There are multiple ways in which business taxonomies can be leveraged to improve content “findability.” Faceted search, auto-complete, targeted content, security and personalization are some of them. These features help content managers restrict search results to certain areas of the taxonomy and present content depending on the user profiles. Additionally, these features help users discover content based on topic, location, document type or intended audience, and ultimately find the content they are looking for in a simple, consistent and intuitive way.
For example, a United States intelligence agency previously managed all its intranet content through a complex publishing process, leaving the content on the intranet outdated and underutilized. By designing a business taxonomy and metadata strategy, the agency was able to develop a solution that provided updated, improved site navigation and more accurate search results. SharePoint was leveraged to accommodate metadata tagging, leveraging library columns for documents, images and pages libraries. These metadata fields enabled faceted searching and filtering as an additional navigation option for this agency’s intranet, significantly improving search and findability for their critical mission.
Another example involves a United States federal agency that protects the public from the risks of injury or death from certain types of products. The agency designed a business taxonomy and metadata strategy for its public website. This strategy has helped the agency receive, classify, and route electronic incident reports from the public about the agency’s products. The strategy includes a product taxonomy, company taxonomy, hazard taxonomy and injury taxonomy, among others. These taxonomies are now used to process thousands of reports for thousands of companies about hundreds of products, and have allowed the agency to implement faceted taxonomy and auto-complete, considerably enhancing the public’s search experience.
Can you imagine the horrors of a world where you can’t find anything? One of the areas that has most clearly illustrated the need for business taxonomies is the legal process of eDiscovery. As part of “discovery” it is fairly common for attorneys to request a large amount of archival documentation, and with electronic discovery now embedded in the new Federal Rules of Civil Procedure, the stage is set for some monumental problems without a truly robust taxonomy to assist in the review process. In their landmark 2007 article titled “Information Inflation: Can the Legal System Adapt,” George Paul and Jason Baron argue that finding anything in the billions of electronic documents involved in eDiscovery in current cases has become unmanageable. Furthermore, the authors estimate that a case with one billion email records, of which one quarter have attachments, would require 100 attorneys to work full time for 54 years in order to complete the adequate discovery process. Hence, “review” – finding something – is almost impossible in terms of time and cost, and the main reason is the lack of an adequate business taxonomy to classify the documents and find the needed content.
So if taxonomy is so important, how do we address it? Here are the primary steps in the design of a best-of-class business taxonomy: revise relevant documentation, interview users and key stakeholders, conduct taxonomy and metadata workshops, convene focus groups to refine the taxonomy, and conduct card-sorting and usability tests.
A taxonomy and metadata workshop is the most important activity when designing a business taxonomy. These workshops, which typically are held over a two to three day period, convene a cross-organizational group of users with the objective of identifying a starter taxonomy based on the needs of end users and content owners. These workshops help set expectations for long-term processes and simplify content categorization requirements to the most basic level.
Creating a business taxonomy is iterative and evolutionary in nature. Beyond some of the suggestions above, it is particularly important for a government agency to consider the following seven key items regarding taxonomy development and business intelligence (BI):
There are a number of sources that can assist an agency in need of taxonomy expertise. A classic text book on the subject is The Accidental Taxonomist, by Heather Hedden, which is the most comprehensive guide available to the art and science of building information taxonomies. Beyond that, every year practitioners gather at the Taxonomy Boot Camp of the KMWorld Conference held annually in Washington, D.C. The authors would be delighted to provide additional guidance to anyone requiring further information.
The key to finding content is a well-defined taxonomy and the metadata that supports it. This has been true for any organization at any time in history, and it is particularly critical for any agency attempting to serve the public in the virtual age. For BI practitioners increasingly intent on extracting business intelligence from unstructured data, a robust taxonomy is an essential prerequisite.
Recent articles by Tatiana Baquero, Dr. Ramon Barquin
Comments
Want to post a comment? Login or become a member today!
Be the first to comment!