Taxonomy – it’s not just for Librarians
On the way to implementing a more effective search portal for my client, a Medical Publisher, I found myself in unfamiliar territory: the region of thesauri and taxonomies. What did these terms have to do with search? I needed to find out.
Let’s start by defining some terms used loosely in the business-related literature.
A controlled vocabulary is a list of terms that have been enumerated explicitly, have unambiguous definitions, and are not redundant.
A thesaurus is a networked collection of controlled vocabulary terms that includes associative as well as hierarchical relationships. These are often used to connect words of similar or related meanings.
A taxonomy is “a hierarchical classification scheme that has structure and content. It allows you to categorize things at the proper level of specificity and guides you from broad classes to more narrow ones.” A map of Phyla, Orders and Species of living organisms is a classic Taxonomy.
So, based on the above, I was interested in several features:
A Thesarus feature would allow the search engine to match a search term to terms with similar meaning. The classic example in medical literature is the equivalence between the terms ‘MI”, “Myocardial Infarction”, and “Heart Attack”. A query on one of these terms should return hits on all three.
A Taxonomy feature would allow articles to be matched and mapped into a hierarchical arrangement of terms based on their relevance. This mapping could be used as a way to identify like articles, and also as a way to present results to the user.
Sounds good. The other good news here is that there exist widely used Taxonomies in the Medical field. Generations of Librarians, Semanticists, and Information Scientists have labored over the organization of medical terms and concepts. Their results are in the public domain, and include MeSH (Medical Subject Headings), and UMLS (Universal Medical Language System).
So, a useful vendor feature would be the ability to use the existing public domain medical taxonomies to categorize and present search results.


Reader Comments