Blog: David Loshin« Appliances are Getting Hot! | Main | Microsoft to Purchase Zoomix? » Metadata for Really Unstructured StuffI have been tinkering with some of the blogging tools out there (so far I like wordpress a lot). One nice aspect of the blogging framework is the expectation of meta-tagging of your content that helps in organization and presentation, which is quite nice because the system does some of the work that I have always been loathe to do (that is, "organizing things"). One way to do this is by categorizing your entries as well as adding additional tags. I was pondering this at some point, thinking that it should be possible at this point to use text mining tools to scan your content and pull out the "statistically improbable" phrases (as our friends at Amazon like to say) to be used as tags. But what about non-text content? I can think of three commonly used content types that are growing in popularity yet require some extra thought for assigning meta-tags: pictures, voice recordings, and video recordings. As more of this unstructured stuff comes down the pike, we metadata folks should think hard about how to assess and capture semantics associated with these objects for the purposes of organization. A few years back my friend Greg Elin put together a system for selectively annotating pictures. Check out his fotonotes web site. Perhaps there is some future in this for video? |