The Emporer’s complex clothes

A few weeks back, I was lucky enough to have That Damn Canadian(tm) as a houseguest. Karl Fast […]

A few weeks back, I was lucky enough to have That Damn Canadian(tm) as a houseguest. Karl Fast recently finished his masters in library and information science and this fall he’ll be starting his PhD (his research area will be information visualization). Dude!

As part of his masters he took a class on thesaurus construction using facet analysis in which he had to develop a small thesaurus. So while he was in Palo Alto I took the opportunity to pump him on facets, the hot IA topic these days. I make everyone sing for their supper.

pan.jpgWhile I stirred some risotto, we talked about why facets are harder than most folks think.

Since we were there in my kitchen, Karl used the example of cooking equipment. “That pot you’re fussing over: what are its characteristics?”

“Well, it’s a pan. flatbottom. non-stick. calphalon. metal handle.”

“So you have type, material, shape, brand….those are potential facets. What about knives?”

“What have knives got to do with it?”

“Well, a knife isn’t a pan so it might have different facets.”

“Like sharpness or length.”

“Yeah, those are potential facets for the knife that aren’t facets for the pan. So some facets would be shared by most cooking items, like material and brand, and other facets would be unique to certain items, like sharpness was to the knife.”

“I think I got it.”

“Do you? All the knives in a kitchen store are sharp, but they all have different handles. It’s an important distinguishing characteristic. So how do you distinguish between the blade of the knife and the handle?

“I don’t know. Maybe material, but also color, edge and…hmmm. I don’t know.”

“Neither do I. It’s starting to get fuzzy here isn’t it? How *do* you describe the handle of a knife? I’m sure it can be done but it’ll require some research and analysis. And we should remember that most cookware has handles but handles aren’t always an important characteristic. It’s probably important for describing a knife, but probably not important for a blender.”

“Blenders! Shoot! What about strainers, toasters, lemon zesters…our classification needs to describe facets for those too, right?

“Yeah.”

“This is getting a bit harder.”

“It could be worse. What if we decided to tackle not just cookware, but the whole subject of cooking?”

“Well, we’d want to hit techniques, history, recipes.. um (looking around) interior design? Counters and shelving? And ingredients.. oh my god! ingredients. Canned, fresh, dried, fruits, vegetables.. vegetables! peas, beans, root veggies…”

“And what about something like the history of cooking? Famous chefs like Julia Child? Geographic differences? How do we handle that?”

“Got me.”

“I don’t know either. Not yet. But with enough time one could develop a faceted scheme to handle all of this. That would take a lot of work.”

“I begin to see what you mean about facets. Not being simple.” (I opened the fridge for a beer at this point. God, the fridge was full of facets also; a french husband means a shelf dedicated to cheese: there was french italian spanish, goat, sheep and cows milk, soft and hard, herbed and plain and what about the creme fraise! where the hell would that go?…. My revery was broken by Karl reaching past the cheese for an ice tea.)

“Most writers use simple examples to describe facets. Like the cheese there”
(simple?!?! I thought to myself)
“This is effective at introducing the concept but there is a dark side. Simple examples mislead readers into thinking facets are simple, or worse, that they understand facets. Life in facet-land *is* relatively simple when you’re dealing with narrow subject areas or physical objects. Life is far more complicated when you expand your scope or when you start dealing with concepts (like history) instead of just physical characteristics. This is true of any sort of classification or indexing scheme, not just facets. And library and information scientists have done a lot of investigation into these things.”

“So coming up with a faceted scheme to describing cooking in general would take years of work, but even doing facets for cookware would take months.”

“Not necessarily. It depends on the scope of the project. The broader the subject area, the more work. It also depends on how exhaustive you want to be.
“Exhaustive?”
“Detailed. Cooking would take a lot of time, probably months depending on how many people are involved and how exhaustive you want to be. Cookware would be a lot easier, but not necessarily months.

So mind reeling, I put facets in the back of my head. Use cautiously. In limited way. Watch out for scope.

Some months later, I saw Adaptive Path’s article on facets, and immediately forwarded it to Karl to tease him.

“After reading this article, I’m going to put in my book how easy and fun facets are, and how every one should do them.”

“I see you have chosen the way of pain. (I just saw Lord of the Rings again–Christopher Lee is soooo evil)

Anyhow, this is not — IMHO — a particularly helpful article about facets.”

“What’s wrong with it? Is it the problem of scale we discussed?”

“It’s a tease. It tells you what facets are, sort of, but not really. Check closely and you’ll see there is only one paragraph that describes what facets are. One paragraph? Not enough.

It has other problems too:

1. It covers what facets are, not how to develop your own faceted classification scheme.

2. It doesn’t tell you how (a) difficult and (b) time consuming it is to design a faceted thesaurus, which is to say how expensive
it is.

3. The terms “thesaurus” and “classification” are never used, but that’s what you’re building. No mentioned either of controlled vocabularies.

4. A faceted thesaurus is a dynamic thing and requires a lot of time and energy to *maintain*. This costs more money.

The best thing about the article is the discussion about the interface issues. It correctly points out that these issues are enormous and much harder than they first appear.

Jeff also makes the excellent point that browsers aren’t well suited to an iterative query interface, which is the direction most faceted interfaces are headed. The idea is to use a point-n-click interface and make lots of little adjustments to your query until you’ve whittled the dataset down. Each iteration involves a request to the server and the relatively slow network response time makes this problematic (information visualization, my research area starting this fall, faces a similar problem).”

“So it’s not inaccurate, it’s sin is that of omission? Personally, I think any kind of thesaurus or even controlled vocabulary design is incredibly difficult and time consuming.”

“There might be a few niggles, accuracy wise, but it seems basically correct.”

“So what’s the problem?”

“The article is too thin for my liking. In my view this is beyond omission. It’s like saying that the engine in your car is simply a
metal container into which you inject gasoline vapor and then light it on fire. It’s far more complicated than that and anyone trying to duplicate an engine with this information is going to fail miserably.

Now the article isn’t going to cause anyone to set themselves on fire, and it’s purpose is not to teach anyone how develop a faceted classification scheme. That should be clear to anyone who reads it. Nothing wrong with that.

My complaint is that there is a lot of talk about facets, but little of any substance. Most of it won’t help you build your own faceted classification scheme. It amounts to saying the grass is greener on the other (faceted) side, but fails to give you a map explaining how to get there and what obstacles you’ll face along the way. And the academic literature doesn’t help much either. It’s too dense and I can’t recommend it to the practitioner (not the stuff I’ve seen).”

“So where’s the article that will explain all of this in a language we can understand?”

“Well, I’d thought about writing it this summer but things have happened and I think I’m too busy (I’m going backpacking in the mountains for four weeks). More importantly, the answer is probably a series of articles. We need something to fill the gap between the enthusiastic but simplified articles we’ve been getting and the rigorous, dense explanations in the academic LIS literature.”

“So, B&A is waiting, Karl….”

Or Amy! or anyone!

Make your own conclusions. But I didn’t want to wait until an article was written to get the word out– it’s complex. So Karl agreed to let me put up our conversations.

Amy Warner’s talk at the summit http://www.asis.org/Conferences/Summit2002/IA_Summit_031602.ppt made many wonderful points about when and why to choose what degree
of controlled vocabulary you want to use. (and faceted thesauri are the Cadillac’s of controlled vocabularies. See slide 6.).

She also pointed out that often a company cannot technologically support a thesaurus, and designing one would be a waste of time and money (which Karl agrees with 100%). In fact, if you are excited by Jeff’s article, definitely go through Amy’s slides. It illustrates what it takes to make a thesaurus.

Personally while I fear faceted classification in all its majesty I think adding limited facets to your navigation is just fine. There is nothing bad about “shop by occasion, recipient, lifestyle and shops (brand)” as seen on www.redenvelope.com. It’s important to be careful when you open that genie’s bottle. Facets are like wishing: they may seem simple, but the consequences can be unpleasantly surprising.

19 Comments

Add Yours
  1. 1
    Victor

    RE>”It depends on the scope of the project. The broader the subject area, the more work. It also depends on how exhaustive you want to be.”

    That bears repeating. Developing an exhaustive, generic set of facets is like boiling the ocean. And maybe that makes for an interesting research pursuit, but most of the time we’re designing for a particular audience who have particular goals, so we only have to boil enough water for them. When doing IA for the web we can balance our (more focused) bottom-up organization with more top-down design. I’m not scared 😉

    In defense of the Adaptive Path article, they were probably writing for their customers, not for IAs. For that audience, an engine *is* a metal box you pour gasoline into, then you push the pedal and it goes. When you need work done on your car, you hire AP, which is ultimately (presumably) the goal of the article.

    I agree that we definitely need that rigorous article for practitioners.

  2. 3
    veen

    The car engine analogy is a good one. Remember those articles a month ago about the “new” interface to the BMW I7? It’s got that knob on the center console for navigating a screen-based interface for accessing the features of the car. Most articles ridiculed the complexity of the UI. They were focused on a user experience problem, and never really got into internal combustion or the inner workings of the vehicle.

    Likewise, the article I wrote for the Adaptive Path site does the same. My intended audience was not only our customers, but the IA community at large. You all know about facets — we’ve been discussing them to death on the various community lists and Web sites. What I wanted to do was illustrate how those emerging architectures were being communicated to users through new UI conventions. That’s why we headlined it “Best Practices in Faceted Interfaces” and not “The Comprehensive Guide to Developing and Maintaining Faceted Architectures”.

    It’s a scoping issue…

  3. 4
    christina

    God yes. And I think it points very well at the heart of the problem, which is “Where is the practioner’s article on facets?”

    Every LIS person I corner to ask about facets runs away screaming (except Karl, who was hungry). Meanwile I see faceted classification all over the web. I think I am the person Karl worries about understanding the articles. 🙂

    Your article was terrific, but I sure can’t make facets from it. Is making a faceted thesaurus as hard as building an engine?

    I had lunch with MadonnaLisa the other day, and she’s taking a wine class and has suddenly realized that wine.com’s facets are worefully incomplete for describing wine. She no longer finds them useful. Then again, I am a wine entusiast and I find them sufficiant for browsing for shopping purposes.

    I really don’t know what to think about facets. And people are begging for me to put something about them in the book. and honestly, I don’t think I want to go there. I talk about things I am not an expert in by interviewing experts, learning what they learn and then having them review what I’ve written for accuracy. But facets seem to be the one subject scaring the experts in facets. And that makes me really wonder what it all means.

  4. 5
    ML

    Great post on faceted thesaurus…glad that Karl and you had the conversation!

    Yes, the biggest thing to remember is scope. You can have a faceted thesaurus of 1000 terms or 100,000 terms and depending on your content either would be sufficient. Another thing to remember is how sophisticated is your intended audience for the thesaurus. The last thing, be realistic about time, resources, and business context. Yes a 10,000 terms thesaurus is great, but if you only have 500 items, and growth of less 10% content per year, you’re probably better off with existing methods/thesauri.

    To add to CW’s comments above, my new awareness of wine makes me want to redo the wine.com facets. Now that I’m learning more about what I like and don’t like I would like to have facets that suit what I know I like in wine. This is my summer project as I get through my wine studies class.

    As for the book CW, don’t do anything you don’t feel comfortable doing…a whole book on faceted thesaurus is bound to turn up hopfully from the academic and practical point of view… 😉

  5. 6
    michael

    To get a good idea about how to create meaningful facets, I think you have to look at what the large thesauri and controlled vocabularies have done. Thesauri like the Art & Architecture Thesaurus take a faceted approach but they are developed over time in the slow moving fashion of committees and working groups. I have mused a little on the topic of how AAT and MLA use facets on iaslash in the past. I think there is value in describing things using facets but to do so, as with any classification system that attempts to identify all ways of describing things a-priori is a huge undertaking. Huge.

  6. 7
    Karl Fast

    Let me be clear: It’s pretty obvious that Jeff’s article wasn’t intended to explain facets to the world. I made sure to mention that.

    But at the same time this is what bugged me about the piece. It says facets are great but ignores the fact that few people in the IA community understand them (even though we’ve been “discussing them to death” on SIGIA-L). Hell, most people in my thesaurus class didn’t understand them.

    So the article doesn’t solve the known problem: explaining facets and facet analysis. Now the dialogue Christina & I had doesn’t solve the problem either. Our intention was to show that facets are powerful, yes, but also a lot more difficult than anyone writing articles about them seems willing to admit.

    I’ve read articles and attended talks which have, in my opinion, done more harm than good when trying to explain facets (Jeff’s piece doesn’t do that, thank heavens; for what it does cover it’s quite good).

  7. 8
    ralph

    Since, as Karl says, the IA community is “discussing facets to death” on SIGIA-L, I think you almost have to mention them in the book, Christina. That doesn’t mean you have to explain them. I’ve been looking at books about the dreaded Flash lately (long story), and the XML section of most of them amounts to “XML is a powerful way of describing data, and Flash can integrate with it.” You could say that it’s emerging as an important but complex and poorly understood aspect of IA, and that a proper understanding of it would take an entire book, which you expect someone other-than-yourself will write.

  8. 10
    Mike Steckel

    In rereading some of the SIGIA comments on facets, I think people get confused because they don’t keep the two sides of the fence distinct: creating and indexing/accessing. When you are creating, you want to look at your content and break it down into small, mutually exclusive pieces. These pieces are generally in some sort of hierarchy with very broad categories at the top (materials, processes, equipment, etc.) down to the more specific (aluminum, cooking, spatula, etc.). You can obviously take specificity to very deep levels or very shallow levels, as you have the need or resources. The other side of the fence is indexing or accessing the material. Here you look at the specific piece of content that you are looking to classify or retrieve. You assemble the pieces from the thesaurus to create a complete description of the object. You don’t put an item into the thesaurus with facets, you start with the object and pull facets from the thesaurus to describe it.

    The idea originally occurred to Ranganathan by watching someone use an erector set. They had decided what the pieces should be, but the user was free to assemble them as necessary to create a representation of the object it meant to make.

  9. 11
    Dan

    I first got a taste of facets on SIGIA-L. Appetite sufficiently whetted, I attended Louise Gruenberg’s session at the IA Summit. I’ve been looking for good information on the subject since. Emphasis on good. Christina’s conversation with Karl helped clarify my understanding, but did nothing for my appetite.

    Karl expressed concern about the bad information out there. Somehow, I stumbled upon this article, which qualifies. While the definition and discussion around facets seems to hold with my (limited) understanding, I do not think the example they provide for business use meets the definition of faceted classification.

    As for the Damn Book, perhaps you can provide examples of good and bad, but offer a reading list for the “how.”

  10. 12
    cookware

    Very interesting post! I’m glad he came over to your kitchen and had that chat with ya. More important than that, thanks for posting it here so we can all learn from it!

  11. 13
    getluky

    “The Emperor’s Complex Clothes”

    “My complaint is that there is a lot of talk about facets, but little of any substance. Most of it won’t help you build your own faceted classification scheme. It amounts to saying the grass is greener on the other (faceted) side, but fails to give you…

  12. 14
    getluky

    “The Emperor’s Complex Clothes”

    I’m updating this original post, because I found that this quote was accidentally misattributed to myself. I was primarily interested in faceted classification for complex access control systems, and I agree that there is a total lack of actual open so…

Comments are closed.