Cataloging works of art with the web in mind

"Cataloging works of art with the web in mind"
by Sherman Clarke, Head of Original Cataloging, New York University
(ARLIS/NA Annual Conference, Pittsburgh, March 2000)

Good morning.

We speak of subject cataloging as the other half of the cataloging record from the description. This leads us to think that perhaps we can have a cataloging code like AACR2 which will inform us as catalogers and lead to consistent subject analysis and implementation as AACR and ISBD lead to descriptions which can be read at least by a cataloger and converted into cataloging records that can be read by our users. When you see cataloging-in-publication, or CIP, on the title page verso in a language you don’t read well, a cataloger who knows ISBD punctuation can determine what elements the publisher and/or national cataloging agency thought were relevant to the description. Nonetheless, the nature of subject cataloging is far more diverse and contextual than descriptive cataloging. In a word, it's subjective.

We often speak of subject cataloging as if it were monolithic. It is actually a means of providing access to different aspects of an item -- to its topical content, to its form or genre, to its iconography, to its physical characteristics. For books, doing subject analysis usually means determining the topical content though we may add subdivisions or additional headings to address form. This is not usually the case for works of art. Let us consider for example the Arnolfini wedding portrait by Jan van Eyck. What if anything is this painting about? Is it about the Arnolfinis? Not really, it IS the Arnolfinis. Certainly, much has been written about the iconography of fidelity, based on such details as the shoes and the dog, but is this painting ABOUT marriage the same way a study of marital customs in late medieval Flanders is about marriage? There are also the aspects of interior views, double portraits, oil painting, panel painting, as well as details like the mirror, the burning candle, and the rosary beads.

Does it really matter if we differentiate between these meanings of "subject"? If you used a web search engine to search on "wedding," would you be satisfied with finding the Arnolfinis along with a Martha Stewart guide to planning your wedding?

Searching for books on the web may be more difficult than in library catalogs because a controlled vocabulary applied by a cataloger will be stronger since predictability is possible. If a user can determine the terminology used for a particular topic, the results are likely to satisfy a subject search.

While clearly one is not well-served with a million hits in Altavista, are you better off with a specific response or something with a bit of serendipity? Subject searching is different from author or title searching in that usually the latter are done for known items, though one might do a title keyword search in a library catalog to find out what sort of controlled terms are used for a certain concept. In my opinion, serendipity plays a bigger role in subject searching.

When cataloging was done for card catalogs, precoordinating the terms into a string was the norm in library catalogs. The string held together the terms that belonged together and prevented our retrieving false hits that combined a mismatched topic and qualifier. If we made references, they were usually at a fairly broad level. That is, we didn’t try to get the user from "contemporary French painting" to "Painting, Modern -- 20th century -- France" though we might refer from "Modern painting" to "Painting, Modern" or from "French painting" to "Painting, French." Our strings were constructed according to examples we saw on LC copy with subdivisions in the perceived order.

Since the Airlie House conference in 1991, we catalogers primarily of book materials have been more aware of the character of subdivisions. If the order of subdivisions was to follow as much as possible the prescribed order topical-geographic-chronological-form, then we needed to determine what role a subdivision played in the subject string. When LC staff was divided into descriptive and subject catalogers, the cataloging variations between subject areas were sustainable. As catalogers there moved to whole book cataloging and more copy cataloging, the distinctions between practices have become more difficult to maintain. As LC goes, so goes the rest of the cataloging world. For example, though fine art subject headings have for many years not used the subdivision "-- History," this subdivision appears with some frequency now on LC copy, usually when it is a secondary subject heading and presumably done by non-art cataloging specialists. Of course, the interdisciplinary nature of current scholarship invites subject access across areas of cataloging practice. It seems to me that this argues pretty strongly for consistent practice across disciplines and presumably makes it easier on the user.

While it may seem pretty straightforward to determine the character of subdivisions, especially when you are using MARC subfielding, I am not sure we really thought that as we supplied an LC subject heading. For example, we applied "Art, Modern -- 20th century -- France -- Paris" and "Art, French -- France -- Paris" to a work because those were the patterns that LCSH prescribed. At the same time, we’d use "City planning -- United States -- History -- 20th century" with the geographical and chronological subdivisions in a different sequence. With the recent proposals for revising art subject heading practice from the Library of Congress, this pair of headings may shift to "Art, French -- France -- Paris -- 20th century." We in the art cataloging world have become accustomed to doubling our subject access points, one with a geographic emphasis and the other with chronological. In its discussion of the LC proposals, the members of the ARLIS/NA Cataloging Advisory Committee expressed significant discomfort with the idea of not considering "Ancient" and "Medieval" as stylistic terms with geographic subdivision. In the art literature, the terms are used to express more than mere chronology. On one hand, the catalog is easier to use if subdivisions are applied consistently across disciplines. One can argue, on the other hand, that an art cataloger will reflect the way that an art researcher will use the terminology.

As library systems and bibliographic utilities have developed more sophisticated indexing and retrieval, the order of the subdivisions seems to carry less importance than in a card catalog or listing that relies on phrase searching from left to right. If you can ask for the keywords "art" and "Paris" and "20th century", what difference does the order make?

Sometimes of course, the order of the subdivisions affects the meaning. For example, “Panel painting -- Germany -- Conservation and restoration” and “Panel painting -- Conservation and restoration -- Germany” mean different things. Karen Drabenstott and others have done research on how users interpret subject strings, with wide variance found in interpretation, even though catalogers and other librarians might agree on the interpretation.

Local places are usually expressed as a noun. Paris and New York, for example. Some local place names have been used stylistically, for example, "Venetian." The expression of nation is much more ambiguous. That is, does "Italian" mean art done in Italy or in an Italian style? Currently, LC uses nouns to express places in architecture headings unless the style is outside its native place, as in English architecture in India or Chinese architecture in San Francisco. Similarly, a lot of decorative art mediums are only divided by place, without recourse to a heading like "Furniture, English."

There are two principal characteristics of the expression of subject analysis. That is, there is the vocabulary being used and the way the vocabulary is used. For many years, Toni Petersen and the staff of the Art & Architecture Thesaurus stated that their goal was building the vocabulary and not implementing it. Nonetheless, field 654 was added to the MARC format to accommodate faceted headings so that a cataloger could code the hierarchy from which the descriptors in a string are from. Most of the occurrences of 654 in databases like RLIN are for AAT headings. In contrast, LC vocabulary often seemed driven by the construction of the subject headings. Does this argue for a subject cataloging code?

There has been some discussion of the development of an AACR2 for subject cataloging. In my opinion, there are basic concepts of subject analysis and much has been written about them, but the actual application of a subject vocabulary is often so tied to the context that a general code like AACR2 is probably impossible. Rather, I anticipate that such guidelines as the Subject cataloging manual: subject headings for LCSH and the AAT Guide to indexing and cataloging will provide vocabulary-specific assistance to the subject cataloger. Some recent developments in metadata do include subject fields as well as format or genre fields, but there is little guidance on HOW to express the topic or genre. This has led many database builders to merely use a term in a discrete field and not to build strings. Most metadata schemes have the potential of expressing the subject vocabulary with no guarantee that the vocabulary is being used in a particular manner.

OCLC has been a leader in metadata developments in recent years, serving as the secretariat for Dublin Core. Their INTERCAT database of electronic resources is a subset of the main OCLC database that includes all items with a Uniform Resource Locator or 856 field. They are now developing FAST which stands for Faceted Application of Subject Terminology. My ears perked up when I heard "faceted" which had been mostly heard in reference to the AAT. FAST is not however really faceting as much as it is deconstruction of subject strings. That is, the topical, geographic, chronological, and form aspects of a heading are broken into separate headings. For example, the LCSH string "Painting, Modern -- 20th century -- United States -- Exhibitions" would become a topical heading "Painting, Modern"; a chronological heading "20th century"; a geographic heading "United States"; and a form heading "Exhibitions." This deconstruction does recognize that most web-based searching is likely to be in keywords, but also loses any precoordination of the terms. That is, if you had a record with headings for 20th-century cubist paintings from Paris and for African sculpture, you might get a hit if you searched on 20th-century sculpture or African painting.

Subject cataloging is also more closely tied to institutional policies and needs. Catalogers in most institutions would describe an item in similar ways. Some books might get a shortened entry, leaving out the subtitle or some of the notes. Some works of art might get a different title, depending on the source of data. But the basic descriptive information is universal. Subject analysis for all sorts of items -- books and visual resources and works of art -- is less concrete. There have been many discussions on the VRA list about corollary images, such as maps or popular culture, with many collections having some sort of catchall category. In a collection of material culture however, this would be too general. In a drawings collection database, you might not think it necessary to say "drawing" in the data record. This reliance on context, of course, becomes more problematic when the access to the database is remote or when you are searching across databases. Will the single subject heading "Drawing" get the searcher to the database and then she or he can do a more detailed search within the database? I don’t think our systems are there yet. Maybe our systems cannot or should not go there. If we want to search across databases, there will always be variety in the depth of cataloging or indexing.

Also, some vocabularies are very good in one area but are not comprehensive. Most of the people in this room would probably agree that AAT is a strong vocabulary for a database of museum objects or visual images. AAT, however, has no iconographic headings which are also critical for describing art. To return to the Arnolfini wedding portrait, one could use AAT to talk about the form and materials of the painting. One could use AAT to provide access to some of the objects in the painting like the mirror, candle, and bedstead. One could even refer to class of person (merchant) or to some of the abstract concepts like marriage or events like weddings.

Some of the vocabularies you might use in providing access to an item do not overlap -- that is, a name authority for Giovanni Arnolfini would not conflict with a topical heading. If however you want to use two vocabularies for comprehensiveness, the possibility of conflict is almost certain. LCSH is an old subject language with many headings in inverted or rotated form. For example: Art, French; Mural painting and decoration, Medieval. Most new headings are entered in direct order, following current thesaural guidelines. Some older headings have been revised. AAT, as a much newer language, does not use inverted or rotated headings. I do not know of any system that logically combines multiple thesauri, though some systems may allow for different vocabularies to coexist. When Mary Woodley reported on her research to the CC:DA Metadata Task Force at January's ALA Midwinter meeting, she had not found any systems that really searched across vocabularies. What she found was gateways that provided parallel searching and retrieval across databases.

I went to the web site for ROADS, a British project which uses Dublin Core as a common language to provide access to seven resources which are themselves amalgams of data. Among the resources is ADAM, a gateway for art, architecture and media. I did a search on "Palladio" and got three hits. Two were highly relevant -- the Palladio Museum and a site devoted to Palladio's Italian villas. The third hit was an essay on Erik Gunnar Asplund, the Swedish architect. The only "palladio" I could find on the site was in the web address. For those of you who have done much searching on general web browse engines, two out of three isn't too bad.

Some experiments have been done to use thesauri to inform one’s searching of a database. The Getty had a prototype called aka which used the AAT and other Getty vocabularies to filter the entry vocabulary of a search. The prototype has been discontinued and did not provide for a dynamic database. Nonetheless, one can imagine an implementation where a fully-developed thesaurus like AAT could be used by a search gateway to lead a user to helpful information even if a keyword search would not have found hits. For example, a search on "seaside cottages" might not find you anything but if AAT were assisting you, you would find cottages in its hierarchical context among houses by form. If your gateway were smart enough, it could show you which terms in the hierarchy had hits.

It is in areas like this that I think the web is most promising. Its great advantage is to provide links within and between resources. Our role as catalogers and, if you’ll pardon the expression, information managers is not really changed however. Our catalogs and databases are best served when we provide consistent and informed access. Consistent access can be obtained through controlled vocabularies and authority files, and we can go much further by being sure that this control of the words is supported by database structures that allow us to dream of linking the terms and names to their thesaural context.

sherman.clarke@nyu.edu

"Cataloging works of art with the web in mind" by Sherman Clarke, Head of Original Cataloging, New York University (ARLIS/NA Annual Conference, Pittsburgh, March 2000)

"Cataloging works of art with the web in mind"
by Sherman Clarke, Head of Original Cataloging, New York University
(ARLIS/NA Annual Conference, Pittsburgh, March 2000)