Photography Media Journal
ISSN 1918-8153

Blog|Journal | Gallery|Contact|Site map|About 

Print version (full-text)

Metadata for Image Resources

by: Tomasz Neugebauer

July 2005

page: 3 of 10

previous page| next page

Literature Review

Day presents the diverse nature of research in metadata for images with a detailed description of Dublin Core. (Day 1999) This diversity has resulted in proprietary image metadata formats maintained by many creating communities as well as more structured but simple generic formats such as Dublin Core, domain specific complex structures such as MARC and the components of “larger semantic frameworks” such as EAD. (Day 1999, 2) His description of Dublin Core emphasizes the role of this schema as a resource discovery tool. Day describes a number of projects relevant to integrating access to distributed collections: Consortium for the Computer Interchange of Museum Information (CIMI), Museum Educational Site Licensing Project (MESL), Electronic Library Image Service for Europe (ELISE), and Arts and Humanities Data Service (ASHDS) (Day 1999, 4-6). As a testament to the rapidly changing and complex nature of this field, searching for these initiatives online revealed that only ASHDS[13] has not ceased operations.

Greenberg performs a quantitative analysis of metadata schema that can be used for the various domains of image retrieval, including Dublin Core, VRA Core, REACH, and EAD (2001, 917-918). Greenberg’s analysis begins with a novel definition of metadata “structured data about data that supports discovery, use, authentication, and administration of information objects.” (2001, 917) The various metadata schemas were compared with regard to granularity and the distribution of types of elements into the four identified classes: discovery, use, authentication and administration. (Greenberg 2001, 918) The discovery class includes elements such as creator, title and subject and “assists in the identification and retrieval of an object.” (Greenberg 2001, 919) The use class “permits the technical and intellectual exploitation of an information object” and includes elements such as format, location, property rights and terms and conditions. (Greenberg 2001, 919) Authentication elements relate to legitimacy and integrity and include elements such as source, relationship and version. (Greenberg 2001, 919) Administration elements assist “with the management and custodial care of an object” and include provenance, and acquisition-related information including ownership (Greenberg 2001, 919). Greenberg finds that Dublin Core and REACH elements are 90% and over in the discovery class, EAD is the only non-discovery centered scheme favoring administrative elements. (2001, 919)

Marcia Lei Zeng applies the USMARC, VRA Core and Dublin Core to three dimensional realia, provides a comparison of these in the context of a museum fashion collection in particular and 3-D object descriptions in general. Among the challenges found are the following: the creator element (VRA Core and DC), 1XX field (authorship) and $c of 24X of USMARC are often uncertain or only partially known information about manufacturer is available (e.g. deduced from the language of inscription on the item). USMARC has fields 260 and 500 to capture manufacturer information whereas VRA Core and DC do not. (Zeng 1999, 1199) Similarly to Vercoustre & Paradis (1999), Zeng found that the title information was often difficult to determine due to lack of textual description from where it could be taken, making the title element of VRA Core, DC and USMARC field 24X a generic term. (Zeng 1999, 1199) For a theoretical foundation of the difference between generic and specific levels of description see Shatford (1986).

Chen (2001) uses various image retrieval tasks (e.g., describing, searching, sorting), including content-based retrieval (visual queries) to study search behaviour of art history students, and found significant relationships between textual queries (e.g., number of keywords participants planned to use) and search drawings (Chen 2001, 715). Jorgensen (1999) and Choi & Rasmussen (2003) offer good reviews of the literature in information needs and image queries. Jorgensen (1999) summarizes the relationship among types of queries and their associated attributes in the domains of: art history (creator, title, size, material, type, nationality, time period, technique and genre), visual content search (color, size, location, texture, shape, orientation), topical search (time, location, event or activity), event search (time, setting, activity), affective search (emotion or atmosphere), and conceptual search (abstract, symbolic, thematic, political, social, interpretive, state). (Jorgensen 1999, 311) As is evident from this list, both content-based attributes (e.g., color, size, texture, shape, etc.) and description-based attributes (e.g., time, setting, title, etc.) are searched for by the users. This suggests a need for closer collaboration between the “two distinctive research groups employing the content-based and description-based approaches, respectively.” (Chu 2001, 1017)

Kramer & Sesink (2003) report on the use of the XML based Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) along with the Safeguarding European Photographic Images for Access Data Element Set (SEPIADES). Kramer & Sesink (2003) note that although standards such as the International Standard for Bibliographic Description (ISBD) and MARC exist, “photographic archives are often not satisfied with the level of support these standards offer” (Kramer & Sesink 2003, 135). SEPIADES is the result of collaboration among the European Commission on Preservation and Access and various archives with the aim of producing “a complete metadata element set for the description of photographic objects.” (Kramer & Sesink 2003, 135). The reasoning for creating a distinct format for photographic materials includes the fact that, according to Kramer & Sesink (2003) all of the other formats are too general and so “a lot of descriptive information about the photograph and its content cannot be specified, or need to be specified in elements that were not defined for use with photographic information” (Kramer & Sesink 2003, 137).

Addis et al (2003) describe another large scale interoperability project (ARTISTE) between four major European galleries: the Uffizi in Florence, the National Gallery and the Victoria and Albert Museum in London, and the Centre de Recherche et de Restauration des Musées de France (C2RMF) (Addis et al. 2003, 91). The advantages of using the Resource Description Framework (RDF) over older protocols such as z39.50 [14] include the ability to specify image content metadata (as opposed to textual metadata only) using operators related to image content and methods “that result in the execution of image processing algorithms” (Addis et al 2003, 94).

in this section: