Interdisciplinary Approaches to Metadata – Tom Lombardi

Modern research has produced a stunning amount of data in virtually every field of study. For example, biologists in the age of gene technologies and bioinformatics have had to grapple with this volume of data for about two decades. In particular, new approaches to evaluating metadata have been developed to address the growing need for the analysis of such data. This talk explored the possibility of applying techniques developed for analyzing metadata in disciplines like biology to comparable metadata in art history. In particular, the presentation outlined the application of such techniques to the wealth of data stored in the Index of Christian Art (https://ica.princeton.edu/). The early results of this work suggest that such techniques could be repurposed to provide advanced search features to large metadata collections in art history.

Close observation and verbal description play an important part in the research traditions of biology and art history. In the field of biology, genomic research requires annotations to document metadata such as gene function, gene location, and experimental details. The Gene Ontology Consortium (http://geneontology.org/ ) and other similar institutions develop formal ontologies to facilitate the consistent integration of annotated data across species and experiments. Similarly, the field of art history has developed its own approaches to the cultivation of metadata. The Index of Christian Art annotates religiously themed artwork from the earliest Christian era to the High Middle Ages. The Index collects a wide variety of information for each artifact in its extensive collection including medium, object type, dimensions, location, description, notes, subject, date, style and school. Despite these similarities of outlook and impressive collections of metadata, biologists widely recognize quantitative methods as essential for processing metadata while many art historians do not. In order to understand this difference better, the presentation discussed the results of applying differential expression analysis to a subset of the data in the Index of Christian Art.

Differential expression is a conceptually straight-forward way to identify the differences in gene expression between two groups of subjects. Without too much scientific jargon, let’s say that we wish to study the reaction of two different groups of mice to a specific experimental condition. The control group represents a collection of untreated mice. The test group represents a group of mice treated with a drug to inhibit the expression of a certain set of genes. Biotechnologies exist that let us count, to simplify just a bit, the expression levels associated with each gene in each sample from each group. Mathematical techniques such as Fisher’s Exact Test can then be used to identify the genes that are expressing at significantly different levels in our control and test groups. Moreover, similar tests can be used to identify which biological functions as defined in gene ontologies are significantly represented by these different levels of gene expression. In order to apply this to art-historical metadata, we will need an experimental condition providing a basis for comparing two distinct groups.

Art historians studying the medieval Tuscan tradition have discussed and debated a case particularly suitable for testing differential expression with art-historical metadata. Many art historians have discussed the role of the Black Death in the development of iconography in the Mid-Fourteenth Century. The size and scope of the demographic changes caused by the Black Death suggest several reasons for positing significant changes in artistic style. For example, several well-known Tuscan artists are known to have died as a result of the Black Death including Ambrogio and Pietro Lorenzetti. Some art historians have cited disruptions in workshop culture as a reasonable explanation for stylistic changes. Moreover, the changes are thought to have impacted different regions in different ways. For instance, some areas such as San Gimignano were hit particularly hard. Moreover, some art historians have argued that Florence and Siena differed in their stylistic responses to the Black Death. These scholarly viewpoints provide two approaches to applying differential expression to data from the period: identify significant differences before and after 1350 in Tuscan art and identify significant differences between Sienese and Florentine art in the period.

The preliminary research aggregated data for just over 1500 Tuscan paintings documented in the Index of Christian Art. For each painting, an index of the subjects was recorded. These subject counts were then aggregated into four categories: Florence pre-1350, Florence post-1350, Siena pre-1350, and Siena post-1350. Table 1 presents an example of some of these count categories for subjects expressed at significantly different levels before and after 1350. Many of the results support changes documented in the literature. For example, Millard Meiss dedicates an entire chapter of his monograph, Painting in Florence and Siena after the Black Death, to the Madonna of Humility. When Fisher’s Exact Test is run against the aggregated metadata derived from the Index of Christian Art, several subjects related to the Madonna prove to be significant including one labeled “type, humility”. The results also show a significant increase in the images of Anthony Abbot after the Black Death.

Table 1: Subjects expressed at significantly different levels before and after the Black Death.

Subject	Florence pre-1350	Florence post-1350	Siena pre-1350	Siena post-1350
anthony abbot the great 1. portrait	3	35	4	27
clergy, abbot: anthony abbot the great	0	15	3	21
virgin mary and christ child	74	52	94	41
virgin mary and christ child: type, humility	2	17	0	8
virgin mary and christ child: type, suckling	4	19	1	11
clergy, deacon: lawrence of rome	2	17	2	8

While the preliminary results are encouraging, the interpretation of the results requires restraint. First, unlike the biological data used in differential expression, the art-historical data considered here was neither randomly sampled nor randomly assigned to the groups of interest. Therefore, the results do not provide a basis for inferring causal explanations or the general characteristics of the population. In other words, these results should be interpreted in purely descriptive terms. The differential expression analysis of this kind of data is best thought of as a way to sort or filter advanced search results. The results in Table 1 represent the most significant subjects given a specific set of search terms. From the perspective of search, the results show the subjects that differ most in our two groups of interest: 1300-1350 and 1351-1400. Second, the results must be checked against the research record and the metadata collection standards to ensure proper interpretation. For example, given the statistical limitations previously cited, it cannot be known if the differences identified are due to bona fide artistic trends, survival bias, collection bias or some other confounding variables. Despite the interpretive limitations, differential expression and techniques like it might enable creative new exploratory analyses for the valuable metadata art historians collect.