Scanning for Pleasure

I am thinking through Jo Guldi’s article about “critical search” and bringing in my memories from her talk here at Pitt in January titled “A Distant Reading of Property: Topic Models, Divergence, Collocation, and Other Text-Mining Strategies to Understand a Modern Intellectual Revolution in the Archives,” which dove further into her research about British Parliamentary papers and tenant issues. For my research, I am reading the newspaper Lampião da Esquina, a monthly publication in Brazil from 1978-1981 produced for and by gay people. An NGO, Grupo Dignidade, an advocacy group for LGBTQ Brazilians, scanned the individual editions of Lampião in Brazil (date unknown). I mention this to say that I do not have the physical copies of Lampião and did not scan them myself – I am working with only what I found online.

The corpus consists of 35 documents, and according to voyant has just over 1.1 million words. The scanned PDFs were run through an OCR program and allow me to search for keywords. Similar to Guldi’s search for the term “tenant” and its usage, I am interested in how the text in Lampião utilizes “pleasure” (prazer).[1] Performing a keyword search for prazer throughout the entire corpus allows me to see how popular the term is over the span of the newspaper’s publishing, and which issues have a particularly high frequency. For example, running a keyword search in Adobe results in 304 instances of the word prazer. That is, however, the ones that the program can read – certainly there are usages of prazer that escape the search due to poor scanning, definition, or non-standard text-font.

How can I incorporate Guldi’s “Critical Search” in my research of gay identity and publications like Lampião? Regarding “seeding,” I came to Lampião after conducting a broad, internet keyword search for “gay rights Brazil” (or something similar). Several results indicated that Lampião was the first nationally distributed publication and was foundational in establishing a national movement. Indeed, many monographs on the topic also argue for Lampião’s importance. I may be able to “broadly winnow” the corpus by identifying which editions more frequently engage with the term prazer. Hopefully later, then, through “guided reading” may I begin to consider ways to make contributions to the field in general.

Conducting preliminary “Critical Searches” on prazer in Lampião has led me to further questions. Why was there such a large spike in the use of the word in late 1980? When is prazer evoked, in what context, and by whom? What do the writers mean by prazer? What about other similar words like desire (desejo), happiness (alegria), satisfaction (satisfação), or enjoyment (gozo) – why specifically prazer? How, if at all, do the publications for other contemporaneous social movements (like the Black consciousness movement, the labor/socialist movement, or environmentalists) use prazer? I anticipate that applying methods addressed in Guldi’s and others’ publications from the semester will help me identify key moments and actors for further research.

[1] In November of 1978 Lampião da Esquina (Lamp on the Street Corner) introduced a new subtitle – “Lampião discusses the only topic still taboo in Brazil: pleasure.”

Future Historians’ Data…

For this post, I would like to focus on Mary Gray’s video on the hidden cost of ghost work in algorithms. As she points out, even with machine learning techniques, there are still humans who perform the new work upon which artificial intelligence algorithms rely. This takes on the form of data entry, data labeling, and content review. Gray’s main topic is what she labels the “human-in-the-loop” services that require humans to work on the back end of the algorithms to ensure that they run smoothly.

She goes on to describe how the process works; with “requesters” on the left, who then interact with the “platform” through the internet, when finally any number of human workers with accounts to the platform supply the labor required to complete the initial request. It is through this process, Gary argues, that workers are devalued and isolated due to an over-reliance on code.

Around the 6-minute mark, Gray introduces an “online-to-online” process in which companies access data online and contextualize it with other data sets to maximize profit. Thinking about the process(es) companies go through to obtain such data and contextualize it led me to wonder how future historians will grapple this same data.

For future historians working on economic or labor issues of the early 21st century, what digital sources and data might they discover in their research? How will this information be archived, organized, and preserved over time? How will they be able to link the human experience (whether as worker or consumer) with the different processes the Gray addresses in her lecture?

In Laura Putnam’s piece that we read earlier this semester, she explored the possible shadows that digitized sources cast over certain historical subjects. What can we say about the current digital processes that inherently cast shadows over the human laborers? Like historians now, perhaps future historians will get a glimpse into the conditions through ethnographies, personal testimonies, and second-hand accounts. Or perhaps the data collected, analyzed, and contextualized will be stored in a way that it will be accessible decades and centuries from now.

Mapping Neighborhoods

Hello All,

I was unable to figure out the process for uploading data (even the sample data) to the World Historical Gazetteer site. This is due to human error and ignorance on my part, and I look forward to learning more about how to successfully do it. Because of the error messages I received I was unable to browse the contents of the data or practice reconciling them in the system.

I had more success with Recogito.

For my research, I am reading scanned pdfs of newspapers written for and by gay people from the 1970/80s in Brazil during the military dictatorship. I am interested in the “letters to the editor” section of the newspaper in which readers wrote into the editors expressing their compliments, contempt, or concerns for the newspaper. Unfortunately, Recogito does not accept pdfs as a file type to upload. I first tried to convert the pdf to a txt file, but the result was a useless mess of letters and special characters. Instead, I created my own dataset in Excel. I recorded the Issue Number, Title (of the letter), the Name provided, and the Location(s) listed.

In all, I recorded over 150 submissions, resulting in more than 46 unique locations. Some letter-writers identified only their city, others provided both city and state, while yet others named their neighborhood as their location. I kept the original location indicators as specified by the writers in the sources because how people define where they are from speaks to their identities. For example, many letter-writers were from Rio de Janeiro (the city), some were from other cities within the state of Rio de Janeiro (Belfort Roxo, RJ, for example), and others identified the neighborhood (Copacabana) within the city of Rio de Janeiro. Certain neighborhoods ascribe socioeconomic, lifestyle, and political affiliations, and it is important to catch such information in the mapping and gazetteer-ing process.

I converted the Excel to a CVS and successfully uploaded it to Recogito. It seems that due to my data being in the form of a table, Recogito will only allow me to select rows in their entirety, and not specific words or the text itself. Although I am able to assign more then one place, person, or event to the row, I am not able to distinguish between the three categories within a single row.

While playing around with both World Historical Gazetteer and Recogito, I came across the same issue – scope. As mentioned above, several Letters to the Editor identify their place as a neighborhood, beach, or university. Neither the WHG nor Recogito were able to capture this information. Considering how human experiences and historical happenstances define a “place,” it is important that mapping programs reflect how people identify their places. In times of oppression, people may choose to identify themselves with specific, confined, and inconspicuous places (theaters, back alleys, or bars). Neighborhoods, I argue, take on important social meanings and facilitate community identity. How have other historians/anthropologists mapped specific neighborhoods and other important yet clandestine places? I will have to contend with this as I move forward with my research.

Academic Journals’ Word Networks

Hello All,

I originally struggled to get Web of Science to give me articles related to my interests. Entering terms like “Brazilian history” produced over 5,500 hits with articles concerning topics like the prevalence of the syphilis virus in female prisons, or the spatial niche modelling of five endemic cacti from the Brazilian Caatinga. Instead of “topic” searches, then, I chose “publication name” and limited my results to those of the top academic journals in my field. I was interested in seeing if there were notable differences between the journals’ word networks. For this post, I will present five related journals: Hispanic American Historical Review (HAHR), Journal of Latin American Studies (JLAS), Latin American Research Review (LARR), Revista de Indias, and Luso-Brazilian Review (LBR).

First is the Latin American Research Review (3,037 search results; 16,414 terms with 318 meeting the threshold = 1.94%):

“The Latin American Research Review (LARR) publishes original research and review essays on Latin America, the Caribbean, and Latina/Latino studies. LARR covers the social sciences and the humanities, including the fields of anthropology, economics, history, literature and cultural studies, political science, and sociology. The journal reviews and publishes papers in English, Spanish, and Portuguese. All papers, except for book and documentary film review essays, are subject to double-blind peer review. LARR, the academic journal of the Latin American Studies Association, has been in continuous publication since 1965.”

The first time I ran the program with LARR’s articles, the network was dominated by the word “America.”

I found this to be unhelpful because the journal was already limiting its publications to the Americas, and so I could assume it was the default common denominator in all the articles. I wanted to know what other words would dominate if I removed “America.” I ran the program again and removed some of the more frequent words including “america,” “forward,” “editor,” and “vol.” These last three words, I assumed, were related to the journal’s text format and did not contribute to the articles’ themes or contents.

By removing “America,” I was able to more easily see the relationships between other key words.

In general, the networks remained similar with only slight changes. Main nodes such as “culture” and “world” changed clusters. Likewise, “violence” was originally grouped with “revolution,” but after I removed “America” “violence” changed to the cluster that included “women.” What in the program’s algorithm would shift these words and clusters? These two images may lead to different assumptions/conclusions of the journal’s subject matter.

Next is the Hispanic American Historical Review (2,692 search results, 6680 terms with 103 meeting the threshold = 1.5%):

“Published in cooperation with the Conference on Latin American History of the American Historical Association. Hispanic American Historical Review pioneered the study of Latin American history and culture in the United States and remains the most widely respected journal in the field. HAHR’s comprehensive book review section provides commentary, ranging from brief notices to review essays, on every facet of scholarship on Latin American history and culture.”

Why were there far fewer items for this journal according to VOSviewer? Above, LARR had over 16,000 terms to calculate while HAHR only had 6,000. How does the program determine the terms it will use? As with LARR, I removed the term “america” from HAHR’s calculation. How does the program differentiate between individual words like “central,” “spanish,” and “america,” and phrases like “central america,” or “spanish america?”

Another journal I researched was the Revista de Indias (1452 search results; 9580 terms with 202 meeting the threshold = 2.1%):

“Since 1940, Revista de Indias is a a well-known forum for debates in the History of America targeted to specialized readers. It publishes original articles aimed at improving knowledge, encouraging scientific debates among researchers, and promoting the development and diffusion of state-of-the-art investigation in the field of the History of America. The contents are open to different topics and study areas such as social, cultural, political and economical, encompassing from the Pre-Hispanic world to the present Ibero-American issues. The Journal publishes articles in Spanish, English and Portuguese. Besides the regular issues, one monographical issue is published every year.”

You can see that this journal has several publications focused on Cuba and Peru with other locations like Argentina, Puerto Rico, New Spain, and Quito on the peripheries.

The journal Luso-Brazilian Review is a smaller, more specific journal (534 search results; 2733 terms with 17 meeting the threshold = 0.6%):

Luso-Brazilian Review publishes interdisciplinary scholarship on Portuguese, Brazilian, and Lusophone African cultures, with special emphasis on scholarly works in literature, history, and the social sciences. Each issue of the Luso-Brazilian Review includes articles and book reviews, which may be written in either English or Portuguese.”

I’m unsure why the spacing on the right side is so wide. If you were to zoom in on the blue cluster it reads from left to right, “study, time, Portugal,” and the red, “identity, history, politic, and Brazil.” Oddly enough, the overlapping green cluster on the far right consists of two nodes, “Assis” and “Machado.” Joaquim Maria Machado de Assis is one individual, an author from the 19th century who is often referred to as “Machado de Assis” – why would the program split his name? As a smaller journal with a more specified topic, it makes sense that there are fewer search results, fewer terms, and yet fewer that reached the threshold. LBR has the lowest percentage of terms that met the threshold at only 0.6%.

Lastly, I investigated the Journal of Latin American Studies (3733 search results; 16,217 terms of which 330 met the threshold = 2%)

Journal of Latin American Studies presents recent research in the field of Latin American studies in development studies, economics, geography, history, politics and international relations, public policy, sociology and social anthropology. Regular features include articles on contemporary themes, short thematic commentaries on key issues, and an extensive section of book reviews.”

You can see that, although the journal advertises a large plethora of fields of study, History sits in the center. JLAS shares similar size and terms as the LARR (the first journal above). “History,” “revolution,” “Peru,” “violence,” and “Cuba” are some terms that stand out to me. It would be interesting to speculate as to why that is… it could relate to the common scholarly interests of researchers, or their trainings, or the availability of funds to study these terms, or accessibility of archives, or the sexiness of the topic and location… Comparing the two journals may yield interesting findings about the field and its publications.

I was also interested in the JLAS’s change over time.

So here we have the same map as above, but viewed through the “Overlay Visualization,” which, according to the manual, indicates impact factor. You can see the shift from “economy” on the right, to “revolution” in the center, to “memory” on the left. What I don’t understand about this map’s key of 2004 – 2010 is if, for example, “economy” was at its peak impact in 2004 and slowly decreased in relation to the other terms (and that’s why it is purple), or if “economy” remained impactful through 2010 and was joined by other terms.

Overall, I found this exercise entertaining. I was able to see the frequency of certain terms and their relationship with other terms, while comparing the different journals. I find the visual representations of each journal to correspond with it’s description. The difference in subject matter topics of the different journals may be implicit to those in the field. The networks provide a visualization for those differences.

“[…] queries must always be in English.”

Hello All,

I’m sorry I didn’t realize we were to have published our experiences on Web of Science on here… Mine is as follows…

Green, James N. “The Emergence of the Brazilian Gay Liberation Movement, 1977-1981.” Latin American Perspectives 21, no. 1 (1994): 38-55.

What is the total number of citations?

10

What can you learn about the number of citations to this article per year since it was published?

There is a spike in 2017 for some reason… ?

What can you learn about who cites this article?  What are their disciplinary identifications?

Mostly historians or political scientists who are outside of the US. Yet all of the references are in the English language, why?

 

AUTHOR: GREEN, JN

What is the total number of publications?

38

What is the H-index?

3

What are the average citations per item?

1.24

Which of these numbers would you prefer to have used in evaluations for hiring and tenure?  Why?

I suppose you would want to highlight the times other scholars cite your works because it speaks to how others value your publications in relation to their own work. I’m not sure I see the value in the h-Index.

Is this kind of analysis appropriate for all academic fields? Why or why not?

No – if there are only a few individuals who are interested in the same subject matter as to what you’re studying, then the possibilities of raw number of citations will be lower. Basing success on how many times others cite your work would create a positive-only feedback loop wherein researchers only publish material they think others will want to cite. Additionally, this site only references articles, which are important to the field of history, but most academic historians are judged on their monographs.

Why does the site only include English language research? While most academic journals are monolinguistic, many (especially international and specialized area-study fields) are multilingual and publish articles in various languages.

Research is Hard; or “Unknown”

In the 1960s and 1970s, on the back of Cold War global politics, several South American countries experienced right-wing, military coups d’état in response to perceived internal and external threats from communism (and other “subversions”). During this tumultuous time, the state violated many individuals’ human rights because of their association with specific social groups – homosexuals, Blacks, Indians, and women, to name a few. While returning to civilian rule during the 1980s and the 1990s, ten of the twelve countries south of Panama rewrote their constitutions (the outliers being Uruguay’s 1967 and Bolivia’s 2008 constitutions). The authors of these new constitutions wrote into them protections based on social groups such as the ones above – including explicit equality between men and women. For the sake of this assignment, I asked: Since the constitutions affected women’s legal status, to what extent did women affect the creation of the constitutions?

To answer this question, I looked at how many women signed the constitutions of the twelve South American nations. The assumption there being that if an individual signed a document, they would have had a hand in its creation. Finding the various countries’ constitutions was an exercise in researching documents and archives. Navigating each country’s government website afforded insight into its priorities and organization. How deeply must one go into the site to access the constitution (often in the form of a downloadable pdf)? Some had a link directly on the homepage, while for others I received a crash course in government structure. Additionally, I learned that Brazil offers its constitution in audio form, and Argentina provides translated videos in sign language.

Immediately, I encountered inconsistencies both between different countries’ documents, and within individual countries. Most frustrating when comparing different countries’ documents were their inconsistencies in listing people involved. Some did not provide a list of signatories and instead opted to sign collectively as “the Assembly” in the case of Colombia, or “the Congress” for Perú, while others used only their titles, “President, Secretaries, and Conventionals” (what is a Conventional?) like Paraguay’s constitution states. Other constitutions simply stopped after their amendments and did not provide any signatories. To know the persons behind the writing and legalization of these countries’ constitutions would require further in-depth archival research. I would need to learn who was part of the Assemblies or Congresses, or held public offices (legislative, executive, and/or judicial) during the years of ratification.

There were also discrepancies within individual countries’ documents. While signatures accompany most of the typed names listed as part of the National Assembly of Venezuela, there are a handful of missing signatures. How do I interpret this? Could I assume they took part in the debates and discussions leading up to the constitution’s creation and they simply chose not to ratify it in the end? Or were they absent the entire time and thus have no input in the document? To what degree would we consider them “decision-makers?” Another interesting digitized document is Brazil’s 1988 Constitution. Click on the “Updated Text” button, and you can read the entirety of the constitution on the site. The site lists thirteen individuals (along with their titles) as signatories at the bottom, one of whom is assumingly female. However, on the pdf version these same thirteen are joined by more than 542 other names. Below these additions, there are 29 more listed as “participants,” followed by 5 people grouped under “in memory.” It would take substantial time to research everyone’s contribution to creating the text.

Even for those countries that publish the signatories’ names, distinguishing between male and female is problematic. My tallies are assumptions of each person’s sex based solely on their name. Additionally, there are many names with Indigenous or African ancestry that are impossible for me to interpret. This problem steams from that fact that the information of the signatories’ sex was not ascribed on the text. To know the sex of everyone who signed the twelve constitutions in South America, and thus speculate the power women had in their creation, would require considerable archival research. Furthermore, to gain a sense of change over time and consequently the impact these constitutions had on women, researchers would need to examine the role women played in the various military dictatorships and compare that to after ratifying the new constitutions. Until I complete such detailed research, the data for the names and sex of those signatories will remain unknown.

Country Year Total Signatories Women Signatories Mentions of “Women” Mentions of “Men”
Argentina 1994 4 1 3 3
Bolivia 2008 unknown unknown 18 10
Brazil 1988 556 unknown 12 10
Chile 1980 17 2 1 2
Colombia 1991 unknown unknown 7 2
Ecuador 1998 71 6 14 6
Guyana 1980 unknown unknown 6 3
Paraguay 1992 unknown unknown 16 8
Peru 1993 unknown unknown 1 1
Suriname 1987 unknown unknown 2 1
Uruguay 1967 unknown unknown 8 5
Venezuela 1999 165 unknown 4 5

People in Quantitative Sociological Studies

The most unfamiliar aspects of last week’s readings were the style, structure, and language in the articles. I am far more familiar with narrative writing and using local examples to illustrate characteristics of macro discussions and theories. This genre of writing was overtly structured, formulaic, and distant, and therefore was more difficult for me to follow the arguments. Grouped together as they were, the three articles provide examples of how, over time (2006, 2012, 2017), critiques of the formulas and systems expanded as other social, cultural, and political qualities were taken into consideration.

Because quantitative sociology is not my expertise, I gave great flexibility to the terminology with which I was unfamiliar. As example, words like “decision-making,” “development,” and conflating ideas of gender/sex, seemed to go undefined and/or unchallenged in the pieces we read. My assumption is that these ideas and concepts have been or are continuing to be contested in other arenas of quantitative sociological discourse. While reading, I tried to consider the audience for whom these pieces were written – policy makers at the UN, fellow practitioners, or students – and what the contributions and interventions being made were – conceptual, theoretical, or methodological?

As I mentioned in class, I had most difficulty reading these because, although they discuss societies and how to measure them, there were very few people in the articles’ discussions. I felt most comfortable reading Weber’s piece, Politics of ‘Leaving No One Behind,’ because readers saw a glimpse of what the neo-liberal Sustainable Development Goals and other polices look like on the ground. Providing the example of the Bolivian ‘water wars’ and the “more than 70,000 people who took to the streets in protests [of the policy to privatize water],” Weber illustrates the local implementation of such broad policies.[1] What other examples could highlight the impact of programs like the United Nations Development Programme, Millennium Development Goals, and Sustainable Development Goals?

Large-scale quantitative studies like those discussed in the articles could aid other social scientists and researchers in the humanities when identifying points of interests for further study. One my ask, for example, why a certain country has a higher or lower gender empowerment and equality ratio and develop projects to address the inconsistencies.

[1] Weber, 403

Reflections on the Overview

Aronova, von Oertzen, and Sepkoski’s “Introduction” provides a comprehensive foundation from which to discuss Big Data, computers, and science. In their writing, they reflect critically on the legal, ethical, and political implications of today’s information technologies and the high value it places on data. They ask, then, “what is the source of the new value?”[1] By illustrating how the collection of large data was not invented by computers but in fact has a long epistemological history, the authors argue for a more encompassing historiographies of data, science, and computers that include the natural, social, and human sciences.

In their volume, Aronova et al. also question the emergence of a “new elite.” Safiya Noble’s article, “The Future of Knowledge in the Public,” takes issue with these new elites and argues for studying the social context of those who organize information online. As example, Noble discusses how systems of organization inherit their creators’ assumptions, like the classification of people as “illegal aliens” in a library system. D’Ignazio & Klein also wrestle with the difficulties of classifications in their work, “What Gets Counted Counts.” In it, they examine the online classification of gender, using Facebook as a particularly strong example.

In her “Conclusion” to Programmed Inequality, Marie Hicks too investigates gender inequalities in the technology field. Using gender as a historical analysis, Hicks shows the absence of women who defied technological change and who shaped key technologies. The piece concludes by stating that “the process of rendering invisible certain categories of workers” aligned with the nation building project.[2] Such unequal relationships of power are also evident in Bailey & Gossett’s chapter “Analog Girls in Digital Worlds.” While similarly concerned with gender, their chapter renders visible the intersectionalities of race, class, and sexuality within the digital humanities. Bailey’s section, especially, explores the relationship between academia and non-academic digital spaces, including the value and usefulness of both spaces.

The power imbalance in the digital sphere is also evident in the pieces by Kimberly Christen, Joanna Radin, and Roopika Risam, who all examine the legacies of colonialism and indigeneity in the digital world. Risam questions how the digital humanities have contributed to the epistemic violence of colonialism and neo-colonialism, and suggests some methods of decolonizing, for example, by focusing on the local context. When researchers take data out of context, as we see in Radin’s piece, it can lead to profound negative consequences. Additionally, Christen shows how the utopian ideal of the digital “openness” disregards the cultural, social, and historical conditions of oppression that native peoples have endured.

Lara Putnam’s “The Transnational and the Text-Searchable” provides further insight into historical and digital research praxis. Rather than focusing on data mining, Putnam highlights how historians use digital methods for “finding and finding out,”[3] and the consequences of some of the physical and geographic spaces of archives. This change has affected the “peripheral vision” and social interactions scholars experience in the physical archive. Although digitization has weakened some traditional barriers, Putnam concludes, the benefits may be canceled out by superficiality and new blind spots.

In all, the readings for the past two weeks illustrate the subjectivity of socio-technical systems, which are often flaunted as egalitarian, neutral, and liberating. Though the authors provide a wide range of considerations, the literature revolves around the North American and European experiences. What new insights might we encounter concerning data, the digital, gender, and race with voices trained in and hailing from South America, Africa, or South or East Asia?

[1] Aronova, et al., 4.

[2] Hicks, 238.

[3] Putnam, 378.

Jim’s Intro

Hello All,

I’m a first-year in History, and have a Masters in Latin American Studies from UofI Urbana-Champaign.  My background is in anthropology (ethnography) and I have done research in southern Brazil. Here at Pitt, however, I’ve shifted to history and archival research, and am currently working with the gay rights movement in Brazil during the military dictatorship (1964-85). My data now is coming from digitized newspapers, especially Lampião da Esquina – the first monthly publication with a national distribution by and for a gay audience.

From this course I am hoping to gain vocabulary and praxis. Until now, “the digital” has been a peripheral conversation in my courses, and I am excited to tackle it head on as the subject of study. I found Lara Putnam’s article from this past week most interesting and am looking forward to thinking about the implications of digitization for research and teaching in the Humanities and Social Sciences.

See you soon,

Jim