The readings that we have done so far have demonstrated that data is very much subject to both present and historical biases and as such cannot be taken at face value, nor even be considered reliable and ethical. Lara Putnam, in her article “The Transnational and the Text-Searchable: Digitized Sources and the Shadows They Cast” details the advantages of digitization for researchers, who were previously constrained by archives and their accessibility. However, Putnam notes that despite the convenience that digital research offers, it is still imperative to interpret one’s findings. Putnam references E. H. Carr’s argument that historians often unintentionally select their facts, comparing historical research to a fisherman’s tackle. Putnam notes that this is exasperated by digital methods, stating that “…if the fact is out there anywhere, it will be on your hook in a nanosecond (Putnam 390).”
This is further complicated by the categorization of data by companies controlling search engines and the political implications of certain identities. The chapter “The Future of Knowledge in the Public” from Safiya Umoja Noble’s Algorithms of Oppression details the ways in which corporations and government institutions often categorize information based on white, Anglo-American male hegemony, leading to racialized categorizations in the Library of Congress, as well as the specific example of google autocorrecting “herself” to “himself as late as 2016 (Noble 6).
The complicated aspects of data and categorization are elaborated upon even further by Catherine D’Ignazio and Lauren Klein in “Chapter Three: “What Gets Counted Counts,” in Data Feminism. The authors note that something as simple as a user account can be anything but, as such systems, which demand that users categorize themselves, often disregard the identities of non-binary and trans people. Furthermore, D’Ignazio and Klein note that in the case of Facebook, which permits users to write their own identity, users are often categorized as male or female in order to appease potential advertisers. Furthermore, the authors provide an example of a case in which data cannot be transmitted at all, and the implications of such a refusal. The O’odham Nation of the Southwestern United States was unable to provide the United States government with details about the locations of burial grounds, as such knowledge constituted sacred knowledge. Therefore, the United States destroyed many burial grounds in order to construct a border fence.
Joanna Radin, in “ ‘Digital Natives’: How Medical and Indigenous Histories Matter for Big Data” demonstrates that the people of the Pima Gila River Indian Community, while they have assisted in and furnished the data for medical studies since the early twentieth century, did not retain any control over the data they provided. Kimberly Christen, however, shows a way in which this could be corrected in her article “Relationships, Not Records: Digital Heritage and the Ethics of Sharing Indigenous Knowledge Online.” She demonstrates that several indigenous nations, while generating their own digital archives, often include specific conditions on the access and use of the data, thereby retaining control over their own information. In this light, while data and its categorization may be inherently problematic, it is possible that data and its categorization may be adapted to better reflect the people who actually provide it.
One thought on “The Dilemma of Ethics and Accuracy”
Leave a Reply
You must be logged in to post a comment.
Alison Langmead
January 23, 2020 — 12:59 pm
Thanks, John. All good, all good. Let me offer you a series of comments that may all seem like pushback, but that are truly intended more as sparks for thought. That data cannot be, “considered reliable and ethical,” is a pretty strong statement, no? How might you moderate and/or contextualize that with a bit more care? This sentence of yours, “The authors note that something as simple as a user account can be anything but, as such systems, which demand that users categorize themselves, often disregard the identities of non-binary and trans people,” sparked another question for me: Is it just in the disregard for difference that the complexities of a user account lie? What about those who fully inhabit the norms represented by Facebook’s system? Are they any more or less dismissed in the process of creating the data model that is a “user?” Next…you suggest that Kimberly Christen has a way that the issues surrounding the inappropriate decontextualization of information can be “corrected.” Can this situation actually _be_ corrected? Is there a normative “correct” towards which we can all constructively move? I’m not trying to be hopelessly relativist here, but I do believe my point stands. One could argue that there is neither a “one true correct” nor a “never-available correct.” How would you consider these articles in the light of such a reality? Finally, I believe you meant “exacerbated” rather than “exasperated” in paragraph one, and I wonder if you might also consider your use of “appease” in paragraph three, as I’m not sure it’s doing the work you want that word to do.