Broadening data, iterative searching, and uncovering capta – Digital / Critical Interdisciplinary Methods

I appreciated thinking about data from a different point of view—that of an historian. Specifically, I appreciated the way the readings pushed us to think about how things that might not normally be seen as “data” (like the way an historian combs through an archive, takes notes, organizes quotes, makes analytical leaps, and writes it up) are indeed forms of data collection, aggregation, and dissemination. The reason I find this important is that people generally privilege certain forms of data over others. Numbers are more convincing than words, because, as it goes, numbers are “truth” and words are constructed. What opening up the definition to data does is show that all of this is, in some sense, is constructed. Making informed and careful choices about how data is collected, aggregated, and disseminated matters whether you are dealing with words or numbers.

I was recently writing up some findings from a study I did a few years back in which we looked at first-year students’ information behaviors. We showed students a variety of articles and asked them how credible the information found therein was. Students were very convinced by articles that had graphs, statistics, and any form of numerical underpinning—whether they were corroborated, well researched, or well written or not This, to me, shows the alarming way in which young people (well…lots of people not just young people) trust numbers without a great deal of criticality.

Another vein I enjoyed discussing were different ways to envision searching environments. Indeed, as someone who comes from a library background, I was always trying to get students to create a more open, exploratory posture in relation to their information consumption and their research methods. The Guildi reading was especially interesting to read as it created a counter-narrative to so many student database searches that I’ve seen over the years. These searches generally go something like this: a student comes in, they have an argument in mind, and they want to find “data” (read: quotes) that support it. There is no discovery, browsing, or curiosity in this research method. Instead, Guildi’s critical search model sees the researcher interacting with research material in a much more iterative manner. Such discovery I think is very important for us to cultivate in today’s information age.

Finally, I’ve been thinking a great deal about the Drucker concept of “capta” as I work on my final project. I am looking at the publicly available Covid-19 Dataset that is hosted on Semantic Scholar and created by the Allen Institute for AI. I’m looking at how Kaggle has gamified the dataset as they are trying to create NLP solutions for exploring such a large body of academic research. I’m trying to figure out how the data was aggregated and am having a very hard time doing so. To me cutting out the human hand in all of this makes the data seem sterile, like it just exists. But the critical thinker in me knows that someone’s hand was there—and I want to find out what decisions that hand made when it was putting together the set. How was the data taken, molded, and created for data scientists to work with?

Leave a Reply Cancel reply