National Communication Association Pre-Conference on Agency in Honor of Carolyn R. Miller. Las Vegas, Nevada, November 18, 2015.
Panel: Automation and Agency
Author: Candice Lanius, Rensselaer Polytechnic Institute, PhD Student in Dept of Communication & Media
Title: Finding Agency in the Data Science Machine: Understanding Emerging Climate Change Arguments from Automated Data Modeling
Abstract: Carolyn Miller’s piece “What can automation tell us about agency?” is groundbreaking for its contribution to understanding agency and responsibility when humans rely on automated systems. Miller’s insights are increasingly relevant in the context of data science, a new field that has expanded rapidly over the course of five years. In data science, particularly “big data”, much of the analytical process is beyond the conceptual power of human agents, so interpretation and processing has become automated. Miller’s conceptualization of agency as a property of the event (analytic process), not something found exclusively in human analysts, opens a door to important questions about the algorithm and code’s role in constructing arguments about human behavior in conjunction with the analyst. In one of the greatest challenges facing humanity today—climate change—modeling the interaction between human behavior and the environment is foundational to understanding and intervening. I will use Miller’s contribution for understanding agency to investigate the ideology and rhetorical impact of a series of big data projects: Google’s Earth Engine, Microsoft Research’s Madingley Model, and Data.gov’s Climate data resource. Each of these projects automates their inquiries in distinct ways to address the climate change crisis, and it is important to understand what the rhetorical and political implications of automation are for the global community.
[PDF Version] Finding Agency in the Data Science Machine
Carolyn Miller’s piece “What can automation tell us about agency?”, published in 2007, was groundbreaking for its contribution to understanding how technology complicates agency. Her piece offered a thought experiment involving an automated system that can grade college student’s speeches. By interviewing teachers and discussing their intuitions about the efficacy of such a system, Miller reveals important, modern issues with rhetorician’s concepts of agency and responsibility. In particular, the reliance on automated systems for grading a student’s speech or writing “denaturalize” traditional understandings of the audience as active participants in rhetorical processes (Miller 2007, 140). Audiences are important in the development of persuasive communication because they provide constraints for the rhetor and, during performance, provide active and complex feedback. Miller’s respondents were skeptical that machines could “take into account communicative complexities such as creativity, appropriateness to context, the expression of emotion, and individual and cultural differences” (2007, 140). Beyond this concern with the machine’s capacity to fully or fairly judge human communication, teachers also had “qualms about abandoning their own pedagogical responsibilities to a machine” (Miller 2007, 140). Miller set out to explain where this skepticism and disbelief emerges from by contrasting respondent’s views on written assessment versus spoken assessment. For one, almost all teachers thought that speaking is more complex than writing due to its performative nature (142). To further delineate the differences, Miller introduces three dimensions needed to discuss rhetorical agency: as mentioned previously, the first is performance, the second is audience, and the third is interaction. With the introduction of a machine as audience, the interactivity and audience fall apart. Kenneth Burke’s distinction between symbolic action and rote motion is a useful way to discuss this failure for rhetoric to work. However, by reconsidering agency as a property of the event itself, not something tied to capacity of the speaker or the effects upon the audience, rhetoric can once again work.
To introduce the complexities of this new position on agency, Miller turns to Pickering and Latour, two sociologists who include non-human and material agents in discussions of how events unfold. Everyone and everything is included as “actants” in their networks, and this explains the ways that information moves and is controlled in a complex system. Miller’s kinetic energy uses the decentering of the rhetor to explain how performances can work regardless of the technical mediation of either the rhetor or the audience. She describes the kinetic (moving) energy of rhetorical performance as “generated through a process of mutual attribution between rhetor and audience” (139). Miller does explain that this position does have some dangers inherent to it from theoretical, ideological, and practical angles. For one, poststructural scholars have described agency as removed from single individuals, “death of the author”, a position which has been in direct conflict with rhetoric’s placement of the author/orator/writer/rhetor as a creative force which crafts persuasive products for an eventual audience. In ideological terms, a change in how agency is conceptualized modifies the obligations for rhetors in negotiating economic and political power, civic participation, and the fight for social justice. Teaching, the most pragmatic of concerns, is also affected by the change in agency. What should remain the purview of humans? What can be safely done by machines? In order for agency as kinetic energy to work, the “performance requires a relationship between two entities who will attribute agency to each other” (Miller 2007, 149). Miller concludes her piece by arguing that rhetorical agency as kinetic energy leads to more effective and morally appropriate machine because systems will be designed with a higher degree of interactivity for the audience, stronger attention to performance of the rhetor, and will enable discussions of “how and where to draw the line– between the human and the nonhuman, between the symbolic and the material– and how to make our case to others” (Miller 2007, 152).
Miller’s insights are increasingly relevant in the context of data science, a new field that has expanded rapidly over the last decade. A good working definition of data science is “an individual, organization or application that performs statistical analysis, data mining and retrieval processes on a large amount of data to identify trends, figures and other relevant information” (Techopedia.com 2015). By looking first at the data scientist’s final product– actionable information– it is easy to see data science as a rhetorical process. Whether that information is predicting future product sales or describing patterns in human populations, data science is used in human decision making processes and its results are employed in persuasive contexts. Data science also benefits from a rhetorical perspective to better understand the responsibilities and changing agency of a highly automated analytic environment. In data science, particularly “big data”, much of the analytical process is beyond the conceptual power of human agents, so interpretation and processing has become automated. Additionally, the dissemination of results and interface with the public is increasingly automated on web platforms. Miller’s conceptualization of agency as a property of the event, in this case the analytic process, not something found exclusively in human analysts, opens a door to important questions within data science about algorithmic decision making and the construction of arguments.
Agency is a complicated term which points to the ability for an individual to act, and it also points towards their responsibility for that action. Rhetorical agency in the instance of data science helps expand rhetoric (the crafting of persuasive speech broadly defined) beyond traditional boundaries and allows us to ask: where does rhetoric start? In the case of the individual data scientist, their prior training and expertise guides their analysis and interaction with the machine. A broad view of rhetorical agency allows us to consider the role of machines, technology, mathematics, education, ideology, material conditions, etc. as actants that exert pressure on the data scientist’s “insights”, which are a rhetorical product, meant to convince agents to act in a particular way. The data science process becomes a space for new rhetorical inquiries.
In this piece, I will focus on one small sub-set of contemporary data science to elucidate and explore these issues. Anthropogenic climate change is one of the greatest challenges facing humanity today, and modeling the interaction between human behavior and the environment is foundational to understanding and intervening. I will use Miller’s contribution for understanding agency as kinetic energy to investigate the ideology and rhetorical impact of a series of public, climate change models: Google’s Earth Engine, Microsoft Research’s Madingley Model, and Data.gov’s Climate Resilience Toolkit. Each of the three projects/platforms automates their inquiries in distinct ways to address the climate change crisis. They also rely on computer mediation to share those models within the scientific community and public, and it is important to understand what the rhetorical and political implications of automation are for the global community. After offering a few comments on how public climate change models differ from Miller’s speech assessment program, I will provide a brief summary of each project, then offer comments on how agency as kinetic energy could improve the platform’s design and facilitate more persuasive public communication. Ultimately, I will explain why this is an important moral obligation if climate change is going to be handled in a responsible way.
Miller’s speech and writing assessment technology differs in several ways from the exigency posed by climate change modeling. First, climate change models are absolutely necessary to do the work, not just a matter of expediency. Human beings do not have the conceptual power to simultaneously remember and find patterns across large climate datasets. Data scientists must rely on parallel computing (Hadoop) and cutting edge mathematical and algorithmic manipulations to process the deluge of data resources. This first observation leads to the second primary difference: modeling techniques are embroiled throughout the process, producing different types of agency, situations, and roles. For the sake of clarity in this piece, these types of complex agency will be partitioned into technical agency and rhetorical agency. Technical pointing towards the work that data scientists do with automated tools in the lab while producing information. Rhetorical agency is closer to what Miller discusses in her piece, and it captures how results are shared with the public. In describing the first as technical, however, it is important to note that both types of agency are rhetorical– they involve crafting data into information that will be persuasive and useful for an eventual audience. The first situation, technical agency, works with the three components of performance, audience, and interaction, but the performance and eventual audience are minimized as the data scientist interacts with the available technologies to work out an appropriate answer. For the rhetorical agency in a data scientist’s output, performance and audience are paramount, but the roles are inverted from the positions held in Miller’s example. The software performs for the user (audience), and the user and machine interact to varying degrees depending on the affordances of the public platform. Rhetoric and rhetorical agency as kinetic energy is a useful concept to apply to climate change modeling to address the interaction between the analyst and machine, the scientific community, and the broader public.
Now, I will turn to the three projects mentioned earlier. After explaining what they do, I will address: 1) how the inquiry and presentation of information is automated, 2) the rhetorical and political implications of that automation, and 3) how to make climate change models more persuasive and responsive to the general public. One limitation of my analysis is a lack of access to the process which informs these public climate change models. Calling upon Latour once again, these modeling systems are examples of “black boxes”: technologies that obscures their internal functions. Despite this barrier to in depth analysis, I will work with the hints and clues that are publicly available in documentation and tutorials to intuit what analysis and automation occurs to address technical agency. The public, rhetorical agency is accessible to me as a member of the target audience.
Google’s Earth Engine
Google’s Earth Engine is available at: earthengine.google.org. The tutorial describes the earth engine as a “planetary-scale platform for environmental data analysis” that “brings together over 40 years of historical and current global satellite imagery, and provides the tools and computational power necessary to analyze and mine that vast data warehouse.” It is currently being used by scientists around the world to understand and prevent massive rainforest deforestation, areas that are primed for wildfires, human carbon production, and changing ecosystem boundaries, e.g. the changing area between a forest and desert or receding coastlines. The tool has even been used by a medical team to assist with predicting malaria outbreaks. The platform functions by using existing Google infrastructure (parallel computing on Google’s cloud servers and petabytes of public data resources); it will look familiar to users of Google Maps or Google Earth since the cloud-free base map for those services was generated using this tool. Earth Engine can be accessed either through a GUI website, with structured preselected options that demonstrate others analyses, or an integrated development environment that runs on Java. This environment is available for free, non-commercial use after the user signs an evaluation agreement. The public facing GUI allows anyone from the general public to view the results that are produced in the development environment. The development environment is composed of four panels, a code sample and documentation library, a window to run earth engine scripts, a compiler and settings view, and the base map to view results. To illustrate how this works, a developer in the environment can select a sub-section of light pollution data and chart how light levels change over time by comparing it to a global mean. The engine supports this type of simple descriptive, mathematical operations and more complex comparative analyses.
The Google Earth Engine has two distinct experiences. First, the development environment facilitates the analyst to explore and play with different mathematical operations, datasets, and graphical displays. This process of exploration is where interesting results and information is generated and then shared with the public. The importance of the machine to the generation of rhetorically persuasive results cannot be overstated as the programmer/analyst interacts with the machine using scripts and aesthetic markers to manipulate and display results. However, a great deal of work is automated. The analyst is not required to understand fully the mathematical statistics that inform the analysis, and they never view the datasets that are fed into the engine.
In the second public facing view, the analyst and programmer is hidden. Like most Google services, Earth Engine is treated as a utility that is seamless in its mediation of data into information. The most obvious and yet unremarked example of this is Google Earth’s, Map’s, and Earth Engine’s use of a cloud-free base map of the earth. The earth has never been without atmospheric cover of some sort, yet this view goes unremarked as users engage with Google’s utilities. The interaction and responsiveness between the user and the GUI is a form of kinetic energy, where the user is persuaded of the engine’s facsimile with reality.
Google’s Earth Engine has several rhetorical implications from this automation. As Miller mentions in her piece on automation, teachers will either produce more interactive and effective assessment technologies, or they will become accustomed to the existing systems. In the same way, users of Google’s Earth Engine have two potential routes: change the system or become accustomed to its use. The engine is exemplary for its level of interaction and performance, however, it has limitations in regards to audience. Google’s system currently does not adapt to a diverse audience, especially the non-sympathetic viewer. Many of the very things that make Google’s services the best in the world, such as seamless integration and homogeneity, actually feed into the suspicions of those who distrust climate change science as a vast conspiracy. Another rhetorical agency dilemma emerges from the presentation of results in the GUI as the exclusive output of individuals/ teams of researchers OR as the exclusive output of the model. The authorial label places responsibility for errors squarely with either the researcher or the machine, not recognizing the interplay between the two. Miller’s conception of agency as kinetic energy provides a useful corrective to this problem of responsibility and allows for the expansion of good will to correct errors in climate change models.
So how could Google’s Earth Engine be a better actant (in the words of Latour) in the high stakes rhetorical situation of convincing more of the general public that anthropogenic climate change is real? For the analyst and machine, Google’s utility appears to be above question. It fully supports the interaction, performance, and audience adaptation that Miller discussed. However, for the public facing GUI, the level of transparency and interaction is not high enough to engage the full spectrum of American audience members.
Microsoft Research’s Madingley Model
The Madingley Model, available at www.madingleymodel.org, is a collaborative project between Microsoft Research and the United Nations and is the world’s first “General Ecosystem Model” (GEM). The GEM simulates all life on earth, including key biological processes such as photosynthesis, food-chains, and life cycles on both land and sea. The key questions for the model are “what will happen to these ecosystems in the future in response to various human pressures, and how can we mitigate or reverse any damage?” As a “virtual biosphere”, the model will predict the “effects of invasive rats on islands,” and simulate the “removal of keystone species” and “‘rewilding’ scenarios that attempt to restore ecosystem function.” The Madingley Model is available as an open source code base for the scientific community, but there is no usable GUI for external audiences at this time.
Despite there being no system to explore, a major limitation for analysis, I selected the Madingley Model for its prototypical position in the ecological community. As the collaborators explain, “what we have begun to do is build the equivalent of the climate models that are used to predict the future of the earth’s atmosphere and oceans, but for ecosystems.” While they have not reached the high levels of complexity and sophistication of Google’s Earth Engine, they have had early successes with proof of concept models of small scale ecological processes that are observable without the use of automated tools. Despite these successes, they admit that the adoption of the model is controversial; “Many ecologists consider that nature is simply too complex to model in this way. Others believe that the outputs from the models are too uncertain to be the basis of important decisions, such as where to place protected areas or how best to run fisheries.” The collaborators argue, however, that even imperfect modeling is better than no modeling since the system will offer “decision-makers a tool to explore the potential effects of their decisions on the environment, in a computer, before the decisions are rolled out in the real world.” The policies are then open to modification rather than implementation without reflection because the effects are considered unknowable.
The Madingley Model offers a clear look into the aspirations of ecological scientists who strive to provide a rhetorically persuasive tool for their own community that is overtly political. The model is not simply for generalizing and testing within a small community; eventually, the collaborators want the model to be a totalizing representation of the earth’s ecosystem to test out potential fixes and impending problems. To make this model persuasive, however, the team should hire a rhetorician who understands the complexities of rhetorical agency and decision making. Pulling from Miller’s experience with teachers and speech assessment, this model will be most effective if it provides a space for audience engagement and also recognizes the collaborative role between the collaborators and their technology. This is hinted at when they argue “the model may be crucial in guiding science, e.g. which processes to study, in which ecosystems, which data to gather, which experiments to do etc. The model is likely to throw up new scientific questions that were not evident before it was built.” This “guidance” is an emerging form of agency as kinetic energy that has long reaching potential effects on the ecological sciences, and it is a valuable conversation for scientists to understand and explore the positive effects and dangerous consequences.
Data.gov’s Climate Resilience Toolkit
Data.gov‘s climate resilience toolkit is a government supported, open source database viewable at www.data.gov/climate. The toolkit includes datasets, use cases, and climate explorer, a mapping and graphing tool for public climate data. Each tool is directed at a specific problem and compiles and analyzes data related to its outcome and provides policy suggestions for improvement. For example, the “Adaptation Tool Kit: Sea-Level Rise and Coastal Land Use” has information about the coast receding along with how to facilitate adaptation of the solutions (whether regulatory, legal, or market-based, etc.) and a list of sites that have already used the solution. Along with the more technical descriptions, there are also “top-level analysis of the economic, environmental, and social costs and benefits of each tool”. The current data and tool kits cover coasts, food, water, ecosystem health, human health, energy infrastructure, transportation, and changes in glacial ice. The climate explorer service has two views: a base map where layers of information are toggled on and off from the right side selection screen.
This site is interesting as a comparison to the other two because it is supported by the federal government; It is overtly political and responsive to a diverse civil population. Of interest even before opening the toolkit is the title, “resilience.” This is an intentional naming choice, along with “climate-related impacts”, that uses euphemisms for “climate change” to soften their impact. While the term climate change is used on occasion, it is less common. Even more interesting is the statement: “Decision makers across the nation are learning to use data and tools to manage their climate-related risks and opportunities” (Site Overview, toolkit.climate.gov). While other models are mainly on a risk and damage control valence, this site uses terms like opportunity, growth, and benefits in relation to its policy recommendations.
The tools from Climate Resilience provide limited agency to both the audience and the non-human actants. The policy outputs are fully formed and supported by evidence. Climate explore does offer a superficial sense of interactivity to the public, since it is fully available to anyone online and is easy to use for a lay audience. The data however is canned and the “explore” option is simply a toggle of existing graphics and the ability to zoom in and out on a map. This is perhaps interesting as an introduction to the issue, but it does not have the rhetorical energy and agency that would make it compelling. Climate Explore is not responsive to the audience, and as Miller argued, being responsive is an important part of accepting the rhetorical agency of machines. This existing system is not using the potential of the tools. They are simply digital approximations of older media: everything available here could be easily viewed in a physical book.
Technical agency as kinetic energy facilitates a more accurate understanding of the scientific process. There has been a great deal of push back publicly against the technical and exploratory nature of model creation. That is, members of the public are frustrated, saying ‘If climate change models are objective, then why do they keep changing?’ This frustration is due to a fundamental misunderstanding of the scientific process involved in climate change research. If one algorithmic approach doesn’t work, scientists work to invent or try a new approach until an appropriate answer emerges. The answer is a result of the interaction between the scientist and their technological tools. Rhetorical agency as kinetic energy reconciles this misunderstanding of the process by including the technology as an important actant. Information does not arise fully formed from the scientists analytic prowess; it is an attribute that occurs as a result of energy produced during the interaction. In this sense, it has a great deal to do with the modified understanding of ethos that Miller introduced in her piece. Rather than ethos simply being a speaker’s prior reputation, with kinetic energy, ethos becomes an attribute of trustworthiness and reputability based on the performance. In a similar way, the outputs from climate change science are not valuable as theories alone, theories gain their validity and credibility by being vetted, improved, and modeled, interactively by technology and the analyst.
Rhetorical agency as kinetic energy also points towards another important improvement to communicating climate change models. Interaction is absolutely necessary. At a most basic level, the rhetor adapts to the audience and thereby creates kinetic energy which makes the rhetoric work. In order for climate change models to be effective, they must be interactive and adaptive to the user (audience) to make the rhetoric work.
Latour, B. (1988). Mixing humans and nonhumans together: The sociology of a door-closer. Social Problems, 35(3), 296-310.
Miller, C. R. (2007). What can automation tel us about agency? Rhetoric Society Quarterly, 37, 137-157.
Techopedia.com. (2015). Data science. Accessed: November 12, 2015.
*Objects for analysis referenced in text.