Examples of Research Design Charts for Big Social Data Projects

Below are additional examples for my 2015 ACM/IEEE International Conference on Advances in Social Networks Analysis and Mining paper “Arguments and Interpretation in Big Social Data Analysis: A Survey of the ASONAM Community.”

B. Argumentation Theory and Research Design

The proposed framework for evaluation looks at research design, the strategic plan built by the researcher to coherently and logically organize the research process. An ideal research design provides a technical roadmap for the researcher to collect and analyze their data, and it also insures that the research addresses the problem successfully [7]. To put this another way, the research design is the planned route to keep human errors from effecting the results. Components of most research designs include performing a literature review for other’s work on the topic, the proposal of research questions, the identification of data, a plan for collection and processing, and a method for analysis.

Big social data analysis complicates traditional notions of research design because the data exist independently of the research project and prior to the formulation of  a research question. Due to this obstacle, I propose that we consider research designs as more than technical roadmaps: research designs are also arguments. By treating them as arguments, we can create standards for evaluation of components of the plan as propositions. The evaluation of research plans as arguments allows for the production of the best work possible by facilitating the explicit consideration of alternative explanations.

During the research process, there are numerous moments of interpretation where the researcher selects from a range of appropriate alternatives [8]. In these moments, selecting the right or wrong answer over-simplifies the situation. The survey of ASONAM participants uncovers interpretive moments to evaluate them as arguments: while there is not a right or wrong answer, there are better answers that more completely or accurately address the problem space.

Argumentation theory provides a structure to understand how research designs function as arguments [9]. Toulmin’s model for addressing formal arguments is composed of data, warrant, claim, ground, backing, and qualifier. Claims are the final conclusions, and warrants are what link data and the ground to a claim. The ground, which can often overlap with the data, is the basis for using a specific type of data. The ground is the definitions and theory where most arguments begin. The backing is additional support for an argument that bolsters unexpected or counter-intuitive claims. Finally, qualifiers condition when the claim should be accepted (e.g. “if x, then y”) or provide the strength of belief in its veracity (e.g. “sometimes x occurs”). These constituent parts can be found in big social data research designs, and by charting the arguments using this model, they can be evaluated and improved.

Fig. 1 is an example of the argumentation framework applied to the research design of Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1-8.

The backing for the ground—that collective mood states may impact systemic decisions— is italicized because it is a proposition that only logically supports the research plan after the technical demonstration of the model. The ground emerges from behavioral economics, borrowing strength from a well-established observational discipline. Ultimately, the technical aspects are sophisticated and performed without error and the qualifier maintains reasonable expectations for the results. In this case, the research as an argument is very persuasive to the community: It has been implemented in numerous real world applications and cited over 2,400 times.

Fig. 2: Chatterjee, A., & Perrizo, W. (2015, August). Classifying stocks using P-Trees and investor sentiment. In 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (pp. 1362-1367). IEEE.

Continue reading

Topoi of Mathematical Statistics: The Creation of Argumentative Facts by Data Scientists

Tags

, , , , , ,

Rhetoric Society of America Biannual Meeting in Atlanta, Georgia.

Paper Title:
Topoi of Mathematical Statistics: The Creation of Argumentative Facts by Data Scientists

Presentation: RSA16 Topics of mathematical statistics, Lanius

Abstract: Data scientists, who frequently rely on massive amounts and varied types of social media data, create insight and value using complex mathematics and engineering methods. This paper introduces a new, rhetorically motivated, approach to understanding results as argumentative facts: results which are both technical and interpretive objects. Building from Leff’s (1983) work on systems of topical invention and Dreyfus & Eisenberg’s (1986) article On the Aesthetics of Mathematical Thought, I argue that the mathematical aesthetic principles of conciseness, simplicity, clarity, structure, power, cleverness, and surprise are not simply secondary considerations employed after a statistical analysis is performed. These principles are used by data scientists as procedural topics that connect their data to compelling claims (Toulmin, 1958). The system is demonstrated using contemporary examples from big social data analysis. Ultimately, a system of topics increases awareness of different governing logics so that data scientists can violate disciplinary norms during the inferential process to make stronger arguments.

Keywords: Data science, big social data, argumentation, interpretation, topical invention, mathematical aesthetics, procedural rhetoric

“It’s common sense!” The unseen role of psychological theory in big social data analysis.

Today, I had the opportunity to present a part of my dissertation research to the Department of Psychiatry at Albany Medical Center. The journal club was a good place to present to a different audience (in this case, domain experts in psychology !) and discuss ramifications/ potential outcomes. I look forward to presenting a full grand round at Albany Medical Center in September.

First Place “Graduate Student Essay” in the McKinney Contest!

Tags

, ,

The winners of RPI’s writing contest were announced last night, and I placed first in the graduate student essay category, winning $300, with my essay “Finding Rhetorical Agency in the Data Science Machine: Understanding Emerging Climate Change Arguments from Automated Data Modeling.”

Last year I came in second with my essay “The Path of Least Resistance: An exploration of non-human agency in a workplace survey.”

I am noticing a trend… despite having different contest judges both years, they appear to like papers that discuss technology and agency.

Congratulations to the other winners!

Dissertation Survey: Arguments and Interpretation in Big Social Data Analysis

Tags

, , , , , ,

After passing my Prospectus Defense in December, my PhD project was suddenly real: I must now actually do primary research and write a several hundred page document. The best advice I have gotten so far is to take it one step at a time. So here is a progress report on Stage 1: The Survey.

I submitted my survey text to Rensselaer’s Institutional Review Board in January and received an exemption: “45CFR46.101(b)(2): Anonymous Surveys – No Risk”. Since my survey is anonymous and does not harm any of the respondents, I am cleared for action.

My target community is big social data researchers in the United States working out of academic institutions. For the first round of invitations, I have been using the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining proceedings to solicit respondents. Everyone has been very polite and interested in my work, and I have had a solid 20% response rate. My goal is to get at least 20 responses, but I will continue collecting through the end of March.

For this initial survey, I am interested in how data scientists use interpretation to complete their projects and how they communicate their results to their audience. My survey questions focus on a few key themes. First, I was interested in how respondents understood their disciplinary role and why they became interested in big social data. Next, I asked about interpretation: how they decided on research questions and generated explanations for their results. If they changed their research questions mid-way through the analysis, I also wanted to know what steps they took to ensure accuracy. Then, I turned to technical aspects of the process, asking what steps they took and how they handled false-negative and false-positives. Finally, I asked about communicating results persuasively and to a target audience. The preliminary results look promising, and I personally find them fascinating!

In case anyone is particularly interested, here are the exact questions. The bulk of them are directed at the researcher’s specific project they submitted to the ASONAM conference. Continue reading

The Hidden Anxieties of Self-Tracking

Radio Interview with Nora Young, Spark, CBC Radio

Many of us willingly collect data about ourselves through wearable trackers or apps with the hope that through measuring and charting our life, we can actually control it. But sometimes the very effort of trying to control it causes anxiety.

Researcher Candice Lanius talks about what she calls the “hidden anxieties” of the quantified self movement.

Finding Agency in the Data Science Machine: Understanding Emerging Climate Change Arguments from Automated Data Modeling

Tags

, , , ,

National Communication Association Pre-Conference on Agency in Honor of Carolyn R. Miller. Las Vegas, Nevada, November 18, 2015.

Panel: Automation and Agency

Author: Candice Lanius, Rensselaer Polytechnic Institute, PhD Student in Dept of Communication & Media

Title: Finding Agency in the Data Science Machine: Understanding Emerging Climate Change Arguments from Automated Data Modeling

Abstract: Carolyn Miller’s piece “What can automation tell us about agency?” is groundbreaking for its contribution to understanding agency and responsibility when humans rely on automated systems. Miller’s insights are increasingly relevant in the context of data science, a new field that has expanded rapidly over the course of five years. In data science, particularly “big data”, much of the analytical process is beyond the conceptual power of human agents, so interpretation and processing has become automated. Miller’s conceptualization of agency as a property of the event (analytic process), not something found exclusively in human analysts, opens a door to important questions about the algorithm and code’s role in constructing arguments about human behavior in conjunction with the analyst. In one of the greatest challenges facing humanity today—climate change—modeling the interaction between human behavior and the environment is foundational to understanding and intervening. I will use Miller’s contribution for understanding agency to investigate the ideology and rhetorical impact of a series of big data projects: Google’s Earth Engine, Microsoft Research’s Madingley Model, and Data.gov’s Climate data resource. Each of these projects automates their inquiries in distinct ways to address the climate change crisis, and it is important to understand what the rhetorical and political implications of automation are for the global community.

[PDF Version] Finding Agency in the Data Science Machine

Continue reading

ACM Chapter Seminar on E-Learning and Technical Communication

Tags

, , , ,

Today, I joined my colleagues at Rensselaer and Dr. Debo Roy from the University of Aizu, Japan to participate in an ACM seminar on e-learning and technical communication. It was a strong and interesting mixture of technical and pedagogical discussions surrounding Legos in an ESL context, Classroom Assessment with new technologies, Information Design using CAD Software, Usability Testing of and classroom uses for 3D printing.

My talk was “Using CAD Software to Break from Photorealism in the Classroom: A Case Study of Build with Chrome and GIS Analytics.” More information is available here.

Hopefully our discussions will lead to an edited volume or special journal issue in the coming year.

The recording of my talk is available in 4 parts:

Follow

Get every new post delivered to your Inbox.