Counterfactuals and counterfactual reasoning underpin numerous techniques for auditing and understanding artificial intelligence (AI) systems. The traditional paradigm for counterfactual reasoning in this literature is the interventional counterfactual, where hypothetical interventions are imagined and simulated. For this reason, the starting point for causal reasoning about legal protections and demographic data in AI is an imagined intervention on a legally-protected characteristic, such as ethnicity, race, gender, disability, age, etc. We ask, for example, what would have happened had your race been different? An inherent limitation of this paradigm is that some demographic interventions – like interventions on race – may not translate into the formalisms of interventional counterfactuals. In this work, we explore a new paradigm based instead on the backtracking counterfactual, where rather than imagine hypothetical interventions on legally-protected characteristics, we imagine alternate initial conditions while holding these characteristics fixed. We ask instead, what would explain a counterfactual outcome for you as you actually are or could be? This alternate framework allows us to address many of the same social concerns, but to do so while asking fundamentally different questions that do not rely on demographic interventions.
We present a framework for large language model (LLM) based data generation with controllable causal structure. In particular, we define a procedure for turning any language model and any directed acyclic graph (DAG) into a sequence-driven structural causal model (SD-SCM). Broadly speaking, an SD-SCM is a causal model with user-defined structure and LLM-defined structural equations. We characterize how an SD-SCM allows sampling from observational, interventional, and counterfactual distributions according to the desired causal structure. We then leverage this procedure to propose a new type of benchmark for causal inference methods, generating individual-level counterfactual data without needing to manually specify functional relationships between variables. We create an example benchmark consisting of thousands of datasets, and test a suite of popular estimation methods on these datasets for average, conditional average, and individual treatment effect estimation, both with and without hidden confounding. Apart from generating data, the same procedure also allows us to test for the presence of a causal effect that might be encoded in an LLM. This procedure can underpin auditing LLMs for misinformation, discrimination, or otherwise undesirable behavior. We believe SD-SCMs can serve as a useful tool in any application that would benefit from sequential data with controllable causal structure.
2023
Counterfactuals for the Future
Lucius EJ
Bynum
,
Joshua R
Loftus
,
and
Julia
Stoyanovich
Proceedings of the AAAI Conference on Artificial Intelligence, Jun 2023
Counterfactuals are often described as ’retrospective,’ focusing on hypothetical alternatives to a realized past. This description relates to an often implicit assumption about the structure and stability of exogenous variables in the system being modeled — an assumption that is reasonable in many settings where counterfactuals are used. In this work, we consider cases where we might reasonably make a different assumption about exogenous variables; namely, that the exogenous noise terms of each unit do exhibit some unit-specific structure and/or stability. This leads us to a different use of counterfactuals — a forward-looking rather than retrospective counterfactual. We introduce "counterfactual treatment choice," a type of treatment choice problem that motivates using forward-looking counterfactuals. We then explore how mismatches between interventional versus forward-looking counterfactual approaches to treatment choice, consistent with different assumptions about exogenous noise, can lead to counterintuitive results.
The Possibility of Fairness: Revisiting the Impossibility Theorem in Practice
Andrew
Bell
,
Lucius EJ
Bynum
,
Nazarii
Drushchak
, and
3 more authors
In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency , Jun 2023
The “impossibility theorem” — which is considered foundational in algorithmic fairness literature — asserts that there must be trade-offs between common notions of fairness and performance when fitting statistical models, except in two special cases: when the prevalence of the outcome being predicted is equal across groups, or when a perfectly accurate predictor is used. However, theory does not always translate to practice. In this work, we challenge the implications of the impossibility theorem in practical settings. First, we show analytically that, by slightly relaxing the impossibility theorem (to accommodate a practitioner’s perspective of fairness), it becomes possible to identify abundant sets of models that satisfy seemingly incompatible fairness constraints. Second, we demonstrate the existence of these models through extensive experiments on five real-world datasets. We conclude by offering tools and guidance for practitioners to understand when — and to what degree — fairness along multiple criteria can be achieved. This work has an important implication for the community: achieving fairness along multiple metrics for multiple groups (and their intersections) is much more possible than was previously believed.
Causal Dependence Plots
Joshua R
Loftus
,
Lucius EJ
Bynum
,
and
Sakina
Hansen
Explaining artificial intelligence or machine learning models is increasingly important. To use such data-driven systems wisely we must understand how they interact with the world, including how they depend causally on data inputs. In this work we develop Causal Dependence Plots (CDPs) to visualize how one variable–a predicted outcome–depends on changes in another variable–a predictor–along with consequent causal changes in other predictor variables. Crucially, this may differ from standard methods based on holding other predictors constant or assuming they are independent, such as regression coefficients or Partial Dependence Plots (PDPs). CDPs use an auxiliary causal model to produce explanations because causal conclusions require causal assumptions. Our explanatory framework generalizes PDPs, including them as a special case, and enables a variety of other custom interpretive plots to show, for example, the total, direct, and indirect effects of causal mediation. We demonstrate with simulations and real data experiments how CDPs can be combined in a modular way with methods for causal learning or sensitivity analysis. Since people often think causally about input-output dependence, CDPs can be powerful tools in the xAI or interpretable machine learning toolkit and contribute to applications like scientific machine learning and algorithmic fairness.
2022
An Interactive Introduction to Causal Inference
Lucius EJ
Bynum
,
Falaah Arif
Khan
,
Oleksandra
Konopatska
, and
2 more authors
IEEE VIS Workshop on Visualization for AI Explainability (VISxAI), Jun 2022
This work is a deep dive into the foundations of causal inference in the style of an interactive story. Learn all about randomization, causal graphical models, estimating treatment effects, and the assumptions behind causal inference.
2021
Disaggregated Interventions to Reduce Inequality
Lucius EJ
Bynum
,
Joshua R
Loftus
,
and
Julia
Stoyanovich
In Proceedings of the 1st ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization , Jun 2021
A significant body of research in the data sciences considers unfair discrimination against social categories such as race or gender that could occur or be amplified as a result of algorithmic decisions. Simultaneously, real-world disparities continue to exist, even before algorithmic decisions are made. In this work, we draw on insights from the social sciences brought into the realm of causal modeling and constrained optimization, and develop a novel algorithmic framework for tackling pre-existing real-world disparities. The purpose of our framework, which we call the “impact remediation framework,” is to measure real-world disparities and discover the optimal intervention policies that could help improve equity or access to opportunity for those who are underserved with respect to an outcome of interest. We develop a disaggregated approach to tackling pre-existing disparities that relaxes the typical set of assumptions required for the use of social categories in structural causal models. Our approach flexibly incorporates counterfactuals and is compatible with various ontological assumptions about the nature of social categories. We demonstrate impact remediation with a hypothetical case study and compare our disaggregated approach to an existing state-of-the-art approach, comparing its structure and resulting policy recommendations. In contrast to most work on optimal policy learning, we explore disparity reduction itself as an objective, explicitly focusing the power of algorithms on reducing inequality.
2020
Rotational Equivariance for Object Classification using xView
Lucius EJ
Bynum
,
Timothy
Doster
,
Tegan H
Emerson
, and
1 more author
In IGARSS 2020-2020 IEEE International Geoscience and Remote Sensing Symposium , Jun 2020
With the recent addition of large, curated and labeled data sets to the remote sensing discipline, deep learning models have largely surpassed the performance of classical techniques. These deep models, typically Convolutional Neural Networks, are invariant to translation through the use of successive convolution layers which are themselves equivariant to translation. Further, the combination of multiple convolution and pooling layers means that in practice, the model is also approximately invariant to translation. However, until recently these models could only approach rotational invariance through data augmentation. Here we propose using a new model formulation which achieves rotational equaivariance without data augmentation for overhead imagery classification. We utilize the popular xView data set to compare the rotational equivariance formalization against a regular CNN and CNN with rotational data augmentation for the task of image classification.
Argumentative Topology: Finding Loop(holes) in Logic
Sarah
Tymochko
,
Zachary
New
,
Lucius EJ
Bynum
, and
4 more authors
Advances in natural language processing have resulted in increased capabilities with respect to multiple tasks. One of the possible causes of the observed performance gains is the introduction of increasingly sophisticated text representations. While many of the new word embedding techniques can be shown to capture particular notions of sentiment or associative structures, we explore the ability of two different word embeddings to uncover or capture the notion of logical shape in text. To this end we present a novel framework that we call Topological Word Embeddings which leverages mathematical techniques in dynamical system analysis and data driven shape extraction (i.e. topological data analysis). In this preliminary work we show that using a topological delay embedding we are able to capture and extract a different, shape-based notion of logic aimed at answering the question "Can we find a circle in a circular argument?"