Rather than describing the concepts that we happen to have in natural languages, my research seeks to describe the concepts that we ought to have by abstracting on the patterns of natural reasoning. Just as the material sciences have devised methods for refining the raw materials of the natural world, philosophical logic employs model theory and proof theory in order to engineer the concepts that are better fit for theory building. In addition to these traditional methods, I have developed a programmatic methodology for working in semantics in order to draw on the computational power available in any laptop to rapidly prototype and explore the implications of a semantic theory (see software for details).

Whereas semantics and logic are typically developed for formal language consisting of the Boolean operators and a few novel additions in order to limit the computational complexity required to investigate the resulting theory, this project aims at greater conceptual unity, devising a common semantics and logic for the following operators:

Attempting to develop a unified model theoretic semantics and proof theory for the collection of operators given above is computationally intractable given traditional methods of establishing logical consequences and writing derivations all by hand. By contrast, the interactions between the operators given above are readily explored using the programmatic methodology that I have devised in order to support the discovery of adequate semantic theories and proof systems. In the next phase of this project, I aim to extend this methodology to include proof assistance so as to streamline the discovery and description of the logics corresponding to each semantic theory.

These methods in conceptual engineering aim to support the development of a unified logical theory with a broad range of applications, including AI interpretability and formal verification.

Verification

This project provides a flexible, intuitive, and unified semantics and logic for tense, modal, counterfactual conditional, and constitutive explanatory operators in order to model system behavior and verify software. Instead of taking possible worlds (i.e., complete histories of the system) and times to be primitive, the semantics models evolutions as appropriately constrained functions from times to instantaneous world states of the system. In order to interpret hyperintensional operators, world states may be identified with maximal possible states where in general states may be partial and impossible. A primitive task relation is then taken to distinguish possible and impossible state transitions, and is subject to a number of constraints, and further generalizations. Finite evolutions are then defined to be functions from a finite interval of times (integers) to states where there is a task from the state at each time in the interval to its successor (if any).

In addition to providing a general semantic framework for interpreting a range of logical operators, this project aims to unify the following logics employed in formal methods by providing a common notation in which the following may be expressed as special cases:

Temporal logic (in particular, Leslie Lamport’s TLA+) has emerged as a highly practical tool for modeling programs and systems, with adoption and support within major commercial enterprises like AWS, Microsoft, and Oracle. Other tools used in industry, like Microsoft’s Dafny and the Coq formalization of Iris’s higher-order separation logic, rely on Hoare Logic. Hoare logic (at least of the classical variety) is expressively subsumed by dynamic logic, where both dynamic and temporal logic are, at bottom modal systems: the result of enriching a standard extensional deductive logic with modal operators. The aim is to both unify and extend this common lineage, providing an intuitive and general framework in which to gain greater perspective on existing logics used in formal verification as well as affording greater expressive power given the hyperintensionality of the present approach.

Interpretability

In order to assist efforts to maintain the alignment of AI-assisted decision-making in socially and morally sensitive sectors, this project aims to gain insight into the underlying mechanics of AI models by developing a model-theoretic semantics for human-readable object languages in terms of AI model abstractions.

This approach has two main goals. First, it aims to support the challenging task of reverse-engineering AI models. Instead of interpreting model parameters or their abstractions, a semantic methodology seeks to construct semantic models for human-readable object languages from AI model abstractions. In particular, I am interested in studying AI model behavior by interpreting languages with counterfactual, causal, relevance, and constitutive explanatory operators in order to explore complex dependencies and conditional structures within AI models in addition to clarifying responses to edge cases in order to better test for model robustness. Integrating tense and modal operators also enables this approach to capture the dynamic aspects of AI decision-making, explaining how outputs evolve over time or respond to initial conditions. Including normative explanatory operators as well as epistemic, indicative conditional, free choice, and deontic modals provides essential leverage in facing the challenges of surveying and maintain alignment with human values.

The second goal is to support the design of more interpretable AI systems. I am especially interested in architectures that allow for dynamic feedback between an AI model and semantic models for a human-readable object language. This feedback loop resembles human perception which both shapes and is shaped by an agent’s beliefs, allowing for belief revision given sufficient counter-evidence. More broadly, a semantic approach bridges the opaque aspects of AI models and transparent determinations expressed in a meaningful object language. By analogy, just as perceptual modules inform cognitive reasoning in human agents even without complete transparency (e.g., one does not need to know why a hand appears as a hand to identify it as such), a semantic methodology mediates between AI models and higher-level reasoning about the AI model articulated in an interpreted human-readable object language by drawing on the logic for this language.

Since the success of a theory is ultimately determined by the abductive support that it enjoys, it is important to provide adequate tooling for testing and adapting theories rather than merely defending an existing theory from criticism. These considerations motivated the development of the model-checker for testing the implications of the semantic theories that I have provided. Instead of limiting these tools to the specific languages with which I have been concerned, the version of the model-checker currently under development aims to abstract on my semantic theories, providing a general programmatic methodology for developing and exploring new semantic systems as well as their corresponding logics.

In addition to applying these resources to interpret AI models, I am excited about the advances in scientific methodology that the solutions to the interpretability problem in AI may provide. By training AI models on specific datasets (e.g., DNA sequences, weather, etc.), extracting meaningful counterfactuals and causal hypothesis may provide powerful new methodologies across the sciences.

Papers

In Progress