Alex Infanger
About
Ahoy. I'm an independent researcher based in the San Francisco Bay Area working on AI interpretability and alignment. In 2022, I finished my PhD on theory and algorithms for Markov chains at the Institute for Computational and Mathematical Engineering at Stanford.
axelnifgarden [ ] amgil [ ] com. (Up to a permutation, that is. Note my middle name is Dara.).
News, Talks, etc.
- October 2024. New preprint: "The Persian Rug: Solving Toy Models of Superposition using Large-Scale Symmetries". (Twitter/X thread)
-
We derive an optimal solution and corresponding loss for Elhage et al. (Anthropic)'s toy model of superposition in the limit of large input dimension. We discuss implications for designing and scaling sparse autoencoders.
- March 2024. "A posteriori error bounds for truncated Markov chain linear systems arising from first transition analysis" published in Operations Research Letters.
-
Many Markov chain expectations and probabilities can be computed as solutions to systems of linear equations, by applying “first transition analysis” (FTA). When the state space is infinite or very large, these linear systems become too large for exact computation. In such settings, one must truncate the FTA linear system. This paper derives lower and upper bounds for when one does this truncation and shows the bounds' effectiveness on two numerical examples. (code)
- December 2023. "Solution Representations for Poisson’s Equation, Martingale Structure, and the Markov Chain Central Limit Theorem" published in Stochastic Systems.
- This paper is concerned with Poisson's equation for Markov chains. For those unaware of this equation, Poisson's equation for Markov chains connects with Poisson's equation the PDE \( \Delta \varphi = f \) (e.g. for electricity and magnetism) in the following way. When you discretize the PDE you can find a representation for the solution that the potential \(u(x)\) as the average value of its neighbors, plus a modified charge value at \(x\) (a classical calculation, do it yourself!). This means \(u(x)\) can be interpreted as the expected value of the sum of the modified charge values "picked up" by a symmetric random walk starting at \(x\). Poisson's equation for Markov chains generalizes this expected value problem to the case where the symmetric random walk is now any Markov chain.
- October 2023. "Eliciting Language Model Behaviors using Reverse Language Models" won a spotlight at the 2023 SoLaR Workshop at Neurips.
- At a high level, the idea in this paper is that if you had a reverse language model (that models a language model in reverse), you could start with dangerous/toxic outputs and go backwards to find all the prompts that lead to that dangerous output. After finding all such prompts, you could then do adversarial training to protect the original (forwards) model from them.
- Fall 2022. I spent a few months facilitating reading groups on the AGI safety fundamentals curriculum in Boston. This was for the MIT AI Alignment Team (note I was not officially affiliated with MIT: this work was funded by the FTX Future Fund regrant program).
- June 2022. "Truncation Algorithms for Markov Chains and Processes" received an honorable mention for the Gene Golub Dissertation Award.
- This thesis focuses on the problem of approximating an infinite or very large state space Markov chain \(X=(X_n:n\geq 0)\) on a smaller subset of the state space \(A\). A well-known approach to this problem is to re-route transitions of the original chain that attempt to leave \(A\) into \(A^c\) back into \(A\). We give new conditions under which such an approximation is good for estimating the stationary distribution \(\pi\) of \(X\) (in the sense of convergence as \(A\) gets large). We also provide a new approximation for estimating \(\pi\) on \(A\) that comes with error bounds.
- June 2021. I was honored to receive ICME's Teaching Assistant Award for the 2020-2021 academic school year. This was the year I was an assistant for two ICME PhD core courses: CME 305 (Discrete Mathematics & Algorithms) and CME 308 (Stochastic Methods in Engineering).
Preprints
-
"The Persian Rug: Solving Toy Models of Superposition using Large-Scale Symmetries". (2024).
A. Cowsik, K. Dolev, A. Infanger.
(code) (Twitter/X thread)
-
"Eliciting Language Model Behaviors using Reverse Language Models". (2023).
J. Pfau, A. Infanger, A. Sheshadri, A. Panda, J. Michael, C. Huebner.
(code)
-
"A new truncation algorithm for Markov chain equilibrium distributions with computable error bounds." (2022).
A. Infanger, P. W. Glynn.
(code)
-
"On convergence of a truncation scheme for approximating stationary distributions of continuous state space Markov chains and processes." (2022).
A. Infanger, P. W. Glynn.
-
"On convergence of general truncation-augmentation schemes for approximating stationary distributions of Markov chains." (2022).
A. Infanger, P. W. Glynn, Y. Liu.
Publications
Links
Twitter, LinkedIn, GitHub, CV (long form CV).
Last updated: 10/24/2024. Website template borrowed from (the totally awesome) Johan Ugander.