If we had to apply k-means clustering to my previous/ongoing research with k=3, we’d get something like this:
The utility of generative modeling in large-scale, high-stakes scientific tasks is hindered by several mathematical challenges. One such issue that I am focused on is incorporating constraints in a cost-efficient and mathematically justified way so that samples and distributions generated by techniques such as flow-based models and diffusion models are faithful to domain knowledge.
In particular, I focus on equality constraints (e.g. fixed bond length/angles, linear constraints, or PDEs). Much of my current research focuses on combating distributional degeneracies induced by such equality constraints, as the reduction in degrees of freedom caused by strict equality constraints can prohibit the existence of a Lebesgue density with full-dimensional ambient support. I am also interested in other topics in this field, including efficient strategies for inequality constraints (e.g. appropriate mappings from constrained to unconstrained spaces for easier generative modeling) and, more broadly, PDE interpretations of existing generative models and their consequences for the constrained case.
Historically (and now!), the matrix Singular Value Decomposition (SVD) has been the primary workhorse of compression and dimensionality reduction in many data science applications, providing provable guarantees for data which either can be manipulated into or already has a two-dimensional format. However, with the advent of modern large datasets for which just two dimensions is insufficient, tensors have emerged as a popular data structure for the storage of n-dimensional arrays.
Tensor decompositions aim to extend some of the incredible tools that the matrix SVD offers into higher dimensions. My work in this area has largely focused on the star-M framework, which offers a matrix-mimetic multilinear algebraic framework which admits a tensor extension of the SVD while also naturally treating tensors as matrices of ``tubes”, bringing many familiar linear algebra concepts into the world of tensors. These works offer provable guarantees (in terms of things such as optimality or compression error) on the compressed data in a computationally efficient and mathematically familiar way.
Projected Tensor-Tensor Products for Efficient Computation of Optimal Multiway Data Representations
Katherine Keegan, Elizabeth Newman
Linear Algebra and its Applications, Volume 729, 2026, pp. 100-147.
[arXiv] [Publication]
Optimal Matrix-Mimetic Tensor Algebras via Variable Projection
Elizabeth Newman, Katherine Keegan
SIAM Journal on Matrix Analysis and Applications, Volume 46, Issue 3, 2025, pp. 1764-1790.
[arXiv] [Publication]
A Tensor SVD-based Classification Algorithm Applied to fMRI Data
Katherine Keegan, Tanvi Vishwanath, Yihua Xu
SIAM Undergraduate Research Online, Volume 15, 2022, pp. 270-294.
[PDF]
Media Processing and A Modified Watermarking Scheme Based on the Singular Value Decomposition
Katherine Keegan, David Melendez, Jennifer Zheng
SIAM Undergraduate Research Online, Volume 14, 2021, pp. 446-467.
[PDF]
As a computational math PhD student, I think it is important to understand (a) real scientific challenges motivating the fun math problems that make it to the desks of applied mathematicians and (b) the implementation issues inhibiting the scalability of the algorithms we develop. To that end, I have deliberately sought out more applied internships with domain science and high-performance computing researchers to understand where computational math fits into the broader computational science research landscape.
Powered by Jekyll and Minimal Light theme.