# TAG in Machine Learning

A Workshop at the 39th International Conference on Machine Learning (ICML 2022), Baltimore, MD, July 22, 2022

## Call for Papers

Much of the data that is fueling current rapid advances in machine learning is: high dimensional, structurally complex, and strongly nonlinear. This poses challenges for researcher intuition when they ask (i) how and why current algorithms work and (ii) what tools will lead to the next big break-though. Mathematicians working in topology, algebra, and geometry have more than a hundred years worth of finely-developed machinery whose purpose is to give structure to, help build intuition about, and generally better understand spaces and structures beyond those that we can naturally understand. This workshop will show-case work which brings methods from topology, algebra, and geometry and uses them to help answer challenging questions in machine learning. With this workshop we will create a vehicle for disseminating machine learning techniques that utilize rich mathematics and address core challenges described in the ICML call for papers. Additionally, this workshop creates opportunity for presentation of approaches which may address critical, domain-specific ML challenges but do not necessarily demonstrate improved performance on mainstream, data-rich benchmarks. To this end our proposed workshop will open up ICML to new researchers who in the past were not able to discuss their novel but data set-dependent analysis methods. We interpret topology, algebra, and geometry broadly and welcome submissions ranging from manifold methods to optimal transport to topological data analysis to mathematically informed deep learning. Through intellectual cross-pollination between data-driven and mathematically-inspired communities we believe this workshop will support the continued development of both groups and enable new solutions to problems in machine learning. All papers accepted for inclusion in the workshop are eligible for inclusion in the PMLR Volume entitled "Topology, Algebra, and Geometry in Learning."

Topic areas of include but are not limited to:

Geometric Machine/Deep Learning

Optimal Transport

Topological Data Analysis

Mathematical Machine Learning

Graph-based Methods

Manifold Methods

Abstract Algebra in Machine/ Deep Learning

Important Dates:

Paper Submission Deadline: May 16, 2022 (Anywhere on Earth) May 19, 2022 (Anywhere on Earth)

Final Decisions to Authors: June 6, 2022 (Anywhere on Earth)

Camera-Ready Deadline (required for inclusion in proceedings): June 16, 2022 (Anywhere on Earth)

Main Conference: July 17-23, 2021

Workshop Date: July 22, 2022

Workshop Location: Baltimore Convention Center Rooms 318-320

Paper Length and Format

The paper submission must be at most 6 pages in length (excluding references and supplementary materials) and double blind. We will be following the ICML general conference submission criteria for papers - for details please see: ICML Call For Papers. As a note the reviewers will not be required to review the supplementary materials so make sure that your paper is self-contained.

Template

Submission Site:

https://cmt3.research.microsoft.com/TAGML2022

## Keynotes

# Dr. Michael Kirby

Colorado State University

Michael Kirby received the SB degree in mathematics from the Massachusetts Institute of Technology and PhD degree from the Division of Applied Mathematics, Brown University. He is currently a professor with the Department of Mathematics, Colorado State University with a joint appointment in the Department of Computer Science. His research interests include low dimensional modeling, geometric models for data and optimization. He authored the textbook Geometric Data Analysis. He was an Alexander von Humboldt fellow at the Institute for Information Verarbeitung, University of Tuebingen, Germany. He also was awarded an IBM Faculty fellowship, and the College of Natural Sciences Award for Graduate Education. He is a member of the IEEE.

# Dr. Bastian Rieck

AIDOS Lab, Institute of AI for Health, Helmholtz Zentrum München

Bastian is the principal investigator of the AIDOS Lab at the Institute of AI for Health at Helmholtz Munich, Germany. His main research interests are developing multi-scale algorithms for analysing complex data sets, with a focus on biomedical applications and healthcare topics. Bastian is also enticed by finding new ways to explain neural networks using concepts from algebraic and differential topology. He is a big proponent of scientific outreach and enjoys blogging about his research, academia, supervision, and software development. Bastian received his M.Sc. degree in mathematics, as well as his Ph.D. in computer science, from Heidelberg University in Germany.

# Dr. Shubhendu Trivedi

Shubhendu Trivedi's current research focuses on developing theoretical and methodological tools to incorporate geometric structure into machine learning models, employing statistical physics based approaches for neural network analysis, and developing conformal prediction methods for theoretically-grounded uncertainty quantification. Shubhendu received his PhD in 2018 for work on group equivariant neural networks, working at the University of Chicago and the Toyota Technological Institute; a MS from TTI-C for work in Computer Vision; a MS from Worcester Polytechnic for work on the Szemeredi Regularity Lemma, and a BE in Electrical Engineering. Shubhendu has been a research associate at MIT and an NSF Institute Fellow in Computational Mathematics at Brown University working on problems in algebraic machine learning. Apart from academic research, Shubhendu has led multiple teams for industrial research on health analytics, equivariant models for relational data, knowledge graph engineering and zero-shot transfer learning. He has also held positions at Rutgers, ZS, NEC Labs America amongst others, and has been associated with a semi-conductors startup.

# Dr. Soledad Villar

Johns Hopkins University

Soledad Villar is an Assistant Professor at the Department of Applied Mathematics & Statistics, and at the Mathematical Institute for Data Science, Johns Hopkins University. She received her PhD in mathematics from University in Texas at Austin and was a research fellow at New York University as well as the Simons Institute in University of California Berkeley. Her mathematical interests are in computational methods for extracting information from data. She studies optimization for data science, machine learning, equivariant representation learning and graph neural networks. Soledad is originally from Uruguay.

## Accepted Papers and Posters

Fast Proximal Gradient Descent for Support Regularized Sparse Graph, Dongfang Sun (Arizona State University); Yingzhen Yang (Arizona State University), Paper Poster

The Shape of Words - topological structure in natural language data, Stephen Fitz (Keio University), Paper Poster

Stochastic Parallelizable Eigengap Dilation for Large Graph Clustering, Elise van der Pol (University of Amsterdam); Ian Gemp (DeepMind); Yoram Bachrach; Richard Everett (DeepMind), Paper Poster

Multiresolution Matrix Factorization and Wavelet Networks on Graphs, Truong Son Hy (University of Chicago); Risi Kondor (The University of Chicago), Paper Poster

A simple and universal rotation equivariant point-cloud network, Ben Finkelshtein (Technion); Chaim Baskin (Technion); Haggai Maron; Nadav Dym (Duke University), Paper Poster

Robust Graph Representation Learning for Local Corruption Recovery, Bingxin ZHOU (The University of Sydney); Yuanhong Jiang (SJTU); Yuguang Wang (Shanghai Jiao Tong University); Jingwei Liang (Shanghai Jiao Tong University); Junbin Gao (University of Sydney, Australia); Shirui Pan (Monash University); Xiaoqun Zhang (Shanghai Jiao Tong University), Paper Poster

Invariance-adapted decomposition and Lasso-type contrastive learning, Masanori Koyama (Preferred Networks Inc.); Takeru Miyato (Preferred Networks, Inc.); Kenji Fukumizu (The Institute of Statistical Mathematics), Paper Poster

EXACT: How to Train Your Accuracy, Ivan A Karpukhin (Tinkoff); Stanislav Dereka (Tinkoff); Sergey Kolesnikov (Tinkoff), Paper Poster

On the Surprising Behaviour of node2vec, Celia Hacker (EPFL); Bastian A Rieck (Institute of AI for Health, Helmholtz Centre Munich), Paper Poster

Sign and Basis Invariant Networks for Spectral Graph Representation Learning, Derek Lim (MIT); Joshua D Robinson (MIT); Lingxiao Zhao (CMU); Tess Smidt (MIT); Suvrit Sra (Massachusetts Institute of Technology, USA); Haggai Maron; Stefanie Jegelka (MIT), Paper Poster

Riemannian Residual Neural Networks, Isay Katsman (Cornell University); Eric M Chen (Cornell University); Sidhanth Holalkere (Cornell University); Anna C Asch (Cornell University); Aaron Lou (Stanford University); Ser-Nam Lim (Facebook AI); Christopher De Sa (Cornell University), Paper Poster

The PWLR graph representation: A Persistent Weisfeiler-Lehman scheme with Random walks for graph classification, Sun Woo Park (National Institute for Mathematical Sciences); YUN YOUNG CHOI (NIMS); Dosang Joe ( National Institute for Mathematical Sciences); U Jin Choi (KAIST); Youngho Woo (National Institute for Mathematical Sciences), Paper Poster

Higher-order Clustering and Pooling for Graph Neural Networks, ALEXANDRE DUVAL (CentraleSupélec); Fragkiskos Malliaros (CentraleSupelec), Paper Poster

Hypergraph Convolutional Networks via Equivalence Between Hypergraphs and Undirected Graphs, Jiying Zhang (Tsinghua University); Fuyang Li (Tsinghua university); Xi Xiao (Tsinghua University); Tingyang Xu (Tencent AI Lab); Yu Rong (Tencent AI Lab); Junzhou Huang (University of Texas at Arlington); Yatao Bian (Tencent AI Lab), Paper Poster

A Geometrical Approach to Finding Difficult Examples in Language, Debajyoti Datta (University of Virginia); Shashwat Kumar (University of Virginia); Laura E Barnes (University of Virginia); P. Thomas Fletcher (University of Virginia), Paper Poster

Rethinking Persistent Homology for Visual Recognition, Ekaterina Khramtsova (University of Queensland); Guido Zuccon (The University of Queensland); Xi Wang (Neusoft); Mahsa Baktashmotlagh (University of Queensland), Paper Poster

Sheaf Neural Networks with Connection Laplacians, Federico Barbero (University of Cambridge); Cristian Bodnar (University of Cambridge); Haitz Sáez de Ocáriz Borde (University of Cambridge); Michael Bronstein (Imperial College / Twitter); Petar Veličković (DeepMind); Pietro Lió (University of Cambridge), Paper Poster

Score Matching for Truncated Density Estimation of Spherical Distributions, Daniel Williams (University of Bristol); Song Liu (University of Bristol), Paper Poster

Local distance preserving autoencoders using continuous kNN graphs, Nutan Chen (Machine Learning Research Lab, Volkswagen Group); Patrick van der Smagt (Machine Learning Research Lab, Volkswagen Group); Botond Cseke (Volkswagen Group), Paper Poster

Geometric Properties of Graph Convolutional Networks from the Perspective of Sheaves and the Neural Tangent Kernel, Thomas Gebhart (University of Minnesota), Paper Poster

Riemannian CUR Decompositions for Robust Principal Component Analysis, Keaton Hamm (University of Texas, Arlington); Mohamed Meskini (University of Texas, Arlington); HanQin Cai (University of California, Los Angeles), Paper Poster

For Manifold Learning, Deep Neural Networks can be Locality Sensitive Hash Functions, Nishanth Dikkala (Google Research); Gal Kaplun (Harvard); Rina Panigrahy (Google), Paper Poster

Nearest Class-Center Simplification through Intermediate Layers, Ido Ben-Shaul (Tel-Aviv University); Shai Dekel (Tel Aviv University), Paper Poster

Deoscillated Adaptive Graph Collaborative Filtering, Zhiwei Liu (University of Illinois, Chicago); Lin Meng (Florida State University); Fei Jiang (University of Chicago); Jiawei Zhang (UC Davis); Philip S Yu (UIC), Paper Poster

Robust Lp-Norm Linear Discriminant Analysis with Proxy Matrix Optimization, Navya Nagananda (Rochester Institute of Technology); Breton L Minnehan (Rochester Institute of Technology); Andreas Savakis (Rochester Institute of Technology), Paper Poster

A Topological characterisation of Weisfeiler-Leman equivalence classes, Jacob Bamberger (EPFL), Paper

GALE: Globally Assessing Local Explanations, Peter Xenopoulos (New York University); Gromit Yeuk-Yin Chan (Adobe Research); Harish Doraiswamy (Microsoft Research India); Luis Gustavo Nonato (USP-SC); Brian Barr (Capital One ); Claudio Silva (NYU), Paper Poster

Neural Geometric Embedding Flows, Aaron Lou (Stanford University); Yang Song (Stanford University); Jiaming Song (Stanford University); Stefano Ermon (Stanford University), Paper Poster

Neural Implicit Manifold Learning for Topology-Aware Generative Modeling, Brendan L Ross (Layer 6 AI); Gabriel Loaiza-Ganem (Layer 6 AI); Anthony Caterini (Layer 6 AI); Jesse Cresswell (Layer 6 AI), Paper Poster

Geodesic Properties of a Generalized Wasserstein Embedding for Time Series Analysis, Shiying Li (University of Virginia)*; Abu Hasnat Mohammad Rubaiyat (University of Virginia); Gustavo Rohde (University of Virginia), Paper Poster

Evaluating Disentanglement in Generative Models Without Knowledge of Latent Factors, Chester Holtz (University of California San Diego); Gal Mishne (UC San Diego); Alexander Cloninger (University of California San Diego), Paper

The Power of Recursion in Graph Neural Networks for Counting Substructures, Behrooz Tahmasebi (MIT); Derek Lim (MIT); Stefanie Jegelka (MIT), Paper Poster

The Manifold Scattering Transform for High-Dimensional Point Cloud Data, Joyce Chew (University of California, Los Angeles); Holly R Steach (Yale University); Siddharth Viswanath (University of California, Irvine); Hau-tieng Wu (Duke University); Matthew Hirn (Michigan State University); Deanna Needell (UCLA); Smita Krishnaswamy (Yale University); Michael Perlmutter (University of California, Los Angeles), Paper Poster

Zeroth-Order Topological Insights into Iterative Magnitude Pruning, Aishwarya H. Balwani (Georgia Institute of Technology); Jakob Krzyston (Georgia Institute of Technology), Paper Poster

Approximate Equivariance SO(3) Needlet Convolution, Kai Yi (University of New South Wales); Jialin Chen (Shanghai Jiao Tong University); Yuguang Wang (Shanghai Jiao Tong University); Bingxin ZHOU (The University of Sydney); Pietro Lió (University of Cambridge); Yanan Fan (University of New South Wales); Jan Hamann (University of New South Wales), Paper Poster

## Organizers

Dr. Tegan Emerson

Pacific Northwest National Laboratory

Colorado State University

University of Texas, El Paso

Tegan Emerson received her PhD in Mathematics from Colorado State University. She was a Jerome and Isabella Karle Distinguished Scholar Fellow in optical sciences at the Naval Research Laboratory from 2017-2019. In 2014 she had the honor of being a member of the American delegation at the Heidelberg Laureate Forum. Dr. Emerson is now a Senior Data Scientist and Team Leader in the Data Sciences and Analytics Group at Pacific Northwest Laboratory. In addition to her role at Pacific Northwest National Laboratories, Dr. Emerson also holds Joint Appointments as Affiliate Faculty in the Departments of Mathematics at Colorado State University and the University of Texas, El Paso. Her research interests include geometric and topological data analysis, dimensionality reduction, algorithms for image processing and materials science, deep learning, and optimization.

Dr. Henry Kvinge

Pacific Northwest National Laboratory

University of Washington

Henry Kvinge received his PhD in Mathematics from UC Davis where his research focused on the intersection of representation theory, algebraic combinatorics, and category theory. After two years as a postdoc in the Department of Mathematics at Colorado State University where he worked on the compressive sensing-based algorithms underlying single-pixel cameras, he joined PNNL as a senior data scientist. These days his work focuses on leveraging ideas from geometry, and representation theory to build more robust and adaptive deep learning models and frameworks.

Dr. Tim Doster

Pacific Northwest National Laboratory

Tim Doster is a Senior Data Scientist at the Pacific Northwest National Laboratory. He received the B.S. degree in computational mathematics from the Rochester Institute of Technology in 2008 and the Ph.D. degree in applied mathematics and scientific computing from the University of Maryland, College Park, in 2014. From 2014 to 2016, he was a Jerome and Isabella Karle Distinguished Scholar Fellow before becoming a Permanent Research Scientist in the Applied Optics division with the U.S. Naval Research Laboratory. During his time with the U.S. Naval Research Laboratory he won the prestigious DoD Laboratory University Collaboration Initiative (LUCI) grant. His research interests include machine learning, harmonic analysis, manifold learning, remote sensing, few-shot learning, and adversarial machine learning.

Dr. Sarah Tymochko

Michigan State University

Sarah Tymochko received her Ph.D. in Computational Mathematics, Science, and Engineering at Michigan State University. Her dissertation research focused on topological tools for time series analysis. Beginning summer 2022, she will be a Hedrick assistant adjunct professor at University of California, Los Angeles in the Department of Mathematics. Her research interests include topological data analysis, dynamical systems, time series analysis, network science, and machine learning.

Dr. Alex Cloninger

University of California. San Diego

Alex Cloninger is an Associate Professor in Mathematics and the Halıcıoğlu Data Science Institute at UC San Diego. He received his PhD in Applied Mathematics and Scientific Computation from the University of Maryland in 2014, and was then an NSF Postdoc and Gibbs Assistant Professor of Mathematics at Yale University until 2017, when he joined UCSD. Alex researches problems in the area of geometric data analysis and applied harmonic analysis. He focuses on approaches that model the data as being locally lower dimensional, including data concentrated near manifolds or subspaces. These types of problems arise in a number of scientific disciplines, including imaging, medicine, and artificial intelligence, and the techniques developed relate to a number of machine learning and statistical algorithms, including deep learning, network analysis, and measuring distances between probability distributions.