Designed especially for neurobiologists, FluoRender is an interactive tool for multi-channel fluorescence microscopy data visualization and analysis.
Deep brain stimulation
BrainStimulator is a set of networks that are used in SCIRun to perform simulations of brain stimulation such as transcranial direct current stimulation (tDCS) and magnetic transcranial stimulation (TMS).
Developing software tools for science has always been a central vision of the SCI Institute.

Scientific Computing

Numerical simulation of real-world phenomena provides fertile ground for building interdisciplinary relationships. The SCI Institute has a long tradition of building these relationships in a win-win fashion – a win for the theoretical and algorithmic development of numerical modeling and simulation techniques and a win for the discipline-specific science of interest. High-order and adaptive methods, uncertainty quantification, complexity analysis, and parallelization are just some of the topics being investigated by SCI faculty. These areas of computing are being applied to a wide variety of engineering applications ranging from fluid mechanics and solid mechanics to bioelectricity.


martin

Martin Berzins

Parallel Computing
GPUs
mike

Mike Kirby

Finite Element Methods
Uncertainty Quantification
GPUs
pascucci

Valerio Pascucci

Scientific Data Management
chris

Chris Johnson

Problem Solving Environments
amir

Amir Arzani

Scientific machine learning
Data-driven fluid flow modeling

Funded Research Projects:


Publications in Scientific Computing:


Complexity-Aware Deep Symbolic Regression with Robust Risk-Seeking Policy Gradients
Subtitled “arXiv:2406.06751,” Z. Bastiani, R.M. Kirby, J. Hochhalter, S. Zhe. 2024.

This paper proposes a novel deep symbolic regression approach to enhance the robustness and interpretability of data-driven mathematical expression discovery. Despite the success of the state-of-the-art method, DSR, it is built on recurrent neural networks, purely guided by data fitness, and potentially meet tail barriers, which can zero out the policy gradient and cause inefficient model updates. To overcome these limitations, we use transformers in conjunction with breadth-first-search to improve the learning performance. We use Bayesian information criterion (BIC) as the reward function to explicitly account for the expression complexity and optimize the trade-off between interpretability and data fitness. We propose a modified risk-seeking policy that not only ensures the unbiasness of the gradient, but also removes the tail barriers, thus ensuring effective updates from top performers. Through a series of benchmarks and systematic experiments, we demonstrate the advantages of our approach.



Everywhere & Nowhere: Envisioning a Computing Continuum for Science
Subtitled “arXiv:2406.04480v1,” M. Parashar. 2024.

Emerging data-driven scientific workflows are seeking to leverage distributed data sources to understand end-to-end phenomena, drive experimentation, and facilitate important decision-making. Despite the exponential growth of available digital data sources at the edge, and the ubiquity of non trivial computational power for processing this data, realizing such science workflows remains challenging. This paper explores a computing continuum that is everywhere and nowhere – one spanning resources at the edges, in the core and in between, and providing abstractions that can be harnessed to support science. It also introduces recent research in programming abstractions that can express what data should be processed and when and where it should be processed, and autonomic middleware services that automate the discovery of resources and the orchestration of computations across these resources.



Visual Exploratory Analysis for Designing Large-Scale Network-on-Chip Architectures: A Domain Expert-Led Design Study,
S. Wang, H. Yan, K.E. Isaacs, Y. Sun. In IEEE Transactions on Visualization and Computer Graphics, Vol. 30, pp. 1970-1983. 2024.

Visualization design studies bring together visualization researchers and domain experts to address yet unsolved data analysis challenges stemming from the needs of the domain experts. Typically, the visualization researchers lead the design study process and implementation of any visualization solutions. This setup leverages the visualization researchers' knowledge of methodology, design, and programming, but the availability to synchronize with the domain experts can hamper the design process. We consider an alternative setup where the domain experts take the lead in the design study, supported by the visualization experts. In this study, the domain experts are computer architecture experts who simulate and analyze novel computer chip designs. These chips rely on a Network-on-Chip (NOC) to connect components. The experts want to understand how the chip designs perform and what in the design led to their performance. To aid this analysis, we develop Vis4Mesh, a visualization system that provides spatial, temporal, and architectural context to simulated NOC behavior. Integration with an existing computer architecture visualization tool enables architects to perform deep-dives into specific architecture component behavior. We validate Vis4Mesh through a case study and a user study with computer architecture researchers. We reflect on our design and process, discussing advantages, disadvantages, and guidance for engaging in a domain expert-led design studies.



Enabling Responsible Artificial Intelligence Research and Development Through the Democratization of Advanced Cyberinfrastructure
M. Parashar. In Harvard Data Science Review, Special Issue 4: Democratizing Data, 2024.

Artificial intelligence (AI) is driving discovery, innovation, and economic growth, and has the potential to transform science and society. However, realizing the positive, transformative potential of AI requires that AI research and development (R&D) progress responsibly; that is, in a way that protects privacy, civil rights, and civil liberties, and promotes principles of fairness, accountability, transparency, and equity. This article explores the importance of democratizing AI R&D for achieving the goal of responsible AI and its potential impacts.



Polynomial-Augmented Neural Networks (PANNs) with Weak Orthogonality Constraints for Enhanced Function and PDE Approximation
Subtitled “arXiv preprint arXiv:2406.02336,” M. Cooley, S. Zhe, R.M. Kirby, V. Shankar. 2024.

We present polynomial-augmented neural networks (PANNs), a novel machine learning architecture that combines deep neural networks (DNNs) with a polynomial approximant. PANNs combine the strengths of DNNs (flexibility and efficiency in higher-dimensional approximation) with those of polynomial approximation (rapid convergence rates for smooth functions). To aid in both stable training and enhanced accuracy over a variety of problems, we present (1) a family of orthogonality constraints that impose mutual orthogonality between the polynomial and the DNN within a PANN; (2) a simple basis pruning approach to combat the curse of dimensionality introduced by the polynomial component; and (3) an adaptation of a polynomial preconditioning strategy to both DNNs and polynomials. We test the resulting architecture for its polynomial reproduction properties, ability to approximate both smooth functions and functions of limited smoothness, and as a method for the solution of partial differential equations (PDEs). Through these experiments, we demonstrate that PANNs offer superior approximation properties to DNNs for both regression and the numerical solution of PDEs, while also offering enhanced accuracy over both polynomial and DNN-based regression (each) when regressing functions with limited smoothness.



XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing
Subtitled “arXiv:2401.04552v1,” T. Hoefler, M. Copik, P. Beckman, A. Jones, I. Foster, M. Parashar, D. Reed, M. Troyer, T. Schulthess, D. Ernst, J. Dongarra. 2024.

HPC and Cloud have evolved independently, specializing their innovations into performance or productivity. Acceleration as a Service (XaaS) is a recipe to empower both fields with a shared execution platform that provides transparent access to computing resources, regardless of the underlying cloud or HPC service provider. Bridging HPC and cloud advancements, XaaS presents a unified architecture built on performance-portable containers. Our converged model concentrates on low-overhead, high-performance communication and computing, targeting resource-intensive workloads from climate simulations to machine learning. XaaS lifts the restricted allocation model of Function-as-a-Service (FaaS), allowing users to benefit from the flexibility and efficient resource utilization of serverless while supporting long-running and performance-sensitive workloads from HPC.



Automated Programmatic Performance Analysis of Parallel Programs
Subtitled “arXiv:2401.13150v1,” O. Cankur, A. Tomar, D. Nichols, C. Scully-Allison, K. Isaacs, A. Bhatele. 2024.

Developing efficient parallel applications is critical to advancing scientific development but requires significant performance analysis and optimization. Performance analysis tools help developers manage the increasing complexity and scale of performance data, but often rely on the user to manually explore low-level data and are rigid in how the data can be manipulated. We propose a Python-based API, Chopper, which provides high-level and flexible performance analysis for both single and multiple executions of parallel applications. Chopper facilitates performance analysis and reduces developer effort by providing configurable high-level methods for common performance analysis tasks such as calculating load imbalance, hot paths, scalability bottlenecks, correlation between metrics and CCT nodes, and causes of performance variability within a robust and mature Python environment that provides fluid access to lower-level data manipulations. We demonstrate how Chopper allows developers to quickly and succinctly explore performance and identify issues across applications such as AMG, Laghos, LULESH, Quicksilver and Tortuga.



Kolmogorov n-Widths for Multitask Physics-Informed Machine Learning (PIML) Methods: Towards Robust Metrics
Subtitled “arXiv preprint arXiv:2402.11126,” M. Penwarden, H. Owhadi, R.M. Kirby. 2024.

Physics-informed machine learning (PIML) as a means of solving partial differential equations (PDE) has garnered much attention in the Computational Science and Engineering (CS&E) world. This topic encompasses a broad array of methods and models aimed at solving a single or a collection of PDE problems, called multitask learning. PIML is characterized by the incorporation of physical laws into the training process of machine learning models in lieu of large data when solving PDE problems. Despite the overall success of this collection of methods, it remains incredibly difficult to analyze, benchmark, and generally compare one approach to another. Using Kolmogorov n-widths as a measure of effectiveness of approximating functions, we judiciously apply this metric in the comparison of various multitask PIML architectures. We compute lower accuracy bounds and analyze the model's learned basis functions on various PDE problems. This is the first objective metric for comparing multitask PIML architectures and helps remove uncertainty in model validation from selective sampling and overfitting. We also identify avenues of improvement for model architectures, such as the choice of activation function, which can drastically affect model generalization to "worst-case" scenarios, which is not observed when reporting task-specific errors. We also incorporate this metric into the optimization process through regularization, which improves the models' generalizability over the multitask PDE problem.



TGPT-PINN: Nonlinear model reduction with transformed GPT-PINNs
Subtitled “arXiv preprint arXiv:2403.03459,” Y. Chen, Y. Ji, A. Narayan, Z. Xu. 2024.

We introduce the Transformed Generative Pre-Trained Physics-Informed Neural Networks (TGPT-PINN) for accomplishing nonlinear model order reduction (MOR) of transport-dominated partial differential equations in an MOR-integrating PINNs framework. Building on the recent development of the GPT-PINN that is a network-of-networks design achieving snapshot-based model reduction, we design and test a novel paradigm for nonlinear model reduction that can effectively tackle problems with parameter-dependent discontinuities. Through incorporation of a shock-capturing loss function component as well as a parameter-dependent transform layer, the TGPT-PINN overcomes the limitations of linear model reduction in the transport-dominated regime. We demonstrate this new capability for nonlinear model reduction in the PINNs framework by several nontrivial parametric partial differential equations.



EfficientMorph: Parameter-Efficient Transformer-Based Architecture for 3D Image Registration
Subtitled “arXiv preprint arXiv:2403.11026,” A.Z.B. Aziz, M.S.T. Karanam, T. Kataria, S.Y. Elhabian. 2024.

Transformers have emerged as the state-of-the-art architecture in medical image registration, outperforming convolutional neural networks (CNNs) by addressing their limited receptive fields and overcoming gradient instability in deeper models. Despite their success, transformer-based models require substantial resources for training, including data, memory, and computational power, which may restrict their applicability for end users with limited resources. In particular, existing transformer-based 3D image registration architectures face three critical gaps that challenge their efficiency and effectiveness. Firstly, while mitigating the quadratic complexity of full attention by focusing on local regions, window-based attention mechanisms often fail to adequately integrate local and global information. Secondly, feature similarities across attention heads that were recently found in multi-head attention architectures indicate a significant computational redundancy, suggesting that the capacity of the network could be better utilized to enhance performance. Lastly, the granularity of tokenization, a key factor in registration accuracy, presents a trade-off; smaller tokens improve detail capture at the cost of higher computational complexity, increased memory demands, and a risk of overfitting. Here, we propose EfficientMorph, a transformer-based architecture for unsupervised 3D image registration. It optimizes the balance between local and global attention through a plane-based attention mechanism, reduces computational redundancy via cascaded group attention, and captures fine details without compromising computational efficiency, thanks to a Hi-Res tokenization strategy complemented by merging operations. We compare the effectiveness of EfficientMorph on two public datasets, OASIS and IXI, against other state-of-the-art models. Notably, EfficientMorph sets a new benchmark for performance on the OASIS dataset with ∼16-27× fewer parameters.



Accelerated process parameter selection of polymer-based selective laser sintering via hybrid physics-informed neural network and finite element surrogate modelling
H.P. Yeh, M. Bayat, A. Arzani, J.H. Hattel. In Applied Mathematical Modelling, Vol. 130, pp. 693--712. 2024.

The state of the melt region as well as the temperature field are critical indicators reflecting the stability of the process and subsequent product quality in selective laser sintering (SLS). The present study compares various simulation models for analyzing melt pool morphologies, specifically considering their complex transient evolution. While thermal fluid dynamic simulations offer comprehensive insights into melt regions, their inherent high computational time demand is a drawback. In SLS, the polymer's high viscosity and low conductivity limit liquid flow, thereby promoting a slow evolution of the melt region formation. Based on this observation, utilizing low-complexity pure heat conduction simulation can be adequate for describing melt region morphologies as compared to the more complex thermal fluid dynamic simulations. In the present work, we propose such a purely conduction based finite element (FE) model and use it in combination with an AI-powered partial differential equation (PDE) solver based on a parametric physics-informed neural network (PINN). We specifically conduct the simulations for the sintering process, where large thermal gradients are present, with the parametric PINN based model, whereas we employ the finite element method (FEM) for the cooling phase in which gradients and cooling rates are several orders lower, thus enabling the prediction of sintering temperature and melt region morphology under various configurations. The combined hybrid model demonstrates less than 7% deviation in temperatures and less than 1% in melt pool sizes as compared to the pure FEM-based models, with faster computational times of 0.7 s for sintering and 20 min for cooling. Moreover, the hybrid model is utilized for multi-track simulation with parametric variations with the purpose of optimizing the manufacturing process. Our model provides an approach to determine the most suitable combinations of settings that enhance manufacturing speed while preventing issues such as lack of fusion and material degradation.



Subsampling of Parametric Models with Bifidelity Boosting
N. Cheng, O.A. Malik, Y. Xu, S. Becker, A. Doostan, A. Narayan. In Journal on Uncertainty Quantificatio., ACM, 2024.

Least squares regression is a ubiquitous tool for building emulators (a.k.a. surrogate models) of problems across science and engineering for purposes such as design space exploration and uncertainty quantification. When the regression data are generated using an experimental design process (e.g., a quadrature grid) involving computationally expensive models, or when the data size is large, sketching techniques have shown promise at reducing the cost of the construction of the regression model while ensuring accuracy comparable to that of the full data. However, random sketching strategies, such as those based on leverage scores, lead to regression errors that are random and may exhibit large variability. To mitigate this issue, we present a novel boosting approach that leverages cheaper, lower-fidelity data of the problem at hand to identify the best sketch among a set of candidate sketches. This in turn specifies the sketch of the intended high-fidelity model and the associated data. We provide theoretical analyses of this bifidelity boosting (BFB) approach and discuss the conditions the low- and high-fidelity data must satisfy for a successful boosting. In doing so, we derive a bound on the residual norm of the BFB sketched solution relating it to its ideal, but computationally expensive, high-fidelity boosted counterpart. Empirical results on both manufactured and PDE data corroborate the theoretical analyses and illustrate the efficacy of the BFB solution in reducing the regression error, as compared to the nonboosted solution.



Integrating FAIR Digital Objects (FDOs) into the National Science Data Fabric (NSDF) to Revolutionize Dataflows for Scientific Discovery
M. Taufer, H. Martinez, J. Luettgau, L. Whitnah, G. Scorzelli, P. Newel, A. Panta, T. Bremer, D. Fils, C.R. Kirkpatrick, N. McCurdy, V. Pascucci. In Computing in Science & Engineering, IEEE, 2024.

In this perspective paper, we introduce a paradigm-shifting approach that combines the power of FAIR Digital Objects (FDO) with the National Science Data Fabric (NSDF), defining a new era of data accessibility, scientific discovery, and education. Integrating FDOs into the NSDF opens doors to overcoming substantial data access barriers and facilitating the extraction of machine-actionable metadata aligned with FAIR principles. Our augmented NSDF empowers the exchange of massive climate simulations and streamlines materials science workflows. This paper lays the foundation for an inclusive, web-centric, and network-first design, democratizing data access and fostering unprecedented opportunities for research and collaboration within the scientific community.



Neighborhood built environment, obesity, and diabetes: A Utah siblings study
Q.C. Nguyen, T. Tasdizen, M. Alirezaei, H. Mane, X. Yue, J.S. Merchant, W. Yu, L. Drew, D. Li, T.T. Nguyen. In SSM - Population Health, Vol. 26, 2024.

Background

This study utilizes innovative computer vision methods alongside Google Street View images to characterize neighborhood built environments across Utah.

Methods

Convolutional Neural Networks were used to create indicators of street greenness, crosswalks, and building type on 1.4 million Google Street View images. The demographic and medical profiles of Utah residents came from the Utah Population Database (UPDB). We implemented hierarchical linear models with individuals nested within zip codes to estimate associations between neighborhood built environment features and individual-level obesity and diabetes, controlling for individual- and zip code-level characteristics (n = 1,899,175 adults living in Utah in 2015). Sibling random effects models were implemented to account for shared family attributes among siblings (n = 972,150) and twins (n = 14,122).

Results

Consistent with prior neighborhood research, the variance partition coefficients (VPC) of our unadjusted models nesting individuals within zip codes were relatively small (0.5%–5.3%), except for HbA1c (VPC = 23%), suggesting a small percentage of the outcome variance is at the zip code-level. However, proportional change in variance (PCV) attributable to zip codes after the inclusion of neighborhood built environment variables and covariates ranged between 11% and 67%, suggesting that these characteristics account for a substantial portion of the zip code-level effects. Non-single-family homes (indicator of mixed land use), sidewalks (indicator of walkability), and green streets (indicator of neighborhood aesthetics) were associated with reduced diabetes and obesity. Zip codes in the third tertile for non-single-family homes were associated with a 15% reduction (PR: 0.85; 95% CI: 0.79, 0.91) in obesity and a 20% reduction (PR: 0.80; 95% CI: 0.70, 0.91) in diabetes. This tertile was also associated with a BMI reduction of −0.68 kg/m2 (95% CI: −0.95, −0.40)

Conclusion

We observe associations between neighborhood characteristics and chronic diseases, accounting for biological, social, and cultural factors shared among siblings in this large population-based study.



CrossPrefetch: Accelerating I/O Prefetching for Modern Storage
S. Garg, J. Zhang, R. Pitchumani, M. Parashar, B. Xie, S. Kannan. In 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ACM, 2024.

We introduce CrossPrefetch, a novel cross-layered I/O prefetching mechanism that operates across the OS and a user-level runtime to achieve optimal performance. Existing OS prefetching mechanisms suffer from rigid interfaces that do not provide information to applications on the prefetch effectiveness, suffer from high concurrency bottlenecks, and are inefficient in utilizing available system memory. CrossPrefetch addresses these limitations by dividing responsibilities between the OS and runtime, minimizing overhead, and achieving low cache misses, lock contentions, and higher I/O performance.

CrossPrefetch tackles the limitations of rigid OS prefetching interfaces by maintaining and exporting cache state and prefetch effectiveness to user-level runtimes. It also addresses scalability and concurrency bottlenecks by distinguishing between regular I/O and prefetch operations paths and introduces fine-grained prefetch indexing for shared files. Finally, CrossPrefetch designs low-interference access pattern prediction combined with support for adaptive and aggressive techniques to exploit memory capacity and storage bandwidth. Our evaluation of CrossPrefetch, encompassing microbenchmarks, macrobenchmarks, and real-world workloads, illustrates performance gains of up to 1.22x- 3.7x in I/O throughput. We also evaluate CrossPrefetch across different file systems and local and remote storage configurations.



An illustration of extending Hedgehog to multi-node GPU architectures using GEMM
N. Shingde, T. Blattner, A. Bardakoff, W. Keyrouz, M. Berzins. In Springer Nature (to appear), 2024.

Asynchronous task-based systems offer the possibility of making it easier to take advantage of scalable heterogeneous architectures. This paper extends the previous work, demonstrating how Hedgehog, a dataflow graph-based model developed at the National Institute of Standards and Technology, can be used to obtain high performance for numerical linear algebraic operations as a starting point for complex algorithms. While the results were promising, it was unclear how to scale them to larger matrices and compute node counts. The aim here is to show how the new, improved algorithm inspired by DPLASMA performs equally well using Hedgehog. The results are compared against the leading library DPLASMA to illustrate the performance of different asynchronous dataflow models. The work demonstrates that using general-purpose, high-level abstractions, such as Hedgehog’s dataflow graphs, makes it possible to achieve similar performance to the specialized linear algebra codes such as DPLASMA.



Grand Challenges at the Interface of Engineering and Medicine
S. Subramaniam, M. Miller, several co-authors, Chris R. Johnson, et al.. In IEEE Open Journal of Engineering in Medicine and Biology, Vol. 5, IEEE, pp. 1--13. 2024.
DOI: 10.1109/OJEMB.2024.3351717

Over the past two decades Biomedical Engineering has emerged as a major discipline that bridges societal needs of human health care with the development of novel technologies. Every medical institution is now equipped at varying degrees of sophistication with the ability to monitor human health in both non-invasive and invasive modes. The multiple scales at which human physiology can be interrogated provide a profound perspective on health and disease. We are at the nexus of creating “avatars” (herein defined as an extension of “digital twins”) of human patho/physiology to serve as paradigms for interrogation and potential intervention. Motivated by the emergence of these new capabilities, the IEEE Engineering in Medicine and Biology Society, the Departments of Biomedical Engineering at Johns Hopkins University and Bioengineering at University of California at San Diego sponsored an interdisciplinary workshop to define the grand challenges that face biomedical engineering and the mechanisms to address these challenges. The Workshop identified five grand challenges with cross-cutting themes and provided a roadmap for new technologies, identified new training needs, and defined the types of interdisciplinary teams needed for addressing these challenges. The themes presented in this paper include: 1) accumedicine through creation of avatars of cells, tissues, organs and whole human; 2) development of smart and responsive devices for human function augmentation; 3) exocortical technologies to understand brain function and treat neuropathologies; 4) the development of approaches to harness the human immune system for health and wellness; and 5) new strategies to engineer genomes and cells.



COMPUTATIONAL ERROR ESTIMATION FOR THE MATERIAL POINT METHOD IN 1D AND 2D
M. Berzins. In VIII International Conference on Particle-Based Methods, PARTICLES 2023, 2024.

The Material Point Method (MPM) is widely used for challenging applications in engineering, and animation but lags behind some other methods in terms of error analysis and computable error estimates. The complexity and nonlinearity of the equations solved by the method and its reliance both on a mesh and on moving particles makes error estimation challenging. Some preliminary error analysis of a simple MPM method has shown the global error to be first order in space and time for a widely-used variant of the Material Point Method. The overall time dependent nature of MPM also complicates matters as both space and time errors and their evolution must be considered thus leading to the use of explicit error transport equations. The preliminary use of an error estimator based on this transport approach has yielded promising results in the 1D case. One other source of error in MPM is the grid-crossing error that can be problematic for large deformations leading to large errors that are identified by the error estimator used. The extension of the error estimation approach to two space higher dimensions is considered and together with additional algorithmic and theoretical results, shown to give promising results in preliminary computational experiments.



Making Uintah Performance Portable for Department of Energy Exascale Testbeds
J. K. Holmen , M. Garcıa, A. Bagusetty, V. Madananth, A. Sanderson,, M. Berzins. In Euro-Par 2023: Parallel Processing, pp. 1--12. 2024.

To help ease ports to forthcoming Department of Energy (DOE) exascale systems, testbeds have been made available to select users. These testbeds are helpful for preparing codes to run on the same hardware and similar software as in their respective exascale systems. This paper describes how the Uintah Computational Framework, an open-source asynchronous many-task (AMT) runtime system, has been modified to be performance portable across the DOE Crusher, DOE Polaris, and DOE Sunspot testbeds in preparation for portable simulations across the exascale DOE Frontier and DOE Aurora systems. The Crusher, Polaris, and Sunspot testbeds feature the AMD MI250X, NVIDIA A100, and Intel PVC GPUs, respectively. This performance portability has been made possible by extending Uintah’s intermediate portability layer [18] to additionally support the Kokkos::HIP, Kokkos::OpenMPTarget, and Kokkos::SYCL back-ends. This paper also describes notable updates to Uintah’s support for Kokkos, which were required to make this extension possible. Results are shown for a challenging radiative heat transfer calculation, central to the University of Utah’s predictive boiler simulations. These results demonstrate single-source portability across AMD-, NVIDIA-, and Intel-based GPUs using various Kokkos back-ends.



Pairwise Learning with Adaptive Online Gradient Descent
T. Sun, Q. Wang, Y. Lei, D. Li, B. Wang. In Transactions on Machine Learning Research, 2023.

In this paper, we propose an adaptive online gradient descent method with momentum for pairwise learning, in which the step size is determined by historical information. Due to the structure of pairwise learning, the sample pairs are dependent on the parameters, causing difficulties in the convergence analysis. To this end, we develop novel techniques for the convergence analysis of the proposed algorithm. We show that the proposed algorithm can output the desired solution in strongly convex, convex, and nonconvex cases. Furthermore, we present theoretical explanations for why our proposed algorithm can accelerate previous workhorses for online pairwise learning. All assumptions used in the theoretical analysis are mild and common, making our results applicable to various pairwise learning problems. To demonstrate the efficiency of our algorithm, we compare the proposed adaptive method with the non-adaptive counterpart on the benchmark online AUC maximization problem.