S. H. Campbell, T. Bidone. 3D Model of Cell Migration and Proliferation in a Tissue Scaffold, In Biophysical Journal, Vol. 120, No. 3, Elsevier, pp. 265a. 2021.
Tissue scaffolds restore tissue functionality without the limitations of transplants. However, successful tissue growth depends on the interplay between scaffold properties and cell activities. It has been previously reported that scaffold porosity and Young's modulus affect cell migration and tissue generation. However, how the geometrical and mechanical properties of a scaffold exactly interplay with cell processes remain poorly understood and are essential for successful tissue growth. We developed a 3D computational model that simulates cell migration and proliferation on a scaffold. The model generates an adjustable 3D porous scaffold environment with a defined pore size and Young modulus. Cells are treated as explicit spherical particles comparable in size to bone-marrow cells and are initially seeded randomly throughout the scaffold. Cells can create adhesions, proliferate, and independently migrate across pores in a random walk. Cell adhesions during migration follow the molecular-clutch mechanism, where traction force from the cells against the scaffold stiffness reinforces adhesions lifetime up to a threshold. We used the model to test how variations in cell proliferation rate, scaffold Young's modulus, and porosity affect cell migration speed. At a low proliferation rate (1 x 10−7 s−1), the spread of cell speeds is larger than at a high replication rate (1 x 10−6 s−1). A biphasic relation between Young's modulus and cell speed is also observed reflecting the molecular-clutch mechanism at the level of individual adhesions. These observations are consistent with previous reports regarding fibroblast migration on collagen-glycosaminoglycan scaffolds. Additionally, our model shows that similar cell diameters and pore diameter induces a crowding effect decreasing cell speed. The results from our study provide important insights about biophysical mechanisms that govern cell motility on scaffolds with different properties for tissue engineering applications.
K.M. Campbell, H. Dai, Z. Su, M. Bauer, P.T. Fletcher, S.C. Joshi. Integrated Construction of Multimodal Atlases with Structural Connectomes in the Space of Riemannian Metrics, Subtitled arXiv preprint arXiv:2109.09808, 2021.
The structural network of the brain, or structural connectome, can be represented by fiber bundles generated by a variety of tractography methods. While such methods give qualitative insights into brain structure, there is controversy over whether they can provide quantitative information, especially at the population level. In order to enable population-level statistical analysis of the structural connectome, we propose representing a connectome as a Riemannian metric, which is a point on an infinite-dimensional manifold. We equip this manifold with the Ebin metric, a natural metric structure for this space, to get a Riemannian manifold along with its associated geometric properties. We then use this Riemannian framework to apply object-oriented statistical analysis to define an atlas as the Fr\'echet mean of a population of Riemannian metrics. This formulation ties into the existing framework for diffeomorphic construction of image atlases, allowing us to construct a multimodal atlas by simultaneously integrating complementary white matter structure details from DWMRI and cortical details from T1-weighted MRI. We illustrate our framework with 2D data examples of connectome registration and atlas formation. Finally, we build an example 3D multimodal atlas using T1 images and connectomes derived from diffusion tensors estimated from a subset of subjects from the Human Connectome Project.
M. Carlson, X. Zheng, H. Sundar, G. E. Karniadakis, R. M. Kirby. An open-source parallel code for computing the spectral fractional Laplacian on 3D complex geometry domains, In Computer Physics Communications, Vol. 261, North-Holland, pp. 107695. 2021.
We present a spectral element algorithm and open-source code for computing the fractional Laplacian defined by the eigenfunction expansion on finite 2D/3D complex domains with both homogeneous and nonhomogeneous boundaries. We demonstrate the scalability of the spectral element algorithm on large clusters by constructing the fractional Laplacian based on computed eigenvalues and eigenfunctions using up to thousands of CPUs. To demonstrate the accuracy of this eigen-based approach for computing the factional Laplacian, we approximate the solutions of the fractional diffusion equation using the computed eigenvalues and eigenfunctions on a 2D quadrilateral, and on a 3D cubic and cylindrical domain, and compare the results with the contrived solutions to demonstrate fast convergence. Subsequently, we present simulation results for a fractional diffusion equation on a hand-shaped domain discretized with 3D hexahedra, as well as on a domain constructed from the Hanford site geometry corresponding to nonzero Dirichlet boundary conditions. Finally, we apply the algorithm to solve the surface quasi-geostrophic (SQG) equation on a 2D square with periodic boundaries. Simulation results demonstrate the accuracy, efficiency, and geometric flexibility of our algorithm and that our algorithm can capture the subtle dynamics of anomalous diffusion modeled by the fractional Laplacian on complex geometry domains. The included open-source code is the first of its kind.
K. R. Carney, A. M. Khan, S. C. Samson, N. Mittal, S. J. Han, M. C. Mendoza, T. C. Bidone. Nascent adhesions differentially regulate lamellipodium velocity and persistence, Subtitled bioRxiv, 2021.
Cell migration is essential to physiological and pathological biology. Migration is driven by the motion of a leading edge, in which actin polymerization pushes against the edge and adhesions transmit traction to the substrate while membrane tension increases. How the actin and adhesions synergistically control edge protrusion remains elusive. We addressed this question by developing a computational model in which the Brownian ratchet mechanism governs actin filament polymerization against the membrane and the molecular clutch mechanism governs adhesion to the substrate (BR-MC model). Our model predicted that actin polymerization is the most significant driver of protrusion, as actin had a greater effect on protrusion than adhesion assembly. Increasing the lifetime of nascent adhesions also enhanced velocity, but decreased the protrusion's motional persistence, because filaments maintained against the cell edge ceased polymerizing as membrane tension increased. We confirmed the model predictions with measurement of adhesion lifetime and edge motion in migrating cells. Adhesions with longer lifetime were associated with faster protrusion velocity and shorter persistence. Experimentally increasing adhesion lifetime increased velocity but decreased persistence. We propose a mechanism for actin polymerization-driven, adhesion-dependent protrusion in which balanced nascent adhesion assembly and lifetime generates protrusions with the power and persistence to drive migration.
Y. Chen, C. McNabb, T. Bidone. Computational Model of E-cadherin Clustering under Cortical Tension, In Biophysical Journal, Vol. 120, No. 3, Elsevier, pp. 236a. 2021.
E-cadherins are adhesion proteins that play a critical role in the formation of cell-cell junctions for several physiological processes, including tissue development and homeostasis. The formation of E-cadherin clusters involves extracellular trans-and cis-associations between cadherin ectodomains and stabilization through intracellular coupling with the contractile actomyosin cortex. The dynamic remodeling of cell-cell junctions largely depends on cortical tension, but previous modeling frameworks did not incorporate this effect. In order to gain insights into the effects of cortical tension on the dynamic properties of E-cadherin clusters, here we developed a computational model based on Brownian dynamics. The model considers individual cadherins as explicit point particles undergoing cycles of lateral diffusion on two parallel surfaces that mimic the membrane of neighboring cells. E-cadherins transit between …
Y. Chen, L. Ji, A. Narayan, Z. Xu. L1-based reduced over collocation and hyper reduction for steady state and time-dependent nonlinear equations, In Journal of Scientific Computing, Vol. 87, No. 1, Springer US, pp. 1--21. 2021.
The task of repeatedly solving parametrized partial differential equations (pPDEs) in optimization, control, or interactive applications makes it imperative to design highly efficient and equally accurate surrogate models. The reduced basis method (RBM) presents itself as such an option. Accompanied by a mathematically rigorous error estimator, RBM carefully constructs a low-dimensional subspace of the parameter-induced high fidelity solution manifold on which an approximate solution is computed. It can improve efficiency by several orders of magnitudes leveraging an offline-online decomposition procedure. However this decomposition, usually implemented with aid from the empirical interpolation method (EIM) for nonlinear and/or parametric-nonaffine PDEs, can be challenging to implement, or results in severely degraded online efficiency. In this paper, we augment and extend the EIM approach as a direct solver, as opposed to an assistant, for solving nonlinear pPDEs on the reduced level. The resulting method, called Reduced Over-Collocation method (ROC), is stable and capable of avoiding efficiency degradation exhibited in traditional applications of EIM. Two critical ingredients of the scheme are collocation at about twice as many locations as the dimension of the reduced approximation space, and an efficient L1-norm-based error indicator for the strategic selection of the parameter values whose snapshots span the reduced approximation space. Together, these two ingredients ensure that the proposed L1-ROC scheme is both offline- and online-efficient. A distinctive feature is that the efficiency degradation appearing in alternative RBM approaches that utilize EIM for nonlinear and nonaffine problems is circumvented, both in the offline and online stages. Numerical tests on different families of time-dependent and steady-state nonlinear problems demonstrate the high efficiency and accuracy of L1-ROC and its superior stability performance.
J. Chilleri, Y. He, D. Bedrov, R. M. Kirby. Optimal allocation of computational resources based on Gaussian process: Application to molecular dynamics simulations, In Computational Materials Science, Vol. 188, Elsevier, pp. 110178. 2021.
Simulation models have been utilized in a wide range of real-world applications for behavior predictions of complex physical systems or material designs of large structures. While extensive simulation is mathematically preferable, external limitations such as available resources are often necessary considerations. With a fixed computational resource (i.e., total simulation time), we propose a Gaussian process-based numerical optimization framework for optimal time allocation over simulations at different locations, so that a surrogate model with uncertainty estimation can be constructed to approximate the full simulation. The proposed framework is demonstrated first via two synthetic problems, and later using a real test case of a glass-forming system with divergent dynamic relaxations where a Gaussian process is constructed to estimate the diffusivity and its uncertainty with respect to the temperature.
D. Dai, Y. Epshteyn, A. Narayan. Hyperbolicity-Preserving and Well-Balanced Stochastic Galerkin Method for Two-Dimensional Shallow Water Equations, In SIAM Journal on Scientific Computing, Vol. 43, No. 2, Society for Industrial and Applied Mathematics, pp. A929-A952. 2021.
Stochastic Galerkin formulations of the two-dimensional shallow water systems parameterized with random variables may lose hyperbolicity, and hence change the nature of the original model. In this work, we present a hyperbolicity-preserving stochastic Galerkin formulation by carefully selecting the polynomial chaos approximations to the nonlinear terms of , and in the shallow water equations. We derive a sufficient condition to preserve the hyperbolicity of the stochastic Galerkin system which requires only a finite collection of positivity conditions on the stochastic water height at selected quadrature points in parameter space. Based on our theoretical results for the stochastic Galerkin formulation, we develop a corresponding well-balanced hyperbolicity-preserving central-upwind scheme. We demonstrate the accuracy and the robustness of the new scheme on several challenging numerical tests.
D. Dai, Y. Epshteyn, A. Narayan. Non-Dissipative and Structure-Preserving Emulators via Spherical Optimization, Subtitled arXiv:2108.12053, 2021.
Approximating a function with a finite series, eg, involving polynomials or trigonometric functions, is a critical tool in computing and data analysis. The construction of such approximations via now-standard approaches like least squares or compressive sampling does not ensure that the approximation adheres to certain convex linear structural constraints, such as positivity or monotonicity. Existing approaches that ensure such structure are norm-dissipative and this can have a deleterious impact when applying these approaches, eg, when numerical solving partial differential equations. We present a new framework that enforces via optimization such structure on approximations and is simultaneously norm-preserving. This results in a conceptually simple convex optimization problem on the sphere, but the feasible set for such problems can be very complex. We establish well-posedness of the optimization problem through results on spherical convexity and design several spherical-projection-based algorithms to numerically compute the solution. Finally, we demonstrate the effectiveness of this approach through several numerical examples.
E. Deelman, A. Mandal, A. P. Murillo, J. Nabrzyski, V. Pascucci, R. Ricci, I. Baldin, S. Sons, L. Christopherson, C. Vardeman, R. F. da Silva, J. Wyngaard, S. Petruzza, M. Rynge, K. Vahi, W. R. Whitcup, J. Drake, E. Scott. Blueprint: Cyberinfrastructure Center of Excellence, Subtitled arXiv, 2021.
In 2018, NSF funded an effort to pilot a Cyberinfrastructure Center of Excellence (CI CoE or Center) that would serve the cyberinfrastructure (CI) needs of the NSF Major Facilities (MFs) and large projects with advanced CI architectures. The goal of the CI CoE Pilot project (Pilot) effort was to develop a model and a blueprint for such a CoE by engaging with the MFs, understanding their CI needs, understanding the contributions the MFs are making to the CI community, and exploring opportunities for building a broader CI community. This document summarizes the results of community engagements conducted during the first two years of the project and describes the identified CI needs of the MFs. To better understand MFs' CI, the Pilot has developed and validated a model of the MF data lifecycle that follows the data generation and management within a facility and gained an understanding of how this model captures the fundamental stages that the facilities' data passes through from the scientific instruments to the principal investigators and their teams, to the broader collaborations and the public. The Pilot also aimed to understand what CI workforce development challenges the MFs face while designing, constructing, and operating their CI and what solutions they are exploring and adopting within their projects. Based on the needs of the MFs in the data lifecycle and workforce development areas, this document outlines a blueprint for a CI CoE that will learn about and share the CI solutions designed, developed, and/or adopted by the MFs, provide expertise to the largest NSF projects with advanced and complex CI architectures, and foster a …
T.P. Driscoll, T.C. Bidone, S.J. Ahn, A. Yu, A. Groisman, G.A. Voth, M.A. Schwartz. Integrin-Based Mechanosensing through Conformational Deformation, In Biophysical Journal, 2021.
Conversion of integrins from low to high affinity states, termed activation, is important in biological processes including immunity, hemostasis, angiogenesis and embryonic development. Integrin activation is regulated by large-scale conformational transitions from closed, low affinity states to open, high affinity states. While it has been suggested that substrate stiffness shifts the conformational equilibrium of integrin and governs its unbinding, here we address the role of integrin conformational activation in cellular mechanosensing. Comparison of WT vs activating mutants of integrin αVβ3 show that activating mutants shift cell spreading, FAK activation, traction stress and force on talin toward high stiffness values at lower stiffness. Although all activated integrin mutants showed equivalent binding affinity for soluble ligands, the β3 S243E mutant showed the strongest shift in mechanical responses. To understand this behavior, we used coarse-grained computational models derived from molecular level information. The models predicted that wild type integrin αVβ3 displaces under force, and that activating mutations shift the required force toward lower values, with S243E showing the strongest effect. Cellular stiffness sensing thus correlates with computed effects of force on integrin conformation. Together, these data identify a role for force-induced integrin conformational deformation in cellular mechanosensing.
A. Dubey, M. Berzins, C. Burstedde, M.l L. Norman, D. Unat, M. Wahib. Structured Adaptive Mesh Refinement Adaptations to Retain Performance Portability With Increasing Heterogeneity, In Computing in Science & Engineering, Vol. 23, No. 5, pp. 62-66. 2021.
Adaptive mesh refinement (AMR) is an important method that enables many mesh-based applications to run at effectively higher resolution within limited computing resources by allowing high resolution only where really needed. This advantage comes at a cost, however: greater complexity in the mesh management machinery and challenges with load distribution. With the current trend of increasing heterogeneity in hardware architecture, AMR presents an orthogonal axis of complexity. The usual techniques, such as asynchronous communication and hierarchy management for parallelism and memory that are necessary to obtain reasonable performance are very challenging to reason about with AMR. Different groups working with AMR are bringing different approaches to this challenge. Here, we examine the design choices of several AMR codes and also the degree to which demands placed on them by their users influence these choices.
M. D. Foote, P. E. Dennison, P. R. Sullivan, K. B. O'Neill, A. K. Thorpe, D. R. Thompson, D. H. Cusworth, R. Duren, S. Joshi. Impact of scene-specific enhancement spectra on matched filter greenhouse gas retrievals from imaging spectroscopy, In Remote Sensing of Environment, Vol. 264, Elsevier, pp. 112574. 2021.
Matched filter techniques have been widely used for retrieval of greenhouse gas enhancements from imaging spectroscopy datasets. While multiple algorithmic techniques and refinements have been proposed, the greenhouse gas target spectrum used for concentration enhancement estimation has remained largely unaltered since the introduction of quantitative matched filter retrievals. The magnitude of retrieved methane and carbon dioxide enhancements, and thereby integrated mass enhancements (IME) and estimated flux of point-source emitters, is heavily dependent on this target spectrum. Current standard use of molecular absorption coefficients to create unit enhancement target spectra does not account for absorption by background concentrations of greenhouse gases, solar and sensor geometry, or atmospheric water vapor absorption. We introduce geometric and atmospheric parameters into the generation of scene-specific unit enhancement spectra to provide target spectra that are compatible with all greenhouse gas retrieval matched filter techniques. Specifically, we use radiative transfer modeling to model four parameters that are expected to change between scenes: solar zenith angle, column water vapor, ground elevation, and sensor altitude. These parameter values are well defined, with low variation within a single scene. A benchmark dataset consisting of ten AVIRIS-NG airborne imaging spectrometer scenes was used to compare IME retrieved using a matched filter algorithm. For methane plumes, IME resulting from use of standard, generic enhancement spectra varied from −22 to +28.7% compared to scene-specific enhancement spectra. Due to differences in spectral shape between the generic and scene-specific enhancement spectra, differences in methane plume IME were linked to surface spectral characteristics in addition to geometric and atmospheric parameters. IME differences were much larger for carbon dioxide plumes, with generic enhancement spectra producing integrated mass enhancements −76.1 to −48.1% compared to scene-specific enhancement spectra. Fluxes calculated from these integrated enhancements would vary by the same percentages, assuming equivalent wind conditions. Methane and carbon dioxide IME were most sensitive to changes in solar zenith angle and ground elevation. We introduce an interpolation approach that can efficiently generate scene-specific unit enhancement spectra for given sets of parameters. Scene-specific target spectra can improve confidence in greenhouse gas retrievals and flux estimates across collections of scenes with diverse geometric and atmospheric conditions.
K. Gadhave, J. Görtler, Z. Cutler, C. Nobre, O. Deussen, M. Meyer, J.M. Phillips, A. Lex. Predicting intent behind selections in scatterplot visualizations, In Information Visualization, Vol. 20, No. 4, pp. 207-228. 2021.
Predicting and capturing an analyst’s intent behind a selection in a data visualization is valuable in two scenarios: First, a successful prediction of a pattern an analyst intended to select can be used to auto-complete a partial selection which, in turn, can improve the correctness of the selection. Second, knowing the intent behind a selection can be used to improve recall and reproducibility. In this paper, we introduce methods to infer analyst’s intents behind selections in data visualizations, such as scatterplots. We describe intents based on patterns in the data, and identify algorithms that can capture these patterns. Upon an interactive selection, we compare the selected items with the results of a large set of computed patterns, and use various ranking approaches to identify the best pattern for an analyst’s selection. We store annotations and the metadata to reconstruct a selection, such as the type of algorithm and its parameterization, in a provenance graph. We present a prototype system that implements these methods for tabular data and scatterplots. Analysts can select a prediction to auto-complete partial selections and to seamlessly log their intents. We discuss implications of our approach for reproducibility and reuse of analysis workflows. We evaluate our approach in a crowd-sourced study, where we show that auto-completing selection improves accuracy, and that we can accurately capture pattern-based intent.
K. Gadhave, Z.T. Cutler, A. Lex. Reusing Interactive Analysis Workflows, Subtitled OSF Preprints, 2021.
Interactive visual analysis has many advantages, but has the disadvantage that analysis processes and workflows cannot be easily stored and reused, which is in contrast to scripted analysis workflows using a programming language such as Python. In this paper, we introduce methods to semantically capture workflows in interactive visualization systems for different interactions such as selections, filters, categorizing/grouping, labeling, and aggregation. We design these workflows to be robust to updates in the dataset by capturing the semantics of underlying interactions, and, hence, they can be applied to updated datasets. We demonstrate this specification using a prototype that visualizes the data, shows interaction provenance, and allows generating workflows from this provenance. Finally, we introduce a Python library that can consume the workflow and apply it to the datasets, providing a seamless bridge between computational workflows and interactive visualization tools. We demonstrate our techniques using our UI prototype and Jupyter notebooks.
W. W. Good, B. Zenger, J. A. Bergquist, L. C. Rupp, K. K. Gillette, M. A.F. Gsell, G. Plank, R. S. MacLeod. Quantifying the spatiotemporal influence of acute myocardial ischemia on volumetric conduction velocity, In Journal of Electrocardiology, Vol. 66, Churchill Livingstone, pp. 86-94. 2021.
Acute myocardial ischemia occurs when coronary perfusion to the heart is inadequate, which can perturb the highly organized electrical activation of the heart and can result in adverse cardiac events including sudden cardiac death. Ischemia is known to influence the ST and repolarization phases of the ECG, but it also has a marked effect on propagation (QRS); however, studies investigating propagation during ischemia have been limited.
W. W. Good, K. Gillette, B. Zenger, J. Bergquist, L. C. Rupp, J. D. Tate, D. Anderson, M. Gsell, G. Plank, R. S. Macleod. Estimation and validation of cardiac conduction velocity and wavefront reconstruction using epicardial and volumetric data, In IEEE Transactions on Biomedical Engineering, IEEE, 2021.
Objective: In this study, we have used whole heart simulations parameterized with large animal experiments to validate three techniques (two from the literature and one novel) for estimating epicardial and volumetric conduction velocity (CV). Methods: We used an eikonal-based simulation model to generate ground truth activation sequences with prescribed CVs. Using the sampling density achieved experimentally we examined the accuracy with which we could reconstruct the wavefront, and then examined the robustness of three CV estimation techniques to reconstruction related error. We examined a triangulation-based, inverse-gradient-based, and streamline-based techniques for estimating CV cross the surface and within the volume of the heart. Results: The reconstructed activation times agreed closely with simulated values, with 50-70% of the volumetric nodes and 97-99% of the epicardial nodes were within 1 ms of the ground truth. We found close agreement between the CVs calculated using reconstructed versus ground truth activation times, with differences in the median estimated CV on the order of 3-5% volumetrically and 1-2% superficially, regardless of what technique was used. Conclusion: Our results indicate that the wavefront reconstruction and CV estimation techniques are accurate, allowing us to examine changes in propagation induced by experimental interventions such as acute ischemia, ectopic pacing, or drugs. Significance: We implemented, validated, and compared the performance of a number of CV estimation techniques. The CV estimation techniques implemented in this study produce accurate, high-resolution CV fields that can be used to study propagation in the heart experimentally and clinically.
A. A. Gooch, S. Petruzza, A. Gyulassy, G. Scorzelli, V. Pascucci, L. Rantham, W. Adcock, C. Coopmans. Lessons learned towards the immediate delivery of massive aerial imagery to farmers and crop consultants, In Autonomous Air and Ground Sensing Systems for Agricultural Optimization and Phenotyping VI, Vol. 11747, International Society for Optics and Photonics, pp. 22 -- 34. 2021.
In this paper, we document lessons learned from using ViSOAR Ag Explorer™ in the fields of Arkansas and Utah in the 2018-2020 growing seasons. Our insights come from creating software with fast reading and writing of 2D aerial image mosaics for platform-agnostic collaborative analytics and visualization. We currently enable stitching in the field on a laptop without the need for an internet connection. The full resolution result is then available for instant streaming visualization and analytics via Python scripting. While our software, ViSOAR Ag Explorer™ removes the time and labor software bottleneck in processing large aerial surveys, enabling a cost-effective process to deliver actionable information to farmers, we learned valuable lessons with regard to the acquisition, storage, viewing, analysis, and planning stages of aerial data surveys. Additionally, with the ultimate goal of stitching thousands of images in minutes on board a UAV at the time of data capture, we performed preliminary tests for on-board, real-time stitching and analysis on USU AggieAir sUAS using lightweight computational resources. This system is able to create a 2D map while flying and allow interactive exploration of the full resolution data as soon as the platform has landed or has access to a network. This capability further speeds up the assessment process on the field and opens opportunities for new real-time photogrammetry applications. Flying and imaging over 1500-2000 acres per week provides up-to-date maps that give crop consultants a much broader scope of the field in general as well as providing a better view into planting and field preparation than could be observed from field level. Ultimately, our software and hardware could provide a much better understanding of weed presence and intensity or lack thereof.
W. W. Good, B. Zenger, J. A. Bergquist, L. C. Rupp, K. Gillett, N. Angel, D. Chou, G. Plank, R. S. MacLeod. Combining endocardial mapping and electrocardiographic imaging (ECGI) for improving PVC localization: A feasibility study, In Journal of Electrocardiology, 2021.
Accurate reconstruction of cardiac activation wavefronts is crucial for clinical diagnosis, management, and treatment of cardiac arrhythmias. Furthermore, reconstruction of activation profiles within the intramural myocardium has long been impossible because electrical mapping was only performed on the endocardial surface. Recent advancements in electrocardiographic imaging (ECGI) have made endocardial and epicardial activation mapping possible. We propose a novel approach to use both endocardial and epicardial mapping in a combined approach to reconstruct intramural activation times.
J. K. Holmen, D. Sahasrabudhe, M. Berzins. A Heterogeneous MPI+PPL Task Scheduling Approach for Asynchronous Many-Task Runtime Systems, In Proceedings of the Practice and Experience in Advanced Research Computing 2021 on Sustainability, Success and Impact (PEARC21), ACM, 2021.
Asynchronous many-task runtime systems and MPI+X hybrid parallelism approaches have shown promise for helping manage the increasing complexity of nodes in current and emerging high performance computing (HPC) systems, including those for exascale. The increasing architectural diversity, however, poses challenges for large legacy runtime systems emphasizing broad support for major HPC systems. Performance portability layers (PPL) have shown promise for helping manage this diversity. This paper describes a heterogeneous MPI+PPL task scheduling approach for combining these promising solutions with additional consideration for parallel third party libraries facing similar challenges to help prepare such a runtime for the diverse heterogeneous systems accompanying exascale computing. This approach is demonstrated using a heterogeneous MPI+Kokkos task scheduler and the accompanying portable abstractions  implemented in the Uintah Computational Framework, an asynchronous many-task runtime system, with additional consideration for hypre, a parallel third party library. Results are shown for two challenging problems executing workloads representative of typical Uintah applications. These results show performance improvements up to 4.4x when using this scheduler and the accompanying portable abstractions  to port a previously MPI-Only problem to Kokkos::OpenMP and Kokkos::CUDA to improve multi-socket, multi-device node use. Good strong-scaling to 1,024 NVIDIA V100 GPUs and 512 IBM POWER9 processor are also shown using MPI+Kokkos::OpenMP+Kokkos::CUDA at scale.