SCIENTIFIC COMPUTING AND IMAGING INSTITUTE
at the University of Utah

An internationally recognized leader in visualization, scientific computing, and image analysis

SCI Publications

2024


S. Dasetty, T.C. Bidone, A.L. Ferguson. “Data-driven prediction of αIIbβ3 integrin activation pathways using nonlinear manifold learning and deep generative modeling,” In Biophysical Journal, Vol. 123, 2024.

ABSTRACT

The integrin heterodimer is a transmembrane protein critical for driving cellular process and is a therapeutic target in the treatment of multiple diseases linked to its malfunction. Activation of integrin involves conformational transitions between bent and extended states. Some of the conformations that are intermediate between bent and extended states of the heterodimer have been experimentally characterized, but the full activation pathways remain unresolved both experimentally due to their transient nature and computationally due to the challenges in simulating rare barrier crossing events in these large molecular systems. An understanding of the activation pathways can provide new fundamental understanding of the biophysical processes associated with the dynamic interconversions between bent and extended states and can unveil new putative therapeutic targets. In this work, we apply nonlinear manifold learning to coarse-grained molecular dynamics simulations of bent, extended, and two intermediate states of aIIbb3 integrin to learn a low-dimensional embedding of the configurational phase space. We then train deep generative models to learn an inverse mapping between the low-dimensional embedding and high-dimensional molecular space and use these models to interpolate the molecular configurations constituting the activation pathways between the experimentally characterized states. This work furnishes plausible predictions of integrin activation pathways and reports a generic and transferable multiscale technique to predict transition pathways for biomolecular systems.



J. Dong, E. Kwan, J.A. Bergquist, B.A. Steinberg, D.J. Dosdall, E. DiBella, R.S. MacLeod, T.J. Bunch, R. Ranjan. “Ablation-induced left atrial mechanical dysfunction recovers in weeks after ablation,” In Journal of Interventional Cardiac Electrophysiology, Springer, 2024.

ABSTRACT

Background

The immediate impact of catheter ablation on left atrial mechanical function and the timeline for its recovery in patients undergoing ablation for atrial fibrillation (AF) remain uncertain. The mechanical function response to catheter ablation in patients with different AF types is poorly understood.

Methods

A total of 113 AF patients were included in this retrospective study. Each patient had three magnetic resonance imaging (MRI) studies in sinus rhythm: one pre-ablation, one immediate post-ablation (within 2 days after ablation), and one post-ablation follow-up MRI (≤ 3 months). We used feature tracking in the MRI cine images to determine peak longitudinal atrial strain (PLAS). We evaluated the change in strain from pre-ablation, immediately after ablation to post-ablation follow-up in a short-term study (< 50 days) and a 3-month study (3 months after ablation).

Results

The PLAS exhibited a notable reduction immediately after ablation, compared to both pre-ablation levels and those observed in follow-up studies conducted at short-term (11.1 ± 9.0 days) and 3-month (69.6 ± 39.6 days) intervals. However, there was no difference between follow-up and pre-ablation PLAS. The PLAS returned to 95% pre-ablation level within 10 days. Paroxysmal AF patients had significantly higher pre-ablation PLAS than persistent AF patients in pre-ablation MRIs. Both type AF patients had significantly lower immediate post-ablation PLAS compared with pre-ablation and post-ablation PLAS.

Conclusion

The present study suggested a significant drop in PLAS immediately after ablation. Left atrial mechanical function recovered within 10 days after ablation. The drop in PLAS did not show a substantial difference between paroxysmal and persistent AF patients.



J. Dong, E. Kwan, J.A. Bergquist, D.J. Dosdall, E.V. DiBella, R.S. MacLeod, G. Stoddard, K. Konstantidinis, B.A. Steinberg, T.J. Bunch, R. Ranjan. “Left atrial functional changes associated with repeated catheter ablations for atrial fibrillation,” In J Cardiovasc Electrophysiol, 2024.
DOI: 10.1111/jce.16484
PubMed ID: 39474660

ABSTRACT

Introduction: The impact of repeated atrial fibrillation (AF) ablations on left atrial (LA) mechanical function remains uncertain, with limited long-term follow-up data.

Methods: This retrospective study involved 108 AF patients who underwent two catheter ablations with cardiac magnetic resonance imaging (MRI) done before and 3 months after each of the ablations from 2010 to 2021. The rate of change in peak longitudinal atrial strain (PLAS) assessed LA function. Additionally, a sub-study of 36 patients who underwent an extra MRI before the second ablation, gave us an additional time segment to evaluate the basis of change in PLAS.

Results: In the two-ablation, three MRI sub-study 1, the PLAS percent change rate was similar before and after the first ablation (r11 = -0.9 ± 3.1%/year, p = 0.771). However, the strain change rate from postablation 1 to postablation 2 was significantly worse (r12 = -23.7 ± 4.8%/year, p < 0.001). In the sub-study 2 with four MRIs, all three rates were negative, with reductions from postablation 1 to pre-ablation 2 (r22 = -13.3 ± 2.6%/year, p < 0.001) and from pre-ablation 2 to postablation 2 (r23 = -8.9 ± 3.9%/year, p = 0.028) being significant.

Conclusion: The present study suggests that the more ablations performed, the more significant the decrease in the postablation mechanical function of the LA. The natural progression of AF (strain change from postablation 1 to pre-ablation 2) had a greater negative influence on LA mechanical function compared to the second ablation itself suggesting that second ablation in patients with recurrence after first ablation is an effective strategy even from the LA mechanical function aspect.



S. Dubey, Y. Chong, B. Knudsen, S.Y. Elhabian. “VIMs: Virtual Immunohistochemistry Multiplex staining via Text-to-Stain Diffusion Trained on Uniplex Stains,” Subtitled “arXiv:2407.19113,” 2024.

ABSTRACT

This paper introduces a Virtual Immunohistochemistry Multiplex staining (VIMs) model designed to generate multiple immunohistochemistry (IHC) stains from a single hematoxylin and eosin (H&E) stained tissue section. IHC stains are crucial in pathology practice for resolving complex diagnostic questions and guiding patient treatment decisions. While commercial laboratories offer a wide array of up to 400 different antibody-based IHC stains, small biopsies often lack sufficient tissue for multiple stains while preserving material for subsequent molecular testing. This highlights the need for virtual IHC staining. Notably, VIMs is the first model to address this need, leveraging a large vision-language single-step diffusion model for virtual IHC multiplexing through text prompts for each IHC marker. VIMs is trained on uniplex paired H&E and IHC images, employing an adversarial training module. Testing of VIMs includes both paired and unpaired image sets. To enhance computational efficiency, VIMs utilizes a pre-trained large latent diffusion model fine-tuned with small, trainable weights through the Low-Rank Adapter (LoRA) approach. Experiments on nuclear and cytoplasmic IHC markers demonstrate that VIMs outperforms the base diffusion model and achieves performance comparable to Pix2Pix, a standard generative model for paired image translation. Multiple evaluation methods, including assessments by two pathologists, are used to determine the performance of VIMs. Additionally, experiments with different prompts highlight the impact of text conditioning. This paper represents the first attempt to accelerate histopathology research by demonstrating the generation of multiple IHC stains from a single H&E input using a single model trained solely on uniplex data. This approach relaxes the traditional need for multiplex training sets, significantly broadening the applicability and accessibility of virtual IHC staining techniques.



K. Eckelt, K. Gadhave, A. Lex, M. Streit. “Loops: Leveraging Provenance and Visualization to Support Exploratory Data Analysis in Notebooks,” In Computer Graphics Forum, 2024.

ABSTRACT

Exploratory data science is an iterative process of obtaining, cleaning, profiling, analyzing, and interpreting data. This cyclical way of working creates challenges within the linear structure of computational notebooks, leading to issues with code quality, recall, and reproducibility. To remedy this, we present Loops, a set of visual support techniques for iterative and exploratory data analysis in computational notebooks. Loops leverages provenance information to visualize the impact of changes made within a notebook. In visualizations of the notebook provenance, we trace the evolution of the notebook over time and highlight differences between versions. Loops visualizes the provenance of code, markdown, tables, visualizations, and images and their respective differences. Analysts can explore these differences in detail in a separate view. Loops not only improves the reproducibility of notebooks but also supports analysts in their data science work by showing the effects of changes and facilitating comparison of multiple versions. We demonstrate our approach’s utility and potential impact in two use cases and feedback from notebook users from various backgrounds.



G. Eisenhauer, N. Podhorszki, A. Gainaru, S. Klasky, M. Parashar, M. Wolf, E. Suchtya, E. Fredj, V. Bolea, F. Poschel, K. Steiniger, M. Bussmann, R. Pausch, S. Chandrasekaran. “Streaming Data in HPC Workflows Using ADIOS,” Subtitled “arXiv:2410.00178v1,” 2024.

ABSTRACT

The “IO Wall” problem, in which the gap between computation rate and data access rate grows continuously, poses significant problems to scientific workflows which have traditionally relied upon using the filesystem for intermediate storage between workflow stages. One way to avoid this problem in scientific workflows is to stream data directly from producers to consumers and avoiding storage entirely. However, the manner in which this is accomplished is key to both performance and usability. This paper presents the Sustainable Staging Transport, an approach which allows direct streaming between traditional file writers and readers with few application changes. SST is an ADIOS “engine”, accessible via standard ADIOS APIs, and because ADIOS allows engines to be chosen at run-time, many existing file-oriented ADIOS workflows can utilize SST for direct application-to-application communication without any source code changes. This paper describes the design of SST and presents performance results from various applications that use SST, for feeding model training with simulation data with substantially higher bandwidth than the theoretical limits of Frontier’s file system, for strong coupling of separately developed applications for multiphysics multiscale simulation, or for in situ analysis and visualization of data to complete all data processing shortly after the simulation finishes.



A. Ferrero, E. Ghelichkhan, H. Manoochehri, M.M. Ho, D.J. Albertson, B.J. Brintz, T. Tasdizen, R.T. Whitaker, B. Knudsen. “HistoEM: A Pathologist-Guided and Explainable Workflow Using Histogram Embedding for Gland Classification,” In Modern Pathology, Vol. 37, No. 4, 2024.

ABSTRACT

Pathologists have, over several decades, developed criteria for diagnosing and grading prostate cancer. However, this knowledge has not, so far, been included in the design of convolutional neural networks (CNN) for prostate cancer detection and grading. Further, it is not known whether the features learned by machine-learning algorithms coincide with diagnostic features used by pathologists. We propose a framework that enforces algorithms to learn the cellular and subcellular differences between benign and cancerous prostate glands in digital slides from hematoxylin and eosin–stained tissue sections. After accurate gland segmentation and exclusion of the stroma, the central component of the pipeline, named HistoEM, utilizes a histogram embedding of features from the latent space of the CNN encoder. Each gland is represented by 128 feature-wise histograms that provide the input into a second network for benign vs cancer classification of the whole gland. Cancer glands are further processed by a U-Net structured network to separate low-grade from high-grade cancer. Our model demonstrates similar performance compared with other state-of-the-art prostate cancer grading models with gland-level resolution. To understand the features learned by HistoEM, we first rank features based on the distance between benign and cancer histograms and visualize the tissue origins of the 2 most important features. A heatmap of pixel activation by each feature is generated using Grad-CAM and overlaid on nuclear segmentation outlines. We conclude that HistoEM, similar to pathologists, uses nuclear features for the detection of prostate cancer. Altogether, this novel approach can be broadly deployed to visualize computer-learned features in histopathology images.



S. Garg, J. Zhang, R. Pitchumani, M. Parashar, B. Xie, S. Kannan. “CrossPrefetch: Accelerating I/O Prefetching for Modern Storage,” In 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ACM, 2024.

ABSTRACT

We introduce CrossPrefetch, a novel cross-layered I/O prefetching mechanism that operates across the OS and a user-level runtime to achieve optimal performance. Existing OS prefetching mechanisms suffer from rigid interfaces that do not provide information to applications on the prefetch effectiveness, suffer from high concurrency bottlenecks, and are inefficient in utilizing available system memory. CrossPrefetch addresses these limitations by dividing responsibilities between the OS and runtime, minimizing overhead, and achieving low cache misses, lock contentions, and higher I/O performance.

CrossPrefetch tackles the limitations of rigid OS prefetching interfaces by maintaining and exporting cache state and prefetch effectiveness to user-level runtimes. It also addresses scalability and concurrency bottlenecks by distinguishing between regular I/O and prefetch operations paths and introduces fine-grained prefetch indexing for shared files. Finally, CrossPrefetch designs low-interference access pattern prediction combined with support for adaptive and aggressive techniques to exploit memory capacity and storage bandwidth. Our evaluation of CrossPrefetch, encompassing microbenchmarks, macrobenchmarks, and real-world workloads, illustrates performance gains of up to 1.22x- 3.7x in I/O throughput. We also evaluate CrossPrefetch across different file systems and local and remote storage configurations.



E. Gasparovic, E. Purvine, R. Sazdanovic, B. Wang, Y. Wang, L. Ziegelmeier. “A survey of simplicial, relative, and chain complex homology theories for hypergraphs,” 2024.

ABSTRACT

Hypergraphs have seen widespread applications in network and data science communities in recent years. We present a survey of recent work to define topological objects from hypergraphs—specifically simplicial, relative, and chain complexes—that can be used to build homology theories for hypergraphs. We define and describe nine different homology theories and their relevant topological objects. We discuss some interesting properties of each method to show how the hypergraph structures are preserved or destroyed by modifying a hypergraph. Finally, we provide a series of illustrative examples by computing many of these homology theories for small hypergraphs to show the variability of the methods and build intuition.



S. Ghorbany, M. Hu, S. Yao, C. Wang, Q.C. Nguyen, X. Yue, M. Alirezaei, T. Tasdizen, M Sisk. “Examining the role of passive design indicators in energy burden reduction: Insights from a machine learning and deep learning approach,” In Building and Environment, Elsevier, 2024.

ABSTRACT

Passive design characteristics (PDC) play a pivotal role in reducing the energy burden on households without imposing additional financial constraints on project stakeholders. However, the scarcity of PDC data has posed a challenge in previous studies when assessing their energy-saving impact. To tackle this issue, this research introduces an innovative approach that combines deep learning-powered computer vision with machine learning techniques to examine the relationship between PDC and energy burden in residential buildings. In this study, we employ a convolutional neural network computer vision model to identify and measure key indicators, including window-to-wall ratio (WWR), external shading, and operable window types, using Google Street View images within the Chicago metropolitan area as our case study. Subsequently, we utilize the derived passive design features in conjunction with demographic characteristics to train and compare various machine learning methods. These methods encompass Decision Tree Regression, Random Forest Regression, and Support Vector Regression, culminating in the development of a comprehensive model for energy burden prediction. Our framework achieves a 74.2 % accuracy in forecasting the average energy burden. These results yield invaluable insights for policymakers and urban planners, paving the way toward the realization of smart and sustainable cities.



M. Han, J. Li, S. Sane, S. Gupta, B. Wang, S. Petruzza, C.R. Johnson. “Interactive Visualization of Time-Varying Flow Fields Using Particle Tracing Neural Networks,” Subtitled “arXiv preprint arXiv:2312.14973,” 2024.

ABSTRACT

Lagrangian representations of flow fields have gained prominence for enabling fast, accurate analysis and exploration of time-varying flow behaviors. In this paper, we present a comprehensive evaluation to establish a robust and efficient framework for Lagrangian-based particle tracing using deep neural networks (DNNs). Han et al. (2021) first proposed a DNN-based approach to learn Lagrangian representations and demonstrated accurate particle tracing for an analytic 2D flow field. In this paper, we extend and build upon this prior work in significant ways. First, we evaluate the performance of DNN models to accurately trace particles in various settings, including 2D and 3D time-varying flow fields, flow fields from multiple applications, flow fields with varying complexity, as well as structured and unstructured input data. Second, we conduct an empirical study to inform best practices with respect to particle tracing model architectures, activation functions, and training data structures. Third, we conduct a comparative evaluation of prior techniques that employ flow maps as input for exploratory flow visualization. Specifically, we compare our extended model against its predecessor by Han et al. (2021), as well as the conventional approach that uses triangulation and Barycentric coordinate interpolation. Finally, we consider the integration and adaptation of our particle tracing model with different viewers. We provide an interactive web-based visualization interface by leveraging the efficiencies of our framework, and perform high-fidelity interactive visualization by integrating it with an OSPRay-based viewer. Overall, our experiments demonstrate that using a trained DNN model to predict new particle trajectories requires a low memory footprint and results in rapid inference. Following best practices for large 3D datasets, our deep learning approach using GPUs for inference is shown to require approximately 46 times less memory while being more than 400 times faster than the conventional methods.



M. Han, T. Athawale, J. Li, C.R. Johnson. “Accelerated Depth Computation for Surface Boxplots with Deep Learning,” In IEEE Workshop on Uncertainty Visualization: Applications, Techniques, Software, and Decision Frameworks, IEEE, pp. 38--42. 2024.
DOI: 10.1109/UncertaintyVisualization63963.2024.00009

ABSTRACT

Functional depth is a well-known technique used to derive descriptive statistics (e.g., median, quartiles, and outliers) for 1D data. Surface boxplots extend this concept to ensembles of images, helping scientists and users identify representative and outlier images. However, the computational time for surface boxplots increases cubically with the number of ensemble members, making it impractical for integration into visualization tools. In this paper, we propose a deep-learning solution for efficient depth prediction and computation of surface boxplots for time-varying ensemble data. Our deep learning framework accurately predicts member depths in a surface boxplot, achieving average speedups of 6X on a CPU and 15X on a GPU for the 2D Red Sea dataset with 50 ensemble members compared to the traditional depth computation algorithm. Our approach achieves at least a 99% level of rank preservation, with order flipping occurring only at pairs with extremely similar depth values that pose no statistical differences. This local flipping does not significantly impact the overall depth order of the ensemble members.



C. Han, K.E. Isaacs. “A Deixis-Centered Approach for Documenting Remote Synchronous Communication around Data Visualizations,” Subtitled “arXiv:2408.04041,” 2024.

ABSTRACT

Referential gestures, or as termed in linguistics, deixis, are an essential part of communication around data visualizations. Despite their importance, such gestures are often overlooked when documenting data analysis meetings. Transcripts, for instance, fail to capture gestures, and video recordings may not adequately capture or emphasize them. We introduce a novel method for documenting collaborative data meetings that treats deixis as a first-class citizen. Our proposed framework captures cursor-based gestural data along with audio and converts them into interactive documents. The framework leverages a large language model to identify word correspondences with gestures. These identified references are used to create context-based annotations in the resulting interactive document. We assess the effectiveness of our proposed method through a user study, finding that participants preferred our automated interactive documentation over recordings, transcripts, and manual note-taking. Furthermore, we derive a preliminary taxonomy of cursor-based deictic gestures from participant actions during the study. This taxonomy offers further opportunities for better utilizing cursor-based deixis in collaborative data analysis scenarios.



C. Han, J. Lieffers, C. Morrison, K.E. Isaacs. “An Overview+ Detail Layout for Visualizing Compound Graphs,” Subtitled “arXiv:2408.04045,” 2024.

ABSTRACT

Compound graphs are networks in which vertices can be grouped into larger subsets, with these subsets capable of further grouping, resulting in a nesting that can be many levels deep. In several applications, including biological workflows, chemical equations, and computational data flow analysis, these graphs often exhibit a tree-like nesting structure, where sibling clusters are disjoint. Common compound graph layouts prioritize the lowest level of the grouping, down to the individual ungrouped vertices, which can make the higher level grouped structures more difficult to discern, especially in deeply nested networks. Leveraging the additional structure of the tree-like nesting, we contribute an overview+detail layout for this class of compound graphs that preserves the saliency of the higher level network structure when groups are expanded to show internal nested structure. Our layout draws inner structures adjacent to their parents, using a modified tree layout to place substructures. We describe our algorithm and then present case studies demonstrating the layout’s utility to a domain expert working on data flow analysis. Finally, we discuss network parameters and analysis situations in which our layout is well suited.



G. Hari, N. Joshi, Z. Wang, Q. Gong, D. Pugmire, K. Moreland, C.R. Johnson, S. Klasky, N. Podhorszki, T. Athawale. “FunM2C: A Filter for Uncertainty Visualization of Multivariate Data on Multi-Core Devices,” In IEEE Workshop on Uncertainty Visualization: Applications, Techniques, Software, and Decision Frameworks, IEEE, pp. 43--47. 2024.
DOI: 10.1109/UncertaintyVisualization63963.2024.00010

ABSTRACT

Uncertainty visualization is an emerging research topic in data visualization because neglecting uncertainty in visualization can lead to inaccurate assessments. In this paper, we study the propagation of multivariate data uncertainty in visualization. Although there have been a few advancements in probabilistic uncertainty visualization of multivariate data, three critical challenges remain to be addressed. First, the state-of-the-art probabilistic uncertainty visualization framework is limited to bivariate data (two variables). Second, existing uncertainty visualization algorithms use computationally intensive techniques and lack support for cross-platform portability. Third, as a consequence of the computational expense, integration into production visualization tools is impractical. In this work, we address all three issues and make a threefold contribution. First, we take a step to generalize the state-of-the-art probabilistic framework for bivariate data to multivariate data with an arbitrary number of variables. Second, through utilization of VTK-m’s shared-memory parallelism and cross-platform compatibility features, we demonstrate acceleration of multivariate uncertainty visualization on different many-core architectures, including OpenMP and AMD GPUs. Third, we demonstrate the integration of our algorithms with the ParaView software. We demonstrate the utility of our algorithms through experiments on multivariate simulation data with three and four variables.



M.M. Ho, S. Dubey, Y. Chong, B. Knudsen, T. Tasdizen. “F2FLDM: Latent Diffusion Models with Histopathology Pre-Trained Embeddings for Unpaired Frozen Section to FFPE Translation,” Subtitled “arXiv:2404.12650v1,” 2024.

ABSTRACT

The Frozen Section (FS) technique is a rapid and efficient method, taking only 15-30 minutes to prepare slides for pathologists' evaluation during surgery, enabling immediate decisions on further surgical interventions. However, FS process often introduces artifacts and distortions like folds and ice-crystal effects. In contrast, these artifacts and distortions are absent in the higher-quality formalin-fixed paraffin-embedded (FFPE) slides, which require 2-3 days to prepare. While Generative Adversarial Network (GAN)-based methods have been used to translate FS to FFPE images (F2F), they may leave morphological inaccuracies with remaining FS artifacts or introduce new artifacts, reducing the quality of these translations for clinical assessments. In this study, we benchmark recent generative models, focusing on GANs and Latent Diffusion Models (LDMs), to overcome these limitations. We introduce a novel approach that combines LDMs with Histopathology Pre-Trained Embeddings to enhance restoration of FS images. Our framework leverages LDMs conditioned by both text and pre-trained embeddings to learn meaningful features of FS and FFPE histopathology images. Through diffusion and denoising techniques, our approach not only preserves essential diagnostic attributes like color staining and tissue morphology but also proposes an embedding translation mechanism to better predict the targeted FFPE representation of input FS images. As a result, this work achieves a significant improvement in classification performance, with the Area Under the Curve rising from 81.99% to 94.64%, accompanied by an advantageous CaseFD. This work establishes a new benchmark for FS to FFPE image translation quality, promising enhanced reliability and accuracy in histopathology FS image analysis.



M.M. Ho, E. Ghelichkhan, Y. Chong, Y. Zhou, B.S. Knudsen, T. Tasdizen. “DISC: Latent Diffusion Models with Self-Distillation from Separated Conditions for Prostate Cancer Grading,” Subtitled “arXiv:2404.13097,” 2024.

ABSTRACT

Latent Diffusion Models (LDMs) can generate high-fidelity images from noise, offering a promising approach for augmenting histopathology images for training cancer grading models. While previous works successfully generated high-fidelity histopathology images using LDMs, the generation of image tiles to improve prostate cancer grading has not yet been explored. Additionally, LDMs face challenges in accurately generating admixtures of multiple cancer grades in a tile when conditioned by a tile mask. In this study, we train specific LDMs to generate synthetic tiles that contain multiple Gleason Grades (GGs) by leveraging pixel-wise annotations in input tiles. We introduce a novel framework named Self-Distillation from Separated Conditions (DISC) that generates GG patterns guided by GG masks. Finally, we deploy a training framework for pixel-level and slide-level prostate cancer grading, where synthetic tiles are effectively utilized to improve the cancer grading performance of existing models. As a result, this work surpasses previous works in two domains: 1) our LDMs enhanced with DISC produce more accurate tiles in terms of GG patterns, and 2) our training scheme, incorporating synthetic data, significantly improves the generalization of the baseline model for prostate cancer grading, particularly in challenging cases of rare GG5, demonstrating the potential of generative models to enhance cancer grading when data is limited.



T. Hoefler, M. Copik, P. Beckman, A. Jones, I. Foster, M. Parashar, D. Reed, M. Troyer, T. Schulthess, D. Ernst, J. Dongarra. “XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing,” Subtitled “arXiv:2401.04552v1,” 2024.

ABSTRACT

HPC and Cloud have evolved independently, specializing their innovations into performance or productivity. Acceleration as a Service (XaaS) is a recipe to empower both fields with a shared execution platform that provides transparent access to computing resources, regardless of the underlying cloud or HPC service provider. Bridging HPC and cloud advancements, XaaS presents a unified architecture built on performance-portable containers. Our converged model concentrates on low-overhead, high-performance communication and computing, targeting resource-intensive workloads from climate simulations to machine learning. XaaS lifts the restricted allocation model of Function-as-a-Service (FaaS), allowing users to benefit from the flexibility and efficient resource utilization of serverless while supporting long-running and performance-sensitive workloads from HPC.



J. K. Holmen , M. Garcıa, A. Bagusetty, V. Madananth, A. Sanderson,, M. Berzins. “Making Uintah Performance Portable for Department of Energy Exascale Testbeds,” In Euro-Par 2023: Parallel Processing, pp. 1--12. 2024.

ABSTRACT

To help ease ports to forthcoming Department of Energy (DOE) exascale systems, testbeds have been made available to select users. These testbeds are helpful for preparing codes to run on the same hardware and similar software as in their respective exascale systems. This paper describes how the Uintah Computational Framework, an open-source asynchronous many-task (AMT) runtime system, has been modified to be performance portable across the DOE Crusher, DOE Polaris, and DOE Sunspot testbeds in preparation for portable simulations across the exascale DOE Frontier and DOE Aurora systems. The Crusher, Polaris, and Sunspot testbeds feature the AMD MI250X, NVIDIA A100, and Intel PVC GPUs, respectively. This performance portability has been made possible by extending Uintah’s intermediate portability layer [18] to additionally support the Kokkos::HIP, Kokkos::OpenMPTarget, and Kokkos::SYCL back-ends. This paper also describes notable updates to Uintah’s support for Kokkos, which were required to make this extension possible. Results are shown for a challenging radiative heat transfer calculation, central to the University of Utah’s predictive boiler simulations. These results demonstrate single-source portability across AMD-, NVIDIA-, and Intel-based GPUs using various Kokkos back-ends.



Q. Huang, J. Le, S. Joshi, J. Mendes, G. Adluru, E. DiBella. “Arterial Input Function (AIF) Correction Using AIF Plus Tissue Inputs with a Bi-LSTM Network,” In Tomography, Vol. 10, pp. 660-673. 2024.

ABSTRACT

Background: The arterial input function (AIF) is vital for myocardial blood flow quantification in cardiac MRI to indicate the input time–concentration curve of a contrast agent. Inaccurate AIFs can significantly affect perfusion quantification. Purpose: When only saturated and biased AIFs are measured, this work investigates multiple ways of leveraging tissue curve information, including using AIF + tissue curves as inputs and optimizing the loss function for deep neural network training. Methods: Simulated data were generated using a 12-parameter AIF mathematical model for the AIF. Tissue curves were created from true AIFs combined with compartment-model parameters from a random distribution. Using Bloch simulations, a dictionary was constructed for a saturation-recovery 3D radial stack-of-stars sequence, accounting for deviations such as flip angle, T2* effects, and residual longitudinal magnetization after the saturation. A preliminary simulation study established the optimal tissue curve number using a bidirectional long short-term memory (Bi-LSTM) network with just AIF loss. Further optimization of the loss function involves comparing just AIF loss, AIF with compartment-model-based parameter loss, and AIF with compartment-model tissue loss. The optimized network was examined with both simulation and hybrid data, which included in vivo 3D stack-of-star datasets for testing. The AIF peak value accuracy and ?????? results were assessed. Results: Increasing the number of tissue curves can be beneficial when added tissue curves can provide extra information. Using just the AIF loss outperforms the other two proposed losses, including adding either a compartment-model-based tissue loss or a compartment-model parameter loss to the AIF loss. With the simulated data, the Bi-LSTM network reduced the AIF peak error from −23.6 ± 24.4% of the AIF using the dictionary method to 0.2 ± 7.2% (AIF input only) and 0.3 ± 2.5% (AIF + ten tissue curve inputs) of the network AIF. The corresponding ?????? error was reduced from −13.5 ± 8.8% to −0.6 ± 6.6% and 0.3 ± 2.1%. With the hybrid data (simulated data for training; in vivo data for testing), the AIF peak error was 15.0 ± 5.3% and the corresponding ?????? error was 20.7 ± 11.6% for the AIF using the dictionary method. The hybrid data revealed that using the AIF + tissue inputs reduced errors, with peak error (1.3 ± 11.1%) and ?????? error (−2.4 ± 6.7%). Conclusions: Integrating tissue curves with AIF curves into network inputs improves the precision of AI-driven AIF corrections. This result was seen both with simulated data and with applying the network trained only on simulated data to a limited in vivo test dataset.