SCI Publications
2023
S. Saha, W. Gazi, R. Mohammed, T. Rapstine, H. Powers, R. Whitaker.
Multi-task Training as Regularization Strategy for Seismic Image Segmentation, In IEEE Geoscience and Remote Sensing Letters, Vol. 20, IEEE, pp. 1--5. 2023.
DOI: 10.1109/LGRS.2023.3328837
This letter proposes multitask learning as a regularization method for segmentation tasks in seismic images. We examine application-specific auxiliary tasks, such as the estimation/detection of horizons, dip angle, and amplitude that geophysicists consider relevant for identification of channels (a geological feature), which is currently done through painstaking outlining by qualified experts. We show that multitask training helps in better generalization on test datasets with very similar and different structure/statistics. In such settings, we also show that multitask learning performs better on unseen datasets relative to the baseline.
S. Saha, S. Joshi, R. Whitaker.
Matching aggregate posteriors in the variational autoencoder, Subtitled arXiv preprint arXiv:2311.07693, 2023.
The variational autoencoder (VAE) [1] is a well-studied, deep, latent-variable model (DLVM) that efficiently optimizes the variational lower bound of the log marginal data likelihood and has a strong theoretical foundation. However, the VAE’s known failure to match the aggregate posterior often results in pockets/holes in the latent distribution (i.e., a failure to match the prior) and/or posterior collapse, which is associated with a loss of information in the latent space. This paper addresses these shortcomings in VAEs by reformulating the objective function associated with VAEs in order to match the aggregate/marginal posterior distribution to the prior. We use kernel density estimate (KDE) to model the aggregate posterior in high dimensions. The proposed method is named the aggregate variational autoencoder (AVAE) and is built on the theoretical framework of the VAE. Empirical evaluation of the proposed method on multiple benchmark data sets demonstrates the effectiveness of the AVAE relative to state-of-the-art (SOTA) methods.
M. Shao, T. Tasdizen, S. Joshi.
Analyzing the Domain Shift Immunity of Deep Homography Estimation, Subtitled arXiv:2304.09976v1, 2023.
Homography estimation is a basic image-alignment method in many applications. Recently, with the development of convolutional neural networks (CNNs), some learning based approaches have shown great success in this task. However, the performance across different domains has never been researched. Unlike other common tasks (e.g., classification, detection, segmentation), CNN based homography estimation models show a domain shift immunity, which means a model can be trained on one dataset and tested on another without any transfer learning. To explain this unusual performance, we need to determine how CNNs estimate homography. In this study, we first show the domain shift immunity of different deep homography estimation models. We then use a shallow network with a specially designed dataset to analyze the features used for estimation. The results show that networks use low-level texture information to estimate homography. We also design some experiments to compare the performance between different texture densities and image features distorted on some common datasets to demonstrate our findings. Based on these findings, we provide an explanation of the domain shift immunity of deep homography estimation.
N. Shingde, M. Berzins, T. Blattner, W. Keyrouz, A. Bardakoff.
Extending Hedgehog’s dataflow graphs to multi-node GPU architectures, In Workshop on Asynchronous Many-Task Systems and Applications (WAMTA23), 2023.
Asynchronous task-based systems offer the possibility of making it easier to take advantage of scalable heterogeneous architectures.
This paper extends the National Institute of Standards and Technology’s Hedgehog dataflow graph models, which target a single high-end
compute node, to run on a cluster by borrowing aspects of Uintah’s cluster-scale task graphs and applying them to a sample implementation
of matrix multiplication. These results are compared to implementations using the leading libraries, SLATE and DPLASMA, for illustrative purposes only. The motivation behind this work is to demonstrate that using general purpose high-level abstractions, such as Hedgehog’s dataflow graphs, does not negatively impact performance.
K. Shukla, V. Oommen, A. Peyvan, M. Penwarden, L. Bravo, A. Ghoshal, R.M. Kirby, G. Karniadakis.
Deep neural operators can serve as accurate surrogates for shape optimization: A case study for airfoils, Subtitled arXiv:2302.00807v1, 2023.
Deep neural operators, such as DeepONets, have changed the paradigm in high-dimensional nonlinear regression from function regression to (differential) operator regression, paving the way for significant changes in computational engineering applications. Here, we investigate the use of DeepONets to infer flow fields around unseen airfoils with the aim of shape optimization, an important design problem in aerodynamics that typically taxes computational resources heavily. We present results which display little to no degradation in prediction accuracy, while reducing the online optimization cost by orders of magnitude. We consider NACA airfoils as a test case for our proposed approach, as their shape can be easily defined by the four-digit parametrization. We successfully optimize the constrained NACA four-digit problem with respect to maximizing the lift-to-drag ratio and validate all results by comparing them to a high-order CFD solver. We find that DeepONets have low generalization error, making them ideal for generating solutions of unseen shapes. Specifically, pressure, density, and velocity fields are accurately inferred at a fraction of a second, hence enabling the use of general objective functions beyond the maximization of the lift-to-drag ratio considered in the current work.
K. Shukla, V. Oommen, A. Peyvan, M. Penwarden, N. Plewacki, L. Bravo, A. Ghoshal, R.M. Kirby, G. Karniadakis.
Deep neural operators as accurate surrogates for shape optimization, In Engineering Applications of Artificial Intelligence, Vol. 129, pp. 107615. 2023.
ISSN: 0952-1976
Deep neural operators, such as DeepONet, have changed the paradigm in high-dimensional nonlinear regression, paving the way for significant generalization and speed-up in computational engineering applications. Here, we investigate the use of DeepONet to infer flow fields around unseen airfoils with the aim of shape constrained optimization, an important design problem in aerodynamics that typically taxes computational resources heavily. We present results that display little to no degradation in prediction accuracy while reducing the online optimization cost by orders of magnitude. We consider NACA airfoils as a test case for our proposed approach, as the four-digit parameterization can easily define their shape. We successfully optimize the constrained NACA four-digit problem with respect to maximizing the lift-to-drag ratio and validate all results by comparing them to a high-order CFD solver. We find that DeepONets have a low generalization error, making them ideal for generating solutions of unseen shapes. Specifically, pressure, density, and velocity fields are accurately inferred at a fraction of a second, hence enabling the use of general objective functions beyond the maximization of the lift-to-drag ratio considered in the current work. Finally, we validate the ability of DeepONet to handle a complex 3D waverider geometry at hypersonic flight by inferring shear stress and heat flux distributions on its surface at unseen angles of attack. The main contribution of this paper is a modular integrated design framework that uses an over-parametrized neural operator as a surrogate model with good generalizability coupled seamlessly with multiple optimization solvers in a plug-and-play mode.
C. Sicari, A. Catalfamo, L. Carnevale, A. Galletta, D. Balouek-Thomert, M. Parashar, M. Villari.
TEMA: Event Driven Serverless Workflows Platform for Natural Disaster Management, In 2023 IEEE Symposium on Computers and Communications (ISCC), pp. 1-6. 2023.
DOI: 10.1109/ISCC58397.2023.10217920
TEMA project is a Horizon Europe funded project that aims at addressing Natural Disaster Management by the use of sophisticated Cloud-Edge Continuum infrastructures by means of data analysis algorithms wrapped in Serverless functions deployed on a distributed infrastructure according to a Federated Learning scheduler that constantly monitors the infrastructure in search of the best way to satisfy required QoS constraints. In this paper, we discuss the advantages of Serverless workflow and how they can be used and monitored to natively trigger complex algorithm pipelines in the continuum, dynamically placing and relocating them taking into account incoming IoT data, QoS constraints, and the current status of the continuum infrastructure. Therefore we presented the Urgent Function Enabler (UFE) platform, a fully distributed architecture able to define, spread, and manage FaaS functions, using local IOT data managed using the Fiware ecosystem and a computing infrastructure composed of mobile and stable nodes.
C. Sicari, D. Balouek, M. Villari, M. Parashar.
Event-Driven FaaS Workflows for Enabling IoT Data Processing at the Cloud Edge Continuum, In CC 2023 - International Conference on Utility and Cloud Computing, 2023.
Continuum Computing encompasses the integration of diverse infrastructures, including cloud, edge, and fog, to facilitate seamless
migration of applications based on their specific needs, ensuring optimal satisfaction of their requirements. The primary obstacles in this particular context mostly pertain to the incapacity to promptly respond to changes in the environment or the quality of service (QoS) constraints of the application, as well as the incapability to maintain an application in a stateless manner, hence impeding its relocation without the risk of data loss. The objective of this research is to tackle the aforementioned issues through the introduction of a framework based on Function-as-a-Service (FaaS) and event-driven architecture. This framework enables the decomposition, localization, and relocation of applications inside a Continuum infrastructure, facilitated by a rule engine that is both system and data-aware
K.M.A. Sultan, B. Orkild, A. Morris, E. Kholmovski, E. Bieging, E. Kwan, R. Ranjan, E. DiBella, s. Elhabian.
Two-Stage Deep Learning Framework for Quality Assessment of Left Atrial Late Gadolinium Enhanced MRI Images, Subtitled arXiv:2310.08805v1, 2023.
Accurate assessment of left atrial fibrosis in patients with atrial fibrillation relies on high-quality 3D late gadolinium enhancement (LGE) MRI images. However, obtaining such images is challenging due to patient motion, changing breathing patterns, or sub-optimal choice of pulse sequence parameters. Automated assessment of LGE-MRI image diagnostic quality is clinically significant as it would enhance diagnostic accuracy, improve efficiency, ensure standardization, and contributes to better patient outcomes by providing reliable and high-quality LGE-MRI scans for fibrosis quantification and treatment planning. To address this, we propose a two-stage deep-learning approach for automated LGE-MRI image diagnostic quality assessment. The method includes a left atrium detector to focus on relevant regions and a deep network to evaluate diagnostic quality. We explore two training strategies, multi-task learning, and pretraining using contrastive learning, to overcome limited annotated data in medical imaging. Contrastive Learning result shows about 4%, and 9% improvement in F1-Score and Specificity compared to Multi-Task learning when there’s limited data.
K.M.A.Sultan, B. Orkild, A. Morris, E. Kholmovski, E. Bieging, E. Kwan, R. Ranjan, E. DiBella, S. Elhabian.
Two-Stage Deep Learning Framework for Quality Assessment of Left Atrial Late Gadolinium Enhanced MRI Images, Subtitled arXiv:2310.08805, 2023.
Accurate assessment of left atrial fibrosis in patients with atrial fibrillation relies on high-quality 3D late gadolinium enhancement (LGE) MRI images. However, obtaining such images is challenging due to patient motion, changing breathing patterns, or sub-optimal choice of pulse sequence parameters. Automated assessment of LGE-MRI image diagnostic quality is clinically significant as it would enhance diagnostic accuracy, improve efficiency, ensure standardization, and contributes to better patient outcomes by providing reliable and high-quality LGE-MRI scans for fibrosis quantification and treatment planning. To address this, we propose a two-stage deep-learning approach for automated LGE-MRI image diagnostic quality assessment. The method includes a left atrium detector to focus on relevant regions and a deep network to evaluate diagnostic quality. We explore two training strategies, multi-task learning, and pretraining using contrastive learning, to overcome limited annotated data in medical imaging. Contrastive Learning result shows about 4%, and 9% improvement in F1-Score and Specificity compared to Multi-Task learning when there's limited data.
T. Sun, D. Li, B. Wang.
On the Decentralized Stochastic Gradient Descent with Markov Chain Sampling, In IEEE Transactions on Signal Processing, IEEE, July, 2023.
The decentralized stochastic gradient method emerges as a promising solution for solving large-scale machine learning problems. This paper studies the decentralized Markov chain gradient descent (DMGD), a variant of the decentralized stochastic gradient method, which draws random samples along the trajectory of a Markov chain. DMGD arises when obtaining independent samples is costly or impossible, excluding the use of the traditional stochastic gradient algorithms. Specifically, we consider the DMGD over a connected graph, where each node only communicates with its neighbors by sending and receiving the intermediate results. We establish both ergodic and nonergodic convergence rates of DMGD, which elucidate the critical dependencies on the topology of the graph that connects all nodes and the mixing time of the Markov chain. We further numerically verify the sample efficiency of DMGD.
T. Sun, Q. Wang, Y. Lei, D. Li, B. Wang.
Pairwise Learning with Adaptive Online Gradient Descent, In Transactions on Machine Learning Research, 2023.
In this paper, we propose an adaptive online gradient descent method with momentum for pairwise learning, in which the step size is determined by historical information. Due to the structure of pairwise learning, the sample pairs are dependent on the parameters, causing difficulties in the convergence analysis. To this end, we develop novel techniques for the convergence analysis of the proposed algorithm. We show that the proposed algorithm can output the desired solution in strongly convex, convex, and nonconvex cases. Furthermore, we present theoretical explanations for why our proposed algorithm can accelerate previous workhorses for online pairwise learning. All assumptions used in the theoretical analysis are mild and common, making our results applicable to various pairwise learning problems. To demonstrate the efficiency of our algorithm, we compare the proposed adaptive method with the non-adaptive counterpart on the benchmark online AUC maximization problem.
J. Tate, Z. Liu, J.A. Bergquist, S. Rampersad, D. White, C. Charlebois, L. Rupp, D. Brooks, R. MacLeod, A. Narayan.
UncertainSCI: A Python Package for Noninvasive Parametric Uncertainty Quantification of Simulation Pipelines, In Journal of Open Source Software, Vol. 8, No. 90, 2023.
We have developed UncertainSCI (UncertainSCI, 2020) as an open-source tool designed to make modern uncertainty quantification (UQ) techniques more accessible in biomedical simulation applications. UncertainSCI is implemented in Python with a noninvasive interface to meet our software design goals of 1) numerical accuracy, 2) simple application programming interface (API), 3) adaptability to many applications and methods, and 4) interfacing with diverse simulation software. Using a Python implementation in UncertainSCI allowed us to utilize the popularity and low barrier-to-entry of Python and its common packages and to leverage the built-in integration and support for Python in common simulation software packages and languages. Additionally, we used noninvasive UQ techniques and created a similarly noninvasive interface to external modeling software that can be called in diverse ways, depending on the complexity and level of Python integration in the external simulation pipeline. We have developed and included examples applying UncertainSCI to relatively simple 1D simulations implemented in Python, and to bioelectric field simulations implemented in external software packages, which demonstrate the use of UncertainSCI and the effectiveness of the architecture and implementation in achieving our design goals. UnceratainSCI differs from similar software, notably UQLab, Uncertainpy, and Simnibs, in that it can be efficiently and non-invasively used with external simulation software, specifically with high resolution 3D simulations often used in Bioelectric field simulations. Figure 1 illustrates the use of UncertainSCI in computing UQ with modeling pipelines for bioelectricity simulations
R. Tohid, S. Shirzad, C. Taylor, S.A. Sakin, K.E. Isaacs, H. Kaiser.
Halide Code Generation Framework in Phylanx, In Euro-Par 2022: Parallel Processing Workshops, Springer Nature Switzerland, pp. 32--45. 2023.
ISBN: 978-3-031-31209-0
DOI: 10.1007/978-3-031-31209-0_3
Separating algorithms from their computation schedule has become a de facto solution to tackle the challenges of developing high performance code on modern heterogeneous architectures. Common approaches include Domain-specific languages (DSLs) which provide familiar APIs to domain experts, code generation frameworks that automate the generation of fast and portable code, and runtime systems that manage threads for concurrency and parallelism. In this paper, we present the Halide code generation framework for Phylanx distributed array processing platform. This extension enables compile-time optimization of Phylanx primitives for target architectures. To accomplish this, (1) we implemented new Phylanx primitives using Halide, and (2) partially exported Halide's thread pool API to carry out parallelism on HPX (Phylanx's runtime) threads. (3) showcased HPX performance analysis tools made available to Halide applications. The evaluation of the work has been done in two steps. First, we compare the performance of Halide applications running on its native runtime with that of the new HPX backend to verify there is no cost associated with using HPX threads. Next, we compare performances of a number of original implementations of Phylanx primitives against the new ones in Halide to verify performance and portability benefits of Halide in the context of Phylanx.
J Ukey, S Elhabian.
Image2SSM: Localization-aware Deep Learning Framework for Statistical Shape Modeling Directly from Images, In Medical Imaging with Deep Learning, In Proceedings of Machine Learning Research , 2023.
Statistical Shape Modelling (SSM) is an effective tool for quantitatively analyzing anatomical populations. SSM has benefitted largely from advances in deep learning where statistical representations of anatomies (e.g., point distribution models or PDMs) are inferred directly from images, alleviating the need for a time-consuming and expensive workflow of anatomy segmentation, shape registration, and model optimization. Nonetheless, to date, existing deep learning methods do not consider the rigid pose transformation of shapes or anatomy of interest. They also require a tight bounding box to be defined over the image of anatomy-of-interest before feeding the image to the deep network for network training and inference. In this paper, we propose a deep learning framework that simultaneously detects and segments the anatomy of interest, estimate the rigid transformation with respect to the population mean (average) using a spatial transformer, and estimates the corresponding statistical representation of that anatomy, all directly from unsegmented 3D image without the need for any additional supervision. Furthermore, we leverage the segmentation task to provide an attention model for the sub-network that estimates shape representation, giving more accurate shape statistics for shape analysis.
Z. Wang, T. M. Athawale, K. Moreland, J. Chen, C. R. Johnson, D. Pugmire.
FunMC2: A Filter for Uncertainty Visualization of Marching Cubes on Multi-Core Devices, In Eurographics Symposium on Parallel Graphics and Visualization, 2023.
DOI: 10.2312/pgv.20231081
Visualization is an important tool for scientists to extract understanding from complex scientific data. Scientists need to understand the uncertainty inherent in all scientific data in order to interpret the data correctly. Uncertainty visualization has been an active and growing area of research to address this challenge. Algorithms for uncertainty visualization can be expensive, and research efforts have been focused mainly on structured grid types. Further, support for uncertainty visualization in production tools is limited. In this paper, we adapt an algorithm for computing key metrics for visualizing uncertainty in Marching Cubes (MC) to multi-core devices and present the design, implementation, and evaluation for a Filter for uncertainty visualization of Marching Cubes on Multi-Core devices (FunMC2). FunMC2 accelerates the uncertainty visualization of MC significantly, and it is portable across multi-core CPUs and GPUs. Evaluation results show that FunMC2 based on OpenMP runs around 11× to 41× faster on multi-core CPUs than the corresponding serial version using one CPU core. FunMC2 based on a single GPU is around 5× to 9× faster than FunMC2 running by OpenMP. Moreover, FunMC2 is flexible enough to process ensemble data with both structured and unstructured mesh types. Furthermore, we demonstrate that FunMC2 can be seamlessly integrated as a plugin into ParaView, a production visualization tool for post-processing.
K. Williams, A. Bigelow, K.E. Isaacs.
Data Abstraction Elephants: The Initial Diversity of Data Representations and Mental Models, In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23), ACM, 2023.
Two people looking at the same dataset will create diferent mental models, prioritize diferent attributes, and connect with diferent visualizations. We seek to understand the space of data abstractions associated with mental models and how well people communicate their mental models when sketching. Data abstractions have a profound infuence on the visualization design, yet it’s unclear how universal they may be when not initially infuenced by a representation. We conducted a study about how people create their mental models from a dataset. Rather than presenting tabular data, we presented each participant with one of three datasets in paragraph form, to avoid biasing the data abstraction and mental model. We observed various mental models, data abstractions, and depictions from the same dataset, and how these concepts are infuenced by communication and purpose-seeking. Our results have implications for visualization design, especially during the discovery and data collection phase.
H. Xu, S. Elhabian.
Image2SSM: Reimagining Statistical Shape Models from Images with Radial Basis Functions, Subtitled arXiv:2305.11946, 2023.
Statistical shape modeling (SSM) is an essential tool for analyzing variations in anatomical morphology. In a typical SSM pipeline, 3D anatomical images, gone through segmentation and rigid registration, are represented using lower-dimensional shape features, on which statistical analysis can be performed. Various methods for constructing compact shape representations have been proposed, but they involve laborious and costly steps. We propose Image2SSM, a novel deep-learning-based approach for SSM that leverages image-segmentation pairs to learn a radial-basis-function (RBF)-based representation of shapes directly from images. This RBF-based shape representation offers a rich self-supervised signal for the network to estimate a continuous, yet compact representation of the underlying surface that can adapt to complex geometries in a data-driven manner. Image2SSM can characterize populations of biological structures of interest by constructing statistical landmark-based shape models of ensembles of anatomical shapes while requiring minimal parameter tuning and no user assistance. Once trained, Image2SSM can be used to infer low-dimensional shape representations from new unsegmented images, paving the way toward scalable approaches for SSM, especially when dealing with large cohorts. Experiments on synthetic and real datasets show the efficacy of the proposed method compared to the state-of-art correspondence-based method for SSM.
H. Xu, A. Morris, S.Y. Elhabian.
Particle-Based Shape Modeling for Arbitrary Regions-of-Interest, In Shape in Medical Imaging, Lecture Notes in Computer Science, vol 14350, 2023.
Statistical Shape Modeling (SSM) is a quantitative method for analyzing morphological variations in anatomical structures. These analyses often necessitate building models on targeted anatomical regions of interest to focus on specific morphological features. We propose an extension to particle-based shape modeling (PSM), a widely used SSM framework, to allow shape modeling to arbitrary regions of interest. Existing methods to define regions of interest are computationally expensive and have topological limitations. To address these shortcomings, we use mesh fields to define free-form constraints, which allow for delimiting arbitrary regions of interest on shape surfaces. Furthermore, we add a quadratic penalty method to the model optimization to enable computationally efficient enforcement of any combination of cutting-plane and free-form constraints. We demonstrate the effectiveness of this method on a challenging synthetic dataset and two medical datasets.
B. Zhang, P.E. Davis, N. Morales, Z. Zhang, K. Teranishi, M. Parashar.
Optimizing Data Movement for GPU-Based In-Situ Workflow Using GPUDirect RDMA, In Euro-Par 2023: Parallel Processing, Springer Nature Switzerland, pp. 323--338. 2023.
ISBN: 978-3-031-39698-4
DOI: 10.1007/978-3-031-39698-4_22
The extreme-scale computing landscape is increasingly dominated by GPU-accelerated systems. At the same time, in-situ workflows that employ memory-to-memory inter-application data exchanges have emerged as an effective approach for leveraging these extreme-scale systems. In the case of GPUs, GPUDirect RDMA enables third-party devices, such as network interface cards, to access GPU memory directly and has been adopted for intra-application communications across GPUs. In this paper, we present an interoperable framework for GPU-based in-situ workflows that optimizes data movement using GPUDirect RDMA. Specifically, we analyze the characteristics of the possible data movement pathways between GPUs from an in-situ workflow perspective, and design a strategy that maximizes throughput. Furthermore, we implement this approach as an extension of the DataSpaces data staging service, and experimentally evaluate its performance and scalability on a current leadership GPU cluster. The performance results show that the proposed design reduces data-movement time by up to 53% and 40% for the sender and receiver, respectively, and maintains excellent scalability for up to 256 GPUs.
Page 7 of 140