SCIENTIFIC COMPUTING AND IMAGING INSTITUTE
at the University of Utah

An internationally recognized leader in visualization, scientific computing, and image analysis

SCI Publications

2024


X. Tang, J. Berquist, B.A. Steinberg, T. Tasdizen. “Hierarchical Transformer for Electrocardiogram Diagnosis,” Subtitled “arXiv:2411.00755,” 2024.

ABSTRACT

Transformers, originally prominent in NLP and computer vision, are now being adapted for ECG signal analysis. This paper introduces a novel hierarchical transformer architecture that segments the model into multiple stages by assessing the spatial size of the embeddings, thus eliminating the need for additional downsampling strategies or complex attention designs. A classification token aggregates information across feature scales, facilitating interactions between different stages of the transformer. By utilizing depth-wise convolutions in a six-layer convolutional encoder, our approach preserves the relationships between different ECG leads. Moreover, an attention gate mechanism learns associations among the leads prior to classification. This model adapts flexibly to various embedding networks and input sizes while enhancing the interpretability of transformers in ECG signal analysis.



T. Tasdizen. “VISTA: A Visual and Textual Attention Dataset for Interpreting Multimodal Models,” Subtitled “arXiv:2410.04609,” 2024.

ABSTRACT

The recent developments in deep learning (DL) led to the integration of natural language processing (NLP) with computer vision, resulting in powerful integrated Vision and Language Models (VLMs). Despite their remarkable capabilities, these models are frequently regarded as black boxes within the machine learning research community. This raises a critical question: which parts of an image correspond to specific segments of text, and how can we decipher these associations? Understanding these connections is essential for enhancing model transparency, interpretability, and trustworthiness. To answer this question, we present an image-text aligned human visual attention dataset that maps specific associations between image regions and corresponding text segments. We then compare the internal heatmaps generated by VL models with this dataset, allowing us to analyze and better understand the model’s decisionmaking process. This approach aims to enhance model transparency, interpretability, and trustworthiness by providing insights into how these models align visual and linguistic information. We conducted a comprehensive study on text-guided visual saliency detection in these VL models. This study aims to understand how different models prioritize and focus on specific visual elements in response to corresponding text segments, providing deeper insights into their internal mechanisms and improving our ability to interpret their outputs.



M. Taufer, H. Martinez, J. Luettgau, L. Whitnah, G. Scorzelli, P. Newel, A. Panta, T. Bremer, D. Fils, C.R. Kirkpatrick, N. McCurdy, V. Pascucci. “Integrating FAIR Digital Objects (FDOs) into the National Science Data Fabric (NSDF) to Revolutionize Dataflows for Scientific Discovery,” In Computing in Science & Engineering, IEEE, 2024.

ABSTRACT

In this perspective paper, we introduce a paradigm-shifting approach that combines the power of FAIR Digital Objects (FDO) with the National Science Data Fabric (NSDF), defining a new era of data accessibility, scientific discovery, and education. Integrating FDOs into the NSDF opens doors to overcoming substantial data access barriers and facilitating the extraction of machine-actionable metadata aligned with FAIR principles. Our augmented NSDF empowers the exchange of massive climate simulations and streamlines materials science workflows. This paper lays the foundation for an inclusive, web-centric, and network-first design, democratizing data access and fostering unprecedented opportunities for research and collaboration within the scientific community.



J. Ukey, T. Kataria, S.Y. Elhabian. “MASSM: An End-to-End Deep Learning Framework for Multi-Anatomy Statistical Shape Modeling Directly From Images,” Subtitled “arXiv preprint arXiv:2403.11008,” 2024.

ABSTRACT

Statistical Shape Modeling (SSM) is an effective method for quantitatively analyzing anatomical variations within populations. However, its utility is limited by the need for manual segmentations of anatomies, a task that relies on the scarce expertise of medical professionals. Recent advances in deep learning have provided a promising approach that automatically generates statistical representations from unsegmented images. Once trained, these deep learning-based models eliminate the need for manual segmentation for new subjects. Nonetheless, most current methods still require manual pre-alignment of image volumes and specifying a bounding box around the target anatomy prior for inference, resulting in a partially manual inference process. Recent approaches facilitate anatomy localization but only estimate statistical representations at the population level. However, they cannot delineate anatomy directly in images and are limited to modeling a single anatomy. Here, we introduce MASSM, a novel end-to-end deep learning framework that simultaneously localizes multiple anatomies in an image, estimates population-level statistical representations, and delineates each anatomy. Our findings emphasize the crucial role of local correspondences, showcasing their indispensability in providing superior shape information for medical imaging tasks.



J. Ukey, T. Kataria, S.Y. Elhabian. “Weakly SSM: On the Viability of Weakly Supervised Segmentations for Statistical Shape Modeling,” Subtitled “arXiv:2407.15260,” 2024.

ABSTRACT

Statistical Shape Models (SSMs) excel at identifying population level anatomical variations, which is at the core of various clinical and biomedical applications, including morphology-based diagnostics and surgical planning. However, the effectiveness of SSM is often constrained by the necessity for expert-driven manual segmentation, a process that is both time-intensive and expensive, thereby restricting their broader application and utility. Recent deep learning approaches enable the direct estimation of Statistical Shape Models (SSMs) from unsegmented images. While these models can predict SSMs without segmentation during deployment, they do not address the challenge of acquiring the manual annotations needed for training, particularly in resource-limited settings. Semi-supervised and foundation models for anatomy segmentation can mitigate the annotation burden. Yet, despite the abundance of available approaches, there are no established guidelines to inform end-users on their effectiveness for the downstream task of constructing SSMs. In this study, we systematically evaluate the potential of weakly supervised methods as viable alternatives to manual segmentation’s for building SSMs. We establish a new performance benchmark by employing various semi-supervised and foundational model methods for anatomy segmentation under low annotation settings, utilizing the predicted segmentation’s for the task of SSM. We compare the modes of shape variation and use quantitative metrics to compare against a shape model derived from a manually annotated dataset. Our results indicate that some methods produce noisy segmentation, which is very unfavorable for SSM tasks, while others can capture the correct modes of variations in the population cohort with 60-80% reduction in required manual annotation.



S. Viknesh, Y. Tatari, A. Arzani. “ADAM-SINDy: An Efficient Optimization Framework for Parameterized Nonlinear Dynamical System Identification,” Subtitled “arXiv:2410.16528,” 2024.

ABSTRACT

Identifying nonlinear dynamical systems characterized by nonlinear parameters presents significant challenges in deriving mathematical models that enhance understanding of physical phenomena. Traditional methods, such as Sparse Identification of Nonlinear Dynamics (SINDy) and symbolic regression, can extract governing equations from observational data; however, they also come with distinct advantages and disadvantages. This paper introduces a novel methodology within the SINDy framework, termed ADAM-SINDy, which synthesizes the strengths of established approaches by employing the ADAM optimization algorithm. This integration facilitates the simultaneous optimization of nonlinear parameters and coefficients associated with nonlinear candidate functions, enabling efficient and precise parameter estimation without requiring prior knowledge of nonlinear characteristics such as trigonometric frequencies, exponential bandwidths, or polynomial exponents, thereby addressing a key limitation of the classical SINDy framework. Through an integrated global optimization, ADAM-SINDy dynamically adjusts all unknown variables in response to system-specific data, resulting in a more adaptive and efficient identification procedure that reduces the sensitivity to the library of candidate functions. The performance of the ADAM-SINDy methodology is demonstrated across a spectrum of dynamical systems, including benchmark coupled nonlinear ordinary differential equations such as oscillators, chaotic fluid flows, reaction kinetics, pharmacokinetics, as well as nonlinear partial differential equations (wildfire transport). The results demonstrate significant improvements in identifying parameterized dynamical systems and underscore the importance of concurrently optimizing all parameters, particularly those characterized by nonlinear parameters. These findings highlight the potential of ADAM-SINDy to extend the applicability of the SINDy framework in addressing more complex challenges in dynamical system identification.



S. Viknesh, A. Tohidi, F. Afghah, R. Stoll, A. Arzani. “Role of flow topology in wind-driven wildfire propagation,” Subtitled “arXiv:2411.04007,” 2024.

ABSTRACT

Wildfires propagate through intricate interactions between wind, fuel, and terrain, resulting in complex behaviors that pose challenges for accurate predictions. This study investigates the interaction between wind velocity topology and wildfire spread dynamics, aiming to enhance our understanding of wildfire spread patterns. We revisited the non-dimensionalizion of the governing combustion model by incorporating three distinct time scales. This approach revealed two new non-dimensional numbers, contrasting with the conventional non-dimensionalization that considers only a single time scale. Through scaling analysis, we analytically identified the critical determinants of transient wildfire behavior and established a state-neutral curve, indicating where initial wildfires extinguish for specific combinations of the identified non-dimensional numbers. Subsequently, a wildfire transport solver was developed using a finite difference method, integrating compact schemes and implicit-explicit Runge-Kutta methods. We explored the influence of stable and unstable manifolds in wind velocity on the transport of wildfire under steady wind conditions defined using a saddle-type fixed point flow, emphasizing the role of the non-dimensional numbers. Additionally, we considered the benchmark unsteady double-gyre flow and examined the effect of unsteady wind topology on wildfire propagation, and quantified the wildfire response to varying wind oscillation frequencies and amplitudes using a transfer function approach. The results were also compared to Lagrangian coherent structures (LCS) used to characterize the correspondence of manifolds with wildfire propagation. The comprehensive approach of utilizing the manifolds computed from wind topology provides valuable insights into wildfire dynamics across diverse wind scenarios, offering a potential tool for improved predictive modeling and management strategies.



S. Wang, H. Yan, K.E. Isaacs, Y. Sun. “Visual Exploratory Analysis for Designing Large-Scale Network-on-Chip Architectures: A Domain Expert-Led Design Study,” In IEEE Transactions on Visualization and Computer Graphics, Vol. 30, pp. 1970-1983. 2024.

ABSTRACT

Visualization design studies bring together visualization researchers and domain experts to address yet unsolved data analysis challenges stemming from the needs of the domain experts. Typically, the visualization researchers lead the design study process and implementation of any visualization solutions. This setup leverages the visualization researchers' knowledge of methodology, design, and programming, but the availability to synchronize with the domain experts can hamper the design process. We consider an alternative setup where the domain experts take the lead in the design study, supported by the visualization experts. In this study, the domain experts are computer architecture experts who simulate and analyze novel computer chip designs. These chips rely on a Network-on-Chip (NOC) to connect components. The experts want to understand how the chip designs perform and what in the design led to their performance. To aid this analysis, we develop Vis4Mesh, a visualization system that provides spatial, temporal, and architectural context to simulated NOC behavior. Integration with an existing computer architecture visualization tool enables architects to perform deep-dives into specific architecture component behavior. We validate Vis4Mesh through a case study and a user study with computer architecture researchers. We reflect on our design and process, discussing advantages, disadvantages, and guidance for engaging in a domain expert-led design studies.



S.H. Wang, J. Baker, C. Hauck, B. Wang. “Learning to Control the Smoothness of Graph Convolutional Network Features,” Subtitled “arXiv:2410.14604,” 2024.

ABSTRACT

The pioneering work of Oono and Suzuki [ICLR, 2020] and Cai and Wang [arXiv:2006.13318] initializes the analysis of the smoothness of graph convolutional network (GCN) features. Their results reveal an intricate empirical correlation between node classification accuracy and the ratio of smooth to non-smooth feature components. However, the optimal ratio that favors node classification is unknown, and the non-smooth features of deep GCN with ReLU or leaky ReLU activation function diminish. In this paper, we propose a new strategy to let GCN learn node features with a desired smoothness – adapting to data and tasks – to enhance node classification. Our approach has three key steps: (1) We establish a geometric relationship between the input and output of ReLU or leaky ReLU. (2) Building on our geometric insights, we augment the message-passing process of graph convolutional layers (GCLs) with a learnable term to modulate the smoothness of node features with computational efficiency. (3) We investigate the achievable ratio between smooth and non-smooth feature components for GCNs with the augmented message-passing scheme. Our extensive numerical results show that the augmented message-passing schemes significantly improve node classification for GCN and some related models.



H.P. Yeh, M. Bayat, A. Arzani, J.H. Hattel. “Accelerated process parameter selection of polymer-based selective laser sintering via hybrid physics-informed neural network and finite element surrogate modelling,” In Applied Mathematical Modelling, Vol. 130, pp. 693--712. 2024.

ABSTRACT

The state of the melt region as well as the temperature field are critical indicators reflecting the stability of the process and subsequent product quality in selective laser sintering (SLS). The present study compares various simulation models for analyzing melt pool morphologies, specifically considering their complex transient evolution. While thermal fluid dynamic simulations offer comprehensive insights into melt regions, their inherent high computational time demand is a drawback. In SLS, the polymer's high viscosity and low conductivity limit liquid flow, thereby promoting a slow evolution of the melt region formation. Based on this observation, utilizing low-complexity pure heat conduction simulation can be adequate for describing melt region morphologies as compared to the more complex thermal fluid dynamic simulations. In the present work, we propose such a purely conduction based finite element (FE) model and use it in combination with an AI-powered partial differential equation (PDE) solver based on a parametric physics-informed neural network (PINN). We specifically conduct the simulations for the sintering process, where large thermal gradients are present, with the parametric PINN based model, whereas we employ the finite element method (FEM) for the cooling phase in which gradients and cooling rates are several orders lower, thus enabling the prediction of sintering temperature and melt region morphology under various configurations. The combined hybrid model demonstrates less than 7% deviation in temperatures and less than 1% in melt pool sizes as compared to the pure FEM-based models, with faster computational times of 0.7 s for sintering and 20 min for cooling. Moreover, the hybrid model is utilized for multi-track simulation with parametric variations with the purpose of optimizing the manufacturing process. Our model provides an approach to determine the most suitable combinations of settings that enhance manufacturing speed while preventing issues such as lack of fusion and material degradation.



H.Y. Zewdie, O.L. Sarmiento, J.D. Pinzón, M.A. Wilches-Mogollon, P. A. Arbelaez, L. Baldovino-Chiquillo, D. Hidalgo, L. Guzman, S.J. Mooney, Q.C. Nguyen, T. Tasdizen, D.A. Quistberg . “Road Traffic Injuries and the Built Environment in Bogotá, Colombia, 2015–2019: A Cross-Sectional Analysis,” In Journal of Urban Health, Springer, 2024.

ABSTRACT

Nine in 10 road traffic deaths occur in low- and middle-income countries (LMICs). Despite this disproportionate burden, few studies have examined built environment correlates of road traffic injury in these settings, including in Latin America. We examined road traffic collisions in Bogotá, Colombia, occurring between 2015 and 2019, and assessed the association between neighborhood-level built environment features and pedestrian injury and death. We used descriptive statistics to characterize all police-reported road traffic collisions that occurred in Bogotá between 2015 and 2019. Cluster detection was used to identify spatial clustering of pedestrian collisions. Adjusted multivariate Poisson regression models were fit to examine associations between several neighborhood-built environment features and rate of pedestrian road traffic injury and death. A total of 173,443 police-reported traffic collisions occurred in Bogotá between 2015 and 2019. Pedestrians made up about 25% of road traffic injuries and 50% of road traffic deaths in Bogotá between 2015 and 2019. Pedestrian collisions were spatially clustered in the southwestern region of Bogotá. Neighborhoods with more street trees (RR, 0.90; 95% CI, 0.82–0.98), traffic signals (0.89, 0.81–0.99), and bus stops (0.89, 0.82–0.97) were associated with lower pedestrian road traffic deaths. Neighborhoods with greater density of large roads were associated with higher pedestrian injury. Our findings highlight the potential for pedestrian-friendly infrastructure to promote safer interactions between pedestrians and motorists in Bogotá and in similar urban contexts globally.


2023


J. Adams, S. Elhabian. “Fully Bayesian VIB-DeepSSM,” Subtitled “arXiv:2305.05797,” 2023.

ABSTRACT

Statistical shape modeling (SSM) enables population-based quantitative analysis of anatomical shapes, informing clinical diagnosis. Deep learning approaches predict correspondence-based SSM directly from unsegmented 3D images but require calibrated uncertainty quantification, motivating Bayesian formulations. Variational information bottleneck DeepSSM (VIB-DeepSSM) is an effective, principled framework for predicting probabilistic shapes of anatomy from images with aleatoric uncertainty quantification. However, VIB is only half-Bayesian and lacks epistemic uncertainty inference. We derive a fully Bayesian VIB formulation from both the probably approximately correct (PAC)-Bayes and variational inference perspectives. We demonstrate the efficacy of two scalable approaches for Bayesian VIB with epistemic uncertainty: concrete dropout and batch ensemble. Additionally, we introduce a novel combination of the two that further enhances uncertainty calibration via multimodal marginalization. Experiments on synthetic shapes and left atrium data demonstrate that the fully Bayesian VIB network predicts SSM from images with improved uncertainty reasoning without sacrificing accuracy.



J. Adams, S. Elhabian. “Can point cloud networks learn statistical shape models of anatomies?,” Subtitled “arXiv:2305.05610,” 2023.

ABSTRACT

Statistical Shape Modeling (SSM) is a valuable tool for investigating and quantifying anatomical variations within populations of anatomies. However, traditional correspondence-based SSM generation methods require a time-consuming re-optimization process each time a new subject is added to the cohort, making the inference process prohibitive for clinical research. Additionally, they require complete geometric proxies (e.g., high-resolution binary volumes or surface meshes) as input shapes to construct the SSM. Unordered 3D point cloud representations of shapes are more easily acquired from various medical imaging practices (e.g., thresholded images and surface scanning). Point cloud deep networks have recently achieved remarkable success in learning permutation-invariant features for different point cloud tasks (e.g., completion, semantic segmentation, classification). However, their application to learning SSM from point clouds is to-date unexplored. In this work, we demonstrate that existing point cloud encoder-decoder-based completion networks can provide an untapped potential for SSM, capturing population-level statistical representations of shapes while reducing the inference burden and relaxing the input requirement. We discuss the limitations of these techniques to the SSM application and suggest future improvements. Our work paves the way for further exploration of point cloud deep learning for SSM, a promising avenue for advancing shape analysis literature and broadening SSM to diverse use cases.



J. Adams, S. Elhabian. “Point2SSM: Learning Morphological Variations of Anatomies from Point Cloud,” Subtitled “arXiv:2305.14486,” 2023.

ABSTRACT

We introduce Point2SSM, a novel unsupervised learning approach that can accurately construct correspondence-based statistical shape models (SSMs) of anatomy directly from point clouds. SSMs are crucial in clinical research for analyzing the population-level morphological variation in bones and organs. However, traditional methods for creating SSMs have limitations that hinder their widespread adoption, such as the need for noise-free surface meshes or binary volumes, reliance on assumptions or predefined templates, and simultaneous optimization of the entire cohort leading to lengthy inference times given new data. Point2SSM overcomes these barriers by providing a data-driven solution that infers SSMs directly from raw point clouds, reducing inference burdens and increasing applicability as point clouds are more easily acquired. Deep learning on 3D point clouds has seen recent success in unsupervised representation learning, point-to-point matching, and shape correspondence; however, their application to constructing SSMs of anatomies is largely unexplored. In this work, we benchmark state-of-the-art point cloud deep networks on the task of SSM and demonstrate that they are not robust to the challenges of anatomical SSM, such as noisy, sparse, or incomplete input and significantly limited training data. Point2SSM addresses these challenges via an attention-based module that provides correspondence mappings from learned point features. We demonstrate that the proposed method significantly outperforms existing networks in terms of both accurate surface sampling and correspondence, better capturing population-level statistics.



J. Adams, S.Y. Elhabian. “Benchmarking Scalable Epistemic Uncertainty Quantification in Organ Segmentation,” Subtitled “arXiv:2308.07506,” 2023.

ABSTRACT

Deep learning based methods for automatic organ segmentation have shown promise in aiding diagnosis and treatment planning. However, quantifying and understanding the uncertainty associated with model predictions is crucial in critical clinical applications. While many techniques have been proposed for epistemic or model-based uncertainty estimation, it is unclear which method is preferred in the medical image analysis setting. This paper presents a comprehensive benchmarking study that evaluates epistemic uncertainty quantification methods in organ segmentation in terms of accuracy, uncertainty calibration, and scalability. We provide a comprehensive discussion of the strengths, weaknesses, and out-of-distribution detection capabilities of each method as well as recommendations for future improvements. These findings contribute to the development of reliable and robust models that yield accurate segmentations while effectively quantifying epistemic uncertainty.



M. Adair, I. Rodero, M. Parashar, D. Melgar. “Accelerating Data-Intensive Seismic Research Through Parallel Workflow Optimization and Federated Cyberinfrastructure,” In Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, ACM, pp. 1970--1977. 2023.
DOI: 10.1145/3624062.3624276

ABSTRACT

Earthquake early warning systems use synthetic data from simulation frameworks like MudPy to train models for predicting the magnitudes of large earthquakes. MudPy, although powerful, has limitations: a lengthy simulation time to generate the required data, lack of user-friendliness, and no platform for discovering and sharing its data. We introduce FakeQuakes DAGMan Workflow (FDW), which utilizes Open Science Grid (OSG) for parallel computations to accelerate and streamline MudPy simulations. FDW significantly reduces runtime and increases throughput compared to a single-machine setup. Using FDW, we also explore partitioned parallel HTCondor DAGMan workflows to enhance OSG efficiency. Additionally, we investigate leveraging cyberinfrastructure, such as Virtual Data Collaboratory (VDC), for enhancing MudPy and OSG. Specifically, we simulate using Cloud bursting policies to enforce FDW job-offloading to VDC during OSG peak demand, addressing shared resource issues and user goals; we also discuss VDC’s value in facilitating a platform for broad access to MudPy products.



D. Akbaba, D. Lange, M. Correll, A. Lex, M. Meyer. “Troubling Collaboration: Matters of Care for Visualization Design Study,” In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23),, pp. 23--28. April, 2023.

ABSTRACT

A common research process in visualization is for visualization researchers to collaborate with domain experts to solve particular applied data problems. While there is existing guidance and expertise around how to structure collaborations to strengthen research contributions, there is comparatively little guidance on how to navigate the implications of, and power produced through the socio-technical entanglements of collaborations. In this paper, we qualitatively analyze refective interviews of past participants of collaborations from multiple perspectives: visualization graduate students, visualization professors, and domain collaborators. We juxtapose the perspectives of these individuals, revealing tensions about the tools that are built and the relationships that are formed — a complex web of competing motivations. Through the lens of matters of care, we interpret this web, concluding with considerations that both trouble and necessitate reformation of current patterns around collaborative work in visualization design studies to promote more equitable, useful, and care-ful outcomes.



M. Aliakbari, M.S. Sadrabadi, P. Vadasz, A. Arzani. “Ensemble physics informed neural networks: A framework to improve inverse transport modeling in heterogeneous domains,” In Physics of Fluids, AIP, 2023.

ABSTRACT

Modeling fluid flow and transport in heterogeneous systems is often challenged by unknown parameters that vary in space. In inverse
modeling, measurement data are used to estimate these parameters. Due to the spatial variability of these unknown parameters in
heterogeneous systems (e.g., permeability or diffusivity), the inverse problem is ill-posed and infinite solutions are possible. Physics-informed
neural networks (PINN) have become a popular approach for solving inverse problems. However, in inverse problems in heterogeneous sys-
tems, PINN can be sensitive to hyperparameters and can produce unrealistic patterns. Motivated by the concept of ensemble learning and
variance reduction in machine learning, we propose an ensemble PINN (ePINN) approach where an ensemble of parallel neural networks is
used and each sub-network is initialized with a meaningful pattern of the unknown parameter. Subsequently, these parallel networks provide
a basis that is fed into a main neural network that is trained using PINN. It is shown that an appropriately selected set of patterns can guide
PINN in producing more realistic results that are relevant to the problem of interest. To assess the accuracy of this approach, inverse trans-
port problems involving unknown heat conductivity, porous media permeability, and velocity vector fields were studied. The proposed
ePINN approach was shown to increase the accuracy in inverse problems and mitigate the challenges associated with non-uniqueness.



A. Arzani, L. Yuan, P. Newell, B. Wang. “Interpreting and generalizing deep learning in physics-based problems with functional linear models,” Subtitled “arXiv:2307.04569,” 2023.

ABSTRACT

Although deep learning has achieved remarkable success in various scientific machine learning applications, its black-box nature poses concerns regarding interpretability and generalization capabilities beyond the training data. Interpretability is crucial and often desired in modeling physical systems. Moreover, acquiring extensive datasets that encompass the entire range of input features is challenging in many physics-based learning tasks, leading to increased errors when encountering out-of-distribution (OOD) data. In this work, motivated by the field of functional data analysis (FDA), we propose generalized functional linear models as an interpretable surrogate for a trained deep learning model. We demonstrate that our model could be trained either based on a trained neural network (post-hoc interpretation) or directly from training data (interpretable operator learning). A library of generalized functional linear models with different kernel functions is considered and sparse regression is used to discover an interpretable surrogate model that could be analytically presented. We present test cases in solid mechanics, fluid mechanics, and transport. Our results demonstrate that our model can achieve comparable accuracy to deep learning and can improve OOD generalization while providing more transparency and interpretability. Our study underscores the significance of interpretability in scientific machine learning and showcases the potential of functional linear models as a tool for interpreting and generalizing deep learning.



T. M. Athawale, C.R. Johnson, S. Sane,, D. Pugmire. “Fiber Uncertainty Visualization for Bivariate Data With Parametric and Nonparametric Noise Models,” In IEEE Transactions on Visualization and Computer Graphics, Vol. 29, No. 1, IEEE, pp. 613-23. 2023.

ABSTRACT

Visualization and analysis of multivariate data and their uncertainty are top research challenges in data visualization. Constructing fiber surfaces is a popular technique for multivariate data visualization that generalizes the idea of level-set visualization for univariate data to multivariate data. In this paper, we present a statistical framework to quantify positional probabilities of fibers extracted from uncertain bivariate fields. Specifically, we extend the state-of-the-art Gaussian models of uncertainty for bivariate data to other parametric distributions (e.g., uniform and Epanechnikov) and more general nonparametric probability distributions (e.g., histograms and kernel density estimation) and derive corresponding spatial probabilities of fibers. In our proposed framework, we leverage Green’s theorem for closed-form computation of fiber probabilities when bivariate data are assumed to have independent parametric and nonparametric noise. Additionally, we present a nonparametric approach combined with numerical integration to study the positional probability of fibers when bivariate data are assumed to have correlated noise. For uncertainty analysis, we visualize the derived probability volumes for fibers via volume rendering and extracting level sets based on probability thresholds. We present the utility of our proposed techniques via experiments on synthetic and simulation datasets