N. Truong, C. Yuksel, C. Watcharopas, J. A. Levine, R. M. Kirby. Particle Merging-and-Splitting, In IEEE Transactions on Visualization and Computer Graphics, IEEE, 2021.
Robustly handling collisions between individual particles in a large particle-based simulation has been a challenging problem. We introduce particle merging-and-splitting, a simple scheme for robustly handling collisions between particles that prevents inter-penetrations of separate objects without introducing numerical instabilities. This scheme merges colliding particles at the beginning of the time-step and then splits them at the end of the time-step. Thus, collisions last for the duration of a time-step, allowing neighboring particles of the colliding particles to influence each other. We show that our merging-and-splitting method is effective in robustly handling collisions and avoiding penetrations in particle-based simulations. We also show how our merging-and-splitting approach can be used for coupling different simulation systems using different and otherwise incompatible integrators. We present simulation tests …
W. Usher, X. Huang, S. Petruzza, S. Kumar, S. R. Slattery, S. T. Reeve, F. Wang, C. R. Johnson,, V. Pascucci. Adaptive Spatially Aware I/O for Multiresolution Particle Data Layouts, In IPDPS, 2021.
V. Vedam-Mai, K. Deisseroth, J. Giordano, G. Lazaro-Munoz, W. Chiong, N. Suthana, J. Langevin, J. Gill, W. Goodman, N. R. Provenza, C. H. Halpern, R. S. Shivacharan, T. N. Cunningham, S. A. Sheth, N. Pouratian, K. W. Scangos, H. S. Mayberg, A. Horn, K. A. Johnson, C. R. Butson, R. Gilron, C. de Hemptinne, R. Wilt, M. Yaroshinsky, S. Little, P. Starr, G. Worrell, P. Shirvalkar, E. Chang, J. Volkmann, M. Muthuraman, S. Groppa, A. A. Kühn, L. Li, M. Johnson, K. J. Otto, R. Raike, S. Goetz, C. Wu, P. Silburn, B. Cheeran, Y. J. Pathak, M. Malekmohammadi, A. Gunduz, J. K. Wong, S. Cernera, A. W. Shukla, A. Ramirez-Zamora, W. Deeb, A. Patterson, K. D. Foote, M. S. Okun.
Proceedings of the Eighth Annual Deep Brain Stimulation Think Tank: Advances in Optogenetics, Ethical Issues Affecting DBS Research, Neuromodulatory Approaches for Depression, Adaptive Neurostimulation, and Emerging DBS Technologies, In Frontiers in Human Neuroscience, Vol. 15, pp. 169. 2021.
We estimate that 208,000 deep brain stimulation (DBS) devices have been implanted to address neurological and neuropsychiatric disorders worldwide. DBS Think Tank presenters pooled data and determined that DBS expanded in its scope and has been applied to multiple brain disorders in an effort to modulate neural circuitry. The DBS Think Tank was founded in 2012 providing a space where clinicians, engineers, researchers from industry and academia discuss current and emerging DBS technologies and logistical and ethical issues facing the field. The emphasis is on cutting edge research and collaboration aimed to advance the DBS field. The Eighth Annual DBS Think Tank was held virtually on September 1 and 2, 2020 (Zoom Video Communications) due to restrictions related to the COVID-19 pandemic. The meeting focused on advances in: (1) optogenetics as a tool for comprehending neurobiology of diseases and on optogenetically-inspired DBS, (2) cutting edge of emerging DBS technologies, (3) ethical issues affecting DBS research and access to care, (4) neuromodulatory approaches for depression, (5) advancing novel hardware, software and imaging methodologies, (6) use of neurophysiological signals in adaptive neurostimulation, and (7) use of more advanced technologies to improve DBS clinical outcomes. There were 178 attendees who participated in a DBS Think Tank survey, which revealed the expansion of DBS into several indications such as obesity, post-traumatic stress disorder, addiction and Alzheimer’s disease. This proceedings summarizes the advances discussed at the Eighth Annual DBS Think Tank.
A. Venkat, A. Gyulassy, G. Kosiba, A. Maiti, H. Reinstein, R. Gee, P.-T. Bremer, V. Pascucci. Towards replacing physical testing of granular materials with a Topology-based Model, Subtitled arXiv preprint arXiv:2109.08777, 2021.
In the study of packed granular materials, the performance of a sample (e.g., the detonation of a high-energy explosive) often correlates to measurements of a fluid flowing through it. The "effective surface area," the surface area accessible to the airflow, is typically measured using a permeametry apparatus that relates the flow conductance to the permeable surface area via the Carman-Kozeny equation. This equation allows calculating the flow rate of a fluid flowing through the granules packed in the sample for a given pressure drop. However, Carman-Kozeny makes inherent assumptions about tunnel shapes and flow paths that may not accurately hold in situations where the particles possess a wide distribution in shapes, sizes, and aspect ratios, as is true with many powdered systems of technological and commercial interest. To address this challenge, we replicate these measurements virtually on micro-CT images of the powdered material, introducing a new Pore Network Model based on the skeleton of the Morse-Smale complex. Pores are identified as basins of the complex, their incidence encodes adjacency, and the conductivity of the capillary between them is computed from the cross-section at their interface. We build and solve a resistive network to compute an approximate laminar fluid flow through the pore structure. We provide two means of estimating flow-permeable surface area: (i) by direct computation of conductivity, and (ii) by identifying dead-ends in the flow coupled with isosurface extraction and the application of the Carman-Kozeny equation, with the aim of establishing consistency over a range of particle shapes, sizes, porosity levels, and void distribution patterns.
B. Wang, D. Zou, Q. Gu, S. J. Osher. Laplacian smoothing stochastic gradient markov chain monte carlo, In SIAM Journal on Scientific Computing, Vol. 43, No. 1, SIAM, pp. A26-A53. 2021.
As an important Markov chain Monte Carlo (MCMC) method, the stochastic gradient Langevin dynamics (SGLD) algorithm has achieved great success in Bayesian learning and posterior sampling. However, SGLD typically suffers from a slow convergence rate due to its large variance caused by the stochastic gradient. In order to alleviate these drawbacks, we leverage the recently developed Laplacian smoothing technique and propose a Laplacian smoothing stochastic gradient Langevin dynamics (LS-SGLD) algorithm. We prove that for sampling from both log-concave and non-log-concave densities, LS-SGLD achieves strictly smaller discretization error in 2-Wasserstein distance, although its mixing rate can be slightly slower. Experiments on both synthetic and real datasets verify our theoretical results and demonstrate the superior performance of LS-SGLD on different machine learning tasks including posterior …
Z. Wang, W. Xing, R. Kirby, S. Zhe. Multi-Fidelity High-Order Gaussian Processes for Physical Simulation, In International Conference on Artificial Intelligence and Statistics, PMLR, pp. 847-855. 2021.
The key task of physical simulation is to solve partial differential equations (PDEs) on discretized domains, which is known to be costly. In particular, high-fidelity solutions are much more expensive than low-fidelity ones. To reduce the cost, we consider novel Gaussian process (GP) models that leverage simulation examples of different fidelities to predict high-dimensional PDE solution outputs. Existing GP methods are either not scalable to high-dimensional outputs or lack effective strategies to integrate multi-fidelity examples. To address these issues, we propose Multi-Fidelity High-Order Gaussian Process (MFHoGP) that can capture complex correlations both between the outputs and between the fidelities to enhance solution estimation, and scale to large numbers of outputs. Based on a novel nonlinear coregionalization model, MFHoGP propagates bases throughout fidelities to fuse information, and places a deep matrix GP prior over the basis weights to capture the (nonlinear) relationships across the fidelities. To improve inference efficiency and quality, we use bases decomposition to largely reduce the model parameters, and layer-wise matrix Gaussian posteriors to capture the posterior dependency and to simplify the computation. Our stochastic variational learning algorithm successfully handles millions of outputs without extra sparse approximations. We show the advantages of our method in several typical applications.
The main objective for understanding fluorescence microscopy data is to investigate and evaluate the fluorescent signal intensity distributions as well as their spatial relationships across multiple channels. The quantitative analysis of 3D fluorescence microscopy data needs interactive tools for researchers to select and focus on relevant biological structures. We developed an interactive tool based on volume visualization techniques and GPU computing for streamlining rapid data analysis. Our main contribution is the implementation of common data quantification functions on streamed volumes, providing interactive analyses on large data without lengthy preprocessing. Data segmentation and quantification are coupled with brushing and executed at an interactive speed. A large volume is partitioned into data bricks, and only user-selected structures are analyzed to constrain the computational load. We designed a framework to assemble a sequence of GPU programs to handle brick borders and stitch analysis results. Our tool was developed in collaboration with domain experts and has been used to identify cell types. We demonstrate a workflow to analyze cells in vestibular epithelia of transgenic mice.
In-situ and in-transit processing alleviate the gap between the computing and I/O capabilities by scheduling data analytics close to the data source. Hybrid in-situ processing splits data analytics into two stages: the data processing that runs in-situ aims to extract regions of interest, which are then transferred to staging services for further in-transit analytics. To facilitate this type of hybrid in-situ processing, the data staging service needs to support complex intermediate data representations generated/consumed by the in-situ tasks. Unstructured (or irregular) mesh is one such derived data representation that is typically used and bridges simulation data and analytics. However, how staging services efficiently support unstructured mesh transfer and processing remains to be explored. This paper investigates design options for transferring and processing unstructured mesh data using staging services. Using polygonal mesh data as an example, we show that hybrid in-situ workflows with staging-based unstructured mesh processing can effectively support hybrid in-situ workflows, and can significantly decrease data movement overheads.
W. W. Xing, A. A. Shah, P. Wang, S. Zhe, Q. Fu, R. M. Kirby. Residual Gaussian process: A tractable nonparametric Bayesian emulator for multi-fidelity simulations, In Applied Mathematical Modelling, Vol. 97, Elsevier, pp. 36-56. 2021.
Challenges in multi-fidelity modelling relate to accuracy, uncertainty estimation and high-dimensionality. A novel additive structure is introduced in which the highest fidelity solution is written as a sum of the lowest fidelity solution and residuals between the solutions at successive fidelity levels, with Gaussian process priors placed over the low fidelity solution and each of the residuals. The resulting model is equipped with a closed-form solution for the predictive posterior, making it applicable to advanced, high-dimensional tasks that require uncertainty estimation. Its advantages are demonstrated on univariate benchmarks and on three challenging multivariate problems. It is shown how active learning can be used to enhance the model, especially with a limited computational budget. Furthermore, error bounds are derived for the mean prediction in the univariate case.
W. W. Xing, R. M. Kirby, S. Zhe. Deep coregionalization for the emulation of simulation-based spatial-temporal fields, In Journal of Computational Physics, Academic Press, pp. 109984. 2021.
Data-driven surrogate models are widely used for applications such as design optimization and uncertainty quantification, where repeated evaluations of an expensive simulator are required. For most partial differential equation (PDE) simulations, the outputs of interest are often spatial or spatial-temporal fields, leading to very high-dimensional outputs. Despite the success of existing data-driven surrogates for high-dimensional outputs, most methods require a significant number of samples to cover the response surface in order to achieve a reasonable degree of accuracy. This demand makes the idea of surrogate models less attractive considering the high-computational cost to generate the data. To address this issue, we exploit the multifidelity nature of a PDE simulation and introduce deep coregionalization, a Bayesian nonparametric autoregressive framework for efficient emulation of spatial-temporal fields. To effectively extract the output correlations in the context of multifidelity data, we develop a novel dimension reduction technique, residual principal component analysis. Our model can simultaneously capture the rich output correlations and the fidelity correlations and make high-fidelity predictions with only a small number of expensive, high-fidelity simulation samples. We show the advantages of our model in three canonical PDE models and a fluid dynamics problem. The results show that the proposed method can not only approximate simulation results with significantly less cost (by bout 10%-25%) but also further improve model accuracy.
Y. Xu, V. Keshavarzzadeh, R. M. Kirby, A. Narayan. A bandit-learning approach to multifidelity approximation, Subtitled arXiv preprint arXiv:2103.15342, 2021.
Multifidelity approximation is an important technique in scientific computation and simulation. In this paper, we introduce a bandit-learning approach for leveraging data of varying fidelities to achieve precise estimates of the parameters of interest. Under a linear model assumption, we formulate a multifidelity approximation as a modified stochastic bandit, and analyze the loss for a class of policies that uniformly explore each model before exploiting. Utilizing the estimated conditional mean-squared error, we propose a consistent algorithm, adaptive Explore-Then-Commit (AETC), and establish a corresponding trajectory-wise optimality result. These results are then extended to the case of vector-valued responses, where we demonstrate that the algorithm is efficient without the need to worry about estimating high-dimensional parameters. The main advantage of our approach is that we require neither hierarchical model structure nor\textit a priori knowledge of statistical information (eg, correlations) about or between models. Instead, the AETC algorithm requires only knowledge of which model is a trusted high-fidelity model, along with (relative) computational cost estimates of querying each model. Numerical experiments are provided at the end to support our theoretical findings.
Y. Xu, A. Narayan. Randomized weakly admissible meshes, Subtitled arXiv preprint arXiv:2101.04043, 2021.
A weakly admissible mesh (WAM) on a continuum real-valued domain is a sequence of discrete grids such that the discrete maximum norm of polynomials on the grid is comparable to the supremum norm of polynomials on the domain. The asymptotic rate of growth of the grid sizes and of the comparability constant must grow in a controlled manner. In this paper we generalize the notion of a WAM to a hierarchical subspaces of not necessarily polynomial functions, and we analyze particular strategies for random sampling as a technique for generating WAMs. Our main results show that WAM's and their stronger variant, admissible meshes, can be generated by random sampling, and our analysis provides concrete estimates for growth of both the meshes and the discrete-continuum comparability constants.
Y. Xu, A. Narayan. Budget-limited distribution learning in multifidelity problems, Subtitled arXiv preprint arXiv:2105.04599, 2021.
Multifidelity methods are widely used for statistical estimation of quantities of interest (QoIs) in uncertainty quantification using simulation codes of differing costs and accuracies. Many methods approximate numerical-valued statistics that represent only limited information of the QoIs. In this paper, we introduce a semi-parametric approach that aims to effectively describe the distribution of a scalar-valued QoI in the multifidelity setup. Under a linear model hypothesis, we propose an exploration-exploitation strategy to reconstruct the full distribution of a scalar-valued QoI using samples from a subset of low-fidelity regressors. We derive an informative asymptotic bound for the mean 1-Wasserstein distance between the estimator and the true distribution, and use it to adaptively allocate computational budget for parametric estimation and non-parametric reconstruction. Assuming the linear model is correct, we prove that such a procedure is consistent, and converges to the optimal policy (and hence optimal computational budget allocation) under an upper bound criterion as the budget goes to infinity. A major advantage of our approach compared to several other multifidelity methods is that it is automatic, and its implementation does not require a hierarchical model setup, cross-model information, or \textita priori known model statistics. Numerical experiments are provided in the end to support our theoretical analysis.
V. Zala, R. M. Kirby, A. Narayan. Structure-preserving Nonlinear Filtering for Continuous and Discontinuous Galerkin Spectral/hp Element Methods, Subtitled arXiv preprint arXiv:2106.08316, 2021.
Finite element simulations have been used to solve various partial differential equations (PDEs) that model physical, chemical, and biological phenomena. The resulting discretized solutions to PDEs often do not satisfy requisite physical properties, such as positivity or monotonicity. Such invalid solutions pose both modeling challenges, since the physical interpretation of simulation results is not possible, and computational challenges, since such properties may be required to advance the scheme. We, therefore, consider the problem of computing solutions that preserve these structural solution properties, which we enforce as additional constraints on the solution. We consider in particular the class of convex constraints, which includes positivity and monotonicity. By embedding such constraints as a postprocessing convex optimization procedure, we can compute solutions that satisfy general types of convex constraints. For certain types of constraints (including positivity and monotonicity), the optimization is a filter, i.e., a norm-decreasing operation. We provide a variety of tests on one-dimensional time-dependent PDEs that demonstrate the method's efficacy, and we empirically show that rates of convergence are unaffected by the inclusion of the constraints.
R. Zambre, D. Sahasrabudhe, H. Zhou, M. Berzins, A. Chandramowlishwaran, P. Balaji. Logically Parallel Communication for Fast MPI+Threads Communication, In Proceedings of the Transactions on Parallel and Distributed Computing, IEEE, April, 2021.
Supercomputing applications are increasingly adopting the MPI+threads programming model over the traditional “MPI everywhere” approach to better handle the disproportionate increase in the number of cores compared with other on-node resources. In practice, however, most applications observe a slower performance with MPI+threads primarily because of poor communication performance. Recent research efforts on MPI libraries address this bottleneck by mapping logically parallel communication, that is, operations that are not subject to MPI’s ordering constraints to the underlying network parallelism. Domain scientists, however, typically do not expose such communication independence information because the existing MPI-3.1 standard’s semantics can be limiting. Researchers had initially proposed user-visible endpoints to combat this issue, but such a solution requires intrusive changes to the standard (new APIs). The upcoming MPI-4.0 standard, on the other hand, allows applications to relax unneeded semantics and provides them with many opportunities to express logical communication parallelism. In this paper, we show how MPI+threads applications can achieve high performance with logically parallel communication. Through application case studies, we compare the capabilities of the new MPI-4.0 standard with those of the existing one and user-visible endpoints (upper bound). Logical communication parallelism can boost the overall performance of an application by over 2x.
B. Zenger, W. W. Good, J. A. Bergquist, L. C. Rupp, M. Perez, G. J. Stoddard, V. Sharma, R. S. MacLeod. Transient recovery of epicardial and torso ST-segment ischemic signals during cardiac stress tests: A possible physiological mechanism, In Journal of Electrocardiology, Churchill Livingstone, 2021.
Acute myocardial ischemia has several characteristic ECG findings, including clinically detectable ST-segment deviations. However, the sensitivity and specificity of diagnosis based on ST-segment changes are low. Furthermore, ST-segment deviations have been shown to be transient and spontaneously recover without any indication the ischemic event has subsided.
Assess the transient recovery of ST-segment deviations on remote recording electrodes during a partial occlusion cardiac stress test and compare them to intramyocardial ST-segment deviations.
We used a previously validated porcineBZ experimental model of acute myocardial ischemia with controllable ischemic load and simultaneous electrical measurements within the heart wall, on the epicardial surface, and on the torso surface. Simulated cardiac stress tests were induced by occluding a coronary artery while simultaneously pacing rapidly or infusing dobutamine to stimulate cardiac function. Postexperimental imaging created anatomical models for data visualization and quantification. Markers of ischemia were identified as deviations in the potentials measured at 40% of the ST-segment. Intramural cardiac conduction speed was also determined using the inverse gradient method. We assessed changes in intramyocardial ischemic volume proportion, conduction speed, clinical presence of ischemia on remote recording arrays, and regional changes to intramyocardial ischemia. We defined the peak deviation response time as the time interval after onset of ischemia at which maximum ST-segment deviation was achieved, and ST-recovery time was the interval when ST deviation returned to below thresholded of ST elevation.
In both epicardial and torso recordings, the peak ST-segment deviation response time was 4.9±1.1 min and the ST-recovery time was approximately 7.9±2.5 min, both well before the termination of the ischemic stress. At peak response time, conduction speed was reduced by 50% and returned to near baseline at ST-recovery. The overall ischemic volume proportion initially increased, on average, to 37% at peak response time; however, it recovered only to 30% at the ST-recovery time. By contrast, the subepicardial region of the myocardial wall showed 40% ischemic volume at peak response time and recovered much more strongly to 25% as epicardial ST-segment deviations returned to baseline.
Our data show that remote ischemic signal recovery correlates with a recovery of the subepicardial myocardium, while subendocardial ischemic development persists.
We propose a data-driven space-filling curve method for 2D and 3D visualization. Our flexible curve traverses the data elements in the spatial domain in a way that the resulting linearization better preserves features in space compared to existing methods. We achieve such data coherency by calculating a Hamiltonian path that approximately minimizes an objective function that describes the similarity of data values and location coherency in a neighborhood. Our extended variant even supports multiscale data via quadtrees and octrees. Our method is useful in many areas of visualization, including multivariate or comparative visualization,ensemble visualization of 2D and 3D data on regular grids, or multiscale visual analysis of particle simulations. The effectiveness of our method is evaluated with numerical comparisons to existing techniques and through examples of ensemble and multivariate datasets.
Y. Zhou, N. Chalapathi, A. Rathore, Y. Zhao, Bei Wang. Mapper Interactive: A Scalable, Extendable, and Interactive Toolbox for the Visual Exploration of High-Dimensional Data., In IEEE Pacific Visualization Symposium, 2021.
The mapper algorithm is a popular tool from topological data analysis for extracting topological summaries of high-dimensional datasets. In this paper, we present Mapper Interactive, a web-based framework for the interactive analysis and visualization of high-dimensional point cloud data. It implements the mapper algorithm in an interactive, scalable, and easily extendable way, thus supporting practical data analysis. In particular, its command-line API can compute mapper graphs for 1 million points of 256 dimensions in about 3 minutes (4 times faster than the vanilla implementation). Its visual interface allows on-the-fly computation and manipulation of the mapper graph based on user-specified parameters and supports the addition of new analysis modules with a few lines of code. Mapper Interactive makes the mapper algorithm accessible to nonspecialists and accelerates topological analytics workflows.
The term “in situ processing” has evolved over the last decade to mean both a specific strategy for visualizing and analyzing data and an umbrella term for a processing paradigm. The resulting confusion makes it difficult for visualization and analysis scientists to communicate with each other and with their stakeholders. To address this problem, a group of over fifty experts convened with the goal of standardizing terminology. This paper summarizes their findings and proposes a new terminology for describing in situ systems. An important finding from this group was that in situ systems are best described via multiple, distinct axes: integration type, proximity, access, division of execution, operation controls, and output type. This paper discusses these axes, evaluates existing systems within the axes, and explores how currently used terms relate to the axes.
L. Cinquini, S. Petruzza, Jason J. Boutte, S. Ames, G. Abdulla, V. Balaji, R. Ferraro, A. Radhakrishnan, L. Carriere, T. Maxwell, G. Scorzelli, V. Pascucci. Distributed Resources for the Earth System Grid Advanced Management (DREAM), Final Report, 2020.
The DREAM project was funded more than 3 years ago to design and implement a next-generation ESGF (Earth System Grid Federation ) architecture which would be suitable for managing and accessing data and services resources on a distributed and scalable environment. In particular, the project intended to focus on the computing and visualization capabilities of the stack, which at the time were rather primitive. At the beginning, the team had the general notion that a better ESGF architecture could be built by modularizing each component, and redefining its interaction with other components by defining and exposing a well defined API. Although this was still the high level principle that guided the work, the DREAM project was able to accomplish its goals by leveraging new practices in IT that started just about 3 or 4 years ago: the advent of containerization technologies (specifically, Docker), the development of frameworks to manage containers at scale (Docker Swarm and Kubernetes), and their application to the commercial Cloud. Thanks to these new technologies, DREAM was able to improve the ESGF architecture (including its computing and visualization services) to a level of deployability and scalability beyond the original expectations.