Designed especially for neurobiologists, FluoRender is an interactive tool for multi-channel fluorescence microscopy data visualization and analysis.
Large scale visualization on the Powerwall.
BrainStimulator is a set of networks that are used in SCIRun to perform simulations of brain stimulation such as transcranial direct current stimulation (tDCS) and magnetic transcranial stimulation (TMS).
Developing software tools for science has always been a central vision of the SCI Institute.

SCI Publications

2017


W. W. Good, B. Erem, J. Coll-Font, D. H. Brooks, R. S. MacLeod. “Detecting Ischemic Stress to the Myocardium Using Laplacian Eigenmaps and Changes to Conduction Velocity,” In Computing in Cardiology, Vol. 44, IEEE, 2017.

ABSTRACT

The underlying pathophysiology of ischemia and its electrocardiographic consequences are poorly understood, resulting in unreliable diagnosis of this disease. This limited knowledge of underlying mechanisms suggests a data driven approach, which seeks to identify patterns in the ECG that can be linked statistically to underlying behavior and conditions of ischemic stress. The gold standard ECG metrics for evaluating ischemia monitor vertical deflections within the ST segment. However, ischemia influences all portions of the electrogram. Another metric that targets the QRS complex during ischemia is Conduction Velocity (CV). An even more inclusive, data driven approach is known as "Laplacian Eigenmaps" (LE), which can identify trajectories, or "manifolds", that respond to different spatiotemporal consequences of ischemic stress, and these changes to the trajectories on the manifold may serve as a clinically relevant biomarker. On this study, we compared the LE- and CV-based markers against two gold standards for detecting ischemic stress, both derived from the ST segment. We evaluated the response time and fidelity of each biomarker using a Time to Threshold (TTT) and Contrast Ratio (CR) measure, over 51 episodes recorded as cardiac electrograms from a canine model of controlled ischemia. The results show that metrics designed to monitor regions beyond the ST segment can perform at least as well, if not better, than traditional ST segment based metrics.



C. Gritton, J. Guilkey, J. Hooper, D. Bedrov, R. M. Kirby, M. Berzins. “Using the material point method to model chemical/mechanical coupling in the deformation of a silicon anode,” In Modelling and Simulation in Materials Science and Engineering, Vol. 25, No. 4, pp. 045005. 2017.

ABSTRACT

The lithiation and delithiation of a silicon battery anode is modeled using the material point method (MPM). The main challenges in modeling this process using the MPM is to simulate stress dependent diffusion coupled with concentration dependent stress within a material that undergoes large deformations. MPM is chosen as the numerical method of choice because of its ability to handle large deformations. A method for modeling diffusion within MPM is described. A stress dependent model for diffusivity and three different constitutive models that fully couple the equations for stress with the equations for diffusion are considered. Verifications tests for the accuracy of the numerical implementations of the models and validation tests with experimental results show the accuracy of the approach. The application of the fully coupled stress diffusion model implemented in MPM is applied to modeling the lithiation and delithiation of silicon nanopillars.



L. Guo, A. Narayan, T. Zhou, Y. Chen. “Stochastic Collocation Methods via L1 Minimization Using Randomized Quadratures,” In SIAM Journal on Scientific Computing, Vol. 39, No. 1, pp. A333--A359. Jan, 2017.
ISSN: 1064-8275
DOI: 10.1137/16M1059680

ABSTRACT

In this work, we discuss the problem of approximating a multivariate function via ℓ1 minimization method, using a random chosen sub-grid of the corresponding tensor grid of Gaussian points. The independent variables of the function are assumed to be random variables, and thus, the framework provides a non-intrusive way to construct the generalized polynomial chaos expansions, stemming from the motivating application of Uncertainty Quantification (UQ). We provide theoretical analysis on the validity of the approach. The framework includes both the bounded measures such as the uniform and the Chebyshev measure, and the unbounded measures which include the Gaussian measure. Several numerical examples are given to confirm the theoretical results.



J. K. Holmen, A. Humphrey, D. Sutherland, M. Berzins. “Improving Uintah's Scalability Through the Use of Portable Kokkos-Based Data Parallel Tasks,” In Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact, PEARC17, No. 27, pp. 27:1--27:8. 2017.
ISBN: 978-1-4503-5272-7
DOI: 10.1145/3093338.3093388

ABSTRACT

The University of Utah's Carbon Capture Multidisciplinary Simulation Center (CCMSC) is using the Uintah Computational Framework to predict performance of a 1000 MWe ultra-supercritical clean coal boiler. The center aims to utilize the Intel Xeon Phi-based DOE systems, Theta and Aurora, through the Aurora Early Science Program by using the Kokkos C++ library to enable node-level performance portability. This paper describes infrastructure advancements and portability improvements made possible by our integration of Kokkos within Uintah. Scalability results are presented that compare serial and data parallel task execution models for a challenging radiative heat transfer calculation, central to the center's predictive boiler simulations. These results demonstrate both good strong-scaling characteristics to 256 Knights Landing (KNL) processors on the NSF Stampede system, and show the KNL-based calculation to compete with prior GPU-based results for the same calculation.



J. Jakeman, A. Narayan, T. Zhou. “A Generalized Sampling and Preconditioning Scheme for Sparse Approximation of Polynomial Chaos Expansions,” In SIAM Journal on Scientific Computing, Vol. 39, No. 3, SIAM, pp. A1114--A1144. Jan, 2017.
ISSN: 1064-8275
DOI: 10.1137/16M1063885

ABSTRACT

In this paper we propose an algorithm for recovering sparse orthogonal polynomials using stochastic collocation. Our approach is motivated by the desire to use generalized polynomial chaos expansions (PCE) to quantify uncertainty in models subject to uncertain input parameters. The standard sampling approach for recovering sparse polynomials is to use Monte Carlo (MC) sampling of the density of orthogonality. However MC methods result in poor function recovery when the polynomial degree is high. Here we propose a general algorithm that can be applied to any admissible weight function on a bounded domain and a wide class of exponential weight functions defined on unbounded domains. Our proposed algorithm samples with respect to the weighted equilibrium measure of the parametric domain, and subsequently solves a preconditioned ℓ1-minimization problem, where the weights of the diagonal preconditioning matrix are given by evaluations of the Christoffel function. We present theoretical analysis to motivate the algorithm, and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest. Numerical examples are also provided that demonstrate that our proposed Christoffel Sparse Approximation algorithm leads to comparable or improved accuracy even when compared with Legendre and Hermite specific algorithms.



J. Jiang, Y. Chen, A. Narayan. “Offline-Enhanced Reduced Basis Method Through Adaptive Construction of the Surrogate Training Set,” In Journal of Scientific Computing, Vol. 73, No. 2-3, Springer Nature, pp. 853--875. September, 2017.
DOI: 10.1007/s10915-017-0551-3

ABSTRACT

The reduced basis method (RBM) is a popular certified model reduction approach for solving parametrized partial differential equations. One critical stage of the offline portion of the algorithm is a greedy algorithm, requiring maximization of an error estimate over parameter space. In practice this maximization is usually performed by replacing the parameter domain continuum with a discrete "training" set. When the dimension of parameter space is large, it is necessary to significantly increase the size of this training set in order to effectively search parameter space. Large training sets diminish the attractiveness of RBM algorithms since this proportionally increases the cost of the offline phase. In this work we propose novel strategies for offline RBM algorithms that mitigate the computational difficulty of maximizing error estimates over a training set. The main idea is to identify a subset of the training set, a "surrogate training set" (STS), on which to perform greedy algorithms. The STS we construct is much smaller in size than the full training set, yet our examples suggest that it is accurate enough to induce the solution manifold of interest at the current offline RBM iteration. We propose two algorithms to construct the STS: our first algorithm, the successive maximization method, is inspired by inverse transform sampling for non-standard univariate probability distributions. The second constructs an STS by identifying pivots in the Cholesky decomposition of an approximate error correlation matrix. We demonstrate the algorithm through numerical experiments, showing that it is capable of accelerating offline RBM procedures without degrading accuracy, assuming that the solution manifold has rapidly decaying Kolmogorov width.



M. Kern, A. Lex, N. Gehlenborg, C. R. Johnson. “Interactive Visual Exploration And Refinement Of Cluster Assignments,” In BMC Bioinformatics, Cold Spring Harbor Laboratory, April, 2017.
DOI: 10.1101/123844

ABSTRACT

Background:
With ever-increasing amounts of data produced in biology research, scientists are in need of efficient data analysis methods. Cluster analysis, combined with visualization of the results, is one such method that can be used to make sense of large data volumes. At the same time, cluster analysis is known to be imperfect and depends on the choice of algorithms, parameters, and distance measures. Most clustering algorithms don't properly account for ambiguity in the source data, as records are often assigned to discrete clusters, even if an assignment is unclear. While there are metrics and visualization techniques that allow analysts to compare clusterings or to judge cluster quality, there is no comprehensive method that allows analysts to evaluate, compare, and refine cluster assignments based on the source data, derived scores, and contextual data.

Results:
In this paper, we introduce a method that explicitly visualizes the quality of cluster assignments, allows comparisons of clustering results and enables analysts to manually curate and refine cluster assignments. Our methods are applicable to matrix data clustered with partitional, hierarchical, and fuzzy clustering algorithms. Furthermore, we enable analysts to explore clustering results in context of other data, for example, to observe whether a clustering of genomic data results in a meaningful differentiation in phenotypes.

Conclusions:
Our methods are integrated into Caleydo StratomeX, a popular, web-based, disease subtype analysis tool. We show in a usage scenario that our approach can reveal ambiguities in cluster assignments and produce improved clusterings that better differentiate genotypes and phenotypes.



S. Kumar, D. Hoang, S. Petruzza, J. Edwards, V. Pascucci. “Reducing Network Congestion and Synchronization Overhead During Aggregation of Hierarchical Data,” In 2017 IEEE 24th International Conference on High Performance Computing (HiPC), pp. 223-232. Dec, 2017.
DOI: 10.1109/HiPC.2017.00034

ABSTRACT

Hierarchical data representations have been shown to be effective tools for coping with large-scale scientific data. Writing hierarchical data on supercomputers, however, is challenging as it often involves all-to-one communication during aggregation of low-resolution data which tends to span the entire network domain, resulting in several bottlenecks. We introduce the concept of indexing templates, which succinctly describe data organization and can be used to alter movement of data in beneficial ways. We present two techniques, domain partitioning and localized aggregation, that leverage indexing templates to alleviate congestion and synchronization overheads during data aggregation. We report experimental results that show significant I/O speedup using our proposed schemes on two of today's fastest supercomputers, Mira and Shaheen II, using the Uintah and S3D simulation frameworks.



S. Kumar, D. Hoang, S. Petruzza, J. Edwards, V. Pascucci. “Reducing network congestion and synchronization overhead during aggregation of hierarchical data,” In 2017 IEEE 24th International Conference on High Performance Computing (HiPC), IEEE, Dec, 2017.
DOI: 10.1109/hipc.2017.00034

ABSTRACT

Hierarchical data representations have been shown to be effective tools for coping with large-scale scientific data. Writing hierarchical data on supercomputers, however, is challenging as it often involves all-to-one communication during aggregation of low-resolution data which tends to span the entire network domain, resulting in several bottlenecks. We introduce the concept of indexing templates, which succinctly describe data organization and can be used to alter movement of data in beneficial ways. We present two techniques, domain partitioning and localized aggregation, that leverage indexing templates to alleviate congestion and synchronization overheads during data aggregation. We report experimental results that show significant I/O speedup using our proposed schemes on two of today's fastest supercomputers, Mira and Shaheen II, using the Uintah and S3D simulation frameworks.



A. Narayan, J. Jakeman, T. Zhou. “A Christoffel function weighted least squares algorithm for collocation approximations,” In Mathematics of Computation, Vol. 86, No. 306, pp. 1913--1947. 2017.
ISSN: 0025-5718, 1088-6842
DOI: 10.1090/mcom/3192

ABSTRACT

We propose, theoretically investigate, and numerically validate an algorithm for the Monte Carlo solution of least-squares polynomial approximation problems in a collocation frame- work. Our method is motivated by generalized Polynomial Chaos approximation in uncertainty quantification where a polynomial approximation is formed from a combination of orthogonal polynomials. A standard Monte Carlo approach would draw samples according to the density of orthogonality. Our proposed algorithm samples with respect to the equilibrium measure of the parametric domain, and subsequently solves a weighted least-squares problem, with weights given by evaluations of the Christoffel function. We present theoretical analysis to motivate the algorithm, and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest.



T.A.J. Ouermi, A. Knoll, R.M. Kirby, M. Berzins. “OpenMP 4 Fortran Modernization of WSM6 for KNL,” In Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact, PEARC17, No. 12, ACM, pp. 12:1--12:8. 2017.
ISBN: 978-1-4503-5272-7
DOI: 10.1145/3093338.3093387

ABSTRACT

Parallel code portability in the petascale era requires modifying existing codes to support new architectures with large core counts and SIMD vector units. OpenMP is a well established and increasingly supported vehicle for portable parallelization. As architectures mature and compiler OpenMP implementations evolve, best practices for code modernization change as well. In this paper, we examine the impact of newer OpenMP features (in particular OMP SIMD) on the Intel Xeon Phi Knights Landing (KNL) architecture, applied in optimizing loops in the single moment 6-class microphysics module (WSM6) in the US Navy's NEPTUNE code. We find that with functioning OMP SIMD constructs, low thread invocation overhead on KNL and reduced penalty for unaligned access compared to previous architectures, one can leverage OpenMP 4 to achieve reasonable scalability with relatively minor reorganization of a production physics code.



T.A.J. Ouermi, A. Knoll, R.M. Kirby, M. Berzins. “Optimization Strategies for WRF Single-Moment 6-Class Microphysics Scheme (WSM6) on Intel Microarchitectures,” In Proceedings of the fifth international symposium on computing and networking (CANDAR 17). Awarded Best Paper , IEEE, 2017.

ABSTRACT

Optimizations in the petascale era require modifications of existing codes to take advantage of new architectures with large core counts and SIMD vector units. This paper examines high-level and low-level optimization strategies for numerical weather prediction (NWP) codes. These strategies employ thread-local structures of arrays (SOA) and an OpenMP directive such as OMP SIMD. These optimization approaches are applied to the Weather Research Forecasting single-moment 6-class microphysics schemes (WSM6) in the US Navy NEPTUNE system. The results of this study indicate that the high-level approach with SOA and low-level OMP SIMD improves thread and vector parallelism by increasing data and temporal locality. The modified version of WSM6 runs 70x faster than the original serial code. This improvement is about 23.3x faster than the performance achieved by Ouermi et al., and 14.9x faster than the performance achieved by Michalakes et al.



B. Peterson, A. Humphrey, J. Schmidt, M. Berzins. “Addressing Global Data Dependencies in Heterogeneous Asynchronous Runtime Systems on GPUs. Awarded Best Paper,” In Proceedings of the Third International Workshop on Extreme Scale Programming Models and Middleware - ESPM2'17, ACM, 2017.
DOI: 10.1145/3152041.3152082

ABSTRACT

Large-scale parallel applications with complex global data dependencies beyond those of reductions pose significant scalability challenges in an asynchronous runtime system. Internodal challenges include identifying the all-to-all communication of data dependencies among the nodes. Intranodal challenges include gathering together these data dependencies into usable data objects while avoiding data duplication. This paper addresses these challenges within the context of a large-scale, industrial coal boiler simulation using the Uintah asynchronous many-task runtime system on GPU architectures. We show significant reduction in time spent analyzing data dependencies through refinements in our dependency search algorithm. Multiple task graphs are used to eliminate subsequent analysis when task graphs change in predictable and repeatable ways. Using a combined data store and task scheduler redesign reduces data dependency duplication ensuring that problems fit within host and GPU memory. These modifications did not require any changes to application code or sweeping changes to the Uintah runtime system. We report results running on the DOE Titan system on 119K CPU cores and 7.5K GPUs simultaneously. Our solutions can be generalized to other task dependency problems with global dependencies among thousands of nodes which must be processed efficiently at large scale.



M. Rautenhaus, M. Böttinger, S. Siemen, R. Hoffman, R.M. Kirby, M. Mirzargar, N. Rober, R. Westermann. “Visualization in Meteorology---A Survey of Techniques and Tools for Data Analysis Tasks,” In IEEE Transactions on Visualization and Computer Graphics, IEEE, pp. 1--1. 2017.
DOI: 10.1109/tvcg.2017.2779501

ABSTRACT

This article surveys the history and current state of the art of visualization in meteorology, focusing on visualization techniques and tools used for meteorological data analysis. We examine characteristics of meteorological data and analysis tasks, describe the development of computer graphics methods for visualization in meteorology from the 1960s to today, and visit the state of the art of visualization techniques and tools in operational weather forecasting and atmospheric research. We approach the topic from both the visualization and the meteorological side, showing visualization techniques commonly used in meteorological practice, and surveying recent studies in visualization research aimed at meteorological applications. Our overview covers visualization techniques from the fields of display design, 3D visualization, flow dynamics, feature-based visualization, comparative visualization and data fusion, uncertainty and ensemble visualization, interactive visual analysis, efficient rendering, and scalability and reproducibility. We discuss demands and challenges for visualization research targeting meteorological data analysis, highlighting aspects in demonstration of benefit, interactive visual analysis, seamless visualization, ensemble visualization, 3D visualization, and technical issues.



P. Seshadri, A. Narayan, S. Mahadevan. “Effectively Subsampled Quadratures for Least Squares Polynomial Approximations,” In SIAM/ASA Journal on Uncertainty Quantification, pp. 1003--1023. Jan, 2017.

ABSTRACT

This paper proposes a new deterministic sampling strategy for constructing polynomial chaos approximations for expensive physics simulation models. The proposed approach, effectively subsampled quadratures involves sparsely subsampling an existing tensor grid using QR column pivoting. For polynomial interpolation using hyperbolic or total order sets, we then solve the following square least squares problem. For polynomial approximation, we use a column pruning heuristic that removes columns based on the highest total orders and then solves the tall least squares problem. While we provide bounds on the condition number of such tall submatrices, it is difficult to ascertain how column pruning effects solution accuracy as this is problem specific. We conclude with numerical experiments on an analytical function and a model piston problem that show the efficacy of our approach compared with randomized subsampling. We also show an example where this method fails.



A. Suh, M. Hajij, B. Wang, C. Scheidegger, P. Rosen. “Driving Interactive Graph Exploration Using 0-Dimensional Persistent Homology Features,” In CoRR, 2017.

ABSTRACT

Graphs are commonly used to encode relationships among entities, yet, their abstractness makes them incredibly difficult to analyze. Node-link diagrams are a popular method for drawing graphs. Classical techniques for the node-link diagrams include various layout methods that rely on derived information to position points, which often lack interactive exploration functionalities; and force-directed layouts, which ignore global structures of the graph. This paper addresses the graph drawing challenge by leveraging topological features of a graph as derived information for interactive graph drawing. We first discuss extracting topological features from a graph using persistent homology. We then introduce an interactive persistence barcodes to study the substructures of a force-directed graph layout; in particular, we add contracting and repulsing forces guided by the 0-dimensional persistent homology features. Finally, we demonstrate the utility of our approach across three datasets.



J. Tate, K. Gillette, B. Burton, W. Good, J. Coll-Font, D. Brooks, R. MacLeod. “Analyzing Source Sampling to Reduce Error in ECG Forward Simulations,” In Computing in Cardiology, Vol. 44, 2017.

ABSTRACT

A continuing challenge in validating ECG Imaging is the persistent error in the associated forward problem observed in experimental studies. One possible cause of error is insufficient representation of the cardiac sources, which is often measured from only the ventricular epicardium, ignoring the endocardium and the atria. We hypothesize that measurements that completely cover the heart are required for accurate forward solutions. In this study, we used simulated and measured cardiac potentials to test the effect of different levels of sampling on the forward simulation. We found that omitting source samples on the atria increases the peak RMS error by a mean of 464 μV when compared the the fully sampled cardiac surface. Increasing the sampling on the atria in stages reduced the average error of the forward simulation proportionally to the number of additional samples and revealed some strategies may reduce error with fewer samples, such as adding samples to the AV plane and the atrial roof. Based on these results, we can design a sampling strategy to use in future validation studies.



W.Thevathasan, B. Debu, T. Aziz, B. R. Bloem, C. Blahak, C. Butson, V. Czernecki, T. Foltynie, V. Fraix, D. Grabli, C. Joint, A. M. Lozano, M. S. Okun, J. Ostrem, N. Pavese, C. Schrader, C. H. Tai, J. K. Krauss, E. Moro. “Pedunculopontine nucleus deep brain stimulation in Parkinson's disease: A clinical review,” In Movement Disorders, Vol. 33, No. 1, pp. 10--20. 2017.
ISSN: 1531-8257
DOI: 10.1002/mds.27098

ABSTRACT

Pedunculopontine nucleus region deep brain stimulation (DBS) is a promising but experimental therapy for axial motor deficits in Parkinson's disease (PD), particularly gait freezing and falls. Here, we summarise the clinical application and outcomes reported during the past 10 years. The published dataset is limited, comprising fewer than 100 cases. Furthermore, there is great variability in clinical methodology between and within surgical centers. The most common indication has been severe medication refractory gait freezing (often associated with postural instability). Some patients received lone pedunculopontine nucleus DBS (unilateral or bilateral) and some received costimulation of the subthalamic nucleus or internal pallidum. Both rostral and caudal pedunculopontine nucleus subregions have been targeted. However, the spread of stimulation and variance in targeting means that neighboring brain stem regions may be implicated in any response. Low stimulation frequencies are typically employed (20-80 Hertz). The fluctuating nature of gait freezing can confound programming and outcome assessments. Although firm conclusions cannot be drawn on therapeutic efficacy, the literature suggests that medication refractory gait freezing and falls can improve. The impact on postural instability is unclear. Most groups report a lack of benefit on gait or limb akinesia or dopaminergic medication requirements. The key question is whether pedunculopontine nucleus DBS can improve quality of life in PD. So far, the evidence supporting such an effect is minimal. Development of pedunculopontine nucleus DBS to become a reliable, established therapy would likely require a collaborative effort between experienced centres to clarify biomarkers predictive of response and the optimal clinical methodology.



W. Usher, J. Amstutz, C. Brownlee, A. Knoll, I. Wald . “Progressive CPU Volume Rendering with Sample Accumulation,” In Eurographics Symposium on Parallel Graphics and Visualization, Edited by Alexandru Telea and Janine Bennett, The Eurographics Association, 2017.
ISBN: 978-3-03868-034-5
ISSN: 1727-348X
DOI: 10.2312/pgv.20171090

ABSTRACT

We present a new method for progressive volume rendering by accumulating object-space samples over successively rendered frames. Existing methods for progressive refinement either use image space methods or average pixels over frames, which can blur features or integrate incorrectly with respect to depth. Our approach stores samples along each ray, accumulates new samples each frame into a buffer, and progressively interleaves and integrates these samples. Though this process requires additional memory, it ensures interactivity and is well suited for CPU architectures with large memory and cache. This approach also extends well to distributed rendering in cluster environments. We implement this technique in Intel's open source OSPRay CPU ray tracing framework and demonstrate that it is particularly useful for rendering volumetric data with costly sampling functions.



W. Usher, P. Klacansky, F. Federer, P. T. Bremer, A. Knoll, J. Yarch, A. Angelucci, V. Pascucci. “A Virtual Reality Visualization Tool for Neuron Tracing,” In IEEE Transactions on Visualization and Computer Graphics, IEEE, 2017.
ISSN: 1077-2626
DOI: 10.1109/TVCG.2017.2744079

ABSTRACT

Tracing neurons in large-scale microscopy data is crucial to establishing a wiring diagram of the brain, which is needed to understand how neural circuits in the brain process information and generate behavior. Automatic techniques often fail for large and complex datasets, and connectomics researchers may spend weeks or months manually tracing neurons using 2D image stacks. We present a design study of a new virtual reality (VR) system, developed in collaboration with trained neuroanatomists, to trace neurons in microscope scans of the visual cortex of primates. We hypothesize that using consumer-grade VR technology to interact with neurons directly in 3D will help neuroscientists better resolve complex cases and enable them to trace neurons faster and with less physical and mental strain. We discuss both the design process and technical challenges in developing an interactive system to navigate and manipulate terabyte-sized image volumes in VR. Using a number of different datasets, we demonstrate that, compared to widely used commercial software, consumer-grade VR presents a promising alternative for scientists.