Designed especially for neurobiologists, FluoRender is an interactive tool for multi-channel fluorescence microscopy data visualization and analysis.
Deep brain stimulation
BrainStimulator is a set of networks that are used in SCIRun to perform simulations of brain stimulation such as transcranial direct current stimulation (tDCS) and magnetic transcranial stimulation (TMS).
Developing software tools for science has always been a central vision of the SCI Institute.

SCI Publications

2013


R.M. Kirby, M.D. Meyer. “Visualization Collaborations: What Works and Why,” In IEEE Computer Graphics and Applications: Visualization Viewpoints, Vol. 33, No. 6, pp. 82--88. 2013.

ABSTRACT

In 1987, Bruce McCormick and his colleagues outlined the current state and future vision of visualization in scientific computing.1 That same year, Donna Cox pioneered her concept of the "Renaissance team"-a multidisciplinary team of experts focused on solving visualization problems.2 Even if a member of the visualization community has never read McCormick and his colleagues' report or heard Donna Cox speak, he or she has probably been affected by some of their ideas.

Of particular interest to us is their vision for collaboration. McCormick and his colleagues envisioned an interdisciplinary team that through close interaction would develop visualization tools that not only were effective in the context of their immediate collaborative environment but also could be reused by scientists and engineers in other fields. McCormick and his colleagues categorized the types of researchers they imagined constituting these teams, one type being the "visualization scientist/engineer." They even commented on the skills these individuals might have. However, they provided little guidance on how to make such teams successful.

In the more than 25 years since the report, researchers have refined the concepts of interaction versus collaboration,3 interdisciplinary versus multidisciplinary teams,4,5 and independence versus interdependence.6 Here, we use observations from our collective 18 years of collaborative visualization research to help shed light on not just the composition of current and future visualization collaborative teams but also pitfalls and recommendations for successful collaboration. Although our statements might reflect what seasoned visualization researchers are already doing, we believe that reexpressing and possibly reaffirming basic collaboration principles provide benefits.



A. Knoll, I. Wald, P. Navratil, M. E Papka,, K. P Gaither. “Ray Tracing and Volume Rendering Large Molecular Data on Multi-core and Many-core Architectures.,” In Proc. 8th International Workshop on Ultrascale Visualization at SC13 (Ultravis), 2013, 2013.

ABSTRACT

Visualizing large molecular data requires efficient means of rendering millions of data elements that combine glyphs, geometry and volumetric techniques. The geometric and volumetric loads challenge traditional rasterization-based vis methods. Ray casting presents a scalable and memory- efficient alternative, but modern techniques typically rely on GPU-based acceleration to achieve interactive rendering rates. In this paper, we present bnsView, a molecular visualization ray tracing framework that delivers fast volume rendering and ball-and-stick ray casting on both multi-core CPUs andmany-core Intel ® Xeon PhiTM co-processors, implemented in a SPMD language that generates efficient SIMD vector code for multiple platforms without source modification. We show that our approach running on co- processors is competitive with similar techniques running on GPU accelerators, and we demonstrate large-scale parallel remote visualization from TACC's Stampede supercomputer to large-format display walls using this system.



S. Kumar, A. Saha, V. Vishwanath, P. Carns, J.A. Schmidt, G. Scorzelli, H. Kolla, R. Grout, R. Latham, R. Ross, M.E. Papka, J. Chen, V. Pascucci. “Characterization and modeling of PIDX parallel I/O for performance optimization,” In Proceedings of SC13: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 67. 2013.

ABSTRACT

Parallel I/O library performance can vary greatly in response to user-tunable parameter values such as aggregator count, file count, and aggregation strategy. Unfortunately, manual selection of these values is time consuming and dependent on characteristics of the target machine, the underlying file system, and the dataset itself. Some characteristics, such as the amount of memory per core, can also impose hard constraints on the range of viable parameter values. In this work we address these problems by using machine learning techniques to model the performance of the PIDX parallel I/O library and select appropriate tunable parameter values. We characterize both the network and I/O phases of PIDX on a Cray XE6 as well as an IBM Blue Gene/P system. We use the results of this study to develop a machine learning model for parameter space exploration and performance prediction.

Keywords: I/O, Network Characterization, Performance Modeling



A. Lex, C. Partl, D. Kalkofen, M. Streit, A. Wasserman, S. Gratzl, D. Schmalstieg, H. Pfister. “Entourage: Visualizing Relationships between Biological Pathways using Contextual Subsets,” In IEEE Transactions on Visualization and Computer Graphics (InfoVis '13), Vol. 19, No. 12, pp. 2536--2545. 2013.
ISSN: 1077-2626
DOI: 10.1109/TVCG.2013.154

ABSTRACT

Biological pathway maps are highly relevant tools for many tasks in molecular biology. They reduce the complexity of the overall biological network by partitioning it into smaller manageable parts. While this reduction of complexity is their biggest strength, it is, at the same time, their biggest weakness. By removing what is deemed not important for the primary function of the pathway, biologists lose the ability to follow and understand cross-talks between pathways. Considering these cross-talks is, however, critical in many analysis scenarios, such as, judging effects of drugs.

In this paper we introduce Entourage, a novel visualization technique that provides contextual information lost due to the artificial partitioning of the biological network, but at the same time limits the presented information to what is relevant to the analyst's task. We use one pathway map as the focus of an analysis and allow a larger set of contextual pathways. For these context pathways we only show the contextual subsets, i.e., the parts of the graph that are relevant to a selection. Entourage suggests related pathways based on similarities and highlights parts of a pathway that are interesting in terms of mapped experimental data. We visualize interdependencies between pathways using stubs of visual links, which we found effective yet not obtrusive. By combining this approach with visualization of experimental data, we can provide domain experts with a highly valuable tool.

We demonstrate the utility of Entourage with case studies conducted with a biochemist who researches the effects of drugs on pathways. We show that the technique is well suited to investigate interdependencies between pathways and to analyze, understand, and predict the effect that drugs have on different cell types.



T. Liu, M. Seyedhosseini, M. Ellisman, T. Tasdizen. “Watershed Merge Forest Classification for Electron Microscopy Image Stack Segmentation,” In Proceedings of the 2013 International Conference on Image Processing, 2013.

ABSTRACT

Automated electron microscopy (EM) image analysis techniques can be tremendously helpful for connectomics research. In this paper, we extend our previous work [1] and propose a fully automatic method to utilize inter-section information for intra-section neuron segmentation of EM image stacks. A watershed merge forest is built via the watershed transform with each tree representing the region merging hierarchy of one 2D section in the stack. A section classifier is learned to identify the most likely region correspondence between adjacent sections. The inter-section information from such correspondence is incorporated to update the potentials of tree nodes. We resolve the merge forest using these potentials together with consistency constraints to acquire the final segmentation of the whole stack. We demonstrate that our method leads to notable segmentation accuracy improvement by experimenting with two types of EM image data sets.



Y. Livnat, E. Jurrus, A.V. Gundlapalli, P. Gestland. “The CommonGround visual paradigm for biosurveillance,” In Proceedings of the 2013 IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 352--357. 2013.
ISBN: 978-1-4673-6214-6
DOI: 10.1109/ISI.2013.6578857

ABSTRACT

Biosurveillance is a critical area in the intelligence community for real-time detection of disease outbreaks. Identifying epidemics enables analysts to detect and monitor disease outbreak that might be spread from natural causes or from possible biological warfare attacks. Containing these events and disseminating alerts requires the ability to rapidly find, classify and track harmful biological signatures. In this paper, we describe a novel visual paradigm to conduct biosurveillance using an Infectious Disease Weather Map. Our system provides a visual common ground in which users can view, explore and discover emerging concepts and correlations such as symptoms, syndromes, pathogens and geographic locations.

Keywords: biosurveillance, visualization, interactive exploration, situational awareness



D. Maljovec, Bei Wang, V. Pascucci, P.-T. Bremer, M.Pernice, D. Mandelli, R. Nourgaliev. “Exploration of High-Dimensional Scalar Function for Nuclear Reactor Safety Analysis and Visualization,” In Proceedings of the 2013 International Conference on Mathematics and Computational Methods Applied to Nuclear Science & Engineering (M&C), pp. 712-723. 2013.

ABSTRACT

The next generation of methodologies for nuclear reactor Probabilistic Risk Assessment (PRA) explicitly accounts for the time element in modeling the probabilistic system evolution and uses numerical simulation tools to account for possible dependencies between failure events. The Monte-Carlo (MC) and the Dynamic Event Tree (DET) approaches belong to this new class of dynamic PRA methodologies. A challenge of dynamic PRA algorithms is the large amount of data they produce which may be difficult to visualize and analyze in order to extract useful information. We present a software tool that is designed to address these goals. We model a large-scale nuclear simulation dataset as a high-dimensional scalar function defined over a discrete sample of the domain. First, we provide structural analysis of such a function at multiple scales and provide insight into the relationship between the input parameters and the output. Second, we enable exploratory analysis for users, where we help the users to differentiate features from noise through multi-scale analysis on an interactive platform, based on domain knowledge and data characterization. Our analysis is performed by exploiting the topological and geometric properties of the domain, building statistical models based on its topological segmentations and providing interactive visual interfaces to facilitate such explorations. We provide a user's guide to our software tool by highlighting its analysis and visualization capabilities, along with a use case involving data from a nuclear reactor safety simulation.

Keywords: high-dimensional data analysis, computational topology, nuclear reactor safety analysis, visualization



D. Maljovec, Bei Wang, D. Mandelli, P.-T. Bremer, V. Pascucci. “Adaptive Sampling Algorithms for Probabilistic Risk Assessment of Nuclear Simulations,” In Proceedings of the 2013 International Topical Meeting on Probabilistic Safety Assessment and Analysis (PSA 2013), Note: First runner-up for Best Student Paper Award, 2013.

ABSTRACT

Nuclear simulations are often computationally expensive, time-consuming, and high-dimensional with respect to the number of input parameters. Thus exploring the space of all possible simulation outcomes is infeasible using finite computing resources. During simulation-based probabilistic risk analysis, it is important to discover the relationship between a potentially large number of input parameters and the output of a simulation using as few simulation trials as possible. This is a typical context for performing adaptive sampling where a few observations are obtained from the simulation, a surrogate model is built to represent the simulation space, and new samples are selected based on the model constructed. The surrogate model is then updated based on the simulation results of the sampled points. In this way, we attempt to gain the most information possible with a small number of carefully selected sampled points, limiting the number of expensive trials needed to understand features of the simulation space.

We analyze the specific use case of identifying the limit surface, i.e., the boundaries in the simulation space between system failure and system success. In this study, we explore several techniques for adaptively sampling the parameter space in order to reconstruct the limit surface. We focus on several adaptive sampling schemes. First, we seek to learn a global model of the entire simulation space using prediction models or neighborhood graphs and extract the limit surface as an iso-surface of the global model. Second, we estimate the limit surface by sampling in the neighborhood of the current estimate based on topological segmentations obtained locally.

Our techniques draw inspirations from topological structure known as the Morse-Smale complex. We highlight the advantages and disadvantages of using a global prediction model versus local topological view of the simulation space, comparing several different strategies for adaptive sampling in both contexts. One of the most interesting models we propose attempt to marry the two by obtaining a coarse global representation using prediction models, and a detailed local representation based on topology. Our methods are validated on several analytical test functions as well as a small nuclear simulation dataset modeled after a simplified Pressurized Water Reactor.

Keywords: high-dimensional data analysis, computational topology, nuclear reactor safety analysis, visualization



D. Maljovec, Bei Wang, D. Mandelli, P.-T. Bremer, V. Pascucci. “Analyze Dynamic Probabilistic Risk Assessment Data through Clustering,” In Proceedings of the 2013 International Topical Meeting on Probabilistic Safety Assessment and Analysis (PSA 2013), 2013.

ABSTRACT

We investigate the use of a topology-based clustering technique on the data generated by dynamic event tree methodologies. The clustering technique we utilizes focuses on a domain-partitioning algorithm based on topological structures known as the Morse-Smale complex, which partitions the data points into clusters based on their uniform gradient flow behavior. We perform both end state analysis and transient analysis to classify the set of nuclear scenarios. We demonstrate our methodology on a dataset generated for a sodium-cooled fast reactor during an aircraft crash scenario. The simulation tracks the temperature of the reactor as well as the time for a recovery team to fix the passive cooling system. Combined with clustering results obtained previously through mean shift methodology, we present the user with complementary views of the data that help illuminate key features that may be otherwise hidden using a single methodology. By clustering the data, the number of relevant test cases to be selected for further analysis can be drastically reduced by selecting a representative from each cluster. Identifying the similarities of simulations within a cluster can also aid in the drawing of important conclusions with respect to safety analysis.



D. Maljovec, Bei Wang, A. Kupresanin, G. Johannesson, V. Pascucci, P.-T. Bremer. “Adaptive Sampling with Topological Scores,” In Int. J. Uncertainty Quantification, Vol. 3, No. 2, Begell House, pp. 119--141. 2013.
DOI: 10.1615/int.j.uncertaintyquantification.2012003955

ABSTRACT

Understanding and describing expensive black box functions such as physical simulations is a common problem in many application areas. One example is the recent interest in uncertainty quantification with the goal of discovering the relationship between a potentially large number of input parameters and the output of a simulation. Typically, the simulation of interest is expensive to evaluate and thus the sampling of the parameter space is necessarily small. As a result choosing a "good" set of samples at which to evaluate is crucial to glean as much information as possible from the fewest samples. While space-filling sampling designs such as Latin hypercubes provide a good initial cover of the entire domain, more detailed studies typically rely on adaptive sampling: Given an initial set of samples, these techniques construct a surrogate model and use it to evaluate a scoring function which aims to predict the expected gain from evaluating a potential new sample. There exist a large number of different surrogate models as well as different scoring functions each with their own advantages and disadvantages. In this paper we present an extensive comparative study of adaptive sampling using four popular regression models combined with six traditional scoring functions compared against a space-filling design. Furthermore, for a single high-dimensional output function, we introduce a new class of scoring functions based on global topological rather than local geometric information. The new scoring functions are competitive in terms of the root mean squared prediction error but are expected to better recover the global topological structure. Our experiments suggest that the most common point of failure of adaptive sampling schemes are ill-suited regression models. Nevertheless, even given well-fitted surrogate models many scoring functions fail to outperform a space-filling design.



R.K. McClure, M. Styner, J.A. Lieberman, S. Gouttard, G. Gerig, X. Shi, H. Zhu. “Localized differences in caudate and hippocampal shape associated with schizophrenia but not antipsychotic type,” In Psychiatry Research: Neuroimaging, Vol. 211, No. 1, pp. 1--10. January, 2013.
DOI: 10.1016/j.pscychresns.2012.07.001
PubMed Central ID: PMC3557605

ABSTRACT

Caudate and hippocampal volume differences in patients with schizophrenia are associated with disease and antipsychotic treatment, but local shape alterations have not been thoroughly examined. Schizophrenia patients randomly assigned to haloperidol and olanzapine treatment underwent magnetic resonance imaging (MRI) at 3, 6, and 12 months. The caudate and hippocampus were represented as medial representations (M-reps); mesh structures derived from automatic segmentations of high resolution MRIs. Two quantitative shape measures were examined: local width and local deformation. A novel nonparametric statistical method, adjusted exponentially tilted (ET) likelihood, was used to compare the shape measures across the three groups while controlling for covariates. Longitudinal shape change was not observed in the hippocampus or caudate when the treatment groups and controls were examined in a global analysis, nor when the three groups were examined individually. Both baseline and repeated measures analysis showed differences in local caudate and hippocampal size between patients and controls, while no consistent differences were shown between treatment groups. Regionally specific differences in local hippocampal and caudate shape are present in schizophrenic patients. Treatment-related related longitudinal shape change was not observed within the studied timeframe. Our results provide additional evidence for disrupted cortico-basal ganglia-thalamo-cortical circuits in schizophrenia. CLINICAL TRIAL INFORMATION: This longitudinal study was conducted from March 1, 1997 to July 31, 2001 at 14 academic medical centers (11 in the United States, one in Canada, one in the Netherlands, and one in England). This study was performed prior to the establishment of centralized registries of federally and privately supported clinical trials.



K.S. McDowell, F. Vadakkumpadan, R. Blake, J. Blauer, G.t Plank, R.S. MacLeod, N.A. Trayanova. “Mechanistic Inquiry into the Role of Tissue Remodeling in Fibrotic Lesions in Human Atrial Fibrillation,” In Biophysical Journal, Vol. 104, pp. 2764--2773. 2013.
DOI: 10.1016/j.bpj.2013.05.025
PubMed ID: 23790385
PubMed Central ID: PMC3686346

ABSTRACT

Atrial fibrillation (AF), the most common arrhythmia in humans, is initiated when triggered activity from the pulmonary veins propagates into atrial tissue and degrades into reentrant activity. Although experimental and clinical findings show a correlation between atrial fibrosis and AF, the causal relationship between the two remains elusive. This study used an array of 3D computational models with different representations of fibrosis based on a patient-specific atrial geometry with accurate fibrotic distribution to determine the mechanisms by which fibrosis underlies the degradation of a pulmonary vein ectopic beat into AF. Fibrotic lesions in models were represented with combinations of: gap junction remodeling; collagen deposition; and myofibroblast proliferation with electrotonic or paracrine effects on neighboring myocytes. The study found that the occurrence of gap junction remodeling and the subsequent conduction slowing in the fibrotic lesions was a necessary but not sufficient condition for AF development, whereas myofibroblast proliferation and the subsequent electrophysiological effect on neighboring myocytes within the fibrotic lesions was the sufficient condition necessary for reentry formation. Collagen did not alter the arrhythmogenic outcome resulting from the other fibrosis components. Reentrant circuits formed throughout the noncontiguous fibrotic lesions, without anchoring to a specific fibrotic lesion.



C. McGann, N. Akoum, A. Patel, E. Kholmovski, P. Revelo, K. Damal, B. Wilson, J. Cates, A. Harrison, R. Ranjan, N.S. Burgon, T. Greene, D. Kim, E.V.R. DiBella, D. Parker, R.S. MacLeod, N.F. Marrouche. “Atrial Fibrillation Ablation Outcome is Predicted by Left Atrial Remodeling on MRI,” In Circulation: Arrhythmia and Electrophysiology, Note: Published online before print., December, 2013.
DOI: 10.1161/CIRCEP.113.000689

ABSTRACT

Background: While catheter ablation therapy for atrial fibrillation (AF) is becoming more common, results vary widely and patient selection criteria remain poorly defined. We hypothesized that late gadolinium enhancement magnetic resonance imaging (LGE-MRI) can identify left atrial (LA) wall structural remodeling (SRM) and stratify patients who are likely or not to benefit from ablation therapy.

Methods and Results: LGE-MRI was performed on 426 consecutive AF patients without contraindications to MRI and before undergoing their first ablation procedure and on 21 non-AF control subjects. Patients were categorized by SRM stage (I-IV) based on percentage of LA wall enhancement for correlation with procedure outcomes. Histological validation of SRM was performed comparing LGE-MRI to surgical biopsy. A total of 386 patients (91%) with adequate LGE-MRI scans were included in the study. Post-ablation, 123 (31.9%) experienced recurrent atrial arrhythmias over one-year follow-up. Recurrent arrhythmias (failed ablations) occurred at higher SRM stages with 28/133 (21.0%) stage I, 40/140 (29.3%) stage II, 24/71 (33.8%) stage III, and 30/42 (71.4%) stage IV. In multi-variate analysis, ablation outcome was best predicted by advanced SRM stage (hazard ratio (HR) 4.89; p

Keywords: atrial fibrillation arrhythmia, catheter ablation, magnetic resonance imaging, remodeling, outcome



T. McLoughlin, M.W. Jones, R.S. Laramee, R. Malki, I. Masters, C.D. Hansen. “Similarity Measures for Enhancing Interactive Streamline Seeding,” In IEEE Transactions on Visualization and Computer Graphics (TVCG), Vol. 19, No. 8, pp. 1342--1353. 2013.
ISSN: 1077-2626
DOI: 10.1109/TVCG.2012.150
PubMed ID: 23744264

ABSTRACT

Streamline seeding rakes are widely used in vector field visualization. We present new approaches for calculating similarity between integral curves (streamlines and pathlines). While others have used similarity distance measures, the computational expense involved with existing techniques is relatively high due to the vast number of euclidean distance tests, restricting interactivity and their use for streamline seeding rakes. We introduce the novel idea of computing streamline signatures based on a set of curve-based attributes. A signature produces a compact representation for describing a streamline. Similarity comparisons are performed by using a popular statistical measure on the derived signatures. We demonstrate that this novel scheme, including a hierarchical variant, produces good clustering results and is computed over two orders of magnitude faster than previous methods. Similarity-based clustering enables filtering of the streamlines to provide a nonuniform seeding distribution along the seeding object. We show that this method preserves the overall flow behavior while using only a small subset of the original streamline set. We apply focus + context rendering using the clusters which allows for faster and easier analysis in cases of high visual complexity and occlusion. The method provides a high level of interactivity and allows the user to easily fine tune the clustering results at runtime while avoiding any time-consuming recomputation. Our method maintains interactive rates even when hundreds of streamlines are used.



Q. Meng, A. Humphrey, J. Schmidt, M. Berzins. “Preliminary Experiences with the Uintah Framework on Intel Xeon Phi and Stampede,” SCI Technical Report, No. UUSCI-2013-002, SCI Institute, University of Utah, 2013.

ABSTRACT

In this work, we describe our preliminary experiences on the Stampede system in the context of the Uintah Computational Framework. Uintah was developed to provide an environment for solving a broad class of fluid-structure interaction problems on structured adaptive grids. Uintah uses a combination of fluid-flow solvers and particle-based methods, together with a novel asynchronous taskbased approach and fully automated load balancing. While we have designed scalable Uintah runtime systems for large CPU core counts, the emergence of heterogeneous systems presents considerable challenges in terms of effectively utilizing additional on-node accelerators and co-processors, deep memory hierarchies, as well as managing multiple levels of parallelism. Our recent work has addressed the emergence of heterogeneous CPU/GPU systems with the design of a Unified heterogeneous runtime system, enabling Uintah to fully exploit these architectures with support for asynchronous, out-of-order scheduling of both CPU and GPU computational tasks. Using this design, Uintah has run at full scale on the Keeneland System and TitanDev. With the release of the Intel Xeon Phi co-processor and the recent availability of the Stampede system, we show that Uintah may be modified to utilize such a coprocessor based system. We also explore the different usage models provided by the Xeon Phi with the aim of understanding portability of a general purpose framework like Uintah to this architecture. These usage models range from the pragma based offload model to the more complex symmetric model, utilizing all co-processor and host CPU cores simultaneously. We provide preliminary results of the various usage models for a challenging adaptive mesh refinement problem, as well as a detailed account of our experience adapting Uintah to run on the Stampede system. Our conclusion is that while the Stampede system is easy to use, obtaining high performance from the Xeon Phi co-processors requires a substantial but different investment to that needed for GPU-based systems.

Keywords: Uintah, hybrid parallelism, scalability, parallel, adaptive, MIC, Xeon Phi, heterogeneous systems, Stampede, co-processor



Q. Meng, A. Humphrey, J. Schmidt, M. Berzins. “Investigating Applications Portability with the Uintah DAG-based Runtime System on PetaScale Supercomputers,” SCI Technical Report, No. UUSCI-2013-003, SCI Institute, University of Utah, 2013.

ABSTRACT

Present trends in high performance computing present formidable challenges for applications code using multicore nodes possibly with accelerators and/or co-processors and reduced memory while still attaining scalability. Software frameworks that execute machineindependent applications code using a runtime system that shields users from architectural complexities offer a possible solution. The Uintah framework for example, solves a broad class of large-scale problems on structured adaptive grids using fluid-flow solvers coupled with particle-based solids methods. Uintah executes directed acyclic graphs of computational tasks with a scalable asynchronous and dynamic runtime system for CPU cores and/or accelerators/coprocessors on a node. Uintah's clear separation between application and runtime code has led to scalability increases of 1000x without significant changes to application code. This methodology is tested on three leading Top500 machines; OLCF Titan, TACC Stampede and ALCF Mira using three diverse and challenging applications problems. This investigation of scalability with regard to the different processors and communications performance leads to the overall conclusion that the adaptive DAG-based approach provides a very powerful abstraction for solving challenging multiscale multi-physics engineering problems on some of the largest and most powerful computers available today.

Keywords: Uintah, hybrid parallelism, scalability, parallel, adaptive, MIC, Xeon Phi, heterogeneous systems, Stampede, co-processor



Q. Meng, A. Humphrey, J. Schmidt, M. Berzins. “Investigating Applications Portability with the Uintah DAG-based Runtime System on PetaScale Supercomputers,” In Proceedings of SC13: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 96:1--96:12. 2013.
ISBN: 978-1-4503-2378-9
DOI: 10.1145/2503210.2503250

ABSTRACT

Present trends in high performance computing present formidable challenges for applications code using multicore nodes possibly with accelerators and/or co-processors and reduced memory while still attaining scalability. Software frameworks that execute machine-independent applications code using a runtime system that shields users from architectural complexities offer a possible solution. The Uintah framework for example, solves a broad class of large-scale problems on structured adaptive grids using fluid-flow solvers coupled with particle-based solids methods. Uintah executes directed acyclic graphs of computational tasks with a scalable asynchronous and dynamic runtime system for CPU cores and/or accelerators/co-processors on a node. Uintah's clear separation between application and runtime code has led to scalability increases of 1000x without significant changes to application code. This methodology is tested on three leading Top500 machines; OLCF Titan, TACC Stampede and ALCF Mira using three diverse and challenging applications problems. This investigation of scalability with regard to the different processors and communications performance leads to the overall conclusion that the adaptive DAG-based approach provides a very powerful abstraction for solving challenging multi-scale multi-physics engineering problems on some of the largest and most powerful computers available today.

Keywords: Blue Gene/Q, GPU, Xeon Phi, adaptive, application, co-processor, heterogeneous systems, hybrid parallelism, parallel, scalability, software, uintah, NETL



Q. Meng, A. Humphrey, J. Schmidt, M. Berzins. “Preliminary Experiences with the Uintah Framework on Intel Xeon Phi and Stampede,” In Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery (XSEDE 2013), San Diego, California, pp. 48:1--48:8. 2013.
DOI: 10.1145/2484762.2484779

ABSTRACT

In this work, we describe our preliminary experiences on the Stampede system in the context of the Uintah Computational Framework. Uintah was developed to provide an environment for solving a broad class of fluid-structure interaction problems on structured adaptive grids. Uintah uses a combination of fluid-flow solvers and particle-based methods, together with a novel asynchronous task-based approach and fully automated load balancing. While we have designed scalable Uintah runtime systems for large CPU core counts, the emergence of heterogeneous systems presents considerable challenges in terms of effectively utilizing additional on-node accelerators and co-processors, deep memory hierarchies, as well as managing multiple levels of parallelism. Our recent work has addressed the emergence of heterogeneous CPU/GPU systems with the design of a Unified heterogeneous runtime system, enabling Uintah to fully exploit these architectures with support for asynchronous, out-of-order scheduling of both CPU and GPU computational tasks. Using this design, Uintah has run at full scale on the Keeneland System and TitanDev. With the release of the Intel Xeon Phi co-processor and the recent availability of the Stampede system, we show that Uintah may be modified to utilize such a co-processor based system. We also explore the different usage models provided by the Xeon Phi with the aim of understanding portability of a general purpose framework like Uintah to this architecture. These usage models range from the pragma based offload model to the more complex symmetric model, utilizing all co-processor and host CPU cores simultaneously. We provide preliminary results of the various usage models for a challenging adaptive mesh refinement problem, as well as a detailed account of our experience adapting Uintah to run on the Stampede system. Our conclusion is that while the Stampede system is easy to use, obtaining high performance from the Xeon Phi co-processors requires a substantial but different investment to that needed for GPU-based systems.

Keywords: MIC, Xeon Phi, adaptive, co-processor, heterogeneous systems, hybrid parallelism, parallel, scalability, stampede, uintah, c-safe



D.C.B. de Oliveira, Z. Rakamaric, G. Gopalakrishnan, A. Humphrey, Q. Meng, M. Berzins. “Crash Early, Crash Often, Explain Well: Practical Formal Correctness Checking of Million-core Problem Solving Environments for HPC,” In Proceedings of the 35th International Conference on Software Engineering (ICSE 2013), pp. (accepted). 2013.

ABSTRACT

While formal correctness checking methods have been deployed at scale in a number of important practical domains, we believe that such an experiment has yet to occur in the domain of high performance computing at the scale of a million CPU cores. This paper presents preliminary results from the Uintah Runtime Verification (URV) project that has been launched with this objective. Uintah is an asynchronous task-graph based problem-solving environment that has shown promising results on problems as diverse as fluid-structure interaction and turbulent combustion at well over 200K cores to date. Uintah has been tested on leading platforms such as Kraken, Keenland, and Titan consisting of multicore CPUs and GPUs, incorporates several innovative design features, and is following a roadmap for development well into the million core regime. The main results from the URV project to date are crystallized in two observations: (1) A diverse array of well-known ideas from lightweight formal methods and testing/observing HPC systems at scale have an excellent chance of succeeding. The real challenges are in finding out exactly which combinations of ideas to deploy, and where. (2) Large-scale problem solving environments for HPC must be designed such that they can be \"crashed early\" (at smaller scales of deployment) and \"crashed often\" (have effective ways of input generation and schedule perturbation that cause vulnerabilities to be attacked with higher probability). Furthermore, following each crash, one must \"explain well\" (given the extremely obscure ways in which an error finally manifests itself, we must develop ways to record information leading up to the crash in informative ways, to minimize offsite debugging burden). Our plans to achieve these goals and to measure our success are described. We also highlight some of the broadly applicable concepts and approaches.

Keywords: Uintah



B. Paniagua, O. Emodi, J. Hill, J. Fishbaugh, L.A. Pimenta, S.R. Aylward, E. Andinet, G. Gerig, J. Gilmore, J.A. van Aalst, M. Styner. “3D of brain shape and volume after cranial vault remodeling surgery for craniosynostosis correction in infants,” In Proceedings of SPIE 8672, Medical Imaging 2013: Biomedical Applications in Molecular, Structural, and Functional Imaging, 86720V, 2013.
DOI: 10.1117/12.2006524

ABSTRACT

The skull of young children is made up of bony plates that enable growth. Craniosynostosis is a birth defect that causes one or more sutures on an infant’s skull to close prematurely. Corrective surgery focuses on cranial and orbital rim shaping to return the skull to a more normal shape. Functional problems caused by craniosynostosis such as speech and motor delay can improve after surgical correction, but a post-surgical analysis of brain development in comparison with age-matched healthy controls is necessary to assess surgical outcome. Full brain segmentations obtained from pre- and post-operative computed tomography (CT) scans of 8 patients with single suture sagittal (n=5) and metopic (n=3), nonsyndromic craniosynostosis from 41 to 452 days-of-age were included in this study. Age-matched controls obtained via 4D acceleration-based regression of a cohort of 402 full brain segmentations from healthy controls magnetic resonance images (MRI) were also used for comparison (ages 38 to 825 days). 3D point-based models of patient and control cohorts were obtained using SPHARM-PDM shape analysis tool. From a full dataset of regressed shapes, 240 healthy regressed shapes between 30 and 588 days-of-age (time step = 2.34 days) were selected. Volumes and shape metrics were obtained for craniosynostosis and healthy age-matched subjects. Volumes and shape metrics in single suture craniosynostosis patients were larger than age-matched controls for pre- and post-surgery. The use of 3D shape and volumetric measurements show that brain growth is not normal in patients with single suture craniosynostosis.