Designed especially for neurobiologists, FluoRender is an interactive tool for multi-channel fluorescence microscopy data visualization and analysis.

BrainStimulator is a set of networks that are used in SCIRun to perform simulations of brain stimulation such as transcranial direct current stimulation (tDCS) and magnetic transcranial stimulation (TMS).

Developing software tools for science has always been a central vision of the SCI Institute.

Visualization

Visualization, sometimes referred to as visual data analysis, uses the graphical representation of data as a means of gaining understanding and insight into the data. Visualization research at SCI has focused on applications spanning computational fluid dynamics, medical imaging and analysis, biomedical data analysis, healthcare data analysis, weather data analysis, poetry, network and graph analysis, financial data analysis, etc.

Research involves novel algorithm and technique development to building tools and systems that assist in the comprehension of massive amounts of (scientific) data. We also research the process of creating successful visualizations.

We strongly believe in the role of interactivity in visual data analysis. Therefore, much of our research is concerned with creating visualizations that are intuitive to interact with and also render at interactive rates.

Visualization at SCI includes the academic subfields of Scientific Visualization, Information Visualization and Visual Analytics.

Charles Hansen

Volume Rendering
Ray Tracing
Graphics

Valerio Pascucci

Topological Methods
Data Streaming
Big Data

Chris Johnson

Scalar, Vector, and
Tensor Field Visualization,
Uncertainty Visualization

Mike Kirby

Uncertainty Visualization

Ross Whitaker

Topological Methods
Uncertainty Visualization

Alex Lex

Information Visualization

Bei Wang

Information Visualization
Scientific Visualization
Topological Data Analysis

Centers and Labs:

Funded Research Projects:

SCALE MoDL: Advancing Theoretical Minimax Deep Learning: Optimization, Resilience, and Interpretability

Bei Wang
The past decade has witnessed the great success of deep learning in broad societal and commercial applications. However, conventional deep learning relies on fitting data with neural networks, which is known to produce models that lack resilience. The next-generation deep learning paradigm needs to deliver resilient models that promote robustness to malicious attacks, fairness among users, and privacy preservation. In this project, the investigators will collaboratively develop a comprehensive minimax learning theory that advances the fundamental understanding of minimax deep learning from the perspectives of optimization, resilience, and interpretability.

Enabling Reproducibility of Interactive Visual Data Analysis

Alex Lex
Reproducibility and justifiability are widely recognized as critical aspects of data-driven decision making in fields as varied as scientific research, business, healthcare, or intelligence analysis. This project is concerned with enabling reproducibility and justifiability of decisions in the data analysis process, specifically as it relates to visual data analysis. Visualization is an important tool for discovery, yet decisions made by humans based on visualizations of data are difficult to capture and to justify. This project will develop methods to justify, communicate, and audit decisions made based on visual analysis. This, in turn will lead to better outcomes, achieved with less effort and cost. The increasing use of visual analysis tools for decision making will make data analysis accessible to a broad variety of people, as visual analysis tools are generally easier to use than scripting languages and do not require extensive computational and statistical training. This research and its related activities increase accessibility and enhance the data analysis infrastructure for research and education.

To achieve these goals, this research will develop a framework for making visual analysis sessions not only reproducible but also reusable. The approach is based on tracking semantically meaningful provenance data during an interactive visual analysis session. Once a discovery is made, analysts can use this history to curate a succinct analysis story, adding justifications and explanations to make their analysis reproducible by others. Using a semi-automatic process, analysts will be able to make their actions data-aware, so that their analysis processes become robust to changes, such as updates in the data. A second contribution of the proposed work is the integration of visual analysis into computational analysis processes. While visualization is commonly used to present computational analysis results, the results of a visual analysis session are rarely used to feed into further computational processes. The techniques developed in this project will allow analysts to feed analysis results (selections, aggregations, filters, etc.) back into a computational environment. This will make it possible to use interactive visualization at any point in the data analysis process while maintaining reproducibility and enabling reuse. The expected results include new methods to capture user intent, create data stories from analysis processes, and to integrate computational and visual data analysis, leveraging the strength of both, human abilities and computational power. The results will be disseminated in publications and in the form of open source software, and accessible via the project website (http://vdl.sci.utah.edu/projects/2018-nsf-reproducibility/).

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Reproducible Visual Analysis of Multivariate Networks with Multinet

Miriah Meyer, Bryan Jones, Alexander Lex
Multivariate networks -- datasets that link together entities that are associated with multiple different variables -- are a critical data representation for a range of high-impact problems, from understanding how our bodies work to uncovering how social media influences society. These data representations are a rich and complex reflection of the multifaceted relationships that exist in the world. Reasoning about a problem using a multivariate network allows an analyst to ask questions beyond those about explicit connectivity alone: Do groups of social-media influencers have similar backgrounds or experiences? Do species that co-evolve live in similar climates? What patterns of cell-types support different types of brain functions? Questions like these require understanding patterns and trends about entities with respect to both their attributes and their connectivity, leading to inferences about relationships beyond the initial network structure. As data continues to become an increasingly important driver of scientific discovery, datasets of networks have also become increasingly complex. These networks capture information about relationships between entities as well as attributes of the entities and the connections. Tools used in practice today provide very limited support for reasoning about networks and are also limited in the how users can interact with them. This lack of support leaves analysts and scientists to piece together workflows using separate tools, and significant amounts of programming, especially in the data preparation step. This project aims fill this critical gap in the existing cyber-infrastructure ecosystem for reasoning about multivariate networks by developing MultiNet, a robust, flexible, secure, and sustainable open-source visual analysis system.

MultiNet aims to change the landscape of visual analysis capabilities for reasoning about and analyzing multivariate networks. The web-based tool, along with an underlying plug-in-based framework, will support three core capabilities: (1) interactive, task-driven visualization of both the connectivity and attributes of networks, (2) reshaping the underlying network structure to bring the network into a shape that is well suited to address analysis questions, and (3) leveraging provenance data to support reproducibility, communication, and integration in computational workflows. These capabilities will allow scientists to ask new classes of questions about network datasets, and lead to insights about a wide range of pressing topics. To meet this goal, we will ground the design of MultiNet in four deeply collaborative case studies with domain scientists in biology, neuroscience, sociology, and geology.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Visualizing Robust Features in Vector and Tensor Fields

Bei Wang
Vector and tensor fields provide a powerful language to describe physical phenomena in many scientific applications. In atmospheric sciences, vectors are used to represent air movements with speed and directions and to capture typical and atypical atmospheric conditions. In materials science, stress and strain tensors are used to specify the behaviors of material bodies experiencing deformations and to facilitate the study of material strength. The main objective of this project is to define and quantify robust features in vector and tensor fields and to derive scientifically meaningful visualization for knowledge discovery. Robust features are objects, structures, or regions of interest that are stable under small perturbations of the data that arise from measurement noise, numerical instability or simulation uncertainty. Robust features are defined and evaluated via close collaborations with domain scientists to help them discriminate spurious from essential structures in the data. In materials science, the extraction of robust features in stress tensor fields will help the materials scientists better characterize and predict 3D cracking for manufacturing stronger materials. In neuroscience, quantifying the robustness of degenerate elements in brain imaging will offer new metrics and visualization in characterizing tissue microstructure for disease diagnostics. In bioengineering, robust vortex extraction and tracking of 3D conduction velocity fields in the heart will help bioengineers develop new metrics that detect and characterize ischemic stress associated with a heart attack. In atmospheric sciences, extracting and visualizing robust features in wind data will help the atmospheric scientists establish situation awareness of hazardous weather conditions such as wildfires and to provide wildfire weather forecasting and resource planning for firefighting personnel. This project will also provide a unique environment for multidisciplinary activities and training opportunities for students in integrating visualization with scientific applications.

This project will establish a new approach to feature-based visualization with three interconnected aims. First, it will derive novel mathematical formulations of robust features for vector and tensor fields and their ensembles. Second, it will develop new robustness-driven algorithms in feature extraction, tracking, simplification, visual representation, and uncertainty visualization. Third, it will apply and evaluate the proposed framework via close collaborations with scientists in four high-impact application areas: materials science, neuroscience, bioengineering, and atmospheric sciences. Using simulated micro-mechanical fields in an uncracked polycrystal, the project will integrate robust features with visualization to improve the interpretability of micro-mechanical fields and the prediction of fatigue-failure surfaces. Using diffusion tensor imaging (DTI) from the Human Connectome Project, the project will investigate quantifiable characteristics of crossing fibers as part of a long-term goal for deep brain stimulator placement. Using 3D conduction velocity generated in volumes of swine and canine tissues, the project will generate feature-based signatures from vortex stability and evolution and use them, in the long term, for disease diagnostics and medical intervention. Using ensemble datasets generated from the High-Resolution Rapid Refresh Model (HRRR), the project will use robust features in the visualization and statistical analysis of atmospheric models to identify atypical atmospheric conditions for wildfire weather assessment. The research results will be instantiated by a collection of research papers and open-source software tools targeting the communities of collaborating scientists and the large research community. These software tools will be made available via GitHub under MIT or BSD licenses.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

EAGER: Understanding and Mitigating Misinformation in Visualizations on Social Media

Alexander Lex
In a time of crisis, such as during a hurricane or a global pandemic, social media is an important source of information for the general population. In these scenarios, data visualizations are often used to convey information that is critical for decision making by individuals. For example, a visualization of the path of a hurricane can inform the affected population about the need to prepare or evacuate; while a visualization about the prevalence of a disease in a certain area can inform personal choices, such as limiting interactions with others during a relevant time period. Visualizations, however, can be flawed, which can lead to misinterpretation of the data, and, in a crisis, lead to decisions with negative consequences. This project seeks to identify aspects of visualizations that makes them widely shared, identify flaws a visualization might have, and warn social media users about them. Ultimately, this project can lead to better responses to a crisis by the general population, and contribute to improving visualization literacy. Finally, this project will also enable the training of two graduate students, provide opportunities for undergraduate research, and curate material that can be leveraged by educators teaching about visualization design.

These goals will be achieved by applying existing and novel methods, such as topic modeling and calculating measures of social attention, to three large dataset of social media posts related to recent crisis. Using a qualitative coding approach, a taxonomy of design problems will be developed. This taxonomy will be used to label a large dataset. Finally, a prototype intervention in the form of a plug-in that warns of problematic visualizations, but also enables users to classify problems with visualizations they encounter, will be developed. The dataset and the annotations compiled in the course of this project will be shared publicly. The software created will be released under a permissive, non-viral open source license.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

FluoRender: Visualization-Based and Interactive Analysis for Multi-Channel Microscopy Data

Chuck Hansen
FluoRender is a software package for visualizing and analyzing 3D and 4D (3D over time) fluorescence microscopy data. This project will serve the needs of biologists utilizing confocal microscopy for understanding cell development in many organisms and addresses the big-data problem from the massive increase of imaging data from modern high-resolution fluorescence microscopes.

Specific Aim 1 : Visualization of an extended number of volume channels: FluoRender will be enhanced with the multichannel visualization capability by simultaneously supporting several tens to hundreds of channels, which can be acquired from multispectral imaging devices or by registering data of multiple scans. FluoRender will take advantage of the latest volume rendering techniques to visualize significantly improved signal intensity detail compared to pseudo-surfaces.

Specific Aim 2 : Interactive comparison and organization of volume channels: A package of measures will be implemented in FluoRender for directly comparing volume channels. Leveraging the OpenCL programming interface, shape comparisons will be performed interactively on graphics hardware, allowing compound measures for complex morphology as well as immediate visual feedback via multichannel visualization. Interactive comparison will further enable the development of functions for semiautomatic channel organization and multichannel colocalization analysis.

Specific Aim 3 : 4D tracking of structures with irregular and changing shapes: Tracking irregularly shaped and shape-changing structures will substantially expand FluoRender's application for developmental and morphological studies of intracellular organelles, cells, and tissues. This will include a comprehensive tracking system that integrates different modules and allows them to work in an iterative and integrated environment, allowing user-guided, progressive refinement of the segmentation and tracking results.

Specific Aim 4. Fully hardware-accelerated and customizable computing modules: FluoRender will be restructures using compute modules based on the OpenCL standard, which provides not only hardware-accelerated execution speed, but also convenience for customization and reuse. Computing modules will be integrated with visualization features, enabling interactive and visualization-centered analysis. Users will also be able to reorganize and build modules to customize specific workflows for great adaptability.

Public Health Relevance
FluoRender is a software package for visualizing and analyzing 3D and 4D (3D over time) fluorescence microscopy data. This project will serve the needs of biologists utilizing confocal microscopy for understanding cell development in many organisms and addresses the big-data problem from the massive increase of imaging data from modern high-resolution fluorescence microscopes.

CPS: Synergy: A Layered Framework of Sensors, Models, Land-Use Information and Citizens for Understanding Air Quality in Urban Environments

Miriah Meyer, Ross Whitaker, Kerry Kelly, Pierre-Emmanuel Gaillardon
Poor air quality has been linked to not just adverse health effects such as increased incidence of cardiac arrhythmia, lung cancer, heart disease, and mortality, but also to the vitality of a region’s economy. These issues are particularly important in cities such as Salt Lake City (SLC), where topography, climate, and urban expansion combine to create some of the worst air quality episodes in the country. Cities like SLC currently rely on small numbers of expensive sensors placed across a large geographic area to measure air quality, making local, neighborhood-level measurements impossible to determine. Meanwhile, new commodity technologies are leading to fine-grained, community-based strategies for measuring and communicating air quality. Leveraging both of these approaches, this project will develop and deploy a dense, distributed, and dynamic air quality cyber-physical framework -- focusing on fine particulate matter and using SLC as an urban testbed -- to produce neighborhood-level estimates of air quality. The framework includes a network of low-cost sensors, hosted and maintained through a citizen science effort and maker-kit approach.

This research will result in novel developments in three areas: (i) sensor development that focuses on dramatically reducing cost and a movement toward cheap, wearable, passive sensors; (ii) computational modeling that combines heterogeneous sensor measurements with information about weather, topography, and land use patterns; and (iii) visualization interface design that communicates air quality estimates over space and time, coupled with related uncertainty measurements. Each of these areas requires a multidisciplinary approach that integrates existing and novel insights about sensor networks, computational modeling, and sense-making of data, as well as leveraging an engaged and connected community of residents through citizen science.

SBIR Phase II Immediate Delivery of Massive Aerial Imagery to Farmers and Crop Consultants

Valerio Pascucci, Amy Gooch
This Small Business Innovation Research (SBIR) Phase II project will accelerate the adoption of data intensive precision agriculture, increasing yields while decreasing farm inputs such as fertilizers and pesticides. This project removes the software bottleneck (time and labor) in processing large aerial surveys taken by Unmanned Aerial Systems, enabling a cost-effective and timely process to deliver actionable information to farmers. Using frequent high-quality aerial scans, farmers may optimize the use of fertilizers and more finely control the amount of pesticides and herbicides necessary to increase crop yield. Furthermore, farmers mitigate costs and losses by being able to spot problem areas, minimize the spread of plant diseases, and identify issues such as standing water, irrigation malfunctions, and persistent automated machinery errors in planting or cultivation. This project provides special benefit for rural customers having inadequate internet infrastructure by eliminating the need to upload massive imagery to the cloud for processing. The technology is part of a broad initiative in agriculture addressing the need for large increases in food production by 2050 in response to the projected growth of the world’s population to over 9 Billion people.

This project will continue development of algorithms for on-the-fly orthorectification, stitching, and normalization of aerial image mosaics and their deployment in an easy-to-use software prototype. The Phase I already demonstrated industry-leading speeds for such image processing. The technology behind this research project is designed from the ground up to process massive data with less memory and increased speed relative to other approaches, enabled by a proprietary streaming image representation, that allows multichannel gigapixel and terapixel images to be treated as ordinary images. This Phase II supports new extensions to the software that simplify and accelerate delivering a stitched and analyzed map, such as prioritizing computation in regions of the image that a customer is exploring. This would effectively eliminate the delay between image acquisition on unmanned aerial vehicles and when it can be used. Crop consultants have identified this as a transformative capability, as it enables ground-truthing information derived from aerial imagery in the same field visit, saving time and labor. The performance gains in compute-limited environments supported by this project are a key link between new capabilities to gather information and a farmer’s ability to utilize it to increase productivity while reducing costs.

Topology-Preserving Data Sketching for Scientific Visualization

Bei Wang
We are experiencing an information overload from streams of data that arise from scientific instruments and simulations. For example, material scientists use molecular dynamics (MD) simulations to study how fluids (such as gas, oil, and water) interact with heterogeneous porous solids (such as ceramics, cement, and rock) to improve transport phenomena within porous materials, which play critical roles in our energy sector. Such simulations generate large, time-varying, and complex forms of data under different physical and chemical conditions. Keeping track of interesting phenomena and applying appropriate actions (such as storage, analysis, and visualization) while the simulation is running is necessary but challenging. To address this challenge, the goal is no longer to capture and store observations or simulation in detail, but rather to process data efficiently and approximately in order to create a summary - a sketch - which allows queries over large volumes of data to be answered quickly.

The objective of this research is to conduct a systematic study of topology-preserving data sketching techniques to improve visual exploration and understanding of large scientific data. The project will employ topological sketches, that is, compressed representations of the full data that preserve their important structural properties, to support analysis and visualization as the data are generated. Our proposed solution transforms data sketching ideas from statistics, geometry, and linear algebra to develop new topological sketches of complex data. Such sketches will exploit the high spatial resolution and temporal fidelity of in situ data in an intelligent and scalable way. They will reduce data in situ while preserving its structural properties, and subsequently support interactive data exploration. In addition, topological triggers will be integrated into an adaptive workflow to support anomaly detection, computational steering, and decision optimization. The multidisciplinary nature of the proposed work will be broadly applicable in many scientific areas, including applications in computational fluid dynamics and materials science.

Novel 3d Experiments and Simulations Combined with Genetic Optimization for Accelerated Design of Metallic Foams

Valerio Pascucci
Open-cell metallic foams are an exciting class of structural materials that comprise a network of interconnected metallic ligaments, resulting in an interesting foam architecture. These low-density materials have garnered much attention over the past two decades based on their recognized potential for use in multi-functional applications. For example, in addition to serving as light-weight, load-bearing structures, open-cell metallic foams have the potential to serve concurrently as electrodes for energy-storage devices, as hosts for newly generated bone and blood vessels in biomedical implants, or as impact absorbers and noise insulators for advanced high-speed ground transportation. Despite their potential, the widespread deployment of open-cell metallic foams for a broader range of multi-functional applications remains hampered by inefficient, trial-and-error manufacturing approaches. This Designing Materials to Revolutionize and Engineer our Future (DMREF) Grant Opportunities for Academic Liaison with Industry (GOALI) award supports a joint academic-industry research effort to enable more efficient and intelligent design of open-cell metallic foams, and to achieve precise control over their performance for targeted applications. The results will provide dramatic improvements for the industry by increasing both the manufacturing efficiency and the tailorability of the foams, which will help to expand deployment of the foams throughout the energy, defense, biomedical, aerospace, and automotive industries. The research team will host outreach activities to expose students in K-12, undergraduate, and graduate school to this multi-disciplinary STEM research.

This DMREF GOALI award supports research to enable an accelerated and performance-based design paradigm for open-cell metallic foams through the integration of emergent methods in 3D materials characterization with multi-scale modeling and Bayesian optimization. The new design paradigm will be made possible through the discovery of process-structure-property relationships in the foams. The specific objectives include: experimentally modifying manufacturing parameters to produce variants of open-cell metallic foams; performing 3D synchrotron-based crystal-orientation measurements and in-situ X-ray computed tomography experiments to gain unprecedented insight into the hierarchical structure and multi-scale deformation mechanisms of the foam; using high-fidelity, multi-scale (grain-to-continuum) finite-element modeling to investigate micromechanical behavior and predict performance of the as-manufactured foams; conducting virtual tests on synthetic-foam variants to further populate a metallic-foam design space; and using Bayesian optimization on the simulation-based results to enable selection of optimal hierarchical structures (i.e. topology and crystallography) for targeted performance metrics. The research will be a first to decouple the effects of ligament topology and underlying crystal structure on micromechanical behavior of open-cell metallic foams (including microbuckling, local accumulation of slip, and distribution of crack-nucleation sites), which is postulated to influence its performance.

A Scalable Framework for Visual Exploration and Hypotheses Extraction of Phenomics Data

Bei Wang
Understanding how gene by environment interactions result in specific phenotypes is a core goal of modern biology and has real-world impacts on such things as crop management. Developing and managing successful crop practices is a goal that is fundamentally tied to our national food security. By applying novel computational visual analytical methods, this project seeks to identify and unravel the complex web of interactions linking genotypes, environments and phenotypes. These methods will first need to be designed and developed into usable software applications that can handle large volumes of crop phenomics data. High-throughput sensing technologies collect large volumes of field data for many plant traits, such as flowering time, related to crop development and production. The maize cultivars used here come from multiple genotypes that have been grown under a variety of environmental conditions, in order to give the widest range of conditions for understanding the interactions. The resulting data sets are growing quickly, both in size and complexity, but the analytical tools needed to extract knowledge and catalyze scientific discoveries have significantly lagged behind. The methodologies to be developed in this project represent a systematic attempt at bridging this rapidly widening divide. The project is inherently interdisciplinary, involving close research partnerships among computer scientists, plant scientists, and mathematicians. The research outcomes will be tightly integrated with education using a multipronged approach that includes, among others, postdoctoral and student training (graduates and undergraduates), curriculum development for a new campus-wide interdisciplinary undergraduate degree in Data Analytics, conference tutorials for training phenomics data practitioners, and contribution to the recruitment and retention of underrepresented minorities (particularly women) in STEM fields through the Pacific Northwest Louis Stokes Alliance for Minority Participation.

This project will lead to the design and development of a new, scalable, visual analytics platform suitable for hypothesis extraction and refinement from complex phenomics data sets. Focus on hypothesis extraction is critical in the context of phenomics data sets because much of the high-throughput sensing data being generated in crop fields are generated in the absence of specifically formulated hypotheses. Extracting plausible hypotheses from the data represents an important but tedious task. To this end, this project will apply and develop new capabilities using emerging advanced algorithmic principles, particularly from the branch of mathematics called algebraic topology that studies shapes and structure of complex data. The research objectives are three-fold. First, the project will employ and extend emerging algorithmic techniques from algebraic topology to decode the structure of large, complex phenomics data. Second, an interactive visual analytic platform will be developed to facilitate knowledge discovery using the extracted topological structures. Lastly, the quality and validity of a new visual analytic platform designed by this team will be tested using real-world maize data sets as well as simulated inputs as testbeds. The developed framework will encode functions for scientists to delineate hypotheses of three kinds: i) genetic characterization of single complex traits; ii) genetic characterization of multiple traits that share potentially pleiotropic effects; and iii) decoding and detailed characterization of genotype-by-environmental interactions, in particular, through a collaborative pilot study of maize flowering and growth traits. The expected significance of the proposed work is that biologists will be able to extract different types of testable hypotheses from plant phenomics data sets by employing a new class of visual analytic tools, and thus obtain a deeper understanding of the interactions among genotypes, environments and phenotypes. The project is potentially transformative in two ways: i) it will introduce advanced mathematical and computational principles into mainstream phenomic data analysis; and ii) it will usher in a new era where biologists spearhead data-driven hypothesis extraction and discovery with the aid of interactive, informative, and intuitive tools. The project will have a direct impact on the state of software in phenomics for fundamental data-driven discovery. To facilitate broader community adoption, the project will integrate the tools into the CyVerse Institute, and to a community phenomics software outlet. It will also lead to the development of automated scientific workflows. Project website: http://tdaphenomics.eecs.wsu.edu/.

COVID - RAPID: Building a Visual Consensus Model of the SARS-CoV-2 Life Cycle

Janet Iwasa, Miriah Meyer
The COVID-19 epidemic has motivated hundreds (if not thousands) of biological researchers around the globe to redirect their research efforts towards the understanding of SARS-CoV-2. This is leading to an explosion of data and it will be essential to find ways to rapidly digest and integrate new information into a context that facilitates consensus building in the research community. How do researchers and the broader community stay abreast of this flood of information? And how can we quickly move towards building a consensus model of the SARS-CoV-2 life cycle that builds on this explosive body of scientific data and expertise? This work proposes to take a novel and intuitive approach to facilitate scientific discourse and dissemination through the development of: (1) detailed molecular 3-D depictions that put a diverse dataset into the context of the SARS-CoV-2 life cycle, and; (2) provide for annotation tools to be used by researchers to explore and capture scientific discussions that will speed up consensus building to promote a mechanistic understanding of how this virus works. If successful, the work will reduce the time of consensus building from years to months. In addition, a graduate student and postdoc will receive training at the intersection of biological and computer sciences.

Specifically, researchers will work with an international group of SARS-CoV-2 experts to develop detailed and accurate visualizations of all stages of the viral life cycle including cellular entry, RNA replication and transcription, and viral assembly and egress with known energy states, rates, and spatial accuracy. These 3-D visualizations, which will be made freely available online, will be used to stimulate discussions within the scientific community, and will be iteratively updated based on community feedback and new data. To facilitate consensus building, annotation tools will be developed to interactively describe the data used to generate the visualizations and will also mediate and capture scientific discourse surrounding the various molecular mechanisms involved in viral infection. This project will rapidly produce a rich and publicly accessible collection of knowledge about SARS-CoV-2 biology for the global community.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

OpenSpace: An Engine for Dynamic Visualization of Earth and Space Science for Informal Education and Beyond

Chuck Hansen
The American Museum of Natural History (AMNH), in collaboration with informal science institutions (ISI), NASA mission teams and Subject Matter Experts (SME), and academic partners, seeks support for a five-year project to enable STEM education and improve U.S. scientific literacy by engaging a broad spectrum of the American public and STEM learners in cutting-edge NASA science and engineering content.

This project will develop an open source software, called OpenSpace, for visualizing NASA astrophysics, heliophysics, planetary science, and Earth science mission engineering activities and science results for the general public, students, teachers, and citizen scientists everywhere. The project will develop and widely disseminate OpenSpace; create innovative and networked programs with ISI partners; produce educational resources for middle and high school teachers and students; and establish robust partnerships with NASA SMD missions, ISIs, and visualization research centers.

The project is based on the success of pilot efforts to visualize the New Horizons mission and heliophysics and space weather simulation data generated by NASA Goddard’s Community Coordinated Modeling Center. It builds on AMNH’s expertise in science visualization and its record of success in partnering with NASA to develop innovative programming, exhibitions, and Space Shows that engage, inspire, and educate students, teachers, and learners of all ages.

Drawing together a highly qualified and exceptionally talented team of scientists, educators, software engineers, and visualization specialists, the project’s aim is to build a pipeline for transmitting visualized science content from across NASA SMD divisions to ISIs, secondary school classrooms, and the public.

To do so, the project proposes the following objectives:

Develop OpenSpace into a robust and flexible interactive visualization software that supports the presentation of dynamic data sets and that is easily updated for the presentation of current science.
Form a network of ISIs to inform the development of OpenSpace and develop associated programming to engage and educate diverse audiences.
Disseminate OpenSpace via the web to individual users, including teachers as a key audience, with resources for leveraging it as an educational tool.

Project outcomes include:

The establishment of a pipeline connecting NASA SMD content and SMEs with ISIs, secondary school classrooms, and the public.
The development of a new and powerful educational tool for the visualization of a wide range of NASA SMD mission activities and data products.
Enhanced understanding and engagement in STEM among youth, informal and formal educators, and the general public.

Project objectives, activities, and outcomes are closely aligned with, and aim to fulfill, the SMD science education objectives of enabling STEM education, improving U.S. scientific literacy, and advancing national education goals of increasing and sustaining youth and public engagement in STEM and leveraging efforts through partnerships.

Because OpenSpace will be open source, it will be freely accessible to users. It is designed to be compatible with multi-video channel cluster operations for high-resolution wall displays and planetarium domes, as well as for single-channel polar rendering fisheye projections and flat screens, in 2D and 3D. A WebGL version will make it possible for anyone with Internet access to explore OpenSpace. Another core design principle of this project is the ability to network across the Internet to synchronize displays in different locations, creating opportunities for shared experiences of high profile NASA content, including live events. This open source project will have a life far beyond the award period, as it will provide science and education communities access to the source code to modify, enhance, and extend its functionality to best serve audiences in the future.

Extracting the Full Information Content of Astrophysical Data Cubes

Bei Wang
An IFU (Integral Field Unit Spectrometer) allows one to take a high-resolution spectrum at multiple physical locations within an external target. The signal from an astronomical target is distributed into a large number of spaxels (spatial pixels), each with noise from the sky and detectors, and a greatly varying signal to noise ratio across the bundle. IFU bundle technique gives rise to 3-dimensional astrophysical data cubes (two spatial directions and one frequency direction) that require advanced analysis techniques to extract their salient features. In many cases the complex kinematic structure of features of interest further complicates the problem. Furthermore, it is intrinsically difficult to visualize such data and common analysis techniques often involve slicing the data cube along a particular axis, either at a fixed frequency or a fixed spatial location.

A common type of data from IFU bundle technique is the Mapping Nearby Galaxies at APO (MaNGA) survey, which is part of the Sloan Digital Sky Survey IV (SDSS-IV). PI Phillips and PI Rosen have been working to analyze similar data cubes taken at radio frequencies with the ALMA telescope in Chile (see http://alma-tda.cspaul.com). They have been using sophisticated mathematical techniques known as topological data analysis, in particular the contour tree, in order to extract features and remove noise for visualizing data cubes very similar to the ones arise from IFU.

Objective
We would like to apply advanced data analysis and visualization techniques, in particular, those from topological data analysis, to data observed at UV, optical and infrared wavelengths, in order to extract features that are currently inaccessible. In particular, we would like to start by studying the SDSS-IV MaNGA dataset, to which Carnegie Institution for Science and the University of Utah (where the MaNGA reduction and analysis pipelines are run via the Center for High Performance Computing) have full access as Institutional members (the SDSS Data Scientist, Prof. Joel Brownstein of the University of Utah is a PI on this project).

Furthermore, we will explore the applicability of such techniques to other similar datasets that have been acquired using other IFU facilities.

Topological Analysis for Energetic Materials Characterization

Valerio Pascucci
This statement of work supports ongoing efforts towards improved analysis of characterization and surveillance data of energetic materials. The goals are to: 1) use topological segmentations to analyze microstructural changes under aging; 2) explore extending the analysis tools to characterize fine-prill materials; 3) develop techniques to quantify permeable surface area of a lower-density system; and 4) extract age-trendable features from2D-surface profile data.

Tasks

1. Analyze microstructural changes under aging: At various Aging points (in time-temperature space):

Determine matching scales and simplification levels to create best matching segmentations for each dataset
Develop techniques to affinely align pre- & post-aged data sets for maximal correspondence
Use per-grain matching to analyze material changes over time

2. Explore extending the analysis tools to the characterization of fine-prill materials:

In previous years the Utah technology could successfully analyze X-ray CT data for coarser-prill HE materials. Explore the effectiveness of such technology in performing similar analysis on X-ray CT data for fine-prill systems.

3. Develop techniques to quantify permeable surface area of lower-density systems:

The topological segmentation theory could be used to quantify the permeable surface area of lower-density (e.g., porous-powder) systems, and to compute the gas-flow rate through such a specimen under a given pressure-gradient. CONTINGENCY: Availability of high-quality micro-CT data.

4. Extract age-trendable features fromsurface profilometry data

Analyze 2D height-map data from pellet surfaces (measured using a surface profilometer) and device quantitative features that can be used to track age-related changes in material morphology and performance.

Advanced Visualization of Silent Error Propagation in HPC Applications

Valerio Pascucci
High Performance Computing (HPC) systems contain increasingly large numbers of components. This trend, combined with practical limitations on component reliability, makes HPC systems vulnerable to a wide range of faults. These faults degrade systems efficiency and even threaten the correctness of application results. The problem is expected to grow even more significant for Exascale systems. Designing resilient software to run efficiently on such hardware is challenging, and uncertainty about how failures affect programs only complicates the problem.

Disruptions to the micro‐architectural state of hardware components (e.g., caches, reorder buffers or pipeline registers), may cause these components to crash or compute erroneous results. These errors then propagate through layers of the software stack, including the runtime system, support libraries, and application logic. Local memory access to erroneous results can easily propagate the effects of errors across cores; and the remote memory access on modern networks propagates errors across nodes. The reordered memory accesses in use by memory systems introduces further difficulties by obscuring the consistency (ordering) of memory accesses when errors occur. Identifying the propagation of errors through space and time and quantifying it in terms developers can understand is a major problem for error recovery schemes. This is especially true for scientific applications that rely on complex physical or numerical invariants and for resilience techniques that need to identify consistent states.

The ultimate goal of this research is to provide a visualization of the propagation of errors through application and system software in order to identify for application developers the vulnerability of their data structures and code regions to different types of errors, and the way these errors propagate through application state and logic.

VisStore: Seamless Acquisition, Storage, and Distribution of Massive Imagery

Ease of Use and Deployment for a Fast, Scalable Data Movement Infrastructure

Publications in Visualization:

Page 4 of 23

Start
Prev
1
2
3
4
5
6
7
8
9
10
Next
End

Uncertainty Visualization of the Marching Squares and Marching Cubes Topology Cases
Subtitled “arXiv:2108.03066,” T. M. Athawale, S. Sane, C. R. Johnson. 2021.

Marching squares (MS) and marching cubes (MC) are widely used algorithms for level-set visualization of scientific data. In this paper, we address the challenge of uncertainty visualization of the topology cases of the MS and MC algorithms for uncertain scalar field data sampled on a uniform grid. The visualization of the MS and MC topology cases for uncertain data is challenging due to their exponential nature and the possibility of multiple topology cases per cell of a grid. We propose the topology case count and entropy-based techniques for quantifying uncertainty in the topology cases of the MS and MC algorithms when noise in data is modeled with probability distributions. We demonstrate the applicability of our techniques for independent and correlated uncertainty assumptions. We visualize the quantified topological uncertainty via color mapping proportional to uncertainty, as well as with interactive probability queries in the MS case and entropy isosurfaces in the MC case. We demonstrate the utility of our uncertainty quantification framework in identifying the isovalues exhibiting relatively high topological uncertainty. We illustrate the effectiveness of our techniques via results on synthetic, simulation, and hixel datasets.

Data-Driven Estimation of Temporal-Sampling Errors in Unsteady Flows
H. Bhatia, S. N. Petruzza, R. Anirudh, A. G. Gyulassy, R. M. Kirby, V. Pascucci, P. T. Bremer. 2021.

While computer simulations typically store data at the highest available spatial resolution, it is often infeasible to do so for the temporal dimension. Instead, the common practice is to store data at regular intervals, the frequency of which is strictly limited by the available storage and I/O bandwidth. However, this manner of temporal subsampling can cause significant errors in subsequent analysis steps. More importantly, since the intermediate data is lost, there is no direct way of measuring this error after the fact. One particularly important use case that is affected is the analysis of unsteady flows using pathlines, as it depends on an accurate interpolation across time. Although the potential problem with temporal undersampling is widely acknowledged, there currently does not exist a practical way to estimate the potential impact. This paper presents a simple-to-implement yet powerful technique to estimate the error in pathlines due to temporal subsampling. Given an unsteady flow, we compute pathlines at the given temporal resolution as well as subsamples thereof. We then compute the error induced due to various levels of subsampling and use it to estimate the error between the given resolution and the unknown ground truth. Using two turbulent flows, we demonstrate that our approach, for the first time, provides an accurate, a posteriori error estimate for pathline computations. This estimate will enable scientists to better understand the uncertainties involved in pathline-based analysis techniques and can lead to new uncertainty visualization approaches using the predicted errors.

Reusing Interactive Analysis Workflows
Subtitled “OSF Preprints,” K. Gadhave, Z.T. Cutler, A. Lex. 2021.

Interactive visual analysis has many advantages, but has the disadvantage that analysis processes and workflows cannot be easily stored and reused, which is in contrast to scripted analysis workflows using a programming language such as Python. In this paper, we introduce methods to semantically capture workflows in interactive visualization systems for different interactions such as selections, filters, categorizing/grouping, labeling, and aggregation. We design these workflows to be robust to updates in the dataset by capturing the semantics of underlying interactions, and, hence, they can be applied to updated datasets. We demonstrate this specification using a prototype that visualizes the data, shows interaction provenance, and allows generating workflows from this provenance. Finally, we introduce a Python library that can consume the workflow and apply it to the datasets, providing a seamless bridge between computational workflows and interactive visualization tools. We demonstrate our techniques using our UI prototype and Jupyter notebooks.

Towards replacing physical testing of granular materials with a Topology-based Model
Subtitled “arXiv preprint arXiv:2109.08777,” A. Venkat, A. Gyulassy, G. Kosiba, A. Maiti, H. Reinstein, R. Gee, P.-T. Bremer, V. Pascucci. 2021.

In the study of packed granular materials, the performance of a sample (e.g., the detonation of a high-energy explosive) often correlates to measurements of a fluid flowing through it. The "effective surface area," the surface area accessible to the airflow, is typically measured using a permeametry apparatus that relates the flow conductance to the permeable surface area via the Carman-Kozeny equation. This equation allows calculating the flow rate of a fluid flowing through the granules packed in the sample for a given pressure drop. However, Carman-Kozeny makes inherent assumptions about tunnel shapes and flow paths that may not accurately hold in situations where the particles possess a wide distribution in shapes, sizes, and aspect ratios, as is true with many powdered systems of technological and commercial interest. To address this challenge, we replicate these measurements virtually on micro-CT images of the powdered material, introducing a new Pore Network Model based on the skeleton of the Morse-Smale complex. Pores are identified as basins of the complex, their incidence encodes adjacency, and the conductivity of the capillary between them is computed from the cross-section at their interface. We build and solve a resistive network to compute an approximate laminar fluid flow through the pore structure. We provide two means of estimating flow-permeable surface area: (i) by direct computation of conductivity, and (ii) by identifying dead-ends in the flow coupled with isosurface extraction and the application of the Carman-Kozeny equation, with the aim of establishing consistency over a range of particle shapes, sizes, porosity levels, and void distribution patterns.

Visualizing Interactions Between Solar Photovoltaic Farms and the Atmospheric Boundary Layer
T. M. Athawale, B. J. Stanislawski, S. Sane,, C. R. Johnson. In Twelfth ACM International Conference on Future Energy Systems, pp. 377--381. 2021.

The efficiency of solar panels depends on the operating temperature. As the panel temperature rises, efficiency drops. Thus, the solar energy community aims to understand the factors that influence the operating temperature, which include wind speed, wind direction, turbulence, ambient temperature, mounting configuration, and solar cell material. We use high-resolution numerical simulations to model the flow and thermal behavior of idealized solar farms. Because these simulations model such complex behavior, advanced visualization techniques are needed to investigate and understand the results. Here, we present advanced 3D visualizations of numerical simulation results to illustrate the flow and heat transport in an idealized solar farm. The findings can be used to understand how flow behavior influences module temperatures, and vice versa.

Predicting intent behind selections in scatterplot visualizations
K. Gadhave, J. Görtler, Z. Cutler, C. Nobre, O. Deussen, M. Meyer, J.M. Phillips, A. Lex. In Information Visualization, Vol. 20, No. 4, pp. 207-228. 2021.
DOI: 10.1177/14738716211038604

Predicting and capturing an analyst’s intent behind a selection in a data visualization is valuable in two scenarios: First, a successful prediction of a pattern an analyst intended to select can be used to auto-complete a partial selection which, in turn, can improve the correctness of the selection. Second, knowing the intent behind a selection can be used to improve recall and reproducibility. In this paper, we introduce methods to infer analyst’s intents behind selections in data visualizations, such as scatterplots. We describe intents based on patterns in the data, and identify algorithms that can capture these patterns. Upon an interactive selection, we compare the selected items with the results of a large set of computed patterns, and use various ranking approaches to identify the best pattern for an analyst’s selection. We store annotations and the metadata to reconstruct a selection, such as the type of algorithm and its parameterization, in a provenance graph. We present a prototype system that implements these methods for tabular data and scatterplots. Analysts can select a prediction to auto-complete partial selections and to seamlessly log their intents. We discuss implications of our approach for reproducibility and reuse of analysis workflows. We evaluate our approach in a crowd-sourced study, where we show that auto-completing selection improves accuracy, and that we can accurately capture pattern-based intent.

Leveraging Topological Events in Tracking Graphs for Understanding Particle Diffusion
T. McDonald, R. Shrestha, X. Yi, H. Bhatia, D. Chen, D. Goswami, V. Pascucci, T. Turbyville, P‐T Bremer. In Computer Graphics Forum, Vol. 40, No. 3, pp. 251-262. 2021.

Single particle tracking (SPT) of fluorescent molecules provides significant insights into the diffusion and relative motion of tagged proteins and other structures of interest in biology. However, despite the latest advances in high-resolution microscopy, individual particles are typically not distinguished from clusters of particles. This lack of resolution obscures potential evidence for how merging and splitting of particles affect their diffusion and any implications on the biological environment. The particle tracks are typically decomposed into individual segments at observed merge and split events, and analysis is performed without knowing the true count of particles in the resulting segments. Here, we address the challenges in analyzing particle tracks in the context of cancer biology. In particular, we study the tracks of KRAS protein, which is implicated in nearly 20% of all human cancers, and whose clustering and aggregation have been linked to the signaling pathway leading to uncontrolled cell growth. We present a new analysis approach for particle tracks by representing them as tracking graphs and using topological events – merging and splitting, to disambiguate the tracks. Using this analysis, we infer a lower bound on the count of particles as they cluster and create conditional distributions of diffusion speeds before and after merge and split events. Using thousands of time-steps of simulated and in-vitro SPT data, we demonstrate the efficacy of our method, as it offers the biologists a new, detailed look into the relationship between KRAS clustering and diffusion speeds.

Investigating In Situ Reduction via Lagrangian Representations for Cosmology and Seismology Applications,
S. Sane, C. R. Johnson, H. Childs. In Computational Science -- ICCS 2021, Springer International Publishing, pp. 436--450. 2021.
DOI: 10.1007/978-3-030-77961-0_36

Although many types of computational simulations produce time-varying vector fields, subsequent analysis is often limited to single time slices due to excessive costs. Fortunately, a new approach using a Lagrangian representation can enable time-varying vector field analysis while mitigating these costs. With this approach, a Lagrangian representation is calculated while the simulation code is running, and the result is explored after the simulation. Importantly, the effectiveness of this approach varies based on the nature of the vector field, requiring in-depth investigation for each application area. With this study, we evaluate the effectiveness for previously unexplored cosmology and seismology applications. We do this by considering encumbrance (on the simulation) and accuracy (of the reconstructed result). To inform encumbrance, we integrated in situ infrastructure with two simulation codes, and evaluated on representative HPC environments, performing Lagrangian in situ reduction using GPUs as well as CPUs. To inform accuracy, our study conducted a statistical analysis across a range of spatiotemporal configurations as well as a qualitative evaluation. In all, we demonstrate effectiveness for both cosmology and seismology—time-varying vector fields from these domains can be reduced to less than 1% of the total data via Lagrangian representations, while maintaining accurate reconstruction and requiring under 10% of total execution time in over 80% of our experiments.

Scalable In Situ Computation of Lagrangian Representations via Local Flow Maps
S. Sane, A. Yenpure, R. Bujack, M. Larsen, K. Moreland, C. Garth, C. R. Johnson,, H. Childs. In Eurographics Symposium on Parallel Graphics and Visualization, The Eurographics Association, 2021.
DOI: 10.2312/pgv.20211040

In situ computation of Lagrangian flow maps to enable post hoc time-varying vector field analysis has recently become an active area of research. However, the current literature is largely limited to theoretical settings and lacks a solution to address scalability of the technique in distributed memory. To improve scalability, we propose and evaluate the benefits and limitations of a simple, yet novel, performance optimization. Our proposed optimization is a communication-free model resulting in local Lagrangian flow maps, requiring no message passing or synchronization between processes, intrinsically improving scalability, and thereby reducing overall execution time and alleviating the encumbrance placed on simulation codes from communication overheads. To evaluate our approach, we computed Lagrangian flow maps for four time-varying simulation vector fields and investigated how execution time and reconstruction accuracy are impacted by the number of GPUs per compute node, the total number of compute nodes, particles per rank, and storage intervals. Our study consisted of experiments computing Lagrangian flow maps with up to 67M particle trajectories over 500 cycles and used as many as 2048 GPUs across 512 compute nodes. In all, our study contributes an evaluation of a communication-free model as well as a scalability study of computing distributed Lagrangian flow maps at scale using in situ infrastructure on a modern supercomputer.

Distributed merge forest: a new fast and scalable approach for topological analysis at scale
X. Huang, P. Klacansky, S. Petruzza, A. Gyulassy, P.T. Bremer, V. Pascucci. In Proceedings of the ACM International Conference on Supercomputing, pp. 367-377. 2021.

Topological analysis is used in several domains to identify and characterize important features in scientific data, and is now one of the established classes of techniques of proven practical use in scientific computing. The growth in parallelism and problem size tackled by modern simulations poses a particular challenge for these approaches. Fundamentally, the global encoding of topological features necessitates inter process communication that limits their scaling. In this paper, we extend a new topological paradigm to the case of distributed computing, where the construction of a global merge tree is replaced by a distributed data structure, the merge forest, trading slower individual queries on the structure for faster end-to-end performance and scaling. Empirically, the queries that are most negatively affected also tend to have limited practical use. Our experimental results demonstrate the scalability of both the merge forest construction and the parallel queries needed in scientific workflows, and contrast this scalability with the two established alternatives that construct variations of a global tree.

NViSII: A Scriptable Tool for Photorealistic Image Generation
Subtitled “arXiv preprint arXiv:2105.13962,” N. Morrical, J. Tremblay, Y. Lin, S. Tyree, S. Birchfield, V. Pascucci, I. Wald. 2021.

We present a Python-based renderer built on NVIDIA's OptiX ray tracing engine and the OptiX AI denoiser, designed to generate high-quality synthetic images for research in computer vision and deep learning. Our tool enables the description and manipulation of complex dynamic 3D scenes containing object meshes, materials, textures, lighting, volumetric data (e.g., smoke), and backgrounds. Metadata, such as 2D/3D bounding boxes, segmentation masks, depth maps, normal maps, material properties, and optical flow vectors, can also be generated. In this work, we discuss design goals, architecture, and performance. We demonstrate the use of data generated by path tracing for training an object detector and pose estimator, showing improved performance in sim-to-real transfer in situations that are difficult for traditional raster-based renderers. We offer this tool as an easy-to-use, performant, high-quality renderer for advancing research in synthetic data generation and deep learning.

Interactive Analysis for Large Volume Data from Fluorescence Microscopy at Cellular Precision
Y. Wan, H.A. Holman, C. Hansen. In Computers & Graphics, Vol. 98, Pergamon, pp. 138-149. 2021.
DOI: https://doi.org/10.1016/j.cag.2021.05.006

The main objective for understanding fluorescence microscopy data is to investigate and evaluate the fluorescent signal intensity distributions as well as their spatial relationships across multiple channels. The quantitative analysis of 3D fluorescence microscopy data needs interactive tools for researchers to select and focus on relevant biological structures. We developed an interactive tool based on volume visualization techniques and GPU computing for streamlining rapid data analysis. Our main contribution is the implementation of common data quantification functions on streamed volumes, providing interactive analyses on large data without lengthy preprocessing. Data segmentation and quantification are coupled with brushing and executed at an interactive speed. A large volume is partitioned into data bricks, and only user-selected structures are analyzed to constrain the computational load. We designed a framework to assemble a sequence of GPU programs to handle brick borders and stitch analysis results. Our tool was developed in collaboration with domain experts and has been used to identify cell types. We demonstrate a workflow to analyze cells in vestibular epithelia of transgenic mice.

Spatio-Temporal Visualization of Interdependent Battery Bus Transit and Power Distribution Systems
A. Bagherinezhad, M. Young, Bei Wang, M. Parvania. In IEEE PES Innovative Smart Grid Technologies Conference(ISGT), IEEE, 2021.

The high penetration of transportation electrification and its associated charging requirements magnify the interdependency of the transportation and power distribution systems. The emergent interdependency requires that system operators fully understand the status of both systems. To this end,a visualization tool is presented to illustrate the inter dependency of battery bus transit and power distribution systems and the associated components. The tool aims at monitoring components from both systems, such as the locations of electric buses, the state of charge of batteries, the price of electricity, voltage, current,and active/reactive power flow. The results showcase the success of the visualization tool in monitoring the bus transit and power distribution components to determine a reliable cost-effective scheme for spatio-temporal charging of electric buses.

TopoAct: Visually Exploring the Shape of Activations in Deep Learning
A. Rathore, N. Chalapathi, S. Palande, Bei Wang. In Computer Graphics Forum, Vol. 40, No. 1, pp. 382-397. 2021.

Deep neural networks such as GoogLeNet, ResNet, and BERT have achieved impressive performance in tasks such as image and text classification. To understand how such performance is achieved, we probe a trained deep neural network by studying neuron activations, i.e., combinations of neuron firings, at various layers of the network in response to a particular input. With a large number of inputs, we aim to obtain a global view of what neurons detect by studying their activations. In particular, we develop visualizations that show the shape of the activation space, the organizational principle behind neuron activations, and the relationships of these activations within a layer. Applying tools from topological data analysis, we present TopoAct, a visual exploration system to study topological summaries of activation vectors. We present exploration scenarios using TopoAct that provide valuable insights into learned representations of neural networks. We expect TopoAct to give a topological perspective that enriches the current toolbox of neural network analysis, and to provide a basis for network architecture diagnosis and data anomaly detection.

Mapper Interactive: A Scalable, Extendable, and Interactive Toolbox for the Visual Exploration of High-Dimensional Data.
Y. Zhou, N. Chalapathi, A. Rathore, Y. Zhao, Bei Wang. In IEEE Pacific Visualization Symposium, 2021.

The mapper algorithm is a popular tool from topological data analysis for extracting topological summaries of high-dimensional datasets. In this paper, we present Mapper Interactive, a web-based framework for the interactive analysis and visualization of high-dimensional point cloud data. It implements the mapper algorithm in an interactive, scalable, and easily extendable way, thus supporting practical data analysis. In particular, its command-line API can compute mapper graphs for 1 million points of 256 dimensions in about 3 minutes (4 times faster than the vanilla implementation). Its visual interface allows on-the-fly computation and manipulation of the mapper graph based on user-specified parameters and supports the addition of new analysis modules with a few lines of code. Mapper Interactive makes the mapper algorithm accessible to nonspecialists and accelerates topological analytics workflows.

Loon: Using Exemplars to Visualize Large Scale Microscopy Data
D. Lange, E. Polanco, R. Judson-Torres, T. Zangle, A. Lex. In OSF Preprints, 2021.

Which drug is most promising for a cancer patient? This is a question a new microscopy-based approach for measuring the mass of individual cancer cells treated with different drugs promises to answer in only a few hours. However, the analysis pipeline for extracting data from these images is still far from complete automation: human intervention is necessary for quality control for preprocessing steps such as segmentation, to adjust filters, and remove noise, and for the analysis of the result. To address this workflow, we developed Loon, a visualization tool for analyzing drug screening data based on quantitative phase microscopy imaging. Loon visualizes both, derived data such as growth rates, and imaging data. Since the images are collected automatically at a large scale, manual inspection of images and segmentations is infeasible. However, reviewing representative samples of cells is essential, both for quality control and for data analysis. We introduce a new approach of choosing and visualizing representative exemplar cells that retain a close connection to the low-level data. By tightly integrating the derived data visualization capabilities with the novel exemplar visualization and providing selection and filtering capabilities, Loon is well suited for making decisions about which drugs are suitable for a specific patient.

Adaptive Spatially Aware I/O for Multiresolution Particle Data Layouts
W. Usher, X. Huang, S. Petruzza, S. Kumar, S. R. Slattery, S. T. Reeve, F. Wang, C. R. Johnson,, V. Pascucci. In IPDPS, 2021.

Evaluation of GPU Volume Rendering in PyTorch Using Data-Parallel Primitives
N. Marshak, P. Grosset, A. Knoll, J. P. Ahrens, C. R. Johnson. In Eurographics Symposium on Parallel Graphics and Visualization (EGPGV), 2021.

Data-parallel programming (DPP) has attracted considerable interest from the visualization community, fostering major software initiatives such as VTK-m. However, there has been relatively little recent investigation of data-parallel APIs in higherlevel languages such as Python, which could help developers sidestep the need for low-level application programming in C++ and CUDA. Moreover, machine learning frameworks exposing data-parallel primitives, such as PyTorch and TensorFlow, have exploded in popularity, making them attractive platforms for parallel visualization and data analysis. In this work, we benchmark data-parallel primitives in PyTorch, and investigate its application to GPU volume rendering using two distinct DPP formulations: a parallel scan and reduce over the entire volume, and repeated application of data-parallel operators to an array of rays. We find that most relevant DPP primitives exhibit performance similar to a native CUDA library. However, our volume rendering implementation reveals that PyTorch is limited in expressiveness when compared to other DPP APIs. Furthermore, while render times are sufficient for an early ''proof of concept'', memory usage acutely limits scalability.

Visualization of Uncertain Multivariate Data via Feature Confidence Level-Sets
S. Sane, T. Athawale,, C.R. Johnson. In EuroVis 2021, 2021.

Recent advancements in multivariate data visualization have opened new research opportunities for the visualization community. In this paper, we propose an uncertain multivariate data visualization technique called feature confidence level-sets. Conceptually, feature level-sets refer to level-sets of multivariate data. Our proposed technique extends the existing idea of univariate confidence isosurfaces to multivariate feature level-sets. Feature confidence level-sets are computed by considering the trait for a specific feature, a confidence interval, and the distribution of data at each grid point in the domain. Using uncertain multivariate data sets, we demonstrate the utility of the technique to visualize regions with uncertainty in relation to the specific trait or feature, and the ability of the technique to provide secondary feature structure visualization based on uncertainty.

HyperLabels---Browsing of Dense and Hierarchical Molecular 3D Models
D Kouřil, T Isenberg, B Kozlíková, M Meyer, E Gröller, I Viola. In IEEE transactions on visualization and computer graphics, IEEE, 2021.
DOI: 10.1109/TVCG.2020.2975583

We present a method for the browsing of hierarchical 3D models in which we combine the typical navigation of hierarchical structures in a 2D environment---using clicks on nodes, links, or icons---with a 3D spatial data visualization. Our approach is motivated by large molecular models, for which the traditional single-scale navigational metaphors are not suitable. Multi-scale phenomena, e. g., in astronomy or geography, are complex to navigate due to their large data spaces and multi-level organization. Models from structural biology are in addition also densely crowded in space and scale. Cutaways are needed to show individual model subparts. The camera has to support exploration on the level of a whole virus, as well as on the level of a small molecule. We address these challenges by employing HyperLabels: active labels that---in addition to their annotational role---also support user interaction. Clicks on HyperLabels select the next structure to be explored. Then, we adjust the visualization to showcase the inner composition of the selected subpart and enable further exploration. Finally, we use a breadcrumbs panel for orientation and as a mechanism to traverse upwards in the model hierarchy. We demonstrate our concept of hierarchical 3D model browsing using two exemplary models from meso-scale biology.

Page 4 of 23

Start
Prev
1
2
3
4
5
6
7
8
9
10
Next
End

SCI