A. Quistberg, C.I. Gonzalez, P. Arbeláez, O.L. Sarmiento, L. Baldovino-Chiquillo, Q. Nguyen, T. Tasdizen, L.A.G. Garcia, D. Hidalgo, S.J. Mooney, A.V.D. Roux, G. Lovasi. 430 Training neural networks to identify built environment features for pedestrian safety, In Injury Prevention, Vol. 28, No. 2, BMJ, pp. A65. 2022.
We used panoramic images and neural networks to measure street-level built environment features with relevance to pedestrian safety.
Street-level features were identified from systematic literature search and local experience in Bogota, Colombia (study location). Google Street View© panoramic images were sampled from 10,810 intersection and street segment locations, including 2,642 where pedestrian collisions occurred 2015–2019; the most recent, nearest (<25 meters) available image was selected for each sampled intersection or segment. Human raters annotated image features which were used to train neural networks. Neural networks and human raters were compared across all features using mean Average Recall (mAR) and mean Average Precision (mAP) estimated performance. Feature prevalence was compared by pedestrian vs non-pedestrian collision locations.
Thirty features were identified related to roadway (e.g., medians), crossing areas (e.g., crosswalk), traffic control (e.g., pedestrian signal), and roadside (e.g., trees) with streetlights the most frequently detected object (N=10,687 images). Neural networks achieved mAR=15.4 versus 25.4 for humans, and a mAP=16.0. Bus lanes, pedestrian signals, and pedestrian bridges were significantly more prevalent at pedestrian collision locations, whereas speed bumps, school zones, sidewalks, trees, potholes and streetlights were significantly more prevalent at non-pedestrian collision locations.
Neural networks have substantial potential to obtain timely, accurate built environment data crucial to improve road safety. Training images need to be well-annotated to ensure accurate object detection and completeness.
1) Describe how neural networks can be used for road safety research; 2) Describe challenges of using neural networks.
D. Reed, D. Gannon, J. Dongarra. Reinventing High Performance Computing: Challenges and Opportunities, Subtitled UUSCI-2022-001, University of Utah, 2022.
The world of computing is in rapid transition, now dominated by a world of smartphones and cloud services, with profound implications for the future of advanced scientific computing. Simply put, high-performance computing (HPC) is at an important inflection point. For the last 60 years, the world's fastest supercomputers were almost exclusively produced in the United States on behalf of scientific research in the national laboratories. Change is now in the wind. While costs now stretch the limits of U.S. government funding for advanced computing, Japan and China are now leaders in the bespoke HPC systems funded by government mandates. Meanwhile, the global semiconductor shortage and political battles surrounding fabrication facilities affect everyone. However, another, perhaps even deeper, fundamental change has occurred. The major cloud vendors have invested in global networks of massive scale systems that dwarf today's HPC systems. Driven by the computing demands of AI, these cloud systems are increasingly built using custom semiconductors, reducing the financial leverage of traditional computing vendors. These cloud systems are now breaking barriers in game playing and computer vision, reshaping how we think about the nature of scientific computation. Building the next generation of leading edge HPC systems will require rethinking many fundamentals and historical approaches by embracing end-to-end co-design; custom hardware configurations and packaging; large-scale prototyping, as was common thirty years ago; and collaborative partnerships with the dominant computing ecosystem companies, smartphone, and cloud computing vendors.
J.R. Reimer, F.R. Adler, K.M. Golden, A. Narayan. Uncertainty quantification for ecological models with random parameters, In Ecology Letters, Wiley, pp. 1--13. 2022.
There is often considerable uncertainty in parameters in ecological models. This uncertainty can be incorporated into models by treating parameters as random variables with distributions, rather than fixed quantities. Recent advances in uncertainty quantification methods, such as polynomial chaos approaches, allow for the analysis of models with random parameters. We introduce these methods with a motivating case study of sea ice algal blooms in heterogeneous environments. We compare Monte Carlo methods with polynomial chaos techniques to help understand the dynamics of an algal bloom model with random parameters. Modelling key parameters in the algal bloom model as random variables changes the timing, intensity and overall productivity of the modelled bloom. The computational efficiency of polynomial chaos methods provides a promising avenue for the broader inclusion of parametric uncertainty in ecological models, leading to improved model predictions and synthesis between models and data.
L.C. Rupp, B. Zenger, J.A. Bergquist, A. Busatto, R.S. MacLeod. The Role of Beta-1 Receptors in the Response to Myocardial Ischemia, In Computing in Cardiology, Vol. 49, 2022.
Acute myocardial ischemia is commonly diagnosed by ST-segment deviations. These deviations, however, can show a paradoxical recovery even in the face of ongoing ischemic stress. A possible mechanism for this response may be the cardio-protective effects of the autonomic nervous system (ANS) via beta-1 receptors. We assessed the role of norepinephrine (NE), a beta-1 agonist, and esmolol (ES), a beta-1 antagonist, in the recovery of ST-segment deviations during myocardial ischemia. We used an experimental model of controlled myocardial ischemia in which we simultaneously recorded electrograms intramurally and on the epicardial surface. We measured ischemia as deviations in the potentials measured at 40% of the ST-segment duration. During control intervention, 27% of epicardial electrodes showed no ischemic ST-segment deviations, whereas during the interventions with NE and ES, 100% of epicardial electrodes showed no ischemic ST-segment deviations. Intramural electrodes revealed a different behavior with 71% of electrodes showing no ischemic ST-segment deviations during control ischemia, increasing to 79% and 82% for NE infusion and ES infusion interventions, respectively. These preliminary results suggest that recovery of intramural regions of the heart is delayed by the presence of both beta-1 agonists and antagonists even as epicardial potentials show almost complete recovery.
S. Saha, O, Choi, R. Whitaker. Few-Shot Segmentation of Microscopy Images Using Gaussian Process, In Medical Optical Imaging and Virtual Microscopy Image Analysis, Springer Nature Switzerland, pp. 94--104. 2022.
Few-shot segmentation has received recent attention because of its promise to segment images containing novel classes based on a handful of annotated examples. Few-shot-based machine learning methods build generic and adaptable models that can quickly learn new tasks. This approach finds potential application in many scenarios that do not benefit from large repositories of labeled data, which strongly impacts the performance of the existing data-driven deep-learning algorithms. This paper presents a few-shot segmentation method for microscopy images that combines a neural-network architecture with a Gaussian-process (GP) regression. The GP regression is used in the latent space of an autoencoder-based segmentation model to learn the distribution of functions from the encoded image representations to the corresponding representation of the segmentation masks in the support set. This regression analysis serves as the prior for predicting the segmentation mask for the query image. The rich latent representation built by the GP using examples in the support set significantly impacts the performance of the segmentation model, demonstrated by extensive experimental evaluation.
S. Sane, C. R. Johnson, H. Childs. Demonstrating the viability of Lagrangian in situ reduction on supercomputers, In Journal of Computational Science, Vol. 61, Elsevier, 2022.
Performing exploratory analysis and visualization of large-scale time-varying computational science applications is challenging due to inaccuracies that arise from under-resolved data. In recent years, Lagrangian representations of the vector field computed using in situ processing are being increasingly researched and have emerged as a potential solution to enable exploration. However, prior works have offered limited estimates of the encumbrance on the simulation code as they consider “theoretical” in situ environments. Further, the effectiveness of this approach varies based on the nature of the vector field, benefitting from an in-depth investigation for each application area. With this study, an extended version of Sane et al. (2021), we contribute an evaluation of Lagrangian analysis viability and efficacy for simulation codes executing at scale on a supercomputer. We investigated previously unexplored cosmology and seismology applications as well as conducted a performance benchmarking study by using a hydrodynamics mini-application targeting exascale computing. To inform encumbrance, we integrated in situ infrastructure with simulation codes, and evaluated Lagrangian in situ reduction in representative homogeneous and heterogeneous HPC environments. To inform post hoc accuracy, we conducted a statistical analysis across a range of spatiotemporal configurations as well as a qualitative evaluation. Additionally, our study contributes cost estimates for distributed-memory post hoc reconstruction. In all, we demonstrate viability for each application — data reduction to less than 1% of the total data via Lagrangian representations, while maintaining accurate reconstruction and requiring under 10% of total execution time in over 90% of our experiments.
S. Subramanian, R.M. Kirby, M.W. Mahoney, A. Gholami. Adaptive Self-supervision Algorithms for Physics-informed Neural Networks , Subtitled arXiv:2207.04084, 2022.
Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function, but recent work has shown that this can lead to optimization difficulties. Here, we study the impact of the location of the collocation points on the trainability of these models. We find that the vanilla PINN performance can be significantly boosted by adapting the location of the collocation points as training proceeds. Specifically, we propose a novel adaptive collocation scheme which progressively allocates more collocation points (without increasing their number) to areas where the model is making higher errors (based on the gradient of the loss function in the domain). This, coupled with a judicious restarting of the training during any optimization stalls (by simply resampling the collocation points in order to adjust the loss landscape) leads to better estimates for the prediction error. We present results for several problems, including a 2D Poisson and diffusion-advection system with different forcing functions. We find that training vanilla PINNs for these problems can result in up to 70% prediction error in the solution, especially in the regime of low collocation points. In contrast, our adaptive schemes can achieve up to an order of magnitude smaller error, with similar computational complexity as the baseline. Furthermore, we find that the adaptive methods consistently perform on-par or slightly better than vanilla PINN method, even for large collocation point regimes. The code for all the experiments has been open sourced.
T. Sun, D. Li, B. Wang. Adaptive Random Walk Gradient Descent for Decentralized Optimization, In Proceedings of the 39th International Conference on Machine Learning, 2022.
In this paper, we study the adaptive step size random walk gradient descent with momentum for decentralized optimization, in which the training samples are drawn dependently with each other. We establish theoretical convergence rates of the adaptive step size random walk gradient descent with momentum for both convex and nonconvex settings. In particular, we prove that adaptive random walk algorithms perform as well as the nonadaptive method for dependent data in general cases but achieve acceleration when the stochastic gradients are “sparse”. Moreover, we study the zeroth-order version of adaptive random walk gradient descent and provide corresponding convergence results. All assumptions used in this paper are mild and general, making our results applicable to many machine learning problems.
T. Sun, D. Li, B. Wang. Finite-Time Analysis of Adaptive Temporal Difference Learning with Deep Neural Networks, In 36th Conference on Neural Information Processing Systems (NeurIPS 2022), October, 2022.
Temporal difference (TD) learning with function approximations (linear functions or neural networks) has achieved remarkable empirical success, giving impetus to the development of finite-time analysis. As an accelerated version of TD, the adaptive TD has been proposed and proved to enjoy finite-time convergence under the linear function approximation. Existing numerical results have demonstrated the superiority of adaptive algorithms to vanilla ones. Nevertheless, the performance guarantee of adaptive TD with neural network approximation remains widely unknown. This paper establishes the finite-time analysis for the adaptive TD with multi-layer ReLU networks approximation whose samples are generated from a Markov decision process. Our established theory shows that if the width of the deep neural network is large enough, the adaptive TD using neural network approximation can find the (optimal) value function with high probabilities under the same iteration complexity as TD in general cases. Furthermore, we show that the adaptive TD using neural network approximation, with the same width and searching area, can achieve theoretical acceleration when the stochastic semigradients decay fast.
G. Tarcea, B. Puchala, T. Berman, G. Scorzelli, V. Pascucci, M, Taufer, J. Allison. The Materials Commons Data Repository, In 2022 IEEE 18th International Conference on e-Science (e-Science), pp. 405--406. 2022.
Repositories are increasingly used for publishing and sharing scientific data. The Materials Commons is a data repository that follows the FAIR (Findable, Accessible, Inter-operable, Reusable) principles. We demonstrate the challenges with FAIR and how Materials Commons solves them. We also discuss the Nationals Science Data Fabric (NSDF) , a project that is democratizing data access, and show how Materials Commons with the NSDF software stack accelerates data access and scientific research.
M. Toloubidokhti, N. Kumar, Z. Li, P. K. Gyawali, B. Zenger, W. W. Good, R. S. MacLeod, L. Wang . Interpretable Modeling and Reduction of Unknown Errors in Mechanistic Operators, In Medical Image Computing and Computer Assisted Intervention -- MICCAI 2022, Springer Nature Switzerland, pp. 459--468. 2022.
Prior knowledge about the imaging physics provides a mechanistic forward operator that plays an important role in image reconstruction, although myriad sources of possible errors in the operator could negatively impact the reconstruction solutions. In this work, we propose to embed the traditional mechanistic forward operator inside a neural function, and focus on modeling and correcting its unknown errors in an interpretable manner. This is achieved by a conditional generative model that transforms a given mechanistic operator with unknown errors, arising from a latent space of self-organizing clusters of potential sources of error generation. Once learned, the generative model can be used in place of a fixed forward operator in any traditional optimization-based reconstruction process where, together with the inverse solution, the error in prior mechanistic forward operator can be minimized and the potential source of error uncovered. We apply the presented method to the reconstruction of heart electrical potential from body surface potential. In controlled simulation experiments and in-vivo real data experiments, we demonstrate that the presented method allowed reduction of errors in the physics-based forward operator and thereby delivered inverse reconstruction of heart-surface potential with increased accuracy.
D. Tong, N. Soley, R. Kolasangiani, M.A. Schwartz, T.C. Bidone. αIIbβ3 integrin intermediates: from molecular dynamics to adhesion assembly, In Biophysical Journal, 2022.
The platelet integrin αIIbβ3 undergoes long range conformational transitions associated with its functional conversion from inactive (low affinity) to active (high affinity) states during hemostasis. Although new conformations intermediate between the well-characterized bent and extended states have been identified, their molecular dynamic properties and functions in the assembly of adhesions remain largely unexplored. In this study, we evaluated the properties of intermediate conformations of integrin αIIbβ3 and characterized their effects on the assembly of adhesions by combining all-atom simulations, principal component analysis, and mesoscale modeling. Our results show that in the low affinity, bent conformation, the integrin ectodomain tends to pivot around the legs; in intermediate conformations the upper headpiece becomes partially extended, away from the lower legs. In the fully open, active state, αIIbβ3 is flexible and the motions between upper headpiece and lower legs are accompanied by fluctuations of the transmembrane helices. At the mesoscale, bent integrins form only unstable adhesions, but intermediate or open conformations stabilize the adhesions. These studies reveal a mechanism by which small variations in ligand binding affinity and enhancement of the ligand-bound lifetime in the presence of actin retrograde flow stabilize αIIbβ3 integrin adhesions.
H. D. Tran, M. Fernando, K. Saurabh, B. Ganapathysubramanian, R. M. Kirby, H. Sundar. A scalable adaptive-matrix SPMV for heterogeneous architectures, In 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS), IEEE, pp. 13--24. 2022.
In most computational codes, the core computational kernel is the Sparse Matrix-Vector product (SpMV) that enables specialized linear algebra libraries like PETSc to be used, especially in the distributed memory setting. However, optimizing SpMvperformance and scalability at all levels of a modern heterogeneous architecture can be challenging as it is characterized by irregular memory access. This work presents a hybrid approach (HyMV) for evaluating SpMV for matrices arising from PDE discretization schemes such as the finite element method (FEM). The approach enables localized structured memory access that provides improved performance and scalability. Additionally, it simplifies the programmability and portability on different architectures. The developed HyMV approach enables efficient parallelization using MPI, SIMD, OpenMP, and CUDA with minimum programming effort. We present a detailed comparison of HyMV with the two traditional approaches in computational code, matrix-assembled and matrix-free approaches, for structured and unstructured meshes. Our results demonstrate that the HyMV approach achieves excellent scalability and outperforms both approaches, e.g., achieving average speedups of 11x for matrix setup, 1.7x for SpMV with structured meshes, 3.6x for SpMV with unstructured meshes, and 7.5x for GPU SpMV.
W. Usher, J. Amstutz, J. Günther, A. Knoll, G. P. Johnson, C. Brownlee, A. Hota, B. Cherniak, T. Rowley, J. Jeffers, V. Pascucci . Scalable CPU Ray Tracing for In Situ Visualization Using OSPRay, In In Situ Visualization for Computational Science, Springer International Publishing, pp. 353--374. 2022.
In situ visualization increasingly involves rendering large numbers of images for post hoc exploration. As both the number of images to be rendered and the data being rendered are large, the scalability of the rendering component is of key concern. Furthermore, the renderer must be able to support a wide range of data distributions, simulation configurations, and HPC systems to provide the flexibility required for a portable, general purpose in situ rendering package. In this chapter, we discuss recent developments in OSPRay’s support for MPI-parallel applications to provide a flexible and scalable rendering API, with a focus on how these developments can be applied to enable scalable, high-quality in situ visualization.
A. Venkat, D. Hoang, A. Gyulassy, P.T. Bremer, F. Federer, V. Pascucci. High-Quality Progressive Alignment of Large 3D Microscopy Data, In 2022 IEEE 12th Symposium on Large Data Analysis and Visualization (LDAV), pp. 1--10. 2022.
Large-scale three-dimensional (3D) microscopy acquisitions fre-quently create terabytes of image data at high resolution and magni-fication. Imaging large specimens at high magnifications requires acquiring 3D overlapping image stacks as tiles arranged on a two-dimensional (2D) grid that must subsequently be aligned and fused into a single 3D volume. Due to their sheer size, aligning many overlapping gigabyte-sized 3D tiles in parallel and at full resolution is memory intensive and often I/O bound. Current techniques trade accuracy for scalability, perform alignment on subsampled images, and require additional postprocess algorithms to refine the alignment quality, usually with high computational requirements. One common solution to the memory problem is to subdivide the overlap region into smaller chunks (sub-blocks) and align the sub-block pairs in parallel, choosing the pair with the most reliable alignment to determine the global transformation. Yet aligning all sub-block pairs at full resolution remains computationally expensive. The key to quickly developing a fast, high-quality, low-memory solution is to identify a single or a small set of sub-blocks that give good alignment at full resolution without touching all the overlapping data. In this paper, we present a new iterative approach that leverages coarse resolution alignments to progressively refine and align only the promising candidates at finer resolutions, thereby aligning only a small user-defined number of sub-blocks at full resolution to determine the lowest error transformation between pairwise overlapping tiles. Our progressive approach is 2.6x faster than the state of the art, requires less than 450MB of peak RAM (per parallel thread), and offers a higher quality alignment without the need for additional postprocessing refinement steps to correct for alignment errors.
Z. Wang, Y. Xu, C. Tillinghast, S. Li, A. Narayan, S. Zhe. Nonparametric Embeddings of Sparse High-Order Interaction Events, In Proceedings of the 39 th International Conference on Machine Learning, PLMR, pp. 23237-23253. 2022.
High-order interaction events are common in real-world applications. Learning embeddings that encode the complex relationships of the participants from these events is of great importance in knowledge mining and predictive tasks. Despite the success of existing approaches, eg Poisson tensor factorization, they ignore the sparse structure underlying the data, namely the occurred interactions are far less than the possible interactions among all the participants. In this paper, we propose Nonparametric Embeddings of Sparse High-order interaction events (NESH). We hybridize a sparse hypergraph (tensor) process and a matrix Gaussian process to capture both the asymptotic structural sparsity within the interactions and nonlinear temporal relationships between the participants. We prove strong asymptotic bounds (including both a lower and an upper bound) of the sparse ratio, which reveals the asymptotic properties of the sampled structure. We use batch-normalization, stick-breaking construction and sparse variational GP approximations to develop an efficient, scalable model inference algorithm. We demonstrate the advantage of our approach in several real-world applications.
Z. Wang, M. Dorier, P. Subedi, P.E. Davis, M. Parashar. Adaptive elasticity policies for staging-based in situ visualization, In Future Generation Computer Systems, 2022.
In situ processing aims to alleviate the growing gap between computation and I/O capabilities by performing data processing close to the data source. In situ processing is widely used to process data generated by multiple data sources, including observation data from edge devices or scientific observational facilities and the simulation data generated by scientific computation on a high-performance computing (HPC) platform. For a scientific workflow that is run on an HPC platform and composed of a simulation program and an in situ data analytics or visualization (abbreviated as ana/vis) task, there is an implicit assumption that the computing resources assigned to the workflow keep static during the workflow execution. However, with the converging trend between the HPC and cloud computing platform, running the in situ ana/vis task in an elastic way is promising to decrease its overhead and improve its resource utilization rate. Resource elasticity represents the ability to change resource configurations such as the number of computing nodes/processes during workflow execution. An elastic job may dynamically adjust resource configurations; it may use a few resources at the beginning and more resources toward the end of the job when interesting data appear. However, it is hard to predict a priori how many computing nodes/processes need to be added/removed during the workflow execution to adapt to changing workflow needs. How to efficiently guide elasticity operations, such as growing or shrinking the number of processes used for in situ analysis during workflow execution, is an open-ended research question. In this article, we present adaptive elasticity policies that adopt workflow runtime information collected during workflow execution to predict how to trigger the addition/removal of processes in order to minimize in situ processing overhead. Taking in situ visualization tasks as an example, we integrate the presented elasticity policies into a staging-based elastic workflow and evaluate its efficiency in multiple elasticity scenarios. Compared with the situation without elasticity or with a static elasticity policy that uses a fixed number of processes for each rescaling operation, the adaptive elasticity policy can save overhead in finding a proper resource configuration and improve resource utilization efficiency. For example, one experiment illustrates that the adaptive elasticity policy saves 41% of core-hours compared with the situation without the resource elasticity.
V. Zala, A. Narayan, R.M. Kirby. Convex Optimization-Based Structure-Preserving Filter For Multidimensional Finite Element Simulations, Subtitled arXiv preprint arXiv:2203.09748, 2022.
In simulation sciences, it is desirable to capture the real-world problem features as accurately as possible. Methods popular for scientific simulations such as the finite element method (FEM) and finite volume method (FVM) use piecewise polynomials to approximate various characteristics of a problem, such as the concentration profile and the temperature distribution across the domain. Polynomials are prone to creating artifacts such as Gibbs oscillations while capturing a complex profile. An efficient and accurate approach must be applied to deal with such inconsistencies in order to obtain accurate simulations. This often entails dealing with negative values for the concentration of chemicals, exceeding a percentage value over 100, and other such problems. We consider these inconsistencies in the context of partial differential equations (PDEs). We propose an innovative filter based on convex optimization to deal with the inconsistencies observed in polynomial-based simulations. In two or three spatial dimensions, additional complexities are involved in solving the problems related to structure preservation. We present the construction and application of a structure-preserving filter with a focus on multidimensional PDEs. Methods used such as the Barycentric interpolation for polynomial evaluation at arbitrary points in the domain and an optimized root-finder to identify points of interest improve the filter efficiency, usability, and robustness. Lastly, we present numerical experiments in 2D and 3D using discontinuous Galerkin formulation and demonstrate the filter's efficacy to preserve the desired structure. As a real-world application …
B. Zhang, P. Subedi, P. E. Davis, F. Rizzi, K. Teranishi, M. Parashar. Assembling Portable In-Situ Workflow from Heterogeneous Components using Data Reorganization, In 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), pp. 41-50. 2022.
Heterogeneous computing is becoming common in the HPC world. The fast-changing hardware landscape is pushing programmers and developers to rely on performance-portable programming models to rewrite old and legacy applications and develop new ones. While this approach is suitable for individual applications, outstanding challenges still remain when multiple applications are combined into complex workflows. One critical difficulty is the exchange of data between communicating applications where performance constraints imposed by heterogeneous hardware advantage different data layouts. We attempt to solve this problem by exploring asynchronous data layout conversions for applications requiring different memory access patterns for shared data. We implement the proposed solution within the DataSpaces data staging service, extending it to support heterogeneous application workflows across a broad spectrum of programming models. In addition, we integrate heterogeneous DataSpaces with the Kokkos programming model and propose the Kokkos Staging Space as an extension of the Kokkos data abstraction. This new abstraction enables us to express data on a virtual shared space for multiple Kokkos applications, thus guaranteeing the portability of each application when assembling them into an efficient heterogeneous workflow. We present performance results for the Kokkos Staging Space using a synthetic workflow emulator and three different scenarios representing access frequency and use patterns in shared data. The results show that the Kokkos Staging Space is a superior solution in terms of time-to-solution and scalability compared to existing file-based Kokkos data abstractions for inter-application data exchange.
L. Zhou, M. Fan, C. Hansen, C. R. Johnson, D. Weiskopf. A Review of Three-Dimensional Medical Image Visualization, In Health Data Science, Vol. 2022, 2022.
Importance. Medical images are essential for modern medicine and an important research subject in visualization. However, medical experts are often not aware of the many advanced three-dimensional (3D) medical image visualization techniques that could increase their capabilities in data analysis and assist the decision-making process for specific medical problems. Our paper provides a review of 3D visualization techniques for medical images, intending to bridge the gap between medical experts and visualization researchers. Highlights. Fundamental visualization techniques are revisited for various medical imaging modalities, from computational tomography to diffusion tensor imaging, featuring techniques that enhance spatial perception, which is critical for medical practices. The state-of-the-art of medical visualization is reviewed based on a procedure-oriented classification of medical problems for studies of individuals and populations. This paper summarizes free software tools for different modalities of medical images designed for various purposes, including visualization, analysis, and segmentation, and it provides respective Internet links. Conclusions. Visualization techniques are a useful tool for medical experts to tackle specific medical problems in their daily work. Our review provides a quick reference to such techniques given the medical problem and modalities of associated medical images. We summarize fundamental techniques and readily available visualization tools to help medical experts to better understand and utilize medical imaging data. This paper could contribute to the joint effort of the medical and visualization communities to advance precision medicine.