SCIENTIFIC COMPUTING AND IMAGING INSTITUTE
at the University of Utah

An internationally recognized leader in visualization, scientific computing, and image analysis

SCI Publications

2013


J.T. Elison, S.J. Paterson, J.J. Wolff, J.S. Reznick, N.J. Sasson, H. Gu, K.N. Botteron, S.R. Dager, A.M. Estes, A.C. Evans, G. Gerig, H.C. Hazlett, R.T. Schultz, M. Styner, L. Zwaigenbaum, J. Piven for the IBIS Network. “White Matter Microstructure and Atypical Visual Orienting in 7 Month-Olds at Risk for Autism,” In American Journal of Psychiatry, Vol. AJP-12-09-1150.R2, March, 2013.
DOI: 10.1176/appi.ajp.2012.12091150
PubMed ID: 23511344

ABSTRACT

Objective: To determine whether specific patterns of oculomotor functioning and visual orienting characterize 7 month-old infants later classified with an autism spectrum disorder (ASD) and to identify the neural correlates of these behaviors.

Method: Ninety-seven infants contributed data to the current study (16 high-familial risk infants later classified with an ASD, 40 high-familial risk infants not meeting ASD criteria (high-risk-negative), and 41 low-risk infants). All infants completed an eye tracking task at 7 months and a clinical assessment at 25 months; diffusion weighted imaging data was acquired on 84 infants at 7 months. Primary outcome measures included average saccadic reaction time in a visually guided saccade procedure and radial diffusivity (an index of white matter organization) in fiber tracts that included corticospinal pathways and the splenium and genu of the corpus callosum.

Results: Visual orienting latencies were increased in seven-month-old infants who later express ASD symptoms at 25 months when compared with both high-risk-negative infants (p = 0.012, d = 0.73) and low-risk infants (p = 0.032, d = 0.71). Visual orienting latencies were uniquely associated with the microstructural organization of the splenium of the corpus callosum in low-risk infants, but this association was not apparent in infants later classified with ASD.

Conclusions: Flexibly and efficiently orienting to salient information in the environment is critical for subsequent cognitive and social-cognitive development. Atypical visual orienting may represent an earlyemerging prodromal feature of ASD, and abnormal functional specialization of posterior cortical circuits directly informs a novel model of ASD pathogenesis.



B. Erem, J. Coll-Font, R.M. Orellana, P. Stovicek, D.H. Brooks, R.S. MacLeod. “Noninvasive reconstruction of potentials on endocardial surface from body surface potentials and CT imaging of partial torso,” In Journal of Electrocardiology, Vol. 46, No. 4, pp. e28. 2013.
DOI: 10.1016/j.jelectrocard.2013.05.104



B. Erem, R.M. Orellana, P. Stovicek, D.H. Brooks, R.S. MacLeod. “Improved averaging of multi-lead ECGs and electrograms,” In Journal of Electrocardiology, Vol. 46, No. 4, Elsevier, pp. e28. July, 2013.
DOI: 10.1016/j.jelectrocard.2013.05.103



T. Etiene, D. Jonsson, T. Ropinski, C. Scheidegger, J. Comba, L. Gustavo Nonato, R.M. Kirby, A. Ynnerman, C.T. Silva. “Verifying Volume Rendering Using Discretization Error Analysis,” SCI Technical Report, No. UUSCI-2013-001, SCI Institute, University of Utah, 2013.

ABSTRACT

We propose an approach for verification of volume rendering correctness based on an analysis of the volume rendering integral, the basis for most DVR algorithms. With respect to the most common discretization of this continuous model, we make assumptions about the impact of parameter changes on the rendered results and derive convergence curves describing the expected behavior. Specifically, we progressively refine the number of samples along the ray, the grid size, and the pixel size, and evaluate how the errors observed during refinement compare against the expected approximation errors. We will derive the theoretical foundations of our verification approach, explain how to realize it in practice and discuss its limitations as well as the identified errors.

Keywords: discretization errors, volume rendering, verifiable visualization



K. Fakhar, E. Hastings, C.R. Butson, K.D. Foote, P. Zeilman, M.S. Okun. “Management of deep brain stimulator battery failure: battery estimators, charge density, and importance of clinical symptoms,” In PloS One, Vol. 8, No. 3, pp. e58665. January, 2013.
ISSN: 1932-6203
DOI: 10.1371/journal.pone.0058665
PubMed ID: 23536810

ABSTRACT

We aimed in this investigation to study deep brain stimulation (DBS) battery drain with special attention directed toward patient symptoms prior to and following battery replacement.



M. Farzinfar, Y. Li, A.R. Verde, I. Oguz, G. Gerig, M.A. Styner. “DTI Quality Control Assessment via Error Estimation From Monte Carlo Simulations,” In Proceedings of SPIE 8669, Medical Imaging 2013: Image Processing, Vol. 8669, 2013.
DOI: 10.1117/12.2006925
PubMed ID: 23833547
PubMed Central ID: PMC3702180

ABSTRACT

Diffusion Tensor Imaging (DTI) is currently the state of the art method for characterizing the microscopic tissue structure of white matter in normal or diseased brain in vivo. DTI is estimated from a series of Diffusion Weighted Imaging (DWI) volumes. DWIs suffer from a number of artifacts which mandate stringent Quality Control (QC) schemes to eliminate lower quality images for optimal tensor estimation. Conventionally, QC procedures exclude artifact-affected DWIs from subsequent computations leading to a cleaned, reduced set of DWIs, called DWI-QC. Often, a rejection threshold is heuristically/empirically chosen above which the entire DWI-QC data is rendered unacceptable and thus no DTI is computed. In this work, we have devised a more sophisticated, Monte-Carlo (MC) simulation based method for the assessment of resulting tensor properties. This allows for a consistent, error-based threshold definition in order to reject/accept the DWI-QC data. Specifically, we propose the estimation of two error metrics related to directional distribution bias of Fractional Anisotropy (FA) and the Principal Direction (PD). The bias is modeled from the DWI-QC gradient information and a Rician noise model incorporating the loss of signal due to the DWI exclusions. Our simulations further show that the estimated bias can be substantially different with respect to magnitude and directional distribution depending on the degree of spatial clustering of the excluded DWIs. Thus, determination of diffusion properties with minimal error requires an evenly distributed sampling of the gradient directions before and after QC.



M. Farzinfar, I. Oguz, R.G. Smith, A.R. Verde, C. Dietrich, A. Gupta, M.L. Escolar, J. Piven, S. Pujol, C. Vachet, S. Gouttard, G. Gerig, S. Dager, R.C. McKinstry, S. Paterson, A.C. Evans, M.A. Styner. “Diffusion imaging quality control via entropy of principal direction distribution,” In NeuroImage, Vol. 82, pp. 1--12. 2013.
ISSN: 1053-8119
DOI: 10.1016/j.neuroimage.2013.05.022

ABSTRACT

Diffusion MR imaging has received increasing attention in the neuroimaging community, as it yields new insights into the microstructural organization of white matter that are not available with conventional MRI techniques. While the technology has enormous potential, diffusion MRI suffers from a unique and complex set of image quality problems, limiting the sensitivity of studies and reducing the accuracy of findings. Furthermore, the acquisition time for diffusion MRI is longer than conventional MRI due to the need for multiple acquisitions to obtain directionally encoded Diffusion Weighted Images (DWI). This leads to increased motion artifacts, reduced signal-to-noise ratio (SNR), and increased proneness to a wide variety of artifacts, including eddy-current and motion artifacts, “venetian blind” artifacts, as well as slice-wise and gradient-wise inconsistencies. Such artifacts mandate stringent Quality Control (QC) schemes in the processing of diffusion MRI data. Most existing QC procedures are conducted in the DWI domain and/or on a voxel level, but our own experiments show that these methods often do not fully detect and eliminate certain types of artifacts, often only visible when investigating groups of DWI's or a derived diffusion model, such as the most-employed diffusion tensor imaging (DTI). Here, we propose a novel regional QC measure in the DTI domain that employs the entropy of the regional distribution of the principal directions (PD). The PD entropy quantifies the scattering and spread of the principal diffusion directions and is invariant to the patient's position in the scanner. High entropy value indicates that the PDs are distributed relatively uniformly, while low entropy value indicates the presence of clusters in the PD distribution. The novel QC measure is intended to complement the existing set of QC procedures by detecting and correcting residual artifacts. Such residual artifacts cause directional bias in the measured PD and here called dominant direction artifacts. Experiments show that our automatic method can reliably detect and potentially correct such artifacts, especially the ones caused by the vibrations of the scanner table during the scan. The results further indicate the usefulness of this method for general quality assessment in DTI studies.

Keywords: Diffusion magnetic resonance imaging, Diffusion tensor imaging, Quality assessment, Entropy



N. Farah, A. Zoubi, S. Matar, L. Golan, A. Marom, C.R. Butson, I. Brosh, S. Shoham. “Holographically patterned activation using photo-absorber induced neural-thermal stimulation,” In Journal of Neural Engineering, Vol. 10, No. 5, pp. 056004. October, 2013.
ISSN: 1741-2560
DOI: 10.1088/1741-2560/10/5/056004

ABSTRACT

Objective. Patterned photo-stimulation offers a promising path towards the effective control of distributed neuronal circuits. Here, we demonstrate the feasibility and governing principles of spatiotemporally patterned microscopic photo-absorber induced neural.thermal stimulation (PAINTS) based on light absorption by exogenous extracellular photo-absorbers. Approach. We projected holographic light patterns from a green continuous-wave (CW) or an IR femtosecond laser onto exogenous photo-absorbing particles dispersed in the vicinity of cultured rat cortical cells. Experimental results are compared to predictions of a temperature-rate model (where membrane currents follow I ∝ dT/dt). Main results. The induced microscopic photo-thermal transients have sub-millisecond thermal relaxation times and stimulate adjacent cells. PAINTS activation thresholds for different laser pulse durations (0.02 to 1 ms) follow the Lapicque strength-duration formula, but with different chronaxies and minimal threshold energy levels for the two excitation lasers (an order of magnitude lower for the IR system mporal selectivity.



J. Fishbaugh, M.W. Prastawa, G. Gerig, S. Durrleman. “Geodesic Shape Regression in the Framework of Currents,” In Proceedings of the International Conference on Information Processing in Medical Imaging (IPMI), Vol. 23, pp. 718--729. 2013.
PubMed ID: 24684012
PubMed Central ID: PMC4127488

ABSTRACT

Shape regression is emerging as an important tool for the statistical analysis of time dependent shapes. In this paper, we develop a new generative model which describes shape change over time, by extending simple linear regression to the space of shapes represented as currents in the large deformation diffeomorphic metric mapping (LDDMM) framework. By analogy with linear regression, we estimate a baseline shape (intercept) and initial momenta (slope) which fully parameterize the geodesic shape evolution. This is in contrast to previous shape regression methods which assume the baseline shape is fixed. We further leverage a control point formulation, which provides a discrete and low dimensional parameterization of large diffeomorphic transformations. This flexible system decouples the parameterization of deformations from the specific shape representation, allowing the user to define the dimension- ality of the deformation parameters. We present an optimization scheme that estimates the baseline shape, location of the control points, and initial momenta simultaneously via a single gradient descent algorithm. Finally, we demonstrate our proposed method on synthetic data as well as real anatomical shape complexes.



J. Fishbaugh, M. Prastawa, G. Gerig, S. Durrleman. “Geodesic image regression with a sparse parameterization of diffeomorphisms,” In Geometric Science of Information Lecture Notes in Computer Science (LNCS), In Proceedings of the Geometric Science of Information Conference (GSI), Vol. 8085, pp. 95--102. 2013.

ABSTRACT

Image regression allows for time-discrete imaging data to be modeled continuously, and is a crucial tool for conducting statistical analysis on longitudinal images. Geodesic models are particularly well suited for statistical analysis, as image evolution is fully characterized by a baseline image and initial momenta. However, existing geodesic image regression models are parameterized by a large number of initial momenta, equal to the number of image voxels. In this paper, we present a sparse geodesic image regression framework which greatly reduces the number of model parameters. We combine a control point formulation of deformations with a L1 penalty to select the most relevant subset of momenta. This way, the number of model parameters reflects the complexity of anatomical changes in time rather than the sampling of the image. We apply our method to both synthetic and real data and show that we can decrease the number of model parameters (from the number of voxels down to hundreds) with only minimal decrease in model accuracy. The reduction in model parameters has the potential to improve the power of ensuing statistical analysis, which faces the challenging problem of high dimensionality.



T. Fogal, A. Schiewe, J. Krüger. “An Analysis of Scalable GPU-Based Ray-Guided Volume Rendering,” In 2013 IEEE Symposium on Large Data Analysis and Visualization (LDAV), 2013.

ABSTRACT

Volume rendering continues to be a critical method for analyzing large-scale scalar fields, in disciplines as diverse as biomedical engineering and computational fluid dynamics. Commodity desktop hardware has struggled to keep pace with data size increases, challenging modern visualization software to deliver responsive interactions for O(N3) algorithms such as volume rendering. We target the data type common in these domains: regularly-structured data.

In this work, we demonstrate that the major limitation of most volume rendering approaches is their inability to switch the data sampling rate (and thus data size) quickly. Using a volume renderer inspired by recent work, we demonstrate that the actual amount of visualizable data for a scene is typically bound considerably lower than the memory available on a commodity GPU. Our instrumented renderer is used to investigate design decisions typically swept under the rug in volume rendering literature. The renderer is freely available, with binaries for all major platforms as well as full source code, to encourage reproduction and comparison with future research.



Z. Fu, R.M. Kirby, R.T. Whitaker. “A Fast Iterative Method for Solving the Eikonal Equation on Tetrahedral Domains,” In SIAM Journal on Scientific Computing, Vol. 35, No. 5, pp. C473--C494. 2013.

ABSTRACT

Generating numerical solutions to the eikonal equation and its many variations has a broad range of applications in both the natural and computational sciences. Efficient solvers on cutting-edge, parallel architectures require new algorithms that may not be theoretically optimal, but that are designed to allow asynchronous solution updates and have limited memory access patterns. This paper presents a parallel algorithm for solving the eikonal equation on fully unstructured tetrahedral meshes. The method is appropriate for the type of fine-grained parallelism found on modern massively-SIMD architectures such as graphics processors and takes into account the particular constraints and capabilities of these computing platforms. This work builds on previous work for solving these equations on triangle meshes; in this paper we adapt and extend previous 2D strategies to accommodate three-dimensional, unstructured, tetrahedralized domains. These new developments include a local update strategy with data compaction for tetrahedral meshes that provides solutions on both serial and parallel architectures, with a generalization to inhomogeneous, anisotropic speed functions. We also propose two new update schemes, specialized to mitigate the natural data increase observed when moving to three dimensions, and the data structures necessary for efficiently mapping data to parallel SIMD processors in a way that maintains computational density. Finally, we present descriptions of the implementations for a single CPU, as well as multicore CPUs with shared memory and SIMD architectures, with comparative results against state-of-the-art eikonal solvers.



M. Gamell, I. Rodero, M. Parashar, J.C. Bennett, H. Kolla, J.H. Chen, P.-T. Bremer, A. Landge, A. Gyulassy, P. McCormick, Scott Pakin, Valerio Pascucci, Scott Klasky. “Exploring Power Behaviors and Trade-offs of In-situ Data Analytics,” In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, Association for Computing Machinery, 2013.
ISBN: 978-1-4503-2378-9
DOI: 10.1145/2503210.2503303

ABSTRACT

As scientific applications target exascale, challenges related to data and energy are becoming dominating concerns. For example, coupled simulation workflows are increasingly adopting in-situ data processing and analysis techniques to address costs and overheads due to data movement and I/O. However it is also critical to understand these overheads and associated trade-offs from an energy perspective. The goal of this paper is exploring data-related energy/performance trade-offs for end-to-end simulation workflows running at scale on current high-end computing systems. Specifically, this paper presents: (1) an analysis of the data-related behaviors of a combustion simulation workflow with an in-situ data analytics pipeline, running on the Titan system at ORNL; (2) a power model based on system power and data exchange patterns, which is empirically validated; and (3) the use of the model to characterize the energy behavior of the workflow and to explore energy/performance trade-offs on current as well as emerging systems.

Keywords: SDAV



G. Gardner, A. Morris, K. Higuchi, R.S. MacLeod, J. Cates. “A Point-Correspondence Approach to Describing the Distribution of Image Features on Anatomical Surfaces, with Application to Atrial Fibrillation,” In Proceedings of the 2013 IEEE 10th International Symposium on Biomedical Imaging (ISBI), pp. 226--229. 2013.
DOI: 10.1109/ISBI.2013.6556453

ABSTRACT

This paper describes a framework for summarizing and comparing the distributions of image features on anatomical shape surfaces in populations. The approach uses a pointbased correspondence model to establish a mapping among surface positions and may be useful for anatomy that exhibits a relatively high degree of shape variability, such as cardiac anatomy. The approach is motivated by the MRI-based study of diseased, or fibrotic, tissue in the left atrium of atrial fibrillation (AF) patients, which has been difficult to measure quantitatively using more established image and surface registration techniques. The proposed method is to establish a set of point correspondences across a population of shape surfaces that provides a mapping from any surface to a common coordinate frame, where local features like fibrosis can be directly compared. To establish correspondence, we use a previously-described statistical optimization of particle-based shape representations. For our atrial fibrillation population, the proposed method provides evidence that more intense and widely distributed fibrosis patterns exist in patients that do not respond well to radiofrequency ablation therapy.



S. Gerber, O. Reubel, P.-T. Bremer, V. Pascucci, R.T. Whitaker. “Morse-Smale Regression,” In Journal of Computational and Graphical Statistics, Vol. 22, No. 1, pp. 193--214. 2013.
DOI: 10.1080/10618600.2012.657132

ABSTRACT

This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduce a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse-Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this paper introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to over-fitting. The Morse-Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse-Smale regression. Supplementary materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse-Smale complex approximation and additional tables for the climate-simulation study.



J.M. Gililland, L.A. Anderson, H.B. Henninger, E.N. Kubiak, C.L. Peters. “Biomechanical analysis of acetabular revision constructs: is pelvic discontinuity best treated with bicolumnar or traditional unicolumnar fixation?,” In Journal of Arthoplasty, Vol. 28, No. 1, pp. 178--186. 2013.
DOI: 10.1016/j.arth.2012.04.031

ABSTRACT

Pelvic discontinuity in revision total hip arthroplasty presents problems with component fixation and union. A construct was proposed based on bicolumnar fixation for transverse acetabular fractures. Each of 3 reconstructions was performed on 6 composite hemipelvises: (1) a cup-cage construct, (2) a posterior column plate construct, and (3) a bicolumnar construct (no. 2 plus an antegrade 4.5-mm anterior column screw). Bone-cup interface motions were measured, whereas cyclical loads were applied in both walking and descending stair simulations. The bicolumnar construct provided the most stable construct. Descending stair mode yielded more significant differences between constructs. The bicolumnar construct provided improved component stability. Placing an antegrade anterior column screw through a posterior approach is a novel method of providing anterior column support in this setting.



S. Gratzl, A. Lex, N. Gehlenborg, H. Pfister,, M. Streit. “LineUp: Visual Analysis of Multi-Attribute Rankings,” In IEEE Transactions on Visualization and Computer Graphics (InfoVis '13), Vol. 19, No. 12, pp. 2277--2286. 2013.
ISSN: 1077-2626
DOI: 10.1109/TVCG.2013.173

ABSTRACT

Rankings are a popular and universal approach to structure otherwise unorganized collections of items by computing a rank for each item based on the value of one or more of its attributes. This allows us, for example, to prioritize tasks or to evaluate the performance of products relative to each other. While the visualization of a ranking itself is straightforward, its interpretation is not because the rank of an item represents only a summary of a potentially complicated relationship between its attributes and those of the other items. It is also common that alternative rankings exist that need to be compared and analyzed to gain insight into how multiple heterogeneous attributes affect the rankings. Advanced visual exploration tools are needed to make this process efficient.

In this paper we present a comprehensive analysis of requirements for the visualization of multi-attribute rankings. Based on these considerations, we propose a novel and scalable visualization technique - LineUp - that uses bar charts. This interactive technique supports the ranking of items based on multiple heterogeneous attributes with different scales and semantics. It enables users to interactively combine attributes and flexibly refine parameters to explore the effect of changes in the attribute combination. This process can be employed to derive actionable insights into which attributes of an item need to be modified in order for its rank to change.
Additionally, through integration of slope graphs, LineUp can also be used to compare multiple alternative rankings on the same set of items, for example, over time or across different attribute combinations. We evaluate the effectiveness of the proposed multi-attribute visualization technique in a qualitative study. The study shows that users are able to successfully solve complex ranking tasks in a short period of time.



A. Grosset, M. Schott, G.-P. Bonneau, C.D. Hansen. “Evaluation of Depth of Field for Depth Perception in DVR,” In Proceedings of the 2013 IEEE Pacific Visualization Symposium (PacificVis), pp. 81--88. 2013.

ABSTRACT

In this paper we present a user study on the use of Depth of Field for depth perception in Direct Volume Rendering. Direct Volume Rendering with Phong shading and perspective projection is used as the baseline. Depth of Field is then added to see its impact on the correct perception of ordinal depth. Accuracy and response time are used as the metrics to evaluate the usefulness of Depth of Field. The onsite user study has two parts: static and dynamic. Eye tracking is used to monitor the gaze of the subjects. From our results we see that though Depth of Field does not act as a proper depth cue in all conditions, it can be used to reinforce the perception of which feature is in front of the other. The best results (high accuracy & fast response time) for correct perception of ordinal depth occurs when the front feature (out of the two features users were to choose from) is in focus and perspective projection is used.



L.K. Ha, J. King, Z. Fu, R.M. Kirby. “A High-Performance Multi-Element Processing Framework on GPUs,” SCI Technical Report, No. UUSCI-2013-005, SCI Institute, University of Utah, 2013.

ABSTRACT

Many computational engineering problems ranging from finite element methods to image processing involve the batch processing on a large number of data items. While multielement processing has the potential to harness computational power of parallel systems, current techniques often concentrate on maximizing elemental performance. Frameworks that take this greedy optimization approach often fail to extract the maximum processing power of the system for multi-element processing problems. By ultilizing the knowledge that the same operation will be accomplished on a large number of items, we can organize the computation to maximize the computational throughput available in parallel streaming hardware. In this paper, we analyzed weaknesses of existing methods and we proposed efficient parallel programming patterns implemented in a high performance multi-element processing framework to harness the processing power of GPUs. Our approach is capable of levering out the performance curve even on the range of small element size.



M. Hall, R.M. Kirby, F. Li, M.D. Meyer, V. Pascucci, J.M. Phillips, R. Ricci, J. Van der Merwe, S. Venkatasubramanian. “Rethinking Abstractions for Big Data: Why, Where, How, and What,” In Cornell University Library, 2013.

ABSTRACT

Big data refers to large and complex data sets that, under existing approaches, exceed the capacity and capability of current compute platforms, systems software, analytical tools and human understanding [7]. Numerous lessons on the scalability of big data can already be found in asymptotic analysis of algorithms and from the high-performance computing (HPC) and applications communities. However, scale is only one aspect of current big data trends; fundamentally, current and emerging problems in big data are a result of unprecedented complexity |in the structure of the data and how to analyze it, in dealing with unreliability and redundancy, in addressing the human factors of comprehending complex data sets, in formulating meaningful analyses, and in managing the dense, power-hungry data centers that house big data.

The computer science solution to complexity is finding the right abstractions, those that hide as much triviality as possible while revealing the essence of the problem that is being addressed. The "big data challenge" has disrupted computer science by stressing to the very limits the familiar abstractions which define the relevant subfields in data analysis, data management and the underlying parallel systems. Efficient processing of big data has shifted systems towards increasingly heterogeneous and specialized units, with resilience and energy becoming important considerations. The design and analysis of algorithms must now incorporate emerging costs in communicating data driven by IO costs, distributed data, and the growing energy cost of these operations. Data analysis representations as structural patterns and visualizations surpass human visual bandwidth, structures studied at small scale are rare at large scale, and large-scale high-dimensional phenomena cannot be reproduced at small scale.

As a result, not enough of these challenges are revealed by isolating abstractions in a traditional soft-ware stack or standard algorithmic and analytical techniques, and attempts to address complexity either oversimplify or require low-level management of details. The authors believe that the abstractions for big data need to be rethought, and this reorganization needs to evolve and be sustained through continued cross-disciplinary collaboration.

In what follows, we first consider the question of why big data and why now. We then describe the where (big data systems), the how (big data algorithms), and the what (big data analytics) challenges that we believe are central and must be addressed as the research community develops these new abstractions. We equate the biggest challenges that span these areas of big data with big mythological creatures, namely cyclops, that should be conquered.