Many real world applications generate functional data which is measured over a continuous domain like space or time. Classic functional data analysis operates on functions that are scalar-valued where samples along these functions are considered point-measurements. However, these scalar values are generally obtained by summarizing data at initial stages as statistical summaries of the underlying distributions. This reduction via discarding variability information may introduce significant errors which propagate through the procedures. For example, temporal changes in medical images are often evaluated along a parametrized function that represents a structure of interest like white matter tracts extracted from Diffusion Tensor images. Frameworks traditionally operate on scalar measurements derived along these functions by averaging within regions of interest and tract cross-sections for population studies and hypothesis testing. In this dissertation, we extend the idea to functions that are distribution-valued instead. By attributing samples along these functions with distributions of image properties in the local voxel neighborhood, we create distribution-valued signatures of these functions. In a general context, these local distribution profiles along the function can be used to capture the complexity of the local information by using a more nuanced variable which can take on values as a distribution, interval range or a scalar number. This can be utilized to summarize local data via aggregation or capture uncertainty associated with measurements. Our goal is to model a smooth temporal evolution trajectory from a discrete-time set of such functions. We believe that retaining and employing the rich variability information throughout the analysis can provide improved model estimates and statistical power in subsequent downstream analysis. We develop four different methodologies to achieve this goal. We borrow and combine inspiration from the fields of functional data analysis and distribution-valued regression. The methods explore a wide range of model design choices to capture the trend along time ranging from parametric (linear and logistic changes) to semi-parametric and non-parametric. We also propose inference approaches to fully exploit the estimated spatio-temporal trajectories which can even be completely distribution-valued in some of these methods. Our results when applying the new methods to clinical data are promising and indicate that distributions provide a detailed yet compact feature that can better inform the regression models and statistical frameworks.
Posted by: Nathan Galli