![]() Orchestration of materials science workflows for heterogeneous resources at large scale, N. Zhou, G. Scorzelli, J. Luettgau, R.R. Kancharla, J. Kane, R. Wheeler, B. Croom, B. Newell, V. Pascucci, M. Taufer. In The International Journal of High Performance Computing Applications, Sage, 2023. In the era of big data, materials science workflows need to handle large-scale data distribution, storage, and computation. Any of these areas can become a performance bottleneck. We present a framework for analyzing internal material structures (e.g., cracks) to mitigate these bottlenecks. We demonstrate the effectiveness of our framework for a workflow performing synchrotron X-ray computed tomography reconstruction and segmentation of a silica-based structure. Our framework provides a cloud-based, cutting-edge solution to challenges such as growing intermediate and output data and heavy resource demands during image reconstruction and segmentation. Specifically, our framework efficiently manages data storage, scaling up compute resources on the cloud. The multi-layer software structure of our framework includes three layers. A top layer uses Jupyter notebooks and serves as the user interface. A middle layer uses Ansible for resource deployment and managing the execution environment. A low layer is dedicated to resource management and provides resource management and job scheduling on heterogeneous nodes (i.e., GPU and CPU). At the core of this layer, Kubernetes supports resource management, and Dask enables large-scale job scheduling for heterogeneous resources. The broader impact of our work is four-fold: through our framework, we hide the complexity of the cloud’s software stack to the user who otherwise is required to have expertise in cloud technologies; we manage job scheduling efficiently and in a scalable manner; we enable resource elasticity and workflow orchestration at a large scale; and we facilitate moving the study of nonporous structures, which has wide applications in engineering and scientific fields, to the cloud. While we demonstrate the capability of our framework for a specific materials science application, it can be adapted for other applications and domains because of its modular, multi-layer architecture. |
![]() ![]() The effects of passive design on indoor thermal comfort and energy savings for residential buildings in hot climates: A systematic review M. Hu, K. Zhang, Q. Nguyen, T. Tasdizen. In Urban Climate, Vol. 49, pp. 101466. 2023. DOI: https://doi.org/10.1016/j.uclim.2023.101466 In this study, a systematic review and meta-analysis were conducted to identify, categorize, and investigate the effectiveness of passive cooling strategies (PCSs) for residential buildings. Forty-two studies published between 2000 and 2021 were reviewed; they examined the effects of PCSs on indoor temperature decrease, cooling load reduction, energy savings, and thermal comfort hour extension. In total, 30 passive strategies were identified and classified into three categories: design approach, building envelope, and passive cooling system. The review found that using various passive strategies can achieve, on average, (i) an indoor temperature decrease of 2.2 °C, (ii) a cooling load reduction of 31%, (iii) energy savings of 29%, and (v) a thermal comfort hour extension of 23%. Moreover, the five most effective passive strategies were identified as well as the differences between hot and dry climates and hot and humid climates. |
![]() ![]() A Metalearning Approach for Physics-Informed Neural Networks (PINNs): Application to Parameterized PDEs M. Penwarden, S. Zhe, A. Narayan, R.M. Kirby. In Journal of Computational Physics, Elsevier, 2023. DOI: https://doi.org/10.1016/j.jcp.2023.111912 Physics-informed neural networks (PINNs) as a means of discretizing partial differential equations (PDEs) are garnering much attention in the Computational Science and Engineering (CS&E) world. At least two challenges exist for PINNs at present: an understanding of accuracy and convergence characteristics with respect to tunable parameters and identification of optimization strategies that make PINNs as efficient as other computational science tools. The cost of PINNs training remains a major challenge of Physics-informed Machine Learning (PiML) – and, in fact, machine learning (ML) in general. This paper is meant to move towards addressing the latter through the study of PINNs on new tasks, for which parameterized PDEs provides a good testbed application as tasks can be easily defined in this context. Following the ML world, we introduce metalearning of PINNs with application to parameterized PDEs. By introducing metalearning and transfer learning concepts, we can greatly accelerate the PINNs optimization process. We present a survey of model-agnostic metalearning, and then discuss our model-aware metalearning applied to PINNs as well as implementation considerations and algorithmic complexity. We then test our approach on various canonical forward parameterized PDEs that have been presented in the emerging PINNs literature. |
![]() ![]() Multi-Task Classification for Improved Health Outcome Prediction Based on Environmental Indicators M. Alirezaei, Q.C. Nguyen, R. Whitaker, T. Tasdizen. In IEEE Access, 2022. DOI: 10.1109/ACCESS.2023.3295777 The influence of the neighborhood environment on health outcomes has been widely recognized in various studies. Google street view (GSV) images offer a unique and valuable tool for evaluating neighborhood environments on a large scale. By annotating the images with labels indicating the presence or absence of certain neighborhood features, we can develop classifiers that can automatically analyze and evaluate the environment. However, labeling GSV images on a large scale is a time-consuming and labor-intensive task. Considering these challenges, we propose using a multi-task classifier to improve training a classifier with limited supervised, GSV data. Our multi-task classifier utilizes readily available, inexpensive online images collected from Flicker as a related classification task. The hypothesis is that a classifier trained on multiple related tasks is less likely to overfit to small amounts of training data and generalizes better to unseen data. We leverage the power of multiple related tasks to improve the classifier’s overall performance and generalization capability. Here we show that, with the proposed learning paradigm, predicted labels for GSV test images are more accurate. Across different environment indicators, the accuracy, F1 score and balanced accuracy increase up to 6 % in the multi-task learning framework compared to its single-task learning counterpart. The enhanced accuracy of the predicted labels obtained through the multi-task classifier contributes to a more reliable and precise regression analysis determining the correlation between predicted built environment indicators and health outcomes. The R2 values calculated for different health outcomes improve by up to 4 % using multi-task learning detected indicators. |
![]() ![]() Accelerating Physics Schemes in Numerical Weather Prediction Codes and Preserving Positivity in the Physics-Dynamics coupling Timbwaoga Aime Judicael (TAJO) Ouermi. University of Utah, 2022. |
![]() ![]() The Materials Commons Data Repository G. Tarcea, B. Puchala, T. Berman, G. Scorzelli, V. Pascucci, M, Taufer, J. Allison. In 2022 IEEE 18th International Conference on e-Science (e-Science), pp. 405--406. 2022. DOI: 10.1109/eScience55777.2022.00060 Repositories are increasingly used for publishing and sharing scientific data. The Materials Commons is a data repository that follows the FAIR (Findable, Accessible, Inter-operable, Reusable) principles. We demonstrate the challenges with FAIR and how Materials Commons solves them. We also discuss the Nationals Science Data Fabric (NSDF) [1], a project that is democratizing data access, and show how Materials Commons with the NSDF software stack accelerates data access and scientific research. |