E. Natasha Stavros

and 12 more

The Surface Biology and Geology global imaging spectrometer is primarily designed to observe the chemical fingerprint of the Earth’s surface. However imaging spectroscopy across the visible to shortwave infrared (VSWIR) can also provide important atmospheric observations of methane point sources, highly concentrated emissions from energy, waste management and livestock operations. Relating these point-source observations to greenhouse gas inventories and coarser, regional methane observations from sensors like the European Space Agency (ESA) TROPOMI will contribute to reducing uncertainties in local, regional and global carbon budgets. We present the Multi-scale Methane Analytic Framework (M2AF) that facilitates disentangling confounding processes by streamlining analysis of cross-scale, multi-sensor methane observations across three key, overlapping spatial scales: 1) global to regional scale, 2) regional to local scale, and 3) facility (point source scale). M2AF is an information system that bridges methane research and applied science by integrating tiered observations of methane from surface measurements, airborne sensors and satellite. Reducing uncertainty in methane fluxes with multi-scale analyses can improve carbon accounting and attribution which is valuable to both formulation and verification of mitigation actions. M2AF lays the foundation for extending existing methane analysis systems beyond their current experimental states, reducing latency and cost of methane data analysis and improving accessibility by researchers and decision makers. M2AF leverages the NASA Methane Source Finder (MSF), the NASA Science Data Analytics Platform (SDAP), Amazon Web Services (AWS) and two supercomputers for fast, on-demand analytics of cross-scale, integrated, quality-controlled methane flux estimates.

Joseph Jacob

and 2 more

The need to better understand climate change has driven model simulations to greater fidelity with improved spatiotemporal resolution (e.g., < 10 km at sub-hourly cadence). For example, the 7 km GEOS-5 Nature Run (G5NR) with 30-minute outputs from 2005-07 at the NASA Center for Climate Simulation (NCCS) is ~4 PB and is not easily portable. The rise of these high-fidelity climate models coincides with the emergence of cloud computing as a viable platform for scientific analytics. NASA has adopted a cloud computing strategy using public providers like Amazon Web Services (AWS). However, it is not cost- or time- effective to move the High- Performance Computing (HPC)-based model computations and data to the cloud. Thus, there is a need for scalable model evaluation compatible with both the cloud and HPC platforms like NCCS. To fill this need we have extended the analytics component of the Apache Science Data Analytics Platform (SDAP) with a streamlined version that specifically targets high-resolution science data products and climate model outputs on a regular coordinate grid. Gridded inputs (as opposed to other data structures like point clouds or swath-based measurements supported by SDAP), enable offsets to particular grid cells to be directly computed, allow for processing on the original NetCDF or HDF granules, do not require a second tiled copy of the data, and accommodate a simpler technology stack since no geospatial database is required for lookups or tile storage. Our core module, Parmap, abstracts the map-reduce model so that users can select from a variety of map computational modes, including Spark, Dask, serverless AWS Lambda, PySparkling, and Python multiprocessing. Example analytics include area-averaged time series and time-averaged, correlation and climatological maps. Benchmarks compare favorably with the full SDAP implementation.

Edward Armstrong

and 16 more

Before complex analysis of oceanographic or any earth science data can occur, it must be placed in the proper domain of computing and software resources. In the past this was nearly always the scientist’s personal computer or institutional computer servers. The problem with this approach is that it is necessary to bring the data products directly to these compute resources leading to large data transfers and storage requirements especially for high volume satellite or model datasets. In this presentation we will present a new technological solution under development and implementation at the NASA Jet Propulsion Laboratory for conducting oceanographic and related research based on satellite data and other sources. Fundamentally, our approach for satellite resources is to tile (partition) the data inputs into cloud-optimized and computation friendly databases that allow distributed computing resources to perform on demand and server-side computation and data analytics. This technology, known as NEXUS, has already been implemented in several existing NASA data portals to support oceanographic, sea-level, and gravity data time series analysis with capabilities to output time-average maps, correlation maps, Hovmöller plots, climatological averages and more. A further extension of this technology will integrate ocean in situ observations, event-based data discovery (e.g., natural disasters), data quality screening and additional capabilities. This particular activity is an open source project known as the Apache Science Data Analytics Platform (SDAP) (https://sdap.apache.org), and colloquially as OceanWorks, and is funded by the NASA AIST program. It harmonizes data, tools and computational resources for the researcher allowing them to focus on research results and hypothesis testing, and not be concerned with security, data preparation and management. We will present a few oceanographic and interdisciplinary use cases demonstrating the capabilities for characterizing regional sea-level rise, sea surface temperature anomalies, and ocean hurricane responses.