1. NSLS-II Friday Lunchtime Seminar

    "Provenance and workflow tools for multimodal experiments"

    Presented by Line Pouchard, Computational Science Initiative, BNL

    Friday, September 25, 2020, 12 pm
    ZoomGov

    Hosted by: Ignace Jarrige

    New data management techniques are needed to address the increasing volume and complexity of data produced by the latest generation of detectors at Scientific User Facilities such as NSLS-II. The experiments carried out by very diverse user communities produce data processed in many unique and highly customized scientific workflows. The facilities exhibit further complexity in large and adaptable collections of instruments, broad ranges of data rates, and data access patterns. In addition, multi-modal techniques that characterize samples with different imaging modalities are poised to further increase the heterogeneity of data processing and analysis methods. One particular challenge is the development of collections of well-annotated datasets for use with machine learning techniques, including provenance. Provenance is the detailed recording of data lineage and software processes operating on data that enable interpreting, validating and reproducing results. This seminar will describe provenance and workflow tools developed by a joint NSLS-II-CSI team under LDRD, including a text mining portal classifying scientific literature papers by XAS edges, and a graph-based provenance wrapper developed for XPD and recently matured to run with a 3D reconstruction code.