WE3BC —  Data Management   (11-Oct-23   14:00—16:00)
Chair: A. Ashton, PSI, Villigen PSI, Switzerland
Paper Title Page
WE3BCO01 Modular and Scalable Archiving for EPICS and Other Time Series Using ScyllaDB and Rust 1008
  • D. Werder, T. Humar
    PSI, Villigen PSI, Switzerland
  At PSI we currently run too many different products with the common goal of archiving timestamped data. This includes EPICS Channel Archiver as well as Archiver Appliance for EPICS IOC’s, a buffer storage for beam-synchronous data at SwissFEL, and more. This number of monolithic solutions is too large to maintain and overlaps in functionality. Each solution brings their own storage engine, file format and centralized design which is hard to scale. In this talk I report on how we factored the system into modular components with clean interfaces. At the core, the different storage engines and file formats have been replaced by ScyllaDB, which is an open source product with enterprise support and remarkable adoption in the industry. We gain from its distributed, fault-tolerant and scalable design. The ingest of data into ScyllaDB is factored into components according to the different type of protocols of the sources, e.g. Channel Access. Here we build upon the Rust language and achieve robust, maintainable and performant services. One interface to access and process the recorded data is the HTTP retrieval service. This service offers e.g. search among the channels by various criteria, full event data as well as aggregated and binned data in either json or binary formats. This service can also run user-defined data transformations and act as a source for Grafana for a first view into recorded channel data. Our setup for SwissFEL ingests the ~370k EPICS updates/s from ~220k PVs (scalar and waveform), having rates between 0.1 and 100 Hz.  
slides icon Slides WE3BCO01 [1.179 MB]  
DOI • reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-WE3BCO01  
About • Received ※ 04 October 2023 — Revised ※ 09 November 2023 — Accepted ※ 14 December 2023 — Issued ※ 14 December 2023
Cite • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)  
WE3BCO03 Data Management for Tracking Optic Lifetimes at the National Ignition Facility 1012
  • R.D. Clark, L.M. Kegelmeyer
    LLNL, Livermore, California, USA
  Funding: This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
The National Ignition Facility (NIF), the most energetic laser in the world, employs over 9000 optics to reshape, amplify, redirect, smooth, focus, and convert the wavelength of laser light as it travels along 192 beamlines. Underlying the management of these optics is an extensive Oracle database storing details of the entire life of each optic from the time it leaves the vendor to the time it is retired. This journey includes testing and verification, preparing, installing, monitoring, removing, and in some cases repairing and re-using the optics. This talk will address data structures and processes that enable storing information about each step like identifying where an optic is in its lifecycle and tracking damage through time. We will describe tools for reporting status and enabling key decisions like which damage sites should be blocked or repaired and which optics exchanged. Managing relational information and ensuring its integrity is key to managing the status and inventory of optics for NIF.
LLNL Release Number: LLNL-ABS-847598
slides icon Slides WE3BCO03 [2.379 MB]  
DOI • reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-WE3BCO03  
About • Received ※ 26 September 2023 — Revised ※ 09 October 2023 — Accepted ※ 13 October 2023 — Issued ※ 24 October 2023
Cite • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)  
WE3BCO04 Improving Observability of the SCADA Systems Using Elastic APM, Reactive Streams and Asynchronous Communication 1016
  • I. Khokhriakov
    University of California, San Diego (UCSD), La Jolla, California, USA
  • V. Mazalova
    CFEL, Hamburg, Germany
  • O. Merkulova
    IK, Moscow, Russia
  As modern control systems grow in complexity, ensuring observability and traceability becomes more challenging. To meet this challenge, we present a novel solution that seamlessly integrates with multiple SCADA frameworks to provide end-to-end visibility into complex system interactions. Our solution utilizes Elastic APM to monitor and trace the performance of system components, allowing for real-time analysis and diagnosis of issues. In addition, our solution is built using reactive design principles and asynchronous communication, enabling it to scale to meet the demands of large, distributed systems. This presentation will describe our approach and discuss how it can be applied to various use cases, including particle accelerators and other scientific facilities. We will also discuss the benefits of our solution, such as improved system observability and traceability, reduced downtime, and better resource allocation. We believe that our approach represents a significant step forward in the development of modern control systems, and we look forward to sharing our work with the community at ICALEPCS 2023.
* Igor Khokhriakov et al,
A novel solution for controlling hardware components of accelerators and beamlines
slides icon Slides WE3BCO04 [3.377 MB]  
DOI • reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-WE3BCO04  
About • Received ※ 29 September 2023 — Revised ※ 14 November 2023 — Accepted ※ 19 December 2023 — Issued ※ 22 December 2023
Cite • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)  
WE3BCO05 The CMS Detector Control Systems Archiving Upgrade 1022
  • W. Karimeh
    CERN, Meyrin, Switzerland
  The CMS experiment relies on its Detector Control System (DCS) to monitor and control over 10 million channels, ensuring a safe and operable detector that is ready to take physics data. The data is archived in the CMS Oracle conditions database, which is accessed by operators, trigger and data acquisition systems. In the upcoming extended year-end technical stop of 2023/2024, the CMS DCS software will be upgraded to the latest WinCC-OA release, which will utilise the SQLite database and the Next Generation Archiver (NGA), replacing the current Raima database and RDB manager. Taking advantage of this opportunity, CMS has developed its own version of the NGA backend to improve its DCS database interface. This paper presents the CMS DCS NGA backend design and mechanism to improve the efficiency of the read-and-write data flow. This is achieved by simplifying the current Oracle conditions schema and introducing a new caching mechanism. The proposed backend will enable faster data access and retrieval, ultimately improving the overall performance of the CMS DCS.  
slides icon Slides WE3BCO05 [1.920 MB]  
DOI • reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-WE3BCO05  
About • Received ※ 06 October 2023 — Revised ※ 12 October 2023 — Accepted ※ 14 December 2023 — Issued ※ 14 December 2023
Cite • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)  
WE3BCO06 Assonant: A Beamline-Agnostic Event Processing Engine for Data Collection and Standardization 1025
  • P.B. Mausbach, E.X. Miqueles, A. Pinto
    LNLS, Campinas, Brazil
  Synchrotron radiation facilities comprise beamlines designed to perform a wide range of X-ray experimental techniques which require complex instruments to monitor thermodynamic variables, sample-related variables, among others. Thus, synchrotron beamlines can produce heterogeneous sets of data and metadata, hereafter referred to as data, which impose several challenges to standardizing them. For open science and FAIR principles, such standardization is paramount for research reproducibility, besides accelerating the development of scalable and reusable data-driven solutions. To address this issue, the Assonant was devised to collect and standardize the data produced at beamlines of Sirius, the Brazilian fourth-generation synchrotron light source. This solution enables a NeXus-compliant technique-centric data standard at Sirius transparently for beamline teams by removing the burden of standardization tasks from them and providing a unified standardization solution for several techniques at Sirius. The Assonant implements a software interface to abstract data format-related specificities and to send the produced data to an event-driven infrastructure composed of streaming processing and microservices, able to transform the data flow according to NeXus*. This paper presents the development process of Assonant, the strategy adopted to standardize beamlines with different operating stages, and challenges faced during the standardization process for macromolecular crystallography and imaging data at Sirius.
* M. Könnecke et al., ’The nexus data format’, Journal of applied crystallography, vol. 48, no. 1, pp. 301-305, 2015.
slides icon Slides WE3BCO06 [4.909 MB]  
DOI • reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-WE3BCO06  
About • Received ※ 05 October 2023 — Accepted ※ 08 December 2023 — Issued ※ 18 December 2023  
Cite • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)  
WE3BCO07 Extending the ICAT Metadata Catalogue to New Scientific Use Cases 1033
  • A. Götz, M. Bodin, A. De Maria Antolinos, M. Gaonach
    ESRF, Grenoble, France
  • M. AlMohammad, S.A. Matalgah
    SESAME, Allan, Jordan
  • P. Austin, V. Bozhinov, L.E. Davies, A. Gonzalez Beltran, K.S. Phipps
    STFC/RAL/SCD, Didcot, United Kingdom
  • R. Cabezas Quirós
    ALBA-CELLS, Cerdanyola del Vallès, Spain
  • R. Krahl
    HZB, Berlin, Germany
  • A. Pinto
    LNLS, Campinas, Brazil
  • K. Syder
    DLS, Oxfordshire, United Kingdom
  The ICAT metadata catalogue is a flexible solution for managing scientific metadata and data from a wide variety of domains following the FAIR data principles. This paper will present an update of recent developments of the ICAT metadata catalogue and the latest status of the ICAT collaboration. ICAT was originally developed by UK Science and Technology Facilities Council (STFC) to manage the scientific data of ISIS Neutron and Muon Source and Diamond Light Source. They have since been joined by a number of other institutes including ESRF, HZB, SESAME, and ALBA who together now form the ICAT Collaboration [1]. ICAT has been used to manage petabytes of scientific data for ISIS, DLS, ESRF, HZB, and in the future SESAME and ALBA and make these data FAIR. The latest version of the ICAT core as well as the new user interfaces, DataGateway and DataHub, and extensions to ICAT for implementing free text searching, a common search interface across Photon and Neutron catalogues, a protocol-based interface that allows making the metadata available for findability, electronic logbooks, sample tracking, and web-based data and domain specific viewers developed by the community will be presented. Finally recent developments to use ICAT to develop applications for processed data with rich metadata in the fields of small angle scattering, macromolecular crystallography and cryo-electron microscopy will be described. [1] https://icatproject.org  
slides icon Slides WE3BCO07 [7.888 MB]  
DOI • reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-WE3BCO07  
About • Received ※ 05 October 2023 — Revised ※ 23 October 2023 — Accepted ※ 14 December 2023 — Issued ※ 14 December 2023
Cite • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)  
WE3BCO08 Efficient and Automated Metadata Recording and Viewing for Scientific Experiments at MAX IV 1041
  • D. van Dijken, V. Da Silvapresenter, M. Eguiraun, V. Hardion, J.M. Klingberg, M. Leorato, M. Lindberg
    MAX IV Laboratory, Lund University, Lund, Sweden
  With the advancements in beamline instrumentation, synchrotron research facilities have seen a significant improvement. The detectors used today can generate thousands of frames within seconds. Consequently, an organized and adaptable framework is essential to facilitate the efficient access and assessment of the enormous volumes of data produced. Our communication presents a metadata management solution recently implemented at MAX IV, which automatically retrieves and records metadata from Tango devices relevant to the current experiment. The solution includes user-selected scientific metadata and predefined defaults related to the beamline setup, which are integrated into the Sardana control system and automatically recorded during each scan via the SciFish[1] library. The metadata recorded is stored in the SciCat[2] database, which can be accessed through a web-based interface called Scanlog[3]. The interface, built on ReactJS, allows users to easily sort, filter, and extract important information from the recorded metadata. The tool also provides real-time access to metadata, enabling users to monitor experiments and export data for post-processing. These new software tools ensure that recorded data is findable, accessible, interoperable and reusable (FAIR[4]) for many years to come. Collaborations are on-going to develop these tools at other particle accelerator research facilities.
[1] https://gitlab.com/MaxIV/lib-maxiv-scifish
[2] https://scicatproject.github.io/
[3] https://gitlab.com/MaxIV/svc-maxiv-scanlog
[4] https://www.nature.com/articles/sdata201618
slides icon Slides WE3BCO08 [1.914 MB]  
DOI • reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-WE3BCO08  
About • Received ※ 06 October 2023 — Revised ※ 23 October 2023 — Accepted ※ 14 December 2023 — Issued ※ 16 December 2023
Cite • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)  
WE3BCO09 IR of FAIR - Principles at the Instrument Level 1046
  • G. Günther, O. Mannix, V. Serve
    HZB, Berlin, Germany
  • S. Baunack
    KPH, Mainz, Germany
  • L. Capozza, F. Maas, M.C. Wilfert
    HIM, Mainz, Germany
  • O. Freyermuth
    Uni Bonn, Bonn, Germany
  • P. Gonzalez-Caminal, S. Karstensen, A. Lindner, I. Oceano, C. Schneide, K. Schwarz, T. Schörner-Sadenius, L.-M. Stein
    DESY, Hamburg, Germany
  • B. Gou
    IMP/CAS, Lanzhou, People’s Republic of China
  • J. Isaak, S. Typel
    TU Darmstadt, Darmstadt, Germany
  • A.K. Mistry
    GSI, Darmstadt, Germany
  Awareness of the need for FAIR data management has increased in recent years but examples of how to achieve this are often missing. Focusing on the large-scale instrument A4 at the MAMI accelerator, we transfer findings of the EMIL project at the BESSY synchrotron* to improve raw data, i.e. the primary output stored on long-term basis, according to the FAIR principles. Here, the instrument control software plays a key role as the central authority to start measurements and orchestrate connected (meta)data-taking processes. In regular discussions we incorporate the experiences of a wider community and engage to optimize instrument output through various measures from conversion to machine-readable formats over metadata enrichment to additional files creating scientific context. The improvements were already applied to currently built next generation instruments and could serve as a general guideline for publishing data sets.
*G. Günther et al. FAIR meets EMIL: Principles in Practice. Proceedings of ICALEPCS2021, https://doi.org/10.18429/JACoW-ICALEPCS2021-WEBL05
slides icon Slides WE3BCO09 [1.400 MB]  
DOI • reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-WE3BCO09  
About • Received ※ 04 October 2023 — Revised ※ 24 October 2023 — Accepted ※ 08 December 2023 — Issued ※ 15 December 2023
Cite • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)