ICALEPCS2023 - Table of Session: TU1BC (Artificial Intelligence & Machine Learning)

Paper	Title	Page
TU1BCO01	A Workflow for Training and Deploying Machine Learning Models to EPICS	244
	M.F. Leputa, K.R.L. Baker, M. Romanovschi STFC/RAL/ISIS, Chilton, Didcot, Oxon, United Kingdom
	The transition to EPICS as the control system for the ISIS Neutron and Muon Source accelerators is an opportunity to more easily integrate machine learning into operations. But developing high quality machine learning (ML) models is insufficient. Integration into critical operations requires good development practices to ensure stability and reliability during deployment and to allow robust and easy maintenance. For these reasons we implemented a workflow for training and deploying models that utilize off-the-shelf, industry-standard tools such as MLflow. Our experience of how adoption of these tools can make developer’s lives easier during the training phase of a project is discussed. We describe how these tools may be used in an automated deployment pipeline to allow the ML model to interact with our EPICS ecosystem through Python-based IOCs within a containerized environment. This reduces the developer effort required to produce GUIs to interact with the models within the ISIS Main Control Room as tools familiar to operators, such as Phoebus, may be used.
	Slides TU1BCO01 [3.370 MB]
DOI •	reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-TU1BCO01
About •	Received ※ 05 October 2023 — Accepted ※ 12 October 2023 — Issued ※ 19 October 2023
Cite •	reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)

TU1BCO02	Integrating System Knowledge in Unsupervised Anomaly Detection Algorithms for Simulation-Based Failure Prediction of Electronic Circuits	249
	F. Waldhauser, H. Boukabache, D. Perrin, S. Roesler CERN, Meyrin, Switzerland M. Dazer Universität Stuttgart, Stuttgart, Germany
	Funding: This work has been sponsored by the Wolfgang Gentner Programme of the German Federal Ministry of Education and Research (grant no. 13E18CHA). Machine learning algorithms enable failure prediction of large-scale, distributed systems using historical time-series datasets. Although unsupervised learning algorithms represent a possibility to detect an evolving variety of anomalies, they do not provide links between detected data events and system failures. Additional system knowledge is required for machine learning algorithms to determine the nature of detected anomalies, which may represent either healthy system behavior or failure precursors. However, knowledge on failure behavior is expensive to obtain and might only be available upon pre-selection of anomalous system states using unsupervised algorithms. Moreover, system knowledge obtained from evaluation of system states needs to be appropriately provided to the algorithms to enable performance improvements. In this paper, we will present an approach to efficiently configure the integration of system knowledge into unsupervised anomaly detection algorithms for failure prediction. The methodology is based on simulations of failure modes of electronic circuits. Triggering system failures based on synthetically generated failure behaviors enables analysis of the detectability of failures and generation of different types of datasets containing system knowledge. In this way, the requirements for type and extend of system knowledge from different sources can be determined, and suitable algorithms allowing the integration of additional data can be identified.
	Slides TU1BCO02 [2.541 MB]
DOI •	reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-TU1BCO02
About •	Received ※ 02 October 2023 — Accepted ※ 12 October 2023 — Issued ※ 25 October 2023
Cite •	reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)

TU1BCO03	Systems Modelling, AI/ML Algorithms Applied to Control Systems	257
	S.A. Mnisi SARAO, Cape Town, South Africa
	Funding: National Research Foundation (South Africa) The 64 receptor (with 20 more being built) radio telescope in the Karoo, South Africa, comprises a large number of devices and components connected to the Control-and-Monitoring (CAM) system via the Karoo Array Telescope Communication Protocol (KATCP). KATCP is used extensively for internal communications between CAM components and other subsystems. A KATCP interface exposes requests and sensors; sampling strategies are set on sensors, ranging from several updates per second to infrequent on-change updates. The sensor samples are of different types, from small integers to text fields. The samples and associated timestamps are permanently stored and made available for scientists, engineers and operators to query and analyze. This is a presentation on how to apply Machine Learning tools which utilize data-driven algorithms and statistical models to analyze sensor data sets and then draw inferences from identified patterns or make predictions based on them. The algorithms learn from the sensor data as they run against it, unlike traditional rules-based analytics systems that follow explicit instructions. Since this involves data preprocessing, we will go through how the MeerKAT telescope data storage infrastructure (called Katstore) manages the voluminous variety, velocity and volume of this data.
	Slides TU1BCO03 [1.647 MB]
DOI •	reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-TU1BCO03
About •	Received ※ 06 October 2023 — Revised ※ 09 November 2023 — Accepted ※ 14 December 2023 — Issued ※ 21 December 2023
Cite •	reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)

TU1BCO04	Laser Focal Position Correction Using FPGA-Based ML Models	262
	J.A. Einstein-Curtis, S.J. Coleman, N.M. Cook, J.P. Edelen RadiaSoft LLC, Boulder, Colorado, USA S.K. Barber, C.E. Berger, J. van Tilborg LBNL, Berkeley, California, USA
	Funding: This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of High Energy Physics under Award Number DE-SC 00259037. High repetition-rate, ultrafast laser systems play a critical role in a host of modern scientific and industrial applications. We present a diagnostic and correction scheme for controlling and determining laser focal position by utilizing fast wavefront sensor measurements from multiple positions to train a focal position predictor. This predictor and additional control algorithms have been integrated into a unified control interface and FPGA-based controller on beamlines at the Bella facility at LBNL. An optics section is adjusted online to provide the desired correction to the focal position on millisecond timescales by determining corrections for an actuator in a telescope section along the beamline. Our initial proof-of-principle demonstrations leveraged pre-compiled data and pre-trained networks operating ex-situ from the laser system. A framework for generating a low-level hardware description of ML-based correction algorithms on FPGA hardware was coupled directly to the beamline using the AMD Xilinx Vitis AI toolchain in conjunction with deployment scripts. Lastly, we consider the use of remote computing resources, such as the Sirepo scientific framework, to actively update these correction schemes and deploy models to a production environment. M.S. Rakitin et al., "Sirepo: an open-source cloud-based software interface for X-ray source and optics simulations" Journal of Synchrotron Radiation25, 1877-1892 (Nov 2018).
	Slides TU1BCO04 [1.876 MB]
DOI •	reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-TU1BCO04
About •	Received ※ 06 October 2023 — Accepted ※ 14 November 2023 — Issued ※ 18 December 2023
Cite •	reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)

TU1BCO05	Model Driven Reconfiguration of LANSCE Tuning Methods	267
	C.E. Taylor, P.M. Anisimov, S.A. Baily, E.-C. Huang, H.L. Leffler, L. Rybarcyk, A. Scheinker, H.A. Watkins, E.E. Westbrook, D.D. Zimmermann LANL, Los Alamos, New Mexico, USA
	Funding: National Nuclear Security Administration (NNSA) This work presents a review of the shift in tuning methods employed at the Los Alamos Neutron Science Center (LANSCE). We explore the tuning categories and methods employed in four key sections of the accelerator, namely the Low-Energy Beam Transport (LEBT), the Drift Tube Linac (DTL), the side-Coupled Cavity Linac (CCL), and the High-Energy Beam Transport (HEBT). The study additionally presents the findings of employing novel software tools and algorithms to enhance each domain’s beam quality and performance. This study showcases the efficacy of integrating model-driven and model-independent tuning techniques, along with acceptance and adaptive tuning strategies, to enhance the optimization of beam delivery to experimental facilities. The research additionally addresses the prospective strategies for augmenting the control system and diagnostics of LANSCE. R.W. Garnett, J. Phys.: Conf. Ser. 1021 012001 A. Scheinker, Rev. ST Accel. Beams 16 102803 2013 R. Keller, Proc of Part Accel Conf **M. Oothoudt, Proc of Part Accel Conf, 2003, v4
	Slides TU1BCO05 [2.886 MB]
DOI •	reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-TU1BCO05
About •	Received ※ 06 October 2023 — Revised ※ 08 October 2023 — Accepted ※ 12 December 2023 — Issued ※ 13 December 2023
Cite •	reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)

TU1BCO06	Disentangling Beam Losses in The Fermilab Main Injector Enclosure Using Real-Time Edge AI	273
	K.J. Hazelwood, J.M.S. Arnold, M.R. Austin, J.R. Berlioz, P.M. Hanlet, M.A. Ibrahim, A.T. Livaudais-Lewis, J. Mitrevski, V.P. Nagaslaev, A. Narayanan, D.J. Nicklaus, G. Pradhan, A.L. Saewert, B.A. Schupbach, K. Seiya, R.M. Thurman-Keup, N.V. Tran Fermilab, Batavia, Illinois, USA J.YC. Hu, J. Jiang, H. Liu, S. Memik, R. Shi, A.M. Shuping, M. Thieme, C. Xu Northwestern University, EVANSTON, USA A. Narayanan Northern Illinois University, DeKalb, Illinois, USA
	The Fermilab Main Injector enclosure houses two accelerators, the Main Injector and Recycler Ring. During normal operation, high intensity proton beams exist simultaneously in both. The two accelerators share the same beam loss monitors (BLM) and monitoring system. Deciphering the origin of any of the 260 BLM readings is often difficult. The (Accelerator) Real-time Edge AI for Distributed Systems project, or READS, has developed an AI/ML model, and implemented it on fast FPGA hardware, that disentangles mixed beam losses and attributes probabilities to each BLM as to which machine(s) the loss originated from in real-time. The model inferences are then streamed to the Fermilab accelerator controls network (ACNET) where they are available for operators and experts alike to aid in tuning the machines.
DOI •	reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-TU1BCO06
About •	Received ※ 06 October 2023 — Revised ※ 11 October 2023 — Accepted ※ 15 November 2023 — Issued ※ 06 December 2023
Cite •	reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)

Paper

Title

Page

A Workflow for Training and Deploying Machine Learning Models to EPICS

244

M.F. Leputa, K.R.L. Baker, M. Romanovschi
STFC/RAL/ISIS, Chilton, Didcot, Oxon, United Kingdom

The transition to EPICS as the control system for the ISIS Neutron and Muon Source accelerators is an opportunity to more easily integrate machine learning into operations. But developing high quality machine learning (ML) models is insufficient. Integration into critical operations requires good development practices to ensure stability and reliability during deployment and to allow robust and easy maintenance. For these reasons we implemented a workflow for training and deploying models that utilize off-the-shelf, industry-standard tools such as MLflow. Our experience of how adoption of these tools can make developer’s lives easier during the training phase of a project is discussed. We describe how these tools may be used in an automated deployment pipeline to allow the ML model to interact with our EPICS ecosystem through Python-based IOCs within a containerized environment. This reduces the developer effort required to produce GUIs to interact with the models within the ISIS Main Control Room as tools familiar to operators, such as Phoebus, may be used.

Slides TU1BCO01 [3.370 MB]

DOI •

reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-TU1BCO01

About •

Received ※ 05 October 2023 — Accepted ※ 12 October 2023 — Issued ※ 19 October 2023

Cite •

reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)

TU1BCO02

Integrating System Knowledge in Unsupervised Anomaly Detection Algorithms for Simulation-Based Failure Prediction of Electronic Circuits

249

F. Waldhauser, H. Boukabache, D. Perrin, S. Roesler
CERN, Meyrin, Switzerland
M. Dazer
Universität Stuttgart, Stuttgart, Germany

Funding: This work has been sponsored by the Wolfgang Gentner Programme of the German Federal Ministry of Education and Research (grant no. 13E18CHA).
Machine learning algorithms enable failure prediction of large-scale, distributed systems using historical time-series datasets. Although unsupervised learning algorithms represent a possibility to detect an evolving variety of anomalies, they do not provide links between detected data events and system failures. Additional system knowledge is required for machine learning algorithms to determine the nature of detected anomalies, which may represent either healthy system behavior or failure precursors. However, knowledge on failure behavior is expensive to obtain and might only be available upon pre-selection of anomalous system states using unsupervised algorithms. Moreover, system knowledge obtained from evaluation of system states needs to be appropriately provided to the algorithms to enable performance improvements. In this paper, we will present an approach to efficiently configure the integration of system knowledge into unsupervised anomaly detection algorithms for failure prediction. The methodology is based on simulations of failure modes of electronic circuits. Triggering system failures based on synthetically generated failure behaviors enables analysis of the detectability of failures and generation of different types of datasets containing system knowledge. In this way, the requirements for type and extend of system knowledge from different sources can be determined, and suitable algorithms allowing the integration of additional data can be identified.

Slides TU1BCO02 [2.541 MB]

DOI •

reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-TU1BCO02

About •

Received ※ 02 October 2023 — Accepted ※ 12 October 2023 — Issued ※ 25 October 2023

Cite •

reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)

TU1BCO03

Systems Modelling, AI/ML Algorithms Applied to Control Systems

257

S.A. Mnisi
SARAO, Cape Town, South Africa

Funding: National Research Foundation (South Africa)
The 64 receptor (with 20 more being built) radio telescope in the Karoo, South Africa, comprises a large number of devices and components connected to the Control-and-Monitoring (CAM) system via the Karoo Array Telescope Communication Protocol (KATCP). KATCP is used extensively for internal communications between CAM components and other subsystems. A KATCP interface exposes requests and sensors; sampling strategies are set on sensors, ranging from several updates per second to infrequent on-change updates. The sensor samples are of different types, from small integers to text fields. The samples and associated timestamps are permanently stored and made available for scientists, engineers and operators to query and analyze. This is a presentation on how to apply Machine Learning tools which utilize data-driven algorithms and statistical models to analyze sensor data sets and then draw inferences from identified patterns or make predictions based on them. The algorithms learn from the sensor data as they run against it, unlike traditional rules-based analytics systems that follow explicit instructions. Since this involves data preprocessing, we will go through how the MeerKAT telescope data storage infrastructure (called Katstore) manages the voluminous variety, velocity and volume of this data.

Slides TU1BCO03 [1.647 MB]

DOI •

reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-TU1BCO03

About •

Received ※ 06 October 2023 — Revised ※ 09 November 2023 — Accepted ※ 14 December 2023 — Issued ※ 21 December 2023

Cite •

reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)

TU1BCO04

Laser Focal Position Correction Using FPGA-Based ML Models

262

J.A. Einstein-Curtis, S.J. Coleman, N.M. Cook, J.P. Edelen
RadiaSoft LLC, Boulder, Colorado, USA
S.K. Barber, C.E. Berger, J. van Tilborg
LBNL, Berkeley, California, USA

Funding: This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of High Energy Physics under Award Number DE-SC 00259037.
High repetition-rate, ultrafast laser systems play a critical role in a host of modern scientific and industrial applications. We present a diagnostic and correction scheme for controlling and determining laser focal position by utilizing fast wavefront sensor measurements from multiple positions to train a focal position predictor. This predictor and additional control algorithms have been integrated into a unified control interface and FPGA-based controller on beamlines at the Bella facility at LBNL. An optics section is adjusted online to provide the desired correction to the focal position on millisecond timescales by determining corrections for an actuator in a telescope section along the beamline. Our initial proof-of-principle demonstrations leveraged pre-compiled data and pre-trained networks operating ex-situ from the laser system. A framework for generating a low-level hardware description of ML-based correction algorithms on FPGA hardware was coupled directly to the beamline using the AMD Xilinx Vitis AI toolchain in conjunction with deployment scripts. Lastly, we consider the use of remote computing resources, such as the Sirepo scientific framework*, to actively update these correction schemes and deploy models to a production environment.
* M.S. Rakitin et al., "Sirepo: an open-source cloud-based software interface for X-ray source and optics simulations" Journal of Synchrotron Radiation25, 1877-1892 (Nov 2018).

Slides TU1BCO04 [1.876 MB]

DOI •

reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-TU1BCO04

About •

Received ※ 06 October 2023 — Accepted ※ 14 November 2023 — Issued ※ 18 December 2023

Cite •

reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)

TU1BCO05

Model Driven Reconfiguration of LANSCE Tuning Methods

267

C.E. Taylor, P.M. Anisimov, S.A. Baily, E.-C. Huang, H.L. Leffler, L. Rybarcyk, A. Scheinker, H.A. Watkins, E.E. Westbrook, D.D. Zimmermann
LANL, Los Alamos, New Mexico, USA

Funding: National Nuclear Security Administration (NNSA)
This work presents a review of the shift in tuning methods employed at the Los Alamos Neutron Science Center (LANSCE). We explore the tuning categories and methods employed in four key sections of the accelerator, namely the Low-Energy Beam Transport (LEBT), the Drift Tube Linac (DTL), the side-Coupled Cavity Linac (CCL), and the High-Energy Beam Transport (HEBT). The study additionally presents the findings of employing novel software tools and algorithms to enhance each domain’s beam quality and performance. This study showcases the efficacy of integrating model-driven and model-independent tuning techniques, along with acceptance and adaptive tuning strategies, to enhance the optimization of beam delivery to experimental facilities. The research additionally addresses the prospective strategies for augmenting the control system and diagnostics of LANSCE.
*R.W. Garnett, J. Phys.: Conf. Ser. 1021 012001
**A. Scheinker, Rev. ST Accel. Beams 16 102803 2013
***R. Keller, Proc of Part Accel Conf
****M. Oothoudt, Proc of Part Accel Conf, 2003, v4

Slides TU1BCO05 [2.886 MB]

DOI •

reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-TU1BCO05

About •

Received ※ 06 October 2023 — Revised ※ 08 October 2023 — Accepted ※ 12 December 2023 — Issued ※ 13 December 2023

Cite •

reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)

TU1BCO06

Disentangling Beam Losses in The Fermilab Main Injector Enclosure Using Real-Time Edge AI

273

K.J. Hazelwood, J.M.S. Arnold, M.R. Austin, J.R. Berlioz, P.M. Hanlet, M.A. Ibrahim, A.T. Livaudais-Lewis, J. Mitrevski, V.P. Nagaslaev, A. Narayanan, D.J. Nicklaus, G. Pradhan, A.L. Saewert, B.A. Schupbach, K. Seiya, R.M. Thurman-Keup, N.V. Tran
Fermilab, Batavia, Illinois, USA
J.YC. Hu, J. Jiang, H. Liu, S. Memik, R. Shi, A.M. Shuping, M. Thieme, C. Xu
Northwestern University, EVANSTON, USA
A. Narayanan
Northern Illinois University, DeKalb, Illinois, USA

The Fermilab Main Injector enclosure houses two accelerators, the Main Injector and Recycler Ring. During normal operation, high intensity proton beams exist simultaneously in both. The two accelerators share the same beam loss monitors (BLM) and monitoring system. Deciphering the origin of any of the 260 BLM readings is often difficult. The (Accelerator) Real-time Edge AI for Distributed Systems project, or READS, has developed an AI/ML model, and implemented it on fast FPGA hardware, that disentangles mixed beam losses and attributes probabilities to each BLM as to which machine(s) the loss originated from in real-time. The model inferences are then streamed to the Fermilab accelerator controls network (ACNET) where they are available for operators and experts alike to aid in tuning the machines.

DOI •

reference for this paper ※ doi:10.18429/JACoW-ICALEPCS2023-TU1BCO06

About •

Received ※ 06 October 2023 — Revised ※ 11 October 2023 — Accepted ※ 15 November 2023 — Issued ※ 06 December 2023

Cite •

reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)