Multimodal Industrial Scene Characterisation for Pouring Process Monitoring Using a Mixture of Experts

Industrial pouring processes operate under highly dynamic conditions where small deviations can lead to defects, scrap, and production losses. Although modern foundries are equipped with multiple sensors and visual inspection systems, most monitoring approaches remain fragmented, unimodal, and difficult to interpret. Furthermore, annotated anomalous samples in industrial settings are scarce, hindering the development of traditional methods. As a result, many critical pouring anomalies are detected too late or lack sufficient contextual information for effective decision making. In this work, we propose a multimodal framework for industrial scene characterisation that combines visual information and process signals through an explainable Mixture-of-Experts (MoE)-style expert-fusion strategy. First, we deploy an ensemble of specialised modules that collaborate to identify regions of interest, assess pouring quality, and contextualise events within the production process, thereby generating an interpretable description of pouring events. Second, we introduce a novel anomaly detection method for multimodal video data, combining a self-supervised transformer with an outlier-aware clustering algorithm. Our approach effectively identifies rare anomalies without requiring extensive manual labelling. The resulting information is structured into a digital twin-ready representation, supporting synchronisation between the physical system and its virtual counterpart. This solution provides a scalable, deployable pathway to transform heterogeneous industrial data into actionable knowledge, supporting advanced monitoring, anomaly detection, and quality control in real foundry environments.

 

Acknowledgements: This research work was funded by the Elkartek Programme (Basque Government) for the IKUN project (grant number KK-2024/00064). The views and opinions expressed are solely those of the authors and do not necessarily reflect those of the Basque Government, nor can the Basque Government be held responsible for them.

 
Authors:

Javier Nieves (AZTERLAN), Javier Selva (Vicomtech), Guillermo Elejoste-Rementeria (AZTERLAN), Jorge Angulo-Pines (AZTERLAN), Jon Leiñena (Vicomtech), Xuban Barberena (Vicomtech), Fátima A. Saiz (Vicomtech)

Keywords:

multimodal monitoring; industrial foundry; digital twin; Mixture of Experts; computer vision; deep learning; casting process monitoring; anomaly detection; video–sensor fusion; clustering

Related contents:
Back

How can we help you?.

Keep up with AZTERLAN’s activity.

Keep up with AZTERLAN’s activity.

Request demo

Fill out this form to receive access to demo contents of the course in your email. 

Contact Gorka

Get in touch with Andoni

Contacta con Ramón

Contacta con Xabier

Contact with Maider Muro

Contact with Dr. Urko de la Torre

Contact with Dra. Anna Regordosa

Contact with Aitor Loizaga

Contact with Dr. Rodolfo González-Martínez

Contact Ander.

Contact with David Aristondo.

Contact with Juan J. Bravo.

Contact with David Garcia.

Contact with Jose Ramon.

Contact with Oihana.

Keep up with AZTERLAN’s activity.

Contact with David.

Contact with Ibon.

Contact with Hegoi.

Contact with Itziar.

Contact with Erika.

Contact with Beñat.

Contact with John.

Contact José Javier.