Research Paper ML Hub

2025 IEEE 7th International Conference on Civil Aviation Safety and Information Technology (ICCASIT) / 2025

Research on the Intelligent Perception System of Robots Integrated with Multimodal AI Technology

Guanchi Zhu, Minwei Sun

AI SafetyComputer VisionMultimodal LearningRobotics

In response to the problem that the single-modal perception of robots is prone to interference from lighting, noise, etc., resulting in insufficient robustness in complex environments, this paper proposes a robot intelligent perception system that integrates vision, hearing, and touch. This system adopts a hierarchical architecture of "perception layer -preprocessing layer - fusion layer - application layer", and realizes dynamic weight allocation of cross-modal features through the modality attention mechanism to solve the problems of heterogeneous data alignment and information complementation. Experimental results show that compared with the single-modal system, this system improves the perception accuracy of target recognition and positioning tasks by 23%-35%, and the accuracy in object attribute judgment by 18%-27%; compared with traditional fusion methods, its robustness in noisy interference scenarios improves by 15%- 20%, and the single-sample processing delay remains stable within 80ms, which can meet the real-time interaction requirements of service robots, industrial inspection, etc.

0 citations0 influential

Full paper

Read the original paper

A direct open-access PDF is not available in the database yet. Use the source page or learning resources below to open the complete paper from the publisher or index.