数字孪生驱动的掘进机器人决策控制系统研究

张旭辉; 吕欣媛; 王 甜; 黄本鑫; 郑西利

摘要: 针对掘进设备远程控制中存在的设备决策能力低，掘进效率不高，安全隐患大等问题，提出了一种数字孪生驱动的掘进机器人决策控制方法。通过分析对比当前数字孪生技术在煤矿领域的研究情况，设计了数字孪生驱动的掘进机器人决策控制系统体系框架，包含物理空间、虚拟空间、孪生数据、规划层、控制层、执行层6个模块，以实现虚拟样机自主规划决策，远程控制物理样机同步运动的目的。首先，结合虚拟现实技术研究了非结构化环境下的局部避障策略，建立掘进机器人运动控制模型与传感观测模型，利用激光雷达将巷道中的障碍物在虚拟环境中进行重建，采用Ray-Col方法进行机器人与障碍物之间的碰撞检测，为机器人的路径规划决策奠定基础；其次，结合深度强化学习技术研究了基于虚拟智能体的全局路径规划方法，提出了基于改进PPO算法的Muti-PPO算法，通过奖惩机制建立掘进机器人虚拟智能体，并在Unity3D平台中进行训练，训练结果表明Muti-PPO算法相比于PPO算法、SAC算法，平均奖励值分别提升了13.82％与11.31％；标准差分别下降了17.85％与16.81％；最高奖励值分别提升0.14％与0.43％，其性能在3种算法中达到最优；最后，搭建决策控制平台，将虚拟空间中产生的决策指令发送至物理样机的末端执行器，通过物理样机传感器数据驱动虚拟样机同步变化。根据系统的规划决策、双向映射与远程控制功能，设计路径规划试验与虚实同动试验对其进行验证。路径规划试验结果表明，在3种不同复杂程度的工况下，虚拟智能体路径规划结果与目标点的误差在1.2 cm以内，且能够将控制信息传输至物理空间中，远程控制机器人运动；虚实同动试验结果表明，在掘进机器人运行过程中，虚拟样机与物理样机保持同步运动，两者在巷道中的位姿均保持一致。该方法实现了“数据驱动、双向映射、碰撞检测、自主决策、人机协作”的无人化决策控制新模式，为掘进设备的智能化提供了新的思路。

Abstract: Aiming at the problems in remote control of tunneling equipment, such as low decision-making ability of equipment, less tunneling efficiency, and large security risks, a decision-making control method of tunneling robot driven by digital twin is proposed. By analyzing and comparing the current research situation of digital twin technology in the field of coal mine, the framework of digital twin-driven tunneling robot decision control system is designed, including six modules：physical space, virtual space, twin data, planning layer, control layer and execution layer. In this system, the virtual prototype can make planning decisions autonomously and control the synchronous motion of the physical prototype remotely. Firstly, the local obstacle avoidance strategy in unstructured environment is studied based on virtual reality technology. Motion control model and sensor observation model of tunneling robot are established, and obstacles in roadway are reconstructed in virtual environment by laser radar, Ray-Col method is used to detect the collision between robot and obstacle, which lays the foundation for robot path planning decision. Secondly, the global path planning method based on virtual agent is studied by combining deep reinforcement learning technology, the Muti-PPO algorithm based on the improved PPO algorithm is proposed, and the virtual agent of tunneling robot is established through the reward and punishment mechanism, and training in Unity3D platform, The training results show that compared with PPO algorithm and SAC algorithm, the average reward value of Muti-PPO algorithm is increased by 13.82% and 11.31% respectively. Standard deviation decreased by 17.85% and 16.81% respectively; the maximum reward value is increased by 0.14% and 0.43% respectively, and its performance is optimal among the three algorithms. Finally, a decision control platform is built to send the decision instructions generated in the virtual space to the end-effector of the physical prototype, and drive the synchronous change of the virtual prototype through the sensor data of the physical prototype. According to the planning decision, bidirectional mapping and remote control functions of the system, path planning experiment and virtual-real co-movement experiment are designed to verify it. The experimental results of path planning show that the error between the end point of virtual agent path planning and the target point is within 1.2 cm under three different complexity conditions, and the control information can be transmitted to the physical space to control the robot motion remotely. The experimental results show that the virtual prototype and the physical prototype are consistent in the roadway during the operation of the tunneling robot. This method realizes the new unmanned decision-making control mode of “data-driven, two-way mapping, collision detection, autonomous decision-making, and man-machine cooperation”, which provides a new idea for the intelligentization of tunneling equipment.

数字孪生驱动的掘进机器人决策控制系统研究

Research on decision control system of tunneling robot driven by digital twin