Modern Defense Technology ›› 2026, Vol. 54 ›› Issue (3): 93-103.DOI: 10.3969/j.issn.1009-086x.2026.03.009

• PAPERS • Previous Articles     Next Articles

Simulation of Game-Theoretic Decision-Making for Beyond-Visual-Range Combat with UCAVs

Tongyu SHI1, Hao WANG2, Youkun WANG1, Maolong LÜ1   

  1. 1.Air Traffic Control and Navigation School,Air Force Engineering University,Xi′an 710051,China
    2.Academy of Military Science,Institute of Military Intelligence,Beijing 100091,China
  • Received:2025-05-28 Revised:2025-08-21 Online:2026-06-28 Published:2026-07-03
  • Contact: Maolong Lü

无人作战飞机超视距空战博弈对抗决策仿真

史桐雨1, 王昊2, 王酉琨1, 吕茂隆1   

  1. 1.空军工程大学 空管领航学院,陕西 西安 710051
    2.军事科学院 军事智能研究院,北京 100091
  • 通讯作者: 吕茂隆
  • 作者简介:史桐雨(2004-),男,河南南阳人。本科生,研究方向为有人无人协同空战。

Abstract:

Reinforcement learning (RL) performance in beyond-visual-range (BVR) air combat is constrained by inadequate training opponents. This paper proposes a rule-based agent decision framework serving as RL training adversaries, where simulations confirm significantly enhanced combat effectiveness through efficient mastery of tactical maneuvers and improved adaptive decision-making. Fundamental aircraft maneuvers are modeled within an air combat simulation environment with collaborative strategy training modules. To address incomplete coverage and complexity in conventional rule-based decision trees, a state-machine-driven framework implements event-condition mechanisms for state transitions and combat decisions, demonstrating superior performance in comparative simulations. Finally, RL agents trained against this state-machine-based opponent under expert knowledge guidance autonomously acquire classical maneuvers while exhibiting advanced decision adaptability, providing foundational insights for BVR decision systems.

Key words: beyond visual range air combat, unmanned aerial vehicle(UAV), independent decision making, rule based decision making, reinforcement learning, expert domain knowledge

摘要:

超视距空战中强化学习性能受限于训练对手质量。为此,提出一种基于规则的智能体决策框架作为强化学习智能体的训练对手。经仿真验证,以此框架训练的智能体可高效掌握典型空战策略,作战效能明显提升。介绍了战机基本机动动作,建立了空战仿真模块和协同策略训练模块。针对现有规则决策树存在的规则梳理不全面、繁琐复杂难梳理等问题,提出了基于状态机转移的决策逻辑框架,利用事件条件实现状态跳转与决策,其相较于传统决策树具备更强的空战决策能力。建立单机超视距空战强化学习智能体,并以基于状态机转移的决策逻辑框架为对手引导智能体学习训练,在规则专家知识引导下训练的智能体能够自主学习到典型机动动作,同时具备更好的决策适应水平和作战能力,为超视距空战决策系统的进一步研究提供了思路。

关键词: 超视距空战, 无人机, 自主决策, 基于规则的决策, 强化学习, 专家知识

CLC Number: