Modern Defense Technology ›› 2020, Vol. 48 ›› Issue (5): 59-66.DOI: 10.3969/j.issn.1009-086x.2020.05.009

• COMMAND CONTROL AND COMMUNICATION • Previous Articles     Next Articles

Modeling of Air Target Threat to Warship Based on Deep Reinforcement Learning

FANG Xiao, ZENG Bi, SONG Xiang-xiang, JIA Zheng-xuan   

  1. Beijing Institute of Electronic Engineering,Beijing 100854,China
  • Received:2020-04-24 Revised:2020-05-18 Online:2020-10-20 Published:2021-02-01

基于深度强化学习的舰艇空中威胁行为建模

房霄, 曾贲, 宋祥祥, 贾正轩   

  1. 北京电子工程总体研究所,北京 100854
  • 作者简介:房霄(1986-),男,北京人。高工,硕士,主要从事指挥控制及装备模拟训练技术研究。通信地址:100854 北京142信箱30分箱 E-mail:hmjs_0814@126.com

Abstract: With the development of intelligent weapons,the traditional training methods could notmeet the demands of large-scale modern warfare.In the past decade,artificial intelligence (AI) methods such as deep reinforcement learning have made great breakthroughs in chess and electronic competitive games.It proves that the AI methods have great advantages in solving large searching space problems.Furthermore,the problems of situation prediction and temporary adjustment could be solved more effectively by AI methods.A new method for modeling of air target threat is proposed based on the research of deep reinforcement learning.The parallel scene modeling technology and the air target behavior modeling technology are used to construct the model of deep reinforcement learning.The convergence penetration strategy is calculated with iterative learning under the scene of single airplane.The successful attempt verifies that the practicability of deep reinforcement learning in modeling of air target threat.It provides support for the further research on the modeling of fleet joint air defense.

Key words: deep reinforcement learning, artificial intelligence (AI), warship air defense, air threat, penetration strategy, modeling

摘要: 随着武器装备智能化发展的速度加快,传统武器装备的训练方法已经无法满足大规模现代战争的训练需求。在近十年中深度强化学习等人工智能方法在棋类以及电子竞技游戏中取得了极大突破,证明了人工智能方法在面对大搜索空间博弈问题的优势,能够有效解决军事对抗问题中的形势预判和临机调整问题。基于此背景,依托海军舰艇对空方面作战,开展了深度强化学习的方法研究。首先通过并行场景建模技术以及空中威胁决策行为建模技术实现深度学习模型的构建,之后通过单机突防场景的对抗迭代学习,得到收敛的突防策略。验证了深度强化学习方法在空中威胁行为构建场景的可行性,为后续深入开展编队联合防空训练场景构建提供支撑。

关键词: 深度强化学习, 人工智能, 舰艇防空, 空中威胁, 突防策略, 场景构建

CLC Number: