现代防御技术 ›› 2026, Vol. 54 ›› Issue (3): 190-200.DOI: 10.3969/j.issn.1009-086x.2026.03.017

• ?栏目名称:论文? • 上一篇    

基于检索增强生成的军用装备知识抽取方法研究

周丰光, 胡春燕, 周园, 张昊源   

  1. 北京航天晨信科技有限公司,北京 102308
  • 收稿日期:2025-08-15 修回日期:2025-10-01 出版日期:2026-06-28 发布日期:2026-07-03
  • 作者简介:周丰光,男,河南周口人。硕士生,研究方向为指挥信息系统。

Research on Military Equipment Knowledge Extraction Method Based on Retrieval-Augmented Generation

Fengguang ZHOU, Chunyan HU, Yuan ZHOU, Haoyuan ZHANG   

  1. Beijing Aerocim Technology Co. ,Ltd. ,Beijing 102308,China
  • Received:2025-08-15 Revised:2025-10-01 Online:2026-06-28 Published:2026-07-03

摘要:

针对军用装备领域非结构化数据知识抽取困难的问题,提出了一种利用混合搜索的基于检索增强生成(retrieval-augmented generation,RAG)的知识抽取方法。利用大语言模型辅助构建本体模型,在此基础上参考构建的本体模型对半结构化数据进行知识抽取形成三元组数据,并利用该结果构建数据库;针对非结构化数据,提出一种融合稀疏检索和稠密检索的混合搜索方法,检索出相似知识块作为提示词设计的参考示例;完成军用装备领域知识抽取提示词设计,在此基础上利用大语言模型完成非结构化数据知识抽取。结果表明,相较无RAG框架的知识抽取和无混合搜索的RAG框架知识抽取方法,所提方法抽取得到的三元组数据数量更多、召回率更高。

关键词: 检索增强生成(retrieval-augmented generation,RAG), 非结构化数据, 军事装备, 大语言模型, 混合搜索方法, 提示词

Abstract:

To overcome the difficulty of knowledge extraction from unstructured data in the field of military equipment, this paper proposes a knowledge extraction method based on retrieval-augmented generation (RAG) using hybrid search. First, a large language model is used to assist in constructing an ontology model. On this basis, knowledge is extracted from semi-structured data with reference to the constructed ontology model to form triple data, and the extracted results are used to construct a database. Then, for unstructured data, a novel hybrid search method is proposed. This method integrates sparse retrieval and dense retrieval methods to retrieve similar knowledge blocks as reference examples for prompt design. Finally, prompts for knowledge extraction in the field of military equipment are designed, based on which a large language model is used to extract knowledge from unstructured data. The results show that the proposed method is capable of extracting knowledge from unstructured data. Compared with knowledge extraction without the RAG framework and knowledge extraction based on the RAG framework without hybrid search, the proposed method extracts a larger number of triples and achieves a higher recall rate.

Key words: retrieval-augmented generation (RAG), unstructured data, military equipment, large language model, hybrid search method, prompt

中图分类号: