必赢nn699net

必赢nn699net学报 ?? 2022, Vol. 40 ?? Issue (2): 24-30.

? 民用航空 ? 上一篇    下一篇

必赢nn699net: 民航不文明旅客实体识别方法研究

曹卫东,徐秀丽   

  1. (必赢nn699net计算机科学与技术学院,天津 300300) 
  • 收稿日期:2020-11-12 修回日期:2020-11-12 接受日期:2020-09-23 出版日期:2022-06-05 发布日期:2022-06-05
  • 作者简介:曹卫东(1964—),女,天津人,教授,博士,研究方向为数据库与数据挖掘.
  • 基金资助:
    国家自然科学基金项目(U1833114);民航科技创新重大专项(MHRD20160109);民航安全能力建设资金项目(TRSA201803)

必赢nn699net: Research on methods of identifying unruly passengers in civil aviation 

CAO Weidong, XU Xiuli    

  1. (College of Computer Science and Technology, CAUC, Tianjin 300300, China) 
  • Received:2020-11-12 Revised:2020-11-12 Accepted:2020-09-23 Online:2022-06-05 Published:2022-06-05

摘要: 针对民航旅客机上接打电话、扰乱其他乘客等各种不循规行为,提出了一种 Tag+Bi-LSTM+CRF 神经网络 模型,可识别出不文明旅客实体信息。 考虑到民航文本记录中一条语句中有多个实体,实体出现在句子中 的模式可能包含有用的语义信息,将命名实体识别任务中的字符通过 BIOES 标记方法标记后与词嵌入、 位置嵌入串联,以丰富输入表示。 首先,利用 Yedda 工具对民航旅客随机记录文本中的实体进行标注,结合 词嵌入、位置嵌入作为模型输入,其次,采用双向长短时记忆网络(Bi-LSTM,bi-directional long short-term memory)模型获取序列文本的上下文特征,然后,通过条件随机。–RF,conditional random field)模型获得 序列标注结果,最后,分别在输入层和 Bi-LSTM 层添加 dropout 层,防止数据过拟合。 实验结果表明,该模 型在民航不文明旅客实体识别中精确率、召回率和 F1 均高达 96%以上,能有效获取不文明旅客行为、等 级、处罚、期限等信息。

关键词: 命名实体识别, 长短时记忆网络(LSTM), 条件随机场, 不文明旅客

Abstract: Aiming at various unruly behavior of civil aviation passengers, such as making calls and disturbing other passengers, a Tag+Bi-LSTM+CRF neural network model is proposed to obtain the identity of unruly passengers. Considering that there are multiple entities in a sentence in a civil aviation text record, the pattern of entities appearing in the sentence may contain useful semantic information. The characters in the named entity recognition are marked with the BIOES method and then connected in series with word embedding and position embedding for the purpose of enriching input expressions. First, use the Yedda tool to label the entities at random in record text of civil aviation passengers, and combine word embedding and location embedding as input to the model. Second, the bi-directional long short-term memory(Bi-LSTM) model is used to obtain the contextual features of the sequence text. Then, the sequence labeling result is obtained through the conditional random field(CRF) model. Finally, a dropout layer is added into the input layer and Bi-LSTM layer to prevent overfitting. The experimental results show that the accuracy, recall rate and F1 of the unruly passenger identification are as high as 96% or more, and it can effectively obtain information of the unruly passengers with regard to behavior, grade, punishment, and time limit.

Key words: namely entity recognition, long short-term memory(LSTM), conditional random field(CRF), unruly passengers

中图分类号: 

必赢nn699net(中国)责任有限公司