第60卷第2期/2023年1月/激光与光电子学进展0210013-1研究论文基于改进Transformer的细粒度图像分类模型田战胜,刘立波*宁夏大学信息工程学院,宁夏银川750021摘要细粒度图像具有不同子类间差异小、相同子类内差异大的特点。现有网络模型在处理过程中存在特征提取能力不足、特征表示冗余和归纳偏置能力弱等问题,因此提出一种改进的Transformer图像分类模型。首先,利用外部注意力取代原Transformer模型中的自注意力,通过捕获样本间相关性提升模型的特征提取能力;其次,引入特征选择模块筛选区分性特征,去除冗余信息,加强特征表示能力;最后,引入融合的多元损失,增强模型归纳偏置和区分不同子类、归并相同子类的能力。实验结果表明,所提方法在CUB-200-2011、StanfordDogs和StanfordCars三个细粒度图像数据集上的分类精度分别达89.8%、90.2%和94.7%,优于多个主流的细粒度图像分类方法,分类结果较好。关键词细粒度图像分类;Transformer;外部注意力;特征选择;多元损失中图分类号TP391.4文献标志码ADOI:10.3788/LOP220453Fine-GrainedImageClassificationModelBasedonImprovedTransformerTianZhansheng,LiuLibo*SchoolofInformationEngineering,NingxiaUniversity,Yinchuan750021,Ningxia,ChinaAbstractForthecharacteristicsofsubtledifferencesbetweenvarioussubclassesandlargedifferencesbetweensamesubclassesinafine-grainedimage,theexistingneuralnetworkmodelshavesomechallengesinprocessing,includinginsufficientfeatureextractionability,redundantfeaturerepresentation,andweakinductivebiasability;therefore,anenhancedTransformerimageclassificationmodelisproposedinthisstudy.First,anexternalattentionisemployedtoreplacetheself-attentionintheoriginalTransformermodel,andthemodel’sfeatureextractionabilityisenhancedbycapturingthecorrelationbetweensamples.Second,thefeatureselectionmoduleisintroducedtofilterdifferentiatingfeaturesandeliminateredundantinformationtoimprovefeaturerepresentationcapability.Finally,themultivariatelossisaddedtoimprovethemodel’sabilitytoinducebias,differentiatevarioussubclasses,andfusethesamesubclasses.Theexperimentalfindingsdemonstratethattheproposedmethod’sclassificationaccuracyonthreefine-grainedimagedatasetsofCUB-200-2011,StanfordDog...