• 578.34 KB
  • 2022-04-22 13:49:31 发布

利用匹配策略扩展卷积神经网络的分面标注方法.pdf

  • 15页
  • 当前文档由用户上传发布,收益归属用户
  1. 1、本文档共5页,可阅读全部内容。
  2. 2、本文档内容版权归属内容提供方,所产生的收益全部归内容提供方所有。如果您对本文有版权争议,可选择认领,认领后既往收益都归您。
  3. 3、本文档由用户上传,本站不保证质量和数量令人满意,可能有诸多瑕疵,付费之前,请仔细先通过免费阅读内容等途径辨别内容交易风险。如存在严重挂羊头卖狗肉之情形,可联系本站下载客服投诉处理。
  4. 文档侵权举报电话:19940600175。
'˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cn利用匹配策略扩展卷积神经网络的分面标注方法吴蓓,魏笔凡,刘均,郑元浩,郭朝彤,郑庆华陕西省天地网技术重点实验室,西安交通大学计算机科学与技术系,710049,中国摘要:大多数的社区问答网站通过主题组织大量的问答对,但是这种基于主题的组织模式不能满足用户的搜索需求。而主题分面更有利于组织和管理问答对,因此本文提出一种基于匹配策略扩展卷积神经网络的分面标注模型(FACM)。首先,考虑知识领域问答对中主题词组的重要性,通过卷积神经网络将词组信息用于文本表示当中;然后,通过匹配问答对和从维基百科获取的分面标签文本,生成相似矩阵来解决分面异质性问题;最后,训练一个三通道的卷积神经网络完成对问答对的分面标注。在三个真实数据集上的实验也进一步证明了我们的FACM模型优于其它的方法。关键词:知识领域,自然语言处理,分面标注,匹配策略,卷积神经网络中图分类号:TP391FacetAnnotationbyExtendingCNNwithaMatchingStrategyBeiWu,BifanWei,JunLiu,YuanhaoZheng,ZhaotongGuo,QinghuaZhengSPKLSTNLab,DepartmentofComputerScienceandTechnology,Xi’anJiaotongUniversity,710049,ChinaAbstract:Mostcommunityquestionanswering(CQA)websitesmanageplentyofquestionanswerpairs(QAPs)throughtopic-basedorganization,whichcannotsatisfyusers’searchdemands.Facetsoftopicsserveasapowerfultoolfornavigating,refining,andgroupingtheQAPs.Inthiswork,weproposeFACM,amodelforfacetannotationbyextendingConvolutionNeuralNetwork(CNN)withamatchingstrategy.First,consideringtheimportanceoftopicphrasesforQAPsinknowledgedomain,phraseinformationisincorporatedintotextrepresentationbyaCNNwithdifferentkernelsizes.Then,throughamatchingstrategyamongQAPsandfactlabeltexts(FaLTs)acquiredfromexternalknowledgebase,wegeneratesimilaritymatricestodealwithfacetheterogeneity.Finally,athree-channelCNNistrainedforfacetlabelassignmentofQAPsasabinaryclassifier.Experimentsonthreereal-worlddatasetsshowthatFACMoutperformsthreestate-of-the-artmethods.Keywords:Knowledgedomain,NaturalLanguageProcessing,FacetAnnotation,MatchingStrategy,Foundations:DoctoralFundofMinistryofEducationofChinaunderGrant(No.20130201130002)AuthorIntroduction:BeiWu(1992-),female,currentlyworkingtowardthePh.D.degreeinXi’anJiaotongUniversity,majorresearchdirection:Datamining,naturallanguageprocessing.(wubei784872134@stu.xjtu.edu.cn).Correspondenceauthor:QinghuaZheng(1969-),male,professor,majorresearchdirection:theoryandtechnologyofintelligente-Learningenvironment,networkpublicopinionandharmfulinformationmonitoring,andSoftwarereliabilityevaluation.(qhzheng@xjtu.edu.cn).-1- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cnBinarytreeTopicdefinitionpropertyapplicationexampleFacetsWhatisabinarytree?Whatisabinarytree?AWhataretherealworldexamplesAtreewithnotmorebinarytreeisatreedataofbinarytrees(notsearchtree)?QAPsthanTwobranchesatstructureinwhicheachnodeRealworldexample(ondailyanybranchingpoint.hasatmosttwochildren,Ăbasis)ofbinarytreeswhichĂ图1:Anexampleofatopicfacet-basedorganizationmodeinBinarytree.ConvolutionalNeuralNetwork0IntroductionCommunityquestionanswering(CQA)websiteslikeQuora1,StackOverflow2becomepopularplatformsforknowledgesharing,withplentyofquestion-answerpairs(QAPs).AQAPconsistsofaquestionandacorrespondinganswer.MostCQAwebsitesorganizequestionscorrespondingtotopicsinspecificknowledgedomains[1],suchasDataStructure,ofwhichBinarytreeisatypicaltopic.AnexampleofQuoraquestion,“Whatisabinarytree?”isassignedwiththreetopics,includingBinaryTrees,DataStructuresandAlgorithms.Topic-basedorganizationmodeforQAPscannotsatisfythehumandemandsoffine-grainedknowledgeacquisition.Forexample,anuserwhowantstoknowdefinitionofBinarytreefromQuora,issuesashortquerylike“Binarytree,definition.”QuorawillreturnthousandsofQAPs.Afterstatisticalanalysisovertop10questions(involving1,210QAPs),wefindthatonly5%QAPsarerelatedtodefi-nitionofBinarytree.It’sunacceptabletimeconsumingforuserstolookouttheirexpectedknowledge.FacetsorganizeaQAPintoatriple(topic,facet,QAP),whichishelpfulfornavigating,refining,andgroupingtheinformationtargets.Afacetdescribesanimportantaspectofaspecifictopicinknowl-edgedomain,suchasdefinition,property,applicationandexample.Anexampleofatopicfacet-basedorganizationmodeinBinarytreeisshowninFigure1.Inthisworkwefocusonfacetannotation,whichistoassignoneormorefacetlabelstoagivenQAP.Twochallengesoffacetannotationareasfollows:1.Unlikethenormalshorttexts,QAPsareboundupwithtopicsandcontainplentyoftopicphrases.TakingRed-blacktreeforanexample,thecombinationofred,blackandtreerepresentsaspecifictopicinaknowledgedomain.Inthiscase,red,blackandtreedonotreallyretaintheirorigine1https://www.quora.com/2http://stackoverflow.com/-2- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cnmeanings.ExistingrepresentationmethodssuchasBoWmodel[2]andwordembedding[3],cannoteffectivelyrepresentthephrasesintexts.Effectiverepresentionofthephraseinformationisachallengingtask.2.TherearenotabledifferencesamongtheQAPs(suchasQAP[a]andQAP[b])ofthesamefacetbutdifferenttopicsinsomestatisticfeatureslikeworddistribution,whichiscalledthefacetheterogeneityinourpaper.Hence,itisdifficulttoacquireamodelwithagoodgeneralizationcapabilitythroughasmallamountoflabeleddata.[a]definitionofBinarytree:“Whatisabinarytree?AtreewithnotmorethanTwobranchesatanybranchingpoint.3”[b]definitionofArray:“Whatisanarray?Array’sakindofdatastructurethatcanstoreafixed-sizesequentialcollectionofelementsofthesametype.4”Toaddresstheabovechallenges,weproposeamodelcalledFACMforFacetAnnotationbyex-tendingConvolutionalneuralnetwork(CNN)withaMatchingstrategy.Firstly,aneffectivetextrepre-sentationmethodisdevelopedtocapturephraseinformationofQAPs.Theresultsareregardedastheinitialfeaturesofthefeatureextractionpipeline.Secondly,amatchingstrategyisproposedtogener-atesimilaritymatricesbetweenQAPsandknowledgeacquiredfromWikipedia5.Finally,weadoptathree-channelCNN,toimplementthemultiplefacetlabelclassificationwiththesimilaritymatrices.Tosumup,ourcontributionsareofthree-fold:1.Asfarasweknow,wearethefirsttoformalizeandinvestigatetheproblemoffacetannotationforQAPsinknowledgedomains.2.WeproposedamodelcalledFACMforFacetAnnotationbyextendingCNNwithaMatchingstrategy.FACMincorporatesthephraseinformationintoQAPrepresentationandcombinesamatchingstrategytotacklemassivemanualworkofdataannotationowingtofacetheterogeneity.3.Tovalidateourproposedmodel,comprehensiveexperimentsareconductedonthreereal-worlddatasetsfromQuoraindifferentknowledgedomains,includingDataStructure,DataMining,andComputerNetwork.ExperimentalresultsshowthatFACMmodeloutperformsthehighly-citedandpopularmethods.1RelatedWorkFacetannotationofQAPsistoassignoneormorefacetlabelstoagivenQAPinCQA.Itrelativelytargetsthetasksofshorttextclassificationandsubtopicmining.3https://www.quora.com/What-is-a-binary-tree4https://www.quora.com/What-is-an-array5https://en.wikipedia.org/-3- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cn1.1ShortTextClassificationShorttextclassificationistoassignlabelsfrompredefinedtaxonomytoshorttexts,suchasques-tions,instantmessages,twittersandcomments[4].Theapproachesofshorttextclassificationcanbedividedintotwocategories:rule-basedandmachinelearning-based.Rule-basedapproachesclassifyshorttextsinstraightforwardmethodsusingasetofpredefinedheuristicrules.Someearliershorttextsclassificationworksemploysomemanuallyhand-craftedrules[5,6]toclassifythequestionsfromTREC-8.Morerecently,HarisandOmar[7]adoptedvarioustypesofinterrogativewordsandanalyzedthecognitivelevelsofBloom’staxonomytoclassifyeachquestionthroughthedevelopmentofrules.Therule-basedapproachesperformwellintermofaccuracyinaspecificdomain.However,itsuffersfromtimeconsumingandlaborioushumaneffortsofhand-craftedrules,aswellasthelowrecallindifferentdomains.Machinelearning-basedapproachesclassifyshorttextstopredefinedcategoriesusingautomati-callyextractedfeatures.LiandRoth[8]proposedahierarchicalSparseNetworkofWinnowswithbag-of-wordsfeaturesforshorttextclassificationinquestions.IntheexperimentscarriedoutbyZhangandLee[9],thesupportvectormachine(SVM)combiningbag-of-wordsperformswellinshortques-tionclassificationwithfinegrainedcategories,suchasabbreviation,expansion.Srirametal.[10]classifiedtheshorttexttoapredefinedsetofgenericclassessuchasNews,Events,Opinions,Deals,andPrivateMessages,withasmallsetofdomain-specificfeaturesextractedfromauthor’sprofileandtext.Lietal.[11]introducedasemi-supervisedquestionclassificationmethodbasedonensemblelearn-ing.Recently,Kim[12]utilizedaCNNtrainedontopofpre-trainedwordvectorsforsentence-levelclassification.Mostmachinelearning-basedapproachesofshorttextclassificationignorethephraseinformationofshorttexts.1.2SubtopicMiningSubtopicsareoftendefinedasmorefine-grainedkeywordstorepresentetextscomparedwithtop-ics.Subtopicscanbeminedfromquerylogs[13,14],anchortexts[15]anddocumentcorpus[16].Someexistingapproachesusuallyconductclusteringoftextsandextractkeywordoftheclusteringinseparatestepstoaddresssubtopicmining.Forinstance,Wuetal.[13]developedaunifiedframeworkusingnegativematrixfactorizationinordertosimultaneouslyconductquestionclusteringandkeywordextractiontominequerysubtopics.KongandAllan[17]proposedasupervisedapproachbasedonagraphicalmodeltorecognizequeryfacetsfromthenoisycandidates.Inshort,thestringsofsubtopics,suchascity,person,oftenappearedinthetextofthedocumentcorpora.ThemethodsforsubtopicminingmayfailtoachievethesatisfactoryresultsbecauseoftheabsenceoffacetsinQAPs.ThedetailedstatisticsareshowninTable1ofSubsection2.4.-4- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cninput(QAP)convolutionmax!poolingLabelunigram(0/1)bigramtrigramconvolutionmax!poolingconvolutionmax!poolingMLPsoftmaxinput(FaLT)convolutionmax!poolingTextRepresentationSimilarityMatrixConstructionFacetLabelAssignment图2:SchematicframeworkofFACM.Inpractice,aQAPcanbeassignedwithoneormorefacets.ForsimplicityweconsidertheframeworkofaQAPandafacet.Wetransformthetaskofautomaticfacetannotationintomultiplebinaryclassificationproblems.2FACMModelInthissection,amodelcalledFACMisproposedtoautomaticallyannotatefacetlabelsofQAPsbyextendingCNNwithamatchingstrategy.TheproblemoffacetannotationisdefinedformallyatfirstandthentheschematicframeworkofFACMmodelisdescribedwithdetaileddescriptionofeachstage.2.1ProblemDefinitionOurtargetistoassignoneormorefacetlabelstoagivenQAP.Formally,letP=fp1;p2;:::;pi;:::;pngdenotesaQAPset,relatedtotopicsetT=ft1;t2;:::;tj;:::;trg.pi=(w1;w2;:::;wk;:::;ws)denotesaQAPwithtopictj2T,wherewkisk-thword.L=fl1;l2;:::;lmgdenotesafacetsetconsistingofmfacets,suchasdefinition,property,applicationandexample.Thegoalofourworkaimsatgeneratingamulti-labelclassificationfunction:PL!f0;1g.2.2FrameworkTheschematicframeworkofFACMisshowninFigure2.Firstly,theshortnaturallanguagetextsofQAPsareembeddedintothreevectorspacebythegen-eralizationoftheCNN[12]withaconvolutionlayerandamax-poolinglayer.EachQAPisrepresentedasthreeorderedlistsindicatingthephraseinformationatthreelevels,includingunigram,bigramandtrigram.Eachorderedlistofwordvectorsatalevelformsamatrixandtheoutputofthisstagearethreematricesofthreelevels.-5- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cnInputconvolutionmaxpooling()DnxdD图3:TextrepresentationwithdifferentkernelsCdgainningphraseinformationindifferentlevels.2f1;2;3gdenotesthenumberof(1)(2)wordsforcombiningandtheheightofkernels.=1:unigramrepresentationD;=2:bigramrepresentationD;=3:trigram(3)representationD.Secondly,amatchingstrategyisutilizedforthreesimilaritymatrixconstructiontodealwithfacetheterogeneity.WeintroducetextscorrespondingtotopicfacetsfromWikipedia,calledFacetLabelText(FaLT).AFaLTisalsorepresentedasthreematriceswithdifferentlevelsofphraseinformation.Then,threesimilaritymatricesareconstructedbysimilaritymeasuresbetweenaQAPandaFaLTatdifferentlevelsthroughamatchingstrategy.Finally,thethreesimilaritymatricesarecombinedandfedtoathree-channelCNN,consistingoftwoconvolutionlayers,twomax-poolinglayers,andamulti-layerperceptronlayer,forfacetlabelassignmentservedasabinaryclassification.2.3TextRepresentationEffectivelyrepresentingtheembeddingphraseinformationinQAPsisachallengingtask.AwordembeddingmatrixM2R(jVjd)ispretrainedbyutilizinganunlabeledlargecorpusfromEnglishWikipedia2015,wherejVjisthesizeofthevocabularyanddisthedimensionofthevectorspace.Eachwordvector(arowinM)representsawordincorpus.ForeachQAP,webuildamatrixD2R(sd)containingsequentialinformation,whichalsocanberepresentedasanorderedlistofwordvectors(a;a;:::;a;:::;a)|correspondingtoQAPp=(w;w;:::;w;:::;w).12ksi12ksPhraseinformationplaysanimportantroleinnaturallanguagetextsofQAPs.Forexample,Red-blacktree,thecombinationofred,blackandtree,representsaspecifictopicinknowledgedomainDataStructure.However,mostmethodsignorethephraseinformation.TomorepreciselycapturetheinformationofasentenceinQAPs,thephraseinformationofwordsistakenintoconsiderationinourmodelbyutilizingtheproposedconvolutionallayerwithdifferentkernelsCd(2f1;2;3g),showninFigure3.Theproposedtextrepresentationmodelcontainsaconvolutionlayerandamax-poolinglayer.Attheconvolutionlevel,threekernelsareadoptedtoconductdifferentconvolutionoperations,capturingdifferentphraseinformationatdifferentlevels.Theoutputofmax-poolinglayerD()(2f1;2;3g),isafixedsizez1z2withdifferentvalueofusingdifferentpoolsizes.z1denotesthenumberof-6- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cnDomainSizefSizeProportion(%)DataStructure35,0764,30912.28DataMining12,7232,20517.33ComputerNetwork13,0811,49010.77表1:Statisticsofthreedatasetsindifferentknowledgedomain.Size:thesumsofQAPs.fSize:thenumberoftheQAPsinwhichfacetlabelsfSizeappear.Proportion:100%.Sizerows,andz2denotesthenumberoffeaturemaps.GivenasentencematrixDsd,wheresdenotesthelengthoftheinputsentenceandddenotesthedimensionofawordvector,outputD()(2f1;2;3g)ofCNNiscalculatedasfollows.s=bc;(1)z1=s%z1;(2)p=+ ;(3)()D=CNN(Dsd;Cd;P1p; ;z2):(4)InEquation4,Cdrepresentsthekernelofconvolution.P1pdenotesthekernelofmax-pooling,andisthestrideofmax-pooling.2.4MatchingStrategyWedevelopstatisticalanalysisonthreereal-worlddatasetsaboutthenumberoftheQAPsinwhichfacetlabelsappear.TheresultsshowninTable1,indicatethatlabelsoffacetsrarelyappearinthetextsofQAPs.Itisverytime-consumingtomanuallyannotatetopicfacetsofQAPsbecauseoftheabsenceoffacetlabelsandfacetheterogeneityoftopics.Toreducethemassivemanualworkoffacetannotation,amatchingstrategyisproposedtogeneratesimilaritymatricesofFaLTsfromWikipediaandQAPs.LetFj=ffj1;fj2;:::;fjk;:::fjmgdenoteaFaLTsetconsistingofmFaLTscorrespondingtotopicStj.AFaLTsetF=Fjisacquiredbyourdevelopedwebcrawler.EachFaLTisextractedfroma16j6rcorrespondingWikipediapage.Forexample,aFaLTdescribingthefacetdefinitionoftopicBinarytree,isextractedfromaWikipediapagewiththetitleofBinarytree.AftertextrepresentationdescribinginSubsection2.3,aQAPrepresentationsetDp=fDp1;Dp2;:::Dpi;:::;Dpng(1)(2)(3)isgeneratedcorrespondingtoQAPsetP.Dpi=(Dpi;Dpi;Dpi)denotesthreematrixrepresenta-()tionsofQAPpi,whereDpi(2f1;2;3g)istheQAPrepresentationof-levelphraseinformation.ASFaLTrepresentationsetDFj=fDfj1;Dfj2;:::;Dfjk;:::;DfjmgofDF=DFjrepresentsFaLT16j6r(1)(2)(3)setFjcorrespondingtotopictj.Dfjk=(Dfjk;Dfjk;Dfjk)denotesthreematrixrepresentationsof()FaLTfjk,whereDfjk(2f1;2;3g)istheFaLTrepresentationof-levelphraseinformation.-7- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cn(a)QAP:Whatisabinarytree?Atreewithnotmore(b)QAP:Whatisabinarytree?Abinarytreeisa3thanTwobranchesatanybranchingpoint.FaLT:treedatastructureinwhicheachnodehasatmostAbinarytreeisatreedatastructureinwhicheachtwochildren,whicharereferredtoastheleftchild3nodehasatmosttwochildren,whicharereferredtoandtherightchild.FaLT:Abinarytreeisatree6astheleftchildandtherightchild.datastructureinwhicheachnodehasatmosttwochildren,whicharereferredtoastheleftchildand6therightchild.(1)图4:ExamplesofvisualsimilaritymatricesS.Deepercolorrepresentshighervalue.ForeachDpiandDfjkcorrespondingtothesametopictj,thesimilaritymatricesSikisgeneratedandformulatedasfollows.(1)(2)(3)Sik=(Sik;Sik;Sik);(5)()()()Sik=cos(Dpi;Dfjk)(2f1;2;3g):(6)()AnelementsrcinthematrixSikiscalculatedasfollows.~w~w|rcsrc=:(7)jj~wrjjjj~wcjj()()~wristher-th(16r6z1)rowofDpi,and~wcisthec-th(16c6z1)rowofDfjk.(1)(2)(3)Sik=(Sik;Sik;Sik)iscalculatedbycosinesimilaritybetweenQAPpiandFaLTfjkcorre-spondingtothesametopict.AsimilaritymatrixsetSi=fSi1;Si2;:::;Sik;:::SimgisproducedforeachQAPpicorrespondingtotopictj,astheinputfeaturesoffacetlabelassignment.2.5FacetLabelAssignmentWediscoverlocalcorrelationsinthesimilaritymatricesbyvisualanalysisofthematrices.Figure4showsexamplesofsimilaritymatricesforbinarytree.LocalcorrelationmeansthataQAPmatcheswithaFaLTinlocalpart.Thesimilaritypatternattop-leftcornerinFigure4indicatesthatthelocalpartbetweenaQAPandaFaLTarematched.Inaddition,wealsofindthatthesimilaritypatternmayappearinanypartofsimilaritymatrices,suchasthesimilaritypatternsshowninFigure4aand4b.Inspiredbylocalcorrelationsofsimilaritypatterns,weextendathree-channelCNNinFACMtocombinethreelevelphraseinformationforfacetlabelassignment,showninFigure5.Thisthreechannel6https://en.wikipedia.org/wiki/Binarytree-8- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cnconv1pool1conv2pool2unigramLabelbigram(0/1)trigraminputconvolutionmax!poolingconvolutionmax!poolingMLPsoftmax图5:Three-channelCNNforfacetlabelassignment.CNNutilizesthesimilaritymatricesastheinputandconsistsoftwoconvolutionlayers(conv1;conv2),twomax-poolinglayers(pool1;pool2),andamulti-layerperceptronlayer(mlp).ForeachsimilaritymatrixSbetweenQAPpandFaLTf,therepresentationv^=(P0;P1)ikijkikikikofthethree-channelCNNiscomputedby0v^ik=logit(CNN(Sik;));(8)where=fW1;W2;Wmlpgdenotesthewholeparametersofconv1,conv2,andmlprespectively,andCNN0(S;)istheoutputofmlplayer.Afterwards,alogisticfunctionlogit()withtwooutputvalues,ikisutilizedtoobtaintheprobabilityP1ofpredicting(p;f)as1,andP0as0.(p;f)meansapairikijkikijkofQAPpiandFaLTfjkcorrespondingtosametopictj.Atthelasthop,theresultofthree-channelCNNisy^ik2f0;1g,whichiscomputedbyEquation9.xy^ik=argmax(Pik):(9)x2f0;1gInEquation9,y^ik=1denotesQAPpicanmatchwithFaLTfjkcorrespondingtofacetlabellk2L(16k6m)oftopictj.Namely,QAPpicanbeassignedwithfacetlabellk,otherwisepicannotbeassignedwithlk.Theproposedmodelistrainedinasupervisedmannerbyminimizingthelossfunction,whichisformulatedbyXNXm1loss=yiklog(Pik);(10)i=1k=1whereyik2f0;1gdenoteswhethertheQAPpimatcheswiththeFaLTfjkofthefacetlabellk,bothcorrespondingtotopictj.3ExperimentWeevaluatethemodelcalledFACMonthethreedatasetsfromdifferentknowledgedomainsin-cludingDataStructure,DataMiningandComputerNetwork.Thethreedatasetsaredescribedwithdetails.Experimentalsettingsandexperimentalresultsarediscussedintheend.-9- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cnDataset#Topic#QAPDataStructure19335,076DataMining9412,723ComputerNetwork8413,081表2:Statisticsofthreedatasetsindifferentknowledgedomain,includingDataStructure,DataMining,ComputerNetwork.#Topic:thenumberoftopics.#QAP:thenumberofQAPs.3.1DatasetandEvaluationMetricDatasets.WeconstructthreedatasetsofdifferentknowledgedomainsowingtotheabsenceofproperdatasetsforFACM.Accordingtoallthetopicsofthethreeknowledgedomains,weacquireQAPsfromQuoraautomaticallybyourdevelopedwebcrawler.EachQAPisrepresentedasawordsequenceofaquestionandacorrespondinganswer.Werecruitedsevenstudentsasannotatorstoconductdatasetsonthreeknowledgedomainsinclud-ingDataStructure,DataMining,andComputerNetwork.EachQAPinthethreedomaindatasetswasannotatedwithoneormorefacetlabelsfromthepredefinedfacetlabelsetfdefinition,property,applica-tion,exampleg.Theannotatorslearnedthestrictannotationguidelineswithannotatedexamplesbeforeannotation.Everytwoannotatorsareaskedtolabelthesamedatasetindependently,andtheseventhannotatorresolvedannotationdisagreementsduringgroupmeetings.StatisticsofthethreedatasetsareshowninTable2.Inourexperiments,70%ofthedatasetsisusedastrainset.10%isusedasvalidationset,andtheother20%isusedastestset.EvaluationMetric.MacroF1scoreofmulti-labelsformsisselectedasthemainmetric,whichisoneofwidelyusedevaluationmetricsandcomputedasfollows.1XmMacroP=Pi;(11)mi=11XmMacroR=Ri;(12)mi=12MacroPMacroRMacroF1=:(13)MacroP+MacroRInEquations(11)to(13),misthenumberoffacet.PiandRiaretheprecisionandrecallscoreofi-thlabelrespectively.3.2ExperimentalSettingsTextrepresentation.The50-dimensionalvectorsforwordembeddingarepretrainedon12.2GcorpusfromWikipedia2015bytheword2vecutility[18].ItisknownthatthelengthofQAPshavevariablesize.-10- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cnDatasetDataStructureDataMiningComputerNetworks100200300400100200300400100200300400Proportion79.5897.0898.3399.5878.0096.3398.9899.5480.9998.9199.0699.72表3:CoverageproportionofQAPswithdifferents.Table3showsthenumberandproportionofQAPscoveredbydifferentlengths.SincetheinputofCNNshouldbefixedsize,wetileortrimourQAPsintosamesizes=200tomakesurethataQAPisrepresentedasa20050matrix.Wereportz2featuremapswith50kernels.Ingeneral,aphraseisnotmorethanthreewords,soweadopt2f1;2;3g.AccordingtoEquations(1)to(3),thekernelofmax-poolingandthestrideofmax-poolingarepresettedgivenavalueofz1.Asimplegridsearchisusedtoestimatethevalueofz1andz2,andweselectz1=30andz2=30.Facetlabelassignment.Convolutionlayerconv1has32featuremapswith55kernelsandconvolutionlayerconv2has64featuremapswith55kernels.Themaxpoolsizesforpool1andpool2bothare22.Thedropoutrateis0.5[19].Thesehyperparametersareselectedthroughgridsearchonvalidationset.FACMistrainedwithBackPropagationalgorithm[20],andimplementedwithTensorFlow[21].Baselinemethods.Wedevelopthreebaselineexperimentsbasedonstate-of-the-artmethods.1.BaselineI:ThehighlycitedapproachproposedbyKim[12]isselectedasthefirstbaselineexper-iment.KimtrainedasimpleCNNwithonelayerofconvolutionontopofwordvectorsobtainedfromanunsupervisedneurallanguagemodeltoclassifysentences.Toexcludeincomparabilityofconsequencebecauseofdifferentinputfeatures,wechooseCNN-staticwithfifty-dimensionalpre-trainedvectorsfromword2vec.2.BaselineII:ZhangandLee[9]carriedoutexperimentsusingSVMcombiningbag-of-words,whichperformswellinshortquestionclassificationwithfinegrainedcategories.Themethodisutilizedasthebaselinebecauseitispopularandeffectiveinshorttextclassification.3.BaselineIII:HarisandOmar[7]developedaseriesofrulestoanalyzethecognitivelevelsofBloom’staxonomyforeachquestion,whichimprovestheprecisionoftheresultinshorttextclassification.Theapplicationlevelrulesofthemethodaredirectlyadoptedwithothersdevelopedbyourselves.3.3ExperimentalResultsandAnalysisTable4summarizestheperformancesofourproposedmodelFACMandotherstate-of-the-artmethods.-11- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cnDatasetModelsPrecision(%)Recall(%)F1(%)DataStructureBaselineI83.5482.2882.91BaselineII83.2178.4680.77BaselineIII93.7860.2573.37FACM85.0886.8685.96DataMiningBaselineI82.0980.1381.10BaselineII84.5876.6380.41BaselineIII85.5262.5672.26FACM83.6082.7283.16ComputerNetworkBaselineI80.5483.3381.91BaselineII82.1281.2581.68BaselineIII83.7568.6875.47FACM85.6684.2984.97表4:ComparisonsofFACMwithbaselinemethods.ItcanbefoundthatFACMconsistentlyimprovestheMacroF1scorescomparedwithallbaselinemethodsonthethreedatasetsrespectivelyandimprovesMacroP,MacroRscorescomparedwithBaselineIandBaselineII.BaselineIIIwithhigh-qualityhand-craftedrulespossesshighMacroP,evenexceeding93%inDataStructure.Inourexperiment,wemakestrictrulestoannotateourQAPstoequiplabeleddatawithhighquality.However,itleadstolowerMacroRrate.FACMimprovesabout10%ofMacroF1scoresthroughtheimprovementofMacroRscores.ItindicatesthatFACMperformswellintextrepresentationwithphraseinformationandgeneralizationcapability.Tofurtherdescribemodels’generalizationcapabilityindifferenttopicsofthesameknowledgedomain,wecomparethestandarddeviationofMacroF1amongallmethods.canbeformulatedasvuru1X=t(FiF);(14)ri=1whereristhenumberoftopicsinaspecificdomain.FiistheMacroF1ini-thtopicandFistheMacroF1indomain.Figure6showsthatstandarddeviationofBaselineIIIisthehighestbecauseoftheruleswithstrongdependencyontopics.Meanwhile,ofFACMisthelowestamongallmethods,whichindicatesthatFACMimplementsfacetannotationwithagoodgeneralizationcapabilityindifferenttopics.-12- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cn3.5BaselineIBaselineII3BaselineIIIFACM2.521.5StandardDeviation(%)10.50DataStructureDataMiningComputerScienceDatasets图6:ComparisonsofFACMwithbaselinemethodsinstandarddeviation.4ConclusionWepresentaneuralnetworkmodelcalledFACMthatannotatesfacetsofQAPsbyextendingCNNwithamatchingstrategy.FACMmodelincorporatesthephraseinformationintoQAPrepresentationandcombinesamatchingstrategytotacklemassivemanualworkofdataannotationowingtofacetheterogeneity.Athree-channelCNNistrainedasabinaryclassifiertoimplementfacetlabelassignmentatfinal.ResultsofcomprehensiveexperimentsindicatethatFACMoutperformsthehighly-citedandpopularmethodsrelatedtofacetannotation.Inthefuture,wewouldliketomineadynamicnumberoffacetscorrespondingtoaspecifictopicandincorporatethehierarchicalstructureoffacetsintofacetannotation.AcknowledgementsTheresearchwassupportedinpartbytheNationalScienceFoundationofChinaunderGrantNos.61672419,61672418,theDoctoralFundofMinistryofEducationofChinaunderGrantNo.20130201130002.Thankstohelpfulpeopleandcompanies.参考文献(References)[1]G.Wang,K.Gill,M.Mohanlal,H.Zheng,andB.Y.Zhao,“Wisdominthesocialcrowd:ananal-ysisofquora,”inProceedingsofthe22ndInternationalConferenceonWorldWideWeb(WWW),2013,pp.1341–1352.[2]Z.S.Harris,“Distributionalstructure.”Word,,vol.10,no.2-3,pp.146–162,1954.[3]Y.Bengio,R.Ducharme,P.Vincent,andC.Jauvin,“Aneuralprobabilisticlanguagemodel,”JournalofMachineLearningResearch,,vol.3,no.Feb,pp.1137–1155,2003.-13- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cn[4]M.Chen,X.Jin,andD.Shen,“Shorttextclassificationimprovedbylearningmulti-granularitytopics,”inProceedingsofInternationalJointConferenceonArtificialIntelligence(IJCAI),2011,pp.1776–1781.[5]D.A.Hull,“Xeroxtrec-8questionansweringtrackreport,”inProceedingsofTextREtrievalCon-ference(TREC),1999,pp.743–752.[6]J.Prager,D.Radev,E.Brown,A.Coden,andV.Samn,“Theuseofpredictiveannotationforquestionansweringintrec8,”inProceedingsof8thTextREtrievalConference(TREC),1999,pp.399–409.[7]S.S.HarisandN.Omar,“Arule-basedapproachinbloom’staxonomyquestionclassificationthroughnaturallanguageprocessing,”inProceedingsofthe7thInternationalConferenceonCom-putingandConvergenceTechnology(ICCCT),2012,pp.410–414.[8]X.LiandD.Roth,“Learningquestionclassifiers,”inProceedingsofthe19thInternationalCon-ferenceonComputationalLinguistics(COLING),2002,pp.1–7.[9]D.ZhangandW.S.Lee,“Questionclassificationusingsupportvectormachines,”inProceedingsofthe26thInternationalACMSIGIRConferenceonResearchandDevelopmentinInformaionRetrieval,2003,pp.26–32.[10]B.Sriram,D.Fuhry,E.Demir,H.Ferhatosmanoglu,andM.Demirbas,“Shorttextclassificationintwittertoimproveinformationfiltering,”inProceedingsofthe33rdInternationalACMSIGIRConferenceonResearchandDevelopmentinInformationRetrieval,2010,pp.841–842.[11]Y.Li,L.Su,J.Chen,andL.Yuan,“Semi-supervisedlearningforquestionclassificationincqa.”NaturalComputing.,pp.1–11,2016.[12]Y.Kim,“Convolutionalneuralnetworksforsentenceclassification.”arXivpreprintarXiv:1408.5882,2014.[13]Y.Wu,W.Wu,Z.Li,andM.Zhou,“Miningquerysubtopicsfromquestionsincommunityquestionanswering,”inProceedingsofAssociationfortheAdvancementofArtificialIntelligence(AAAI),2015,pp.339–345.[14]Y.Hu,Y.Qian,H.Li,D.Jiang,J.Pei,andQ.Zheng,“Miningquerysubtopicsfromsearchlogdata,”inProceedingsofthe35thInternationalACMSIGIRConferenceonResearchandDevelop-mentinInformationRetrieval,2012,pp.305–314.[15]V.Dang,X.Xue,andW.B.Croft,“Inferringqueryaspectsfromreformulationsusingcluster-ing,”inProceedingsofthe20thACMinternationalconferenceonInformationandknowledgemanagement(CIKM),2011,pp.2117–2120.-14- ˖ڍመ੾᝶஠ڙጲhttp://www.paper.edu.cn[16]J.AllanandH.Raghavan,“Usingpart-of-speechpatternstoreducequeryambiguity,”inProceed-ingsofthe25thinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,2002,pp.307–314.[17]W.KongandJ.Allan,“Extractingqueryfacetsfromsearchresults,”inProceedingsofthe36thInternationalACMSIGIRConferenceonResearchandDevelopmentinInformationretrieval,2013,pp.93–102.[18]T.Mikolov,I.Sutskever,K.Chen,G.S.Corrado,andJ.Dean,“Distributedrepresentationsofwordsandphrasesandtheircompositionality,”inProceedingsofAdvancesinNeuralInformationProcessingSystems(NIPS),2013,pp.3111–3119.[19]G.E.Hinton,N.Srivastava,A.Krizhevsky,I.Sutskever,andR.R.Salakhutdinov,“Improvingneu-ralnetworksbypreventingco-adaptationoffeaturedetectors,”arXivpreprintarXiv:1207.0580,2012.[20]R.Hecht-Nielsenetal.,“Theoryofthebackpropagationneuralnetwork,”NeuralNetworks,,vol.1,no.Supplement-1,pp.445–448,1988.[21]M.Abadi,A.Agarwal,P.Barham,E.Brevdo,Z.Chen,C.Citro,G.S.Corrado,A.Davis,J.Dean,M.Devinetal.,“Tensorflow:Large-scalemachinelearningonheterogeneousdistributedsystems,”arXivpreprintarXiv:1603.04467,2016.-15-'