1DataMining:ConceptsandTechniques(3rded.)—Chapter12—JiaweiHan,MichelineKamber,andJianPeiUniversityofIllinoisatUrbana-Champaign&SimonFraserUniversity©2012Han,Kamber&Pei.Allrightsreserved.3Chapter12.OutlierAnalysisOutlierandOutlierAnalysisOutlierDetectionMethodsStatisticalApproachesProximity-BaseApproachesClustering-BaseApproachesClassificationApproachesMiningContextualandCollectiveOutliersOutlierDetectioninHighDimensionalDataSummary4WhatAreOutliers?Outlier:AdataobjectthatdeviatessignificantlyfromthenormalobjectsasifitweregeneratedbyadifferentmechanismEx.:Unusualcreditcardpurchase,sports:MichaelJordon,WayneGretzky,...OutliersaredifferentfromthenoisedataNoiseisrandomerrororvarianceinameasuredvariableNoiseshouldberemovedbeforeoutlierdetectionOutliersareinteresting:ItviolatesthemechanismthatgeneratesthenormaldataOutlierdetectionvs.noveltydetection:earlystage,outlier;butlatermergedintothemodelApplications:CreditcardfrauddetectionTelecomfrauddetectionCustomersegmentationMedicalanalysis5TypesofOutliers(I)Threekinds:global,contextualandcollectiveoutliersGlobaloutlier(orpointanomaly)ObjectisOgifitsignificantlydeviatesfromtherestofthedatasetEx.IntrusiondetectionincomputernetworksIssue:FindanappropriatemeasurementofdeviationContextualoutlier(orconditionaloutlier)ObjectisOcifitdeviatessignificantlybasedonaselectedcontextEx.80oFinUrbana:outlier?(dependingonsummerorwinter?)AttributesofdataobjectsshouldbedividedintotwogroupsContextualattributes:definesthecontext,e.g.,time&locationBehavioralattributes:characteristicsoftheobject,usedinoutlierevaluation,e.g.,temperatureCanbeviewedasageneralizationoflocaloutliers—whosedensitysignificantlydeviatesfromitslocalareaIssue:Howtodefineorformulatemeaningfulcontext?GlobalOutlier6TypesofOutliers(II)CollectiveOutliersAsubsetofdataobjectscollectivelydeviatesignificantlyfromthewholedataset,eveniftheindividualdataobjectsmaynotbeoutliersApplications:E.g.,intrusiondetecti...