11DataMining:ConceptsandTechniques(3rded.)—Chapter6—JiaweiHan,MichelineKamber,andJianPeiUniversityofIllinoisatUrbana-Champaign&SimonFraserUniversity©2011Han,Kamber&Pei.Allrightsreserved.January26,2025DataMining:ConceptsandTechniques23Chapter5:MiningFrequentPatterns,AssociationandCorrelations:BasicConceptsandMethodsBasicConceptsFrequentItemsetMiningMethodsWhichPatternsAreInteresting?—PatternEvaluationMethodsSummary4WhatIsFrequentPatternAnalysis?Frequentpattern:apattern(asetofitems,subsequences,substructures,etc.)thatoccursfrequentlyinadatasetFirstproposedbyAgrawal,Imielinski,andSwami[AIS93]inthecontextoffrequentitemsetsandassociationruleminingMotivation:FindinginherentregularitiesindataWhatproductswereoftenpurchasedtogether?—Beeranddiapers?!WhatarethesubsequentpurchasesafterbuyingaPC?WhatkindsofDNAaresensitivetothisnewdrug?Canweautomaticallyclassifywebdocuments?ApplicationsBasketdataanalysis,cross-marketing,catalogdesign,salecampaignanalysis,Weblog(clickstream)analysis,andDNAsequenceanalysis.5WhyIsFreq.PatternMiningImportant?Freq.pattern:AnintrinsicandimportantpropertyofdatasetsFoundationformanyessentialdataminingtasksAssociation,correlation,andcausalityanalysisSequential,structural(e.g.,sub-graph)patternsPatternanalysisinspatiotemporal,multimedia,time-series,andstreamdataClassification:discriminative,frequentpatternanalysisClusteranalysis:frequentpattern-basedclusteringDatawarehousing:icebergcubeandcube-gradientSemanticdatacompression:fasciclesBroadapplications6BasicConcepts:FrequentPatternsitemset:Asetofoneormoreitemsk-itemsetX={x1,…,xk}(absolute)support,or,supportcountofX:FrequencyoroccurrenceofanitemsetX(relative)support,s,isthefractionoftransactionsthatcontainsX(i.e.,theprobabilitythatatransactioncontainsX)AnitemsetXisfrequentifX’ssupportisnolessthanaminsupthresholdCustomerbuysdiaperCustomerbuysbothCustomerbuysbeerTidItemsbought10Beer,Nuts,Diaper20Beer,Coffee,Diaper30Beer,Diaper,Eggs40Nuts,Eggs,Mi...