1gConceptsandTechniques—Chapter2—JiaweiHan,MichelineKamber,andJianPeiUniversityofIllinoisatUrbana-ChampaignSimonFraserUniversity©2012Han,Kamber,andPei.Allrightsreserved.January26,2025DataMining:ConceptsandTechniques23Chapter2:GettingtoKnowYourDataDataObjectsandAttributeTypesBasicStatisticalDescriptionsofDataDataVisualizationMeasuringDataSimilarityandDissimilaritySummary4TypesofDataSetsRecordRelationalrecordsDatamatrix,e.g.,numericalmatrix,crosstabsDocumentdata:textdocuments:term-frequencyvectorTransactiondataGraphandnetworkWorldWideWebSocialorinformationnetworksMolecularStructuresOrderedVideodata:sequenceofimagesTemporaldata:time-seriesSequentialData:transactionsequencesGeneticsequencedataSpatial,imageandmultimedia:Spatialdata:mapsImagedata:Videodata:Document1seasontimeoutlostwingamescoreballplaycoachteamDocument2Document3305026020200702100300100122030TIDItems1Bread,Coke,Milk2Beer,Bread3Beer,Coke,Diaper,Milk4Beer,Bread,Diaper,Milk5Coke,Diaper,Milk5ImportantCharacteristicsofStructuredDataDimensionalityCurseofdimensionalitySparsityOnlypresencecountsResolutionPatternsdependonthescaleDistributionCentralityanddispersion6DataObjectsDatasetsaremadeupofdataobjects.Adataobjectrepresentsanentity.Examples:salesdatabase:customers,storeitems,salesmedicaldatabase:patients,treatmentsuniversitydatabase:students,professors,coursesAlsocalledsamples,examples,instances,datapoints,objects,tuples.Dataobjectsaredescribedbyattributes.Databaserows->dataobjects;columns->attributes.7AttributesAttribute(ordimensions,features,variables):adatafield,representingacharacteristicorfeatureofadataobject.E.g.,customer_ID,name,addressTypes:NominalBinaryNumeric:quantitativeInterval-scaledRatio-scaled8AttributeTypesNominal:categories,states,or“namesofthings”Hair_color={auburn,black,blond,brown,grey,red,white}maritalstatus,occupation,IDnumbers,zipcodesBinaryNominalattributewithonly2states(0and1...