Thecellcanbeconsideredthefundamentalunitinbiology.Forcenturies,biologistshaveknownthatmulticellularorganismsarecharacterizedbyaplethoraofdistinctcelltypes.Althoughthenotionofacelltypeisintuitivelyclear,aconsistentandrigorousdefinitionhasremainedelusive.Cellscanbedistinguishedbytheirsizeandshapeusingamicroscope,andattributesbasedontheirphysicalappearancehavetraditionallybeentheprimarydeterminantofcelltype.Later,discoveriesinmolecularbiologymadeitpossibletocharacterizecelltypesonthebasisofthepresenceorabsenceofsurfaceproteins.However,surfaceproteinsrepresentonlyasmallfractionoftheproteome,anditislikelythatimportantdifferencesarenotmanifestedatthecellmembrane.Advancesinmicrofluidicshavemadeitpossibletoisolatealargenumberofcells,andalongwithimprovementsinRNAisolationandamplificationmethods,itisnowpossibletoprofilethetranscriptomeofindividualcellsusingnextgenerationsequencingtechnologies.Technologicaldevelopmentshaveadvancedatabreathtakingspeed.ThefirstsinglecellRNAsequencing(scRNAseq)experimentwaspublishedin2009,andtheauthorsprofiledonlyeightcells1.Only7yearslater,10XGenomicsreleasedadatasetofmorethan1.3millioncells2.Thus,wearenowinanerawherelargevolumesofscRNAseqdatamakeitpossibletoprovidedetailedcataloguesofthecellsfoundinasample.Forresearcherstobeabletotakefulladvantageoftheserichdatasets,efficientcomputationalmethodsarerequired.ThereareseveralstepsinvolvedinthecomputationalanalysisofscRNAseqdata,includingqualitycontrol,mapping,quantification,normalization,clustering,findingtrajectoriesandidentifyingdifferentiallyexpressedgenes(Fig.1).Thestepsupstreamofclusteringmayhaveasubstantialimpactontheoutcome,andforeachstepnumeroustoolsareavailable.Moreover,therearealsosoftwarepackagesthatimplementtheentireclusteringworkflow,forexample,Seurat3,scanpy4andSINCERA5.Weencouragethereadertoconsultrecentlypublishedoverviewsofthisworkflow6–10,asthisReviewfocusesonclusteringalone.Asclusteringisthekeystepindefiningcelltypesbasedonthetranscriptome,onemustcaref...