La lecture en ligne est gratuite
Read Download

Share this publication

Newapplicationsof

themultivariateanalysisframework

NeuroBayesforaninclusiveb-jet

crosssectionmeasurementatCMS

ZurErlangungdesakademischenGradeseines
DOKTORSDERNATURWISSENSCHAFTEN
vonderFakult¨atf¨urPhysikdes
KarlsruherInstitutf¨urTechnologie(KIT)

genehmigte

DISSERTIONAT

nvo

Dipl.Phys.SimonHonc

tuttgartSaus

Tagderm¨undlichenPr¨ufung:13.05.2011
Referent:Prof.Dr.M.Feindt,Institutf¨urExperimentelleKernphysik
Korreferent:Prof.Dr.Th.M¨uller,Institutf¨urExperimentelleKernphysik

”Der

Vrwurf,o

(K.

meine

.T

uz

oktorarbD

ergbuttenG

urf¨

eit

ma

sei

Lisa

ein

16.

lagiat,P

bruareF

ist

2011)

abstrus.”

sttenCon

1Introduction
2TheStandardModelofparticlephysics
2.1ParameteroftheStandardModel............................
2.1.1TheelementaryparticlesoftheStandardModel...............
2.1.2TheinteractionsoftheStandardModel....................
2.2Heavyquarkproduction.................................
2.2.1b-jetcrosssection................................
2.2.2QCDpredictions.................................
2.2.3Historicalcontext.................................
2.2.4Conclusion....................................
3TheCMSexperiment
3.1CERN-ConseilEuropeenpourlaRechercheNucleaire................
3.2LHC-Largehadroncollider...............................
3.3CMSdetector-Compactmuonsolenoiddetector...................
3.3.1Trackingsystem.................................
3.3.2Calorimeter....................................
3.3.3Muondetector..................................
4Eventreconstruction
4.1Triggersystem......................................
4.1.1Level1trigger..................................
4.1.2Highleveltrigger.................................
4.2Luminositymeasurement................................
4.3Eventreconstructionandobjectidentification.....................
4.3.1Trackreconstruction...............................
4.3.2Primaryvertexreconstruction..........................
4.3.3Secondaryvertexreconstruction........................
4.3.4Electronreconstruction.............................
4.3.5Muonreconstruction...............................
4.3.6Jets........................................
4.3.7Jetenergycorrections..............................
4.3.8JetFlavordefinition...............................
4.3.9bjettagging...................................
4.4MonteCarlosamples...................................
4.5Datasamples.......................................

5

7111121315116619142525272820313335353636363838393041414244444541555

6

5NewapplicationsofNeuroBayes
5.1NeuroBayes........................................
5.1.1Introduction...................................
...................................cessingreproP5.1.25.1.3Targetcorrelationandprediction........................
5.2NeuroBayesprobability.................................
5.2.1NeuroBayesprobabilitytransformation.....................
5.2.2BoostTraining-NeuroBayesandweights...................
5.2.3sPlot........................................
5.3NeuroBayesb-jettagger.................................
5.3.1b-jettaggingvariables..............................
5.3.2NeuroBayesMCtagger(NBMC)........................
5.3.3MCtodatacomparison.............................
5.3.4NeuroBayesdatatagger(NBD).........................
6bjetcrosssectionmeasurement
6.1RecentbcrosssectionmeasurementatCMS......................
6.1.1Eventandjetselections.............................
.....................................-taggingb6.1.26.1.3Measurement...................................
6.2Updateoftheflavorcontentfitter...........................
6.2.1Templatefit....................................
6.2.2pT/|y|binning..................................
6.2.3Fitresults.....................................
6.2.4Systematicuncertainties.............................
6.2.5TaggingefficienciesfromMonteCarlosimulation...............
6.2.6Updatedresult..................................
6.3NeuroBayesapplication.................................
6.3.1NeuroBayestemplatefit.............................
onclusionC7ADistributionsofb-jettaggingvariables
BResultsofdatatoMonteCarlocomparison
CDependencycheck
DFithistogramsofflavourcontentfitter
EFithistogramsofNBflavourcontentfitter
figuresofListyBibliograph

CONTENTS

5995950646666686072737472969105501601601211311311411411611021121221221127129141145147151154157

1Chapter

ductiontroIn

Thisthesisisaboutthemeasurementoftheinclusiveb-jetcrosssectionandnewapplicationsofthe
multivariateanalysisframeworkNeuroBayes.AsformostPhDthesesthetitleisaaccumulation
oftechnicaltermsandformostpeoplemoreorlessincomprehensible.Butneverthelessthetitleis
erate.delib

NewapplicationsofthemultivariateanalysisframeworkNeuroBayesforan
inclusiveb-jetcrosssectionmeasurementatCMS

Thetitlecoversthemaintopicsofthisthesis.ThesearethenewapplicationsofNeuroBayes
andthemeasurementofaphysicalquantity:theinclusiveb-jetcrosssection.Thereaderwillbe
broughttothesetopicsafteraextensiveintroductionoftherequiredbackground.
Inchapter2Iwillhighlightthetheoreticalbasisofthephysicalcontents.ThereforIwillgiveashort
essayaboutthehistoricalprocessuptotheformulationofthequantumchromodynamics(QCD).
QCDistheunderlyingtheorybehindthemeasurementpresentedinthisthesisanddescribesthe
stronginteractionsbetweentheparticles.ItsintellectualfatherMurrayGell-Mannwonthenobel
prize”forhiscontributionsanddiscoveriesconcerningtheclassificationofelementaryparticlesand
theirinteractions”asearlyasin1969[Nob].
Manyyearsofresearchenhancedthelistofelementaryparticles.Untilnowwefoundtwelve
elementaryfermions,sixofthem,thequarks,areabletodothestronginteraction.Thequarks
varyinelectromagneticchargeandmass.Oneoftheseparticlesisthesocalledbottomquark
(b-quark).Itsmassisabout4.2GeV.Thenamebottomischoseninanalogytothedown-quark,
whichispartoftheproton.
Theb-quarkwasdiscoveredin1977atFermilab[HHL+77].Thiseventwasthestartingpoint
ofahugefieldofresearch:b-physics.Therearethreemaintopicsinb-physics:bproduction,
Bspectroscopyandbflavorphysics.Thefirsttwocoverphysicaleffectscausedbythestrong
interaction,whilethelasttreatsthedescriptionoftheweakdecayoftheb-quark,whichisvery
importantfordiscoveryofphysicsbeyondthestandardmodel.TheLHCbexperimentatCERN
wasbuiltespeciallyforanalysesinthisinterestingsector.
Butalsostudiesonthestronginteractingsectoroftheb-quarkplayanimportantrole.Ontheone
handb-quarkscontributetothebackgrounddistributionsformanyanalyses.Agoodunderstanding
ofthebquarkproductionmechanismmayleadtosignificantimprovements.Especiallyfornew
particles,whichdecayintob-quarks,thisisnon-negligible.Furthermorefortheseprocessesagood
identificationoftheb-quarksisimportant.
Ontheotherhandtheproductionoftheb-quarksitselfisveryinteresting.Inthelastdecades
analysesonthistopicleadtocuriousresults.Atthebeginningofthenewmilleniumtheyalready

7

8

CHAPTER1.INTRODUCTION

claimednewproductionmechanismsbeyondQCDpredictions[Ber01],[Jun03].Itwasnotuntilten
yearslaterthettheresultscouldbebroughtinlineafterarecalculationoftheoldtheory.Aboveall
thedifficultytosolvetheperturbationcalculationsforheavyparticlesandtheinsufficientmodeling
ofthehadronizationprocessledtothosediscrepanciesbetweentheoryendexperiment.
TodayitispossibletoreachanewregionsinenergywiththeexperimentsattheLargeHadron
Collider(LHC).Thusitbecomesagainveryinterestingtocheckwhetherthetheoryisableto
describethenewmeasurements.InthisthesisIwillpresentthefirstanalysiswhichcoversthe
productionofb-quarksatsuchenergies.ThestudiesaredoneattheCMSexperiment.
Thethirdandtheforthpartofthethesiscoverthedescriptionoftheexperiment.Iwillpicturethe
splendidhistoryofCERNanditsexperimentswhichculminateintheconstructionoftheLHCand
itsexperiments.IwillintroducethelayoutoftheCMSdetectorinchapter3.Withthisapparatus
weareabletomeasurethephysicalprocesseswhichhappenafterthecollisionofprotonsata
centerofmassenergyof7TeV.Furtheritisplannedtoincreasetheenergyupto14TeVinthe
nearfuture.Theobtaineddatamustbetransformedintophysicalobjects.Inchapter4Iwill
presenthowtheseobjectsarereconstructed.
Themaintopicsarepresentedinchapter5and6.
Butletushaveamoredetailedlookintothemainparts.Thesyntaxofthetitleischosento
emphasizethesetopics:Forthemeasurementnewapplicationswheredeveloped.Animportant
partofthisthesisdealswiththeaggregationofknownandnewmethodsbasedonNeuroBayes.
NeuroBayesisunknowntomostpeoplesoitisdescribedinmoredetail.
NeuroBayesisatooltodomultivariateanalysis.Inthesimplestcase,thismeansthatmanyinput
variablesareusedtodoaclassificationoftwotargets:backgroundandsignalorinmoregeneral:
target0andtarget1.Theinputsarecombinedandtransformedtoasingleoutputvariablewhich
carriesallinformationtodotheclassification.Themultivariatephasespaceisreducedtoaone
dimensional.Insection5.1adetailedsummaryofNeuroBayesispresented.
InthethesisitselfIdevelopeddifferentapplicationsforthis.Theyrangefromgeneralderivations
howtocombineknownmethodswithNeuroBayestonewapproachesbasedonNeuroBayesand
specificapplicationsforthephysicalanalysis.
Insection5.2IwillfocusontheinterpretationoftheNeuroBayesoutput.NeuroBayesiscon-
structedinawaythatitispossibletotransformtheresultsintoaprobability.Thisdepends
amongotherthingsonthespecificationoftheinputsamples.Iwilldiscusstwodifferentsetups.
ThefirstIwillcallMonteCarlo(MC)based,theotherdatabased.Infactthemaindifference
isthetypeofoneofthetargetsfortheclassification.Eitherwetakesimulationsoftheback-
grounddistributionintheMCbasedapproachoradatasampleforthedatabasedapproach.The
derivationsofthisIwillshowinsection5.2.1.
Theknowledgeontheprobabilitycanbeusedtoexecuteasocalledboosttraining.Insection
5.2.2Iwillintroducethebasicidea,theimplementationandtheresultingpossibilities.
AnotherapplicationofNeuroBayesthatwasdevelopedisthetransformationofitsoutputinto
sPlotweights.sPlotisamethodtodeterminetheinclusivedistributionofaspecificvariableusing
theinclusiveinformationsofasourcevariablewhichisuncorrelatedtotheformer.Insection
5.2.3IwillderivetheconnectionbetweenNeuroBayesandsPlot.Withthisitispossibletotake
theNeuroBayesoutputdirectlyassourceforthesPlotmethod.Theinspectedvariablehastobe
uncorrelatedtotheNeuroBayesoutput.
HavingmanytoolsbasedonNeuroBayesinplaceitisobvioustoapplythemonphysicaltopics.
Thus,itisusedforaclassificationofjets.Ajetisanobservablepatterninthedetector.Quarks
andgluoncreatemanyparticleswithasimilardirection.Ontheonehandthishappensbecause

9

ofthehadronizationoftheseparticles.Ontheotherhandmostofthecreatedhadronsdecayinto
lighterparticlesinturn.Alltheseparticleswerecombinedtosocalledjets.Ifsuchajetiscreated
byb-quarksitiscalledb-jet.Theaimofab-jettaggeristoclassifyjetsintob-jetsandnon-b-jets.
Insection5.3.1IwillpresentaNeuroBayesb-jettaggerbasedonMonteCarlosimulations(MC)
andinasecondcasebasedondata.ForthedatabasedtaggerdifferencesbetweendataandMC
havetobestudied.ThiswasalsodonewithNeuroBayes.FurtherIappliedanothermethod,called
boost,toshowhowmuchimprovementsarepossiblewhendoingthisprocedure.

InthelastchapterIwillfocusontheinclusiveb-jetcrosssectionmeasurementdonebyCMS.Iwill
presenttherecentanalysisdoneonearlyCMSdatainsection6.1.Thismeasurementwasdone
ondatawithanintegratedluminosityat60/nb.DuringthefirstyearofdatatakingCMSalready
collected36/pbofdata.Thereforanupdateoftherecentmeasurementisplanned.Insection6.2
Iwillpresentourcontributionstothisanalysisandprospectsforfutureresults.
AttheendIwillstartadiscussiononalternativeapproachestodothisinclusiveb-jetcross
sectionmeasurement.Iwillpresentamethodbasedonthejetclassificationperformedwiththe
NeuroBayesframework.Insection6.3Iwillshowhowtheresultschangeifthenewlydeveloped
NeuroBayesb-jettaggerisused.Thenewtaggerisusedinthesamemannerasfortheformer
t.measuremen

Finallytheresultsofthisthesiswillbesummarizedanddiscussedinchapter7.

10

CHAPTER.1IONTRDUCTION

2Chapter

TheStandardModelofparticle
ysicsph

InthischapterIwillintroducetheStandardModelofparticlephysics.Thefirstpartisanoverview
oftheparametersofthemodel.Precisemesurementsofthesequantitiesareneededtomakefurther
estimatesonthebehaviourofthenatureattheelemantarylevel.
Thesecondpartisaboutaspecialpartofthemodel,perturbativequantumchromodynamics
(pQCD).Forthisthesiswewanttocomparetheexperimentaldatawiththepredictionsmade
bytheStandardModel.Therforeweneedthecalculationsdonebytheorists.Iwillfocusonthe
problematicsthatcameupovertheyearsdoingsuchcalculationsandpresentthecurrentstatus.
Finally,Iwillworkouttheprospectsofafurthermeasurementoftheinclusiveb-jetcrosssection
CMS.at

2.1ParameteroftheStandardModel

ParticlephysicshaditsgenesiswiththediscoveryoftheelectronbyThomsonin1897[Tho97].
Notthattheywereawareofthefurtherparticleswemaydiscover,butthedoorwasopenedfora
newfieldofphysics.Inthebeginningofthe20thcenturythepictureoftheatomwascompleted
andalsothefirstanti-particle,thepositron,wasdiscoveredin1932byC.D.Anderson[And33].
Nobodyexpectedwhatwouldhappeninthefollowingyears.Itstartedwiththediscoveryofa
newparticleincosmicrays:themuonin1937[SS37].Itsdiscoveryledtotheformulationofthe
quantumelectrodynamics(QED)inthe1940sbyJ.Schwinger,R.FeynmanandS.Tomonaga
[Dys49].Henceforwardmanyfurtherparticleswerefound,manyweredetectedincosmicrays,but
alsofirstacceleratorswerebuilt.Thefirstdiscoveryofanewparticleproducedincollisionswas
theneutralpionin1949[SPS50].Until1961thenumberofparticlesrapidlyincreased.Theyfound
manyofthegroundstatesofthetodaysocalledmesonsandbaryons,andalsodiscoveredthefirst
neutrinosin1956[RC56].Uptothentheyfoundmorethen20differentparticles.
1961aparticlecalledηwasdiscovered[PRS61].Thiswastheneededingredientforanewphe-
nomenologicalclassificationoftheparticles:theeightfoldway.Dependentonthemeasuredquan-
tumnumbersasystematicorderingwaspossible(seefigure2.1).
1964themodelwasconfirmedbythediscoveryoftheΩ−,whichwasthelastmissingparticleto
completethestructureoftheeightfoldway.Knowingthis,Gell-MannandZweigindependently
sawthepossibilityofaunderlyingtheoryusinggrouptheorywithanSU(3)symmetry.Thiswas
thebirthoftoday’squarkmodelandfinally,togetherwiththequantumelectrodynamics(QED),

11

12

CHAPTER2.THESTANDARDMODELOFPARTICLEPHYSICS

Figure2.1:Theeightfoldwayisanorderingofthegroundstateparticlesasproposedin1961.
TheparticlesareordereddependentontheirquantumnumbersSandq.Thiscompositionleads
tothepredictionofthelaterdiscoveredparticleΩ−−andisthebaseforthelaterformulatedquark
del.mo

chargecharge
e.m.weakcolormasse.m.weakcolor
andedhtrighleptonsIneutrinoνe-+1/2-<2eV---
electrone-e-1/2-511keV-e--
IIneutrinoνμ-+1/2-<0.19MeV---
muonμ-e-1/2-106MeV-e--
IIIneutrinoντ-+1/2-<18.2MeV---
tauτ-e-1/2-1.8GeV-e--
andedhtrighquarksIupu+2/3e+1/2rgb2.49MeV+2/3e-rgb
downd-1/3e-1/2rgb5.05MeV-1/3e-rgb
IIcharmc+2/3e+1/2rgb1.27GeV+2/3e-rgb
stranges-1/3e-1/2rgb101MeV-1/3e-rgb
IIItopt+2/3e+1/2rgb172GeV+2/3e-rgb
bottomb-1/3e-1/2rgb4.2GeV-1/3e-rgb
Table2.1:ListofelementaryparticlesoftheStandardModel.Theparametersaretakenfrom
+10].[N

theStandardModelofparticlephysics.
InthefollowingIwillpresenttheelementaryparticlesandtheinteractionsoftheStandardModel.
Thissectionispartiallyextractedfrom[Ind04].

2.1.1TheelementaryparticlesoftheStandardModel
TheonepartoftheStandardModelarethesocalledfermions.Allparticleswithahalf-integral
spinquantumnumberareassignedtothisclass.Theelementaryparticlesofthiskindaredivided
inleptonsandquarks.Forleptons,aswellasforquarks,threegenerationsofdoubletsexist.In
itsgenerationstheparticlesdifferonlybytheirmass.Thequantumnumbersarethesame.Table
2.1showsanoverviewofthedifferentelementaryparticles.
Particlesonlyparticipateininteractionwheretheyhavecharge.Itistoremarkthattheleptons
havenocolorcharge.Furthertheneutrinoshaveevennoelectromagneticcharge.Theweak
interactiononlycouplestolefthandedfermions.Thereforerighthandedneutrinosdonotinteract
withotherparticlesexceptbygravitation.Becauseoftheweaknessofthegravitationthesekind

2.1.PARAMETEROFTHESTANDARDMODEL

13

Figure2.2:Feynmandiagramoftheelectronscattering.Avirtualphotonisexchangedforthe
teraction.in

ofneutrinosarenotdetectable.Itisnotknownifsuchparticlesexist.
Foreachparticlefurtherexistsanantiparticlewithoppositequantumnumbersbutsamemass.
Quarksappearonlyinneutralcoloredcombinations.Combinationconsistingoftwoquarksare
calledmesons,combinationswiththreequarksbaryons.Combinationswithmorethanthree
componentsarealsopossiblebutnotobservedinnature,yet.

2.1.2TheinteractionsoftheStandardModel
Therearefourtypesofinteractions,whichweareabletoexperience:gravitation,electromagnetic
(e.m.),weakandstronginteraction.

itationvGraThegravitationisnotincludedintheStandardModelofparticlephysics.Iwillbrieflydiscussits
erties.propmainThepullofthegravitationisverysmallforenergiesbelowthePlanckscaleof1019GeVcompared
tootherinteractions.Wecanneglectitinthemodelofparticlephysics.Thegravitationhasno
repulsion.Lookingatcosmicphenomenaitgetsadominantcontributione.g.forthemotionof
theplanets,starsandgalaxies.
Abosoncalledgraviton,whichcarriestheinteraction,hasnotbeendiscovereduntilnow.

teractioninElectromagneticTheelectromagneticinteractionisdescribedbythequantumelectrodynamics(QED).QEDisa
grouptheorywithaabeliansymmetrygroupU(1).TheU(1)impliesonlyonegeneratorwhich
representstheelectromagneticcharge.Themediatorsofthischargearevirtualphotons.The
photonsarebosonswithspin1andhavenomass.Furthertheyhavenochargethemselves.Thus
theyarenotabletocoupletoeachother.
Figure2.2showsaleadingorderFeynmandiagramofQEDprocess.Therearetwoelectronswhich
interacttoeachotherviaavirtualphoton.Thecouplingattheverticesisproportionaltothe
couplingconstant√α.α≈1371forsmallenergytransfers.Thereforeitispossibletocalculatethe
QEDintermsofperturbationtheory.QuitegoodapproximationsarealreadyreachedatO(α2).

teractioninakeWTheweakinteractionsaredescribedbythenonabeliansymmetrygroupSU(2).Thisgroupimplies
threegenerators.OnepossibleexposureofthesegeneratorsistheuseofthePaulimatricesJi,
i=1,2,3multipliedbyafactorof21.Threegeneratorsleadstothreegaugebosons.Allthree
gaugebosonswerediscovered.ThereistheneutralZbosonwithamassof91.2GeVandthetwo
chargedW±bosonswithamassof80.4GeV.ZandW±bosonsarespin1particles.

14

CHAPTER2.THESTANDARDMODELOFPARTICLEPHYSICS

CMS preliminary36 pb-1 ats = 7 TeV
Z/γ*→μμ

μμ10236 pb-1 ats = 7 TeV
/dMσ d10Z/γ*→μμ
Zσ11/-110-210 data-310 NNLO, FEWZ+MSTW08 uncertainty on modeling-410153060120200600
μμM() [GeV]

Figure2.3:ThenormalizedDrell-Yanmassspectrum,(1/σZ)dσ/dM,obtainedinthedi-muon
channelandcomparedtotheoreticalpredictions.Theuncertaintyonthemodelingaccountsfor
differencesintheacceptancecorrectionsobtainedwithPOWHEGandFEWZ.[CMS11]

Furtherthehighmassesleadtoasmallrangeoftheweakforce,becauseoftheHeisenberguncer-
taintyprinciple.TheproductoftheenergytakenfromthevacuumΔEforaspecifictimescale
Δtislimited.Thereforetheweakinteractionisstronglysuppressed.Therawestimateofthe
maximumrangelfortheweakinteractionisdefinedbyl=c·Δt,wherecisthespeedoflight.
ThegaugebosonscoupletothethirdcomponentoftheweakisospinI3.Righthandedparticles
andlefthandedantiparticleshaveI3=0.Thereforetheweakinteractioncouplesonlytohalfofthe
particles.TheLorentzinvariantpropertyofthehandednessisthechirality.Itisageneralizationof
helicityh,whichisdefinedasthenormalizedproductofthespinSoftheparticleanditsdirection
ofthemomentump:h=Spˆ.FormassiveparticlesthehelicityisnotLorentzinvariant.
Figure2.3showstheDrell-YanmassspectrumintherangeoftheZmassobtainedinthedi-muon
channelatCMS.TheZbosonsareeasytoproduceifthecenterofmassenergyisnexttothe
bosonmass.Weseearesonantstructureinthespectrum.Awayfromtheresonanceweseethe
expecteddependencys1(remarkthedoublelogarithmicscale),wheresisthesquareofthecenter
ofmassenergy√s.
Anotherdifferencebetweenweakandelectromagneticinteractionisthepossibilityofthegauge
bosonstocoupletoeachother.Thisduetothenonabelianstructureofthegaugetheory.Itis
possibletohaveverticeswiththreeorfourZandW±bosons.

teractioninStrongThestronginteractioncouplesonlywithquarksandnotwithleptons.Thequarkmodelisformu-
latedbyanonabeliangaugetheorywiththesymmetrygroupSU(3).Thechargeiscalledcolor
chargeinanalogytothecolormixtureoflight.Thereisared,greenandbluechargeandthe
correspondinganticolors.Allphysicalobjectsareuncolored.Thiscanbeachievedbyamixture
ofacolorwiththesameanti-colororamixtureofallthreecolors.Thisleadstomesons,composed
byaquarkandananti-quark,andbaryons,composedbythreequarks.
TheSU(3)implieseightgeneratorswhichleadtoeightvariousgluons.Thegeneratorarethe

2.2.HEAVYQUARKPRODUCTION

15

lambdamatrices.Thegluonscarryacolorandananti-color.Becauseofthenonabelianstructure
oftheSU(3)againcouplingsbetweenthegluonstoeachotherarepossible.Thereexiststhreeas
wellasfourgluonvertices.Thegluonsarespin1particles.
Incontrasttotheelectromagnetictheorythecouplingconstantofthestronginteractionαsisnot
small.Thishastheeffectthatonlyforlargemomentumtransferitispossibletocalculatephysical
processesofstronginteractionsbyperturbationtheory.αsbecomessmallerforlargemomentum
transferandreachesasymptoticallyzero.Thisiscalledasymptoticfreedom.Ifthemomentum
transferissmallthehigherordercontributionsO(αsn)becomelargeandapproximationsarevery
difficult.Anothereffectofthestronginteractionsistheincreaseofthepotentialwithincreasingdistance.
Thisleadstotheconfinement.Ifonetriestodisplacetwoquarkstherequieredenergyriseswiththe
distanceofthetwoparticlesuntilenoughenergyisavailabletogenerateanewquark/anti-quark
pairoutofthevacuum.
Lookingathadroncolliderthisleadstothesocalledhadronization.Insteadofsinglequarks,
manyparticlesareproducedwhichformsocalledjets.Thejetsconsistofhadronsmovingin
similardirection.Thejetsitselfarereconstructedasobjectswithpropertiescorrespondingtothe
propertiesoftheleadingquarks.

2.2Heavyquarkproduction
Ifoundanmeaningfulintroductionofheavyquarkproductionintheproceedingsoftheheavyflavor
workinggroupattheHERA-LHCWorkshopin2006[BBB+06].Themainpartsareextractedhere:
PerturbationQCDisexpectedtoprovidereliablepredictionsfortheproductionofbottomand
(toalesserextent)charmquarkssincetheirmassesarelargeenoughtoassuretheapplicabilityof
perturbationcalculations.AnywayadirectcomparisonofperturbationQCDpredictionstoheavy
flavorproductiondataisnotstraightforward.Difficultiesarise
•fromthepresenceofscales,whichareverydifferentfromthequarkmassesthatreducethe
predictabilityoffixed-ordertheory,
•fromthenon-perturbationingredients,whichareneededtoparametrizethefragmentation
oftheheavyquarksintotheobservedheavyhadronsand
•fromthelimitedphasespaceaccessibletopresentdetectors.
Moreoverabreakdownofthestandardcollinearfactorizationapproachcanbeexpectedatlow
momentaofthepartons.Thestudyofheavyquarkproductioninhadronicinteractionstogether
withtheniceresultsottheelectron-protoncollisionsatHERAhasbeenthereforeanactivefield
intheefforttoovercomethesedifficultiesandtogetadeeperunderstandingofhardinteractions.
Besidesitsintrinsicinterest,apreciseunderstandingofheavyquarkproductionisimportantat
LHCbecausecharmandbeautyfromQCDprocessesarerelevantbackgroundstootherinteresting
processesfromtheStandardModel(e.g.Higgstobb¯orbeyond).Moreover,theoreticaland
experimentaltechniquesdevelopedatHERAintheheavyquarkfield,suchasheavy-quarkparton
densitiesorb-tagging,arealsoofgreatvalueforfuturemeasurementsattheLHC.
Afterexcitingyearswith’riseandfallofthebottomquarkproductionexcess’[Cac04]oilwas
putontroubledwatersandtheycameupwitharationalrouteforfurtherinvestigationsinthis
interestingtopiconphysics.
InmydescriptionofthetheorybehindtheanalysisIwillrefermainlytotheseproceedingsand

16CHAPTER2.THESTANDARDMODELOFPARTICLEPHYSICS

summarizetheideasandprospectstheymadeforthemeasurementsatLHC.InthefollowingI
willintroducethebasicsoftheanalysisanddrawthepictureofthemainphysicsbehindthis
thesis.IwillpointtothecomplexapproachesneededforpredictionsinQCD,andthereforepoint
totheproblemsintheperturbationaswellasinthenon-perturbationpartsofthecalculations.To
completethepictureofhowconfusingmeasurementsofheavyflavorproductionswere,Iwillreflect
thehistoricalcuriositiesandfinalizethischapterwiththeparadigmclaimedbyMatteoCacciari
[Cac04].

ectionscrossb-jet2.2.1FromthepointofviewofstandardperturbationQCDcalculations,thesituationhasnotchanged
sincethebeginningofthe90s:fullymassivenext-to-leadingorder(NLO)calculationsweremade
availableforhadron-hadron,photon-hadron(i.e.photoproduction)andelectron-hadron(i.e.Deep
InelasticScattering,DIS)collisions.Thesecalculationsstillconstitutethestateoftheartasfar
asfixedorderresultsareconcerned,andtheyformthebasisforallmodernphenomenological
predictions.Thisstatementwasgivenby[BBB+06]onthetheoryofheavyflavorproductionin2006.Therefore
theperturbationQCDcalculationsarethebaseforainclusiveb-jetcrosssectionmeasurement.
FromtheexperimentalpointofviewthecrosssectionσisdefinedbythenumberofeventsN
producedatacertainintegratedluminosityL:
σ=NL.
Inthisthesisweareinterestedinthedifferentialcrosssectionofb-jets:
d2σb-jet∂2Nb-jet
dpTdy=∂pT∂yL
Forthemeasurementofthisquantitythenumberofb-jetshavetobecountedindifferentrangesof
thetransversemomentumpTofthejetanditsrapidityy.Inadditiontheintegratedluminosityhas
tobemeasured.ThelatterwasalreadydonebytheCMScollaboration[CMS10g].Theremaining
part,theanalysisoftheb-jetswasfirstdoneduringthesummer2010ontheveryearlydataof
theCMSexperiment[CMS10e].Inthisthesistheupdateandtheimprovementoftheformer
measurementwillbediscussed.
Letusstartwiththealreadymentionedtheoreticalbase,theperturbationQCDcalculations:

redictionspQCD2.2.2Aniceoverviewoftheheavyquarkproductionisgivenin[FNW03].Thefollowingexplanations
them.fromxtractedeareFortheheavyflavorproductionwedistinguishthreeproductionmechanisms:theflavorcreation
(FCR),theflavorexcitation(FEX)andthegluonsplitting(GSP).FCRprocessesoccuralready
atO(αs2)whileFEXandGSPappearprimaryatO(αs3).
Atleadingorder(LO)wehavethefollowingFCRprocesses:

gg→Q¯Qqq¯→QQ¯
wheregspecifiesthegluons,qlightquarksandQtheheavyquarks.Atnexttoleading(NLO)
orderwehavethefollowing:

2.2.HEAVYQUARKPRODUCTION

17

Figure2.4:HeavyquarkQproductionmechanismsatleadingorder(LO).Theseprocessesare
calledflavorcreation(FCR).

Figure2.5:Atnexttoleadingordertheproductionmechanismareclassifiedinthreekinds.Here
anexamplefortheflavorexcitation(FEX)isillustrated.Theprocessesareassignedtoit,ifthe
scatteringisaccomplishedbyaheavyquarkoutoftheproton.

gg→Q¯Qgqq¯→Q¯Qggq→Q¯Qqgq¯→QQ¯q.¯
Therewecanspecifytwofurtherkindsofproductionmechanism.FortheFEXwedefine:

qQ→q¯Qq¯Q→q¯QgQ¯→gQ¯
andtheGSPisahardgg→ggprocessfollowedby:
¯Q.Qg→Figure2.4to2.6illustratethedifferentproductionmechanisms.
FEXandGSPprocessesarewelldefinedonlyinthecaseoflargetransversemomentaoftheheavy
quark.Theirextrapolationtothelowtransversemomentumregioncanatbestbeconsideredavery
roughmodelofhigher-orderheavyflavorproductionprocesses.Figure2.7showsthetransverse
momentumpTspectrumofthedifferentproductionmechanism,modeledbyaPythia6TuneZ2
eventgenerator.Itisnicetoseehowtheguonsplittingprocessbecomesmoredominantforthe
highmomentajets.
In[BBB+06]theproblemsforsuchacalculationwerediscussed:
Perturbationcalculationsofheavyquarkproductioncontainbadlyconverginglogarithmicterms
ofquasicollinearorigininhigherorderswhenasecondenergyscaleispresentanditismuch

18

CHAPTER2.THESTANDARDMODELOFPARTICLEPHYSICS

Figure2.6:Thegluonssplittingcategory(GSP)referstoahardgluonproductionprocesswhich
isfollowedbytheintrinsicgluonsplitting.

CMS simulation

5104# jets per bin10310210101-110-210-310221.5110

2.5

s = 7 TeVGSPFEXFCR

10333.5pT

Figure2.7:Transversejetmomentumspectrumofthedifferentbproductionmechanisms.Forhigh
pTjetstheproductionisdominatedbygluonsplittingprocesses.TheplotisdoneonaPythia6
TuneZ2samplewerepT>37GeV.

2.2.HEAVYQUARKPRODUCTION

19

largerthantheheavyquarkmassm.Examplesarethe(squarerootofthe)photonvirtualityQ2
inDISandthetransversemomentumpTineitherhadroproductionorphotoproduction.Naming
genericallyEthelargescale,wecanwriteschematicallythecrosssectionfortheproductionofthe
heavyquarkQas

σQ(E,m)=σ01+αsncnklnkE2+OE,
n2
n=1k=0mm
whereσ0istheBorncrosssection,andthecoefficientscnkcancontainconstantsaswellasfunc-
tionsofmandE,vanishingaspowersofm/EwhenEm.Solvingthisequationfornext
toleadingorderprocessesneedsadvancedresummationapproaches.Variousaredevelopedwith
thegoaltoresumtheleadinglogarithms(αsnlnn(E2/m2),LL)andnext-to-leadinglogarithms
(αsnlnn−1(E2/m2),NOLL).
Overtheyears,andwithincreasingexperimentalaccuracies,ithoweverbecameevidentthatper-
turbationQCDalonedidnotsuffice.Infact,realparticles-hadronsandleptons-areobservedin
thedetectors,notthequarksandgluonsofperturbationQCD.Apropercomparisonbetweenthe-
oryandexperimentrequiresthatthisgapisbridgedbyadescriptionofthetransition.Ofcourse,
theaccuracyofsuchadescriptionwillreflectontheoverallaccuracyofthecomparison.When
theprecisionrequirementswerenottootight,oneusuallyemployedaMonteCarlodescription
tocorrectthedata,deconvolutinghadronizationeffectsandextrapolatingtothefullphasespace.
Thefinalexperimentalresultcouldtheneasilybecomparedtotheperturbationcalculation.This
procedurehastheinherentdrawbackofincludingthebiasofourtheoreticalunderstanding(as
implementedintheMonteCarlo)intoanexperimentalmeasurement.Thisbiasisofcourselikely
tobemoreimportantwhenthecorrectiontobeperformedisverylarge.Itcansometimesbecome
almostunacceptable,forinstancewhenexclusivemeasurementsareextrapolatedbyafactorof
tenorsoinordertoproduceanexperimentalresultforatotalphotoproductioncrosssectionora
heavyquarkstructurefunction.
Thealternativeapproachistopresent(multi)differentialexperimentalmeasurements,withcuts
ascloseaspossibletotherealones,whichistosaywithaslittletheoreticalcorrectionand
extrapolationaspossible.Thetheoreticalpredictionmustthenberefinedinordertocompare
withtherealdatathatitmustdescribe.Thishastwoconsequences.First,onehastodealwith
differentialdistributionswhich,incertainregionsofphasespace,displayabadconvergencein
perturbationtheory.All-orderresummationsmustthenbeperformedinordertoproducereliable
predictions.Second,differentialdistributionsofrealhadronsdependunavoidablyonsomenon-
perturbationphenomenologicalinputs,fragmentationfunctions.Suchinputsmustbeextracted
fromdataandmatchedtotheperturbationtheoryinaproperway,prettymuchlikeparton
distributionfunctionsoflightquarksandgluonsare.
TosatisfytheseclaimsStefanoFrixioneandBryanR.Webberproposein[FW02]theMC@NLO
methodformatchingthenext-to-leadingordercalculationofagivenQCDprocesswithaparton
showerMonteCarlosimulation.ForalmostallanalysisonQCDthismethodisusedtocompare
themeasurementswiththeNLOprediction.

exttconHistorical2.2.3Aniceoverviewofthestrangehistoricalprogressofbquarkproductionmeasurementscanbe
foundin[Cac04].Themainpointsarelistedhere:
Measurementsofthebottomtransversemomentumspectrumatcolliderbeganinthelate80s,when
theUA1Collaboration,takingdataattheCERNSpp¯Swith√s=546and630GeV,published

20

CHAPTER2.THESTANDARDMODELOFPARTICLEPHYSICS

Figure2.8:UA1b-quarkcrosssectionmeasurement[A+91].Theexperimentalpointsoriginfrom
independentmeasurements:b→J/ΨX(solidcircle),highmassdimuons(opencircle),lowmass
dimuons(triangles)andmuonjets(squares).Theseresultswerecomparedtothethenrecently
completednext-to-leadingordercalculations(NLO).

resultsforthepT>mb(thebottomquarkmass)region.
TheUA1collaborationpublishedtwopapersaboutbeautyproduction:[A+87]and[A+91].
Figure2.8showstheresultfrom[A+91].Theinclusivebcrosssectionisplottedforrapidity
|y|<1.5.Theexperimentalpointscomefromindependentmeasurements:b→J/ΨX(solid
circle),highmassdimuons(opencircle),lowmassdimuons(triangles)andmuonjets(squares).
Theseresultswerecomparedtothethenrecentlycompletednext-to-leadingorder(NLO),i.e.order
αs3,calculation([NDE88]and[NDE89]),andwerefoundtobeingoodagreement.
Duringthe90stheCDFandD0Collaborations√alsomeasuredthebottomquarkpTdistribution
inpp¯collisionsattheFermilabTevatronats=1800GeV.ThemaindifferencetotheUA1
measurementsisthattheymeasuredmainlythebcrosssectionsoutoftheproductionratesof
specificBhadrons.ThisincludesmoredifficultpartsofnonperturbationQCDwhichareneeded
todescribethehadronizationprocessandwerenotwellmodeledatthattime.
ThereweresevenpaperspublishedbytheCDFcollaboration:
•MeasurementoftheB-mesonandb-quarkcrosssectionsat√s=1.8TeVusingtheexclusive
decayB±→J/ΨK±[A+92]
•Measurementofthe√bottomquarkproductioncrosssectionusingsemileptonicdecayelectrons
inpp¯collisionsats=1.8TeV[A+93b]
•Measurementofbottomquarkproductionin1.8TeVpp¯collisionsusingmuonsfromb-quark
+93a][Aysdeca•MeasurementoftheBmesonandbquarkcrosssectionsat√s=1.8TeVusingtheexclusive
decayB0→J/ΨK∗(892)0[A+94]

2.2.HEAVYQUARKPRODUCTION

21

Figure2.9:bquarkproductioncrosssectionfor|yb|<1.0comparedwiththeinclusivesinglemuon
resultsandtheNLOQCDprediction.OntherightistheresultoftheB+mesondifferentialcross
sectionmeasurementsfromCDFnormalizedtotheNLOpredictions.
•MeasurementoftheBMesonDifferentialCrossSectiondσ/dpTinpp¯Collisionsat√s=1.8
TeV[A+95b]
•Measurement√oftheB+totalcrosssectionandB+differentialcrosssectiondσ/dpTinpp¯
collisionsats=1.8TeV[A+02a]
•Measurementoftheratioofbquarkproductioncrosssectionsinpp¯collisionsat√s=630GeV
and√s=1800GeV[A+02b]

ThreepaperswerepublishedbytheD0collaboration:
√•Inclusiveμandb-QuarkProductionCrossSectionsinpp¯Collisionsats=1.8TeV[A+95a]
•Small-AngleMuonandBottom-QuarkProductioninpp¯Collisionsat√s=1.8TeV[A+00a]
•Thebb¯productioncrosssectionandangularcorrelationsinpp¯collisionsat√s=1.8TeV
+00b][AInfigure2.9resultsofthebcrosssectionmeasurementsdoneatTevatronRun1areshown.The
leftistakenfrom[A+00b]andshowsthebquarkproductioncrosssectionfor|yb|<1.0compared
withtherevisedinclusivesinglemuonresultsandtheNLOQCDprediction.Theerrorbarsonthe
datarepresentthetotalerror.Thetheoreticaluncertaintyshowstheuncertaintyassociatedwith
thefactorizationandrenormalizationscalesandthebquarkmass.Alsoshownaretheinclusive
singlemuondatafromCDF[A+93a].OntherighthandsideofthefiguretheresultoftheB+
mesondifferentialcrosssectionmeasurementsfromCDFnormalizedtotheNLOpredictionsis
shown[A+02a].Bothplotsshowalargediscrepancybetweendataandprediction.
ApparentlyatoddswiththeUA1results,theTevatrondataseemedtodisplayanexcesswith
respecttoNLOQCDpredictions.
Atthesametime,ratesforbottomproductionthatappearedhigherthanQCDpredictionswere
alsoobservedinthesocalledγγcollisionsbythreeLEPexperiments:L3([A+01],[A+05]),OPAL
[C+00]andDELPHI[Sil04].Aγγcollisonatelectronpositroncollidermeansthatbothinitial
particlesremainaftertheinteraction.Thecollisonishappensbyinterchangingphotonsγ.

22

CHAPTER2.THESTANDARDMODELOFPARTICLEPHYSICS

Figure2.10:Photoproductionofbeautyquarksineventswithtwojetsandamuon.Thefilledpoints
showtheZEUSresultsfromthisanalysisandtheopenpointisthepreviousZEUSmeasurement
intheelectronchannel[B+01].Thefullerrorbarsarethequadraticsumofthestatistical(inner
part)andsystematicuncertainties.ThedashedlineshowstheNLOQCDpredictionwiththe
theoreticaluncertaintyshownastheshadedband.

AlsothedifferenceswerefoundbytheH1[A+99]andZEUS[B+01]Collaborationsinepcollisions
HERA.atAlltheseanalysesmeasuredtheopenbeautyproductioninγγcollisions.Thishighorderprocess
isneededtobesensitiveonthepQCDNLOcalculationatelectroncollider.
Butdespitethisseeminglyoverwhelmingevidenceofaexcessofbquarks,theoristsarguethatQCD
isinsteadrathersuccessfulinpredictingbottomproductionrates.Improvedtheoreticalanalyses
andmorerecentexperimentalmeasurementsbytheCDFandZEUSCollaborationssupportthis
claim,whichisalsoborneoutbyacriticalreconsiderationofpreviousresults.
AtZEUStheymeasuredthephotoproductionofbeautyquarksineventswithtwojetsandamuon
[C+04].Theresultingbcrosssectionisshowninfigure2.10.Theimprovedtheoreticalpredictions
arecomparablewiththedata.
InTevatronRun2tworesultswerepresentedbytheCDFcollaboration.Onecoverstheinclusive
bcrosssection[CDF05]andtheothermeasuredthebb¯di-jetproduction[CDF07].Bothshowa
goodagreementwiththeNLOpredictions(Figure2.11and2.12).
FinallywehaveresultsfromtheLHC.Untilnowthedataseemstobeinthepredictedregions
althoughwereachedalreadytheregionsnotcoveredbytheTevatronexperiments,wherept>400
GeVortherapidityyisinveryforwarddirection.TheforwardregionwasexploredbytheLHCb
experiment[A+10].
FurtherresultsarepublishedbytheCMScollaboration.Theyhaveperformedmeasurementsof
theinclusiveb-hadronproductioncrosssectionwithmuons[K+11b]andtheB±productioncross
section[K+11a]werepublished.Alsostudiesontheangularcorrelationsoftwobquarkswere
+11c].[KanalyzedTheinclusiveb-jetcrosssectionmeasurementispresentedinthisthesisinchapter6.Forthisuntil
nowonlyapreliminaryresultexists[CMS10e].

2.2.HEAVYQUARKPRODUCTION

23

10CDF RunII PreliminaryData1 (nb/[GeV/c])Systematic errors-1NLO prediction CTEQ6M10Tcorrected at hadron level-22σ/dYdPμR=μF=PT2+m2b / 2
10-3dNLO uncertainties 10-4(scale included μ0 / 4 < μ < μ0)
1010-5MidPoint jets, Rcone=0.7, fmerge=0.75
-1s = 1.96 TeV, ∫ L ~ 300 pb-610|Y|<0.7-71050100150200250300350400
P jet [GeV/c]TCDF RunII Preliminary5Data/NLO prediction (CTEQ6M)sMidPoint jets, R = 1.96 TeV, cone L ~ 300 pb=0.7, f-1merge=0.75
∫4.5corrected at hadron level|Y|<0.74μR=μF=P2T+m2b / 2
Data / NLO predictionSystematic errors3.5NLO uncertainties3(scale μ0 (lower) and μ0 / 4 (upper))
2.521.510.550100150200250300350
P jet [GeV/c]TFigure2.11:Theupperplotshowstheinclusiveb-jetcrosssectionoveraPtrangebetween38
and400GeVmeasuredatCDFRun2.Inthelowerthesameisplottedrelativetothetheory
predictions.

CDF Run II PreliminaryCDF Run II Preliminary210Data - Syst. uncertaintyData - Syst. uncertainty [pb/GeV]T10Herwig (CTEQ5L) + Jimmy φ) [pb/rad]Herwig (CTEQ5L) + Jimmy
Pythia (CTEQ5L) Tune APythia (CTEQ5L) Tune Aη dEMC@NLO (CTEQ6M) + Jimmy d(Δ103MC@NLO (CTEQ6M) + Jimmy
η/d/d2σd12dσ
10-1JetClu Rcone=0.4, |η|<1.2102JetClu Rcone=0.4, |η|<1.2
ET,1>35 GeV, ET,2>32 GeVET,1>35 GeV, ET,2>32 GeV
s = 1.96 TeV, L~260 pb-1s = 1.96 TeV, L~260 pb-1
-210406080100120140160180Leading jet E200 (GeV)22000.511.522.5Δ φ3 (rad)
TFigure2.12:Thedifferentialbb¯crosssectionasafunctionoftheleadingjetEtisplottedonthe
left.Ontherighttheanglebetweenthetwobquarksisshown.Thisdistributionissensitiveto
thefractionsofthethreeproductionmechanismsFCR,FEXandGSP.

24CHAPTER2.THESTANDARDMODELOFPARTICLEPHYSICS

Conclusion2.2.4Lookingbackthemeasurementofbquarkproductionsratescausesafewambiguitiesandmisinter-
pretations.ThereforeMatteoCacciariformulatedin[Cac04]aparadigmwhichpointstheproblems
incomparisonsbetweenmeasurementsandpredictionandproposesanprocedureforanalysison
thebquarkproduction:

’WeshalltakeNLOQCDcalculationsasabenchmarkforcomparisons.Weshall
requiretheexperimentalmeasurementstobegenuineobservablequantities.Bythiswe
meanthatasamatterofprinciplewedonotwishtocomparedatafor,e.g.,b-quarkpT
distributions,sincesuchaquantityisclearlyanunphysicalone:thequarknotbeing
directlyobserved,itscrosssectionshavetobeinferredratherthandirectlymeasured.
Ameaningfulcomparisonwillthereforebeonebetweenaphysicalcrosssectionand
aQCDcalculationwithatleastNLOaccuracy.Non-perturbationinformation,where
needed,willhavetobeintroducedinaminimalandself-consistentway.Thismeansthat
werefrainfromusingunjustifiedmodels,andweshallonlyincludenon-perturbation
informationthathasbeenextractedfromoneexperimentandthenemployedinpre-
dictinganotherobservable,usingthesameunderlyingperturbationframeworkinboth
cases.Suchaprecautionallowsforagoodmatchingbetweentheperturbationand
thenon-perturbationphases,anecessityinthatonlythecombinationofthetwosteps
leadstoanunambiguousmeasurablequantity.
Inpractice,thenon-perturbationinformationrelativetothehadronizationofthe
b-quarksintoB-hadronsisextractedfromLEPdatawithacalculationwhichhas
NLO+NLLaccuracy....theLEP(orSLD)dataaretranslatedtoMellinmoments
space,andonlythemomentsaroundN=5arefitted.Thisensuresthatitisthe
relevantpartofthenon-perturbationinformationwhichisproperlydetermined.These
non-perturbationmomentsarethenusedtogetherwithacalculationhavingthesame
perturbationfeatures,FONLL(FixedOrderplusNext-to-LeadingLog-inthiscase
log(pT2/mb2)),toevaluatethecrosssectionsinpp¯collisions.
TheexpectationisthenthattotalcrosssectionsbereproducedbytheNLOcalcu-
lationsforbquarks,andthatdifferentialdistributionsforBhadronsbecorrectlyde-
scribedbyaproperconvolutionoftheFONLLperturbationspectrumforbquarksand
thenon-perturbationinformationextractedfromLEPdata.Noticethataminimalist
useofnon-perturbationinformationismade:thereisnoattempttofullydescribethe
hadronizationprocess.Onlytherelevantphenomenologicalinformationisdetermined
fromdataandusedinthepredictions.
Asuccessfulcomparisonwillseedataandtheoryinagreementwithintheircom-
bineduncertainties.Thetheoreticaloneswillbeassessedbyvaryingasextensivelyas
reasonabletheparametersandtheunphysicalscalesenteringthepredictions.Asfor
theexperimentalerrors,itisperhapsworthremindingthatonly1-sigmaerrorsare
usuallyshownontheplots,sothatnon-overlappingbandsdonotnecessarilypointto
t.’disagreemenolidsa

3Chapter

TheCMSexperiment

InthissectionIwilloutlinethemultifariouspublicationofthefacilities,whichsticktogether
withthesuccessfulrealizationofthisanalysis.Startingwithoneofthemostimportantcenterof
scientificresearch,wheretheacceleratorandexperimentarebuilt,Iwillgostepbystepintomore
detailsuntilIreachthesingledetectorcomponents,whicharerelevantformystudies.
ThefollowingchapterisasummarywithextractionsofthepublicCERNwebpage:cern.chand
inadditionpartsofthemostinformativepapersIfoundtogiveacompleteimpressionofthelarge
effort,whichwasmadetorealizesuchaproject.

3.1CERN-ConseilEuropeenpourlaRechercheNucleaire

’CERN,theEuropeanOrganizationforNuclearResearch,isoneoftheworldslargestandmost
respectedcentersforscientificresearch.Itsbusinessisfundamentalphysics,findingoutwhatthe
Universeismadeofandhowitworks.AtCERN,theworldslargestandmostcomplexscientific
instrumentsareusedtostudythebasicconstituentsofmatter-thefundamentalparticles.By
studyingwhathappenswhentheseparticlescollide,physicistslearnaboutthelawsofNature.
TheinstrumentsusedatCERNareparticleacceleratorsanddetectors.Acceleratorsboostbeams
ofparticlestohighenergiesbeforetheyaremadetocollidewitheachotherorwithstationary
targets.Detectorsobserveandrecordtheresultsofthesecollisions.
Foundedin1954,theCERNLaboratorysitsastridetheFranco-SwissbordernearGeneva.Itwas
oneofEuropesfirstjointventuresandnowhas20MemberStates.’[CER08a]
Thisistheshortintroductionontheirwebpage.Itisaddressedtotheinterestedtoenterthe
fascinatingworldofCERN.Over50yearCERNisaleaderinscientificandtechnicalinventions,
whichresultsinvarioushighlightsofresearch.

•1954FoundationsforEuropeanscience.CERNwasratifiedbythe12foundingMember
States:Belgium,Denmark,France,theFederalRepublicofGermany,Greece,Italy,the
Netherlands,Norway,Sweden,Switzerland,theUnitedKingdom,andYugoslavia.On29
September1954theEuropeanOrganizationforNuclearResearchofficiallycameintobeing.
•1957Thefirstacceleratorbeganoperation.The600MeVSynchrocyclotron(SC)wasCERNs
firstacceleratoranditprovidedbeamsforCERNsfirstparticleandnuclearphysicsexperi-
s.tmen•1959ThePSstartedup.TheProtonSynchrotron(PS)acceleratedprotonsforthefirsttime.
Withabeamenergyof28GeV,thePSbecamehosttoCERNsparticlephysicsprogram,
andprovidesbeamsforexperimentstothisday.

25

26

CHAPTER3.THECMSEXPERIMENT

•1968GeorgesCharpakrevolutionizeddetection.GeorgesCharpakdevelopedthemultiwire
proportionalchamber,agas-filledboxwithalargenumberofparalleldetectorwires,each
connectedtoindividualamplifiers.Linkedtoacomputer,itcouldachieveacountingratea
thousandtimesbetterthanexistingdetectors.Theinventionrevolutionizedparticledetec-
tion,whichpassedfromthemanualtotheelectronicera.Charpakwasawardedbythe1992
NobelPrizeinPhysicsforhisworkonparticledetectors.
•1971Theworldsfirstproton-protoncollider.TheIntersectingStorageRings(ISR)produced
theworldsfirstproton-protoncollisions,providingCERNwithvaluableknowledgeandex-
pertiseforitssubsequentcolliding-beamprojects.
•1973Neutralcurrentsarerevealed.InanexperimentconductedbyAndrLagarrigueand
colleagues,aninvisibleneutrinopassedthroughtheGargamellebubblechamberatCERN
joltinganelectroninitswake.
•1976TheSPSiscommissioned.Measuring7kmincircumference,theSuperProtonSyn-
chrotron(SPS)wasthefirstofCERNsgiantrings.Builtinatunnel,itwasalsothefirst
acceleratortocrosstheFranco-Swissborder.Initiallyconceivedasaprotonacceleratorwith
abeamenergyof300GeV,theSPSoperatestodayatupto450GeV,andhashandledmany
particles.ofkindstdifferen•1983DiscoveryoftheWandZparticles.In1983,CERNannouncedthediscoveryofthe
WandZparticles.ThediscoverywassoimportantthatCarloRubbiaandSimonvander
Meer,thetwokeyscientistsbehindthediscovery,receivedtheNobelPrizeinphysicsonlya
after.arey•1986Heavy-ioncollisionsbegin.CERNbegantoaccelerateheavyions-nucleicontain-
ingmanyneutronsandprotons-intheSuperProtonSynchrotron(SPS).Theaimwasto
deconfinethequarksbysmashingtheheavyionsintoappropriatetargets.
•1989GiantLEPstartedup.LEPwascommissionedinJuly1989.During11yearsof
research,LEPanditsexperimentsprovidedadetailedstudyoftheelectroweakinteraction
basedonsolidexperimentalfoundations.MeasurementsperformedatLEPalsoprovedthat
therearethree-andonlythree-generationsofparticlesofmatter.LEPwascloseddown
on2November2000tomakewayfortheconstructionoftheLHCinthesametunnel.
•1990TimBerners-LeeinventedtheWeb.Berners-LeehaddefinedtheWebsbasicconcepts,
theURL,httpandhtml,andhehadwrittenthefirstbrowserandserversoftware.
•1993Preciseresultsonmatter-antimatterasymmetry.TheNA31experimentatCERN
publishedthefirstpreciseresultsonwhatisknownasdirectCPsymmetrybreaking,which
indicatesmoreclearlythephysicsunderlyingthephenomenon.
•1995Firstobservationofantihydrogen.AteamledbyWalterOelertcreatedatomsof
antihydrogenforthefirsttimeatCERNsLowEnergyAntiprotonRing(LEAR)facility.
Nineoftheseatomswereproducedincollisionsbetweenantiprotonsandxenonatomsover
aperiodofthreeweeks.
•2002Capturingantihydrogenatoms.TwoCERNexperiments,ATHENAandATRAP,took
amajorsteptowardsunderstandingantimatterin2002bycreatingthousandsofatomsof
antimatterinacoldstate.

3.2.LHC-LARGEHADRONCOLLIDER

27

•2004CERNcelebratesits50thanniversary.TheinaugurationoftheGlobein2004coincided
withtheofficialcelebrationofCERNsanniversary,attendedbyrepresentativesoftheOrga-
nizations20MemberStatesincludingtheheadsofstateofFrance,SpainandSwitzerland.
•2009/10TheLHCstartedup.
Moredetailsonthehighlightscanbeseenon[CER08a].

3.2LHC-Largehadroncollider
TheLargeHadronCollider(LHC)isagiganticscientificinstrumentnearGeneva,whereitspans
theborderbetweenSwitzerlandandFranceabout100munderground.Itisaparticleaccelerator
usedbyphysiciststostudythesmallestknownparticles-thefundamentalbuildingblocksofall
things.Itwillrevolutioniseourunderstanding,fromtheminusculeworlddeepwithinatomstothe
vastnessoftheUniverse.
Twobeamsofsubatomicparticlescalledhadrons-eitherprotonsorleadions-willtravelin
oppositedirectionsinsidethecircularaccelerator,gainingenergywitheverylap.Physicistswill
usetheLHCtorecreatetheconditionsjustaftertheBigBang,bycollidingthetwobeamshead-on
atveryhighenergy.Teamsofphysicistsfromaroundtheworldwillanalysetheparticlescreated
inthecollisionsusingspecialdetectorsinanumberofexperimentsdedicatedtotheLHC.
Therearemanytheoriesastowhatwillresultfromthesecollisions,butwhatsforsureisthata
bravenewworldofphysicswillemergefromthenewaccelerator,asknowledgeinparticlephysics
goesontodescribetheworkingsoftheUniverse.Fordecades,theStandardModelofparticle
physicshasservedphysicistswellasameansofunderstandingthefundamentallawsofNature,
butitdoesnottellthewholestory.Onlyexperimentaldatausingthehigherenergiesreached
bytheLHCcanpushknowledgeforward,challengingthosewhoseekconfirmationofestablished
knowledge,andthosewhodaretodreambeyondtheparadigm.[CER08b]
TheLHC,theworldslargestandmostpowerfulparticleaccelerator,isthelatestadditiontoCERNs
acceleratorcomplex.Itmainlyconsistsofa27kmringofsuperconductingmagnetswithanumber
ofacceleratingstructurestoboosttheenergyoftheparticlesalongtheway.
Insidetheaccelerator,twobeamsofparticlestravelatclosetothespeedoflightwithveryhigh
energiesbeforecollidingwithoneanother.Thebeamstravelinoppositedirectionsinseparatebeam
pipes-twotubeskeptatultrahighvacuum.Theyareguidedaroundtheacceleratorringbyastrong
magneticfield,achievedusingsuperconductingelectromagnets.Thesearebuiltfromcoilsofspecial
electriccablethatoperatesinasuperconductingstate,efficientlyconductingelectricitywithout
resistanceorlossofenergy.Thisrequireschillingthemagnetstoabout271◦C-atemperature
colderthanouterspace!Forthisreason,muchoftheacceleratorisconnectedtoadistribution
systemofliquidhelium,whichcoolsthemagnets,aswellastoothersupplyservices.
Thousandsofmagnetsofdifferentvarietiesandsizesareusedtodirectthebeamsaroundthe
accelerator.Theseinclude1232dipolemagnetsof15mlengthwhichareusedtobendthebeams,
and392quadrupolemagnets,each5-7mlong,tofocusthebeams.Justpriortocollision,another
typeofmagnetisusedtosqueezetheparticlesclosertogethertoincreasethechancesofcollisions.
Theparticlesaresotinythatthetaskofmakingthemcollideisakintofiringneedlesfromtwo
positions10kmapartwithsuchprecisionthattheymeethalfway!
Allthecontrolsfortheaccelerator,itsservicesandtechnicalinfrastructurearehousedunder
oneroofattheCERNControlCentre.Fromhere,thebeamsinsidetheLHCwillbemadeto
collideatfourlocationsaroundtheacceleratorring,correspondingtothepositionsoftheparticle
detectors.[CER08b]

28

CHAPTER3.THECMSEXPERIMENT

Figure3.1:MapoftheLHCanditshinterland.Theredregionscorrespondtothevillagesnextto
CERN.Theforexperimentsaredrawnontheirlocationintheacceleratorring.

Figure3.1showsthedetectorsandtheirlocation.ThesixexperimentsattheLHCareallrunby
internationalcollaborations,bringingtogetherscientistsfrominstitutesallovertheworld.Each
experimentisdistinct,characterizedbyitsuniqueparticledetector.
Thetwolargeexperiments,ATLASandCMS,arebasedongeneral-purposedetectorstoanalysethe
myriadofparticlesproducedbythecollisionsintheaccelerator.Theyaredesignedtoinvestigate
thelargestrangeofphysicspossible.Havingtwoindependentlydesigneddetectorsisvitalfor
cross-confirmationofanynewdiscoveriesmade.
Twomedium-sizeexperiments,ALICEandLHCb,havespecializeddetectorsforanalyzingthe
LHCcollisionsinrelationtospecificphenomena.
Twoexperiments,TOTEMandLHCf,aremuchsmallerinsize.Theyaredesignedtofocuson
forwardparticles(protonsorheavyions).Theseareparticlesthatjustbrushpasteachotheras
thebeamscollide,ratherthanmeetinghead-on
TheATLAS,CMS,ALICEandLHCbdetectorsareinstalledinfourhugeundergroundcaverns
locatedaroundtheringoftheLHC.ThedetectorsusedbytheTOTEMexperimentarepositioned
neartheCMSdetector,whereasthoseusedbyLHCfareneartheATLASdetector.[CER08b]

3.3CMSdetector-Compactmuonsolenoiddetector

CMSstandsforCompactMuonSolenoid:compactbecauseitissmallforitsenormousweight,
muonforoneoftheparticlesitdetects,andsolenoidforthecoilinsideitshugesuperconducting
magnet.Itisahigh-energyphysicsexperimentinCessy,France,partoftheLargeHadronCollider
ERN.Cta(LHC)CMSisdesignedtoseeawiderangeofparticlesandphenomenaproducedinhigh-energycollisions
intheLHC.Likeacylindricalonion,differentlayersofdetectorstopandmeasurethedifferent
particles,andusethiskeydatatobuildupapictureofeventsattheheartofthecollision.
Scientiststhenusethisdatatosearchfornewphenomenathatwillhelptoanswerquestions
suchas:WhatistheUniversereallymadeofandwhatforcesactwithinit?Andwhatgives

3.3.CMSDETECTOR-COMPACTMUONSOLENOIDDETECTOR

29

Figure3.2:modeloftheCMSdetectordecoratedwithpicturesofthedifferentdetectorcomponents
2.

everythingsubstance?CMSwillalsomeasurethepropertiesofpreviouslydiscoveredparticles
withunprecedentedprecision,andbeonthelookoutforcompletelynew,unpredictedphenomena.
[CMS08b]Detectorsconsistoflayersofmaterialthatexploitthedifferentpropertiesofparticlestocatch
andmeasuretheenergyandmomentumofeachone.CMSwasdesignedaroundgettingthebest
possiblescientificresults,andthereforetolookforthemostefficientwaysoffindingevidencefor
newphysicaltheories.Thisputcertainrequirementsonthedesign.CMSneeded:

•ahighperformancesystemtodetectandmeasuremuons,
•ahighresolutionmethodtodetectandmeasureelectronsandphotons(anelectromagnetic
calorimeter),

•ahighqualitycentraltrackingsystemtogiveaccuratemomentummeasurements,and
•ahermetichadroncalorimeter,designedtoentirelysurroundthecollisionandpreventparti-
escaping.fromcles

Withtheseprioritiesinmind,thefirstessentialitemwasaverystrongmagnet.Thehighera
chargedparticlesmomentum,thelessitspathiscurvedinthemagneticfield,sowhenweknow
itspathwecanmeasureitsmomentum.Astrongmagnetwasthereforeneededtoallowusto
accuratelymeasureeventheveryhighmomentumparticles,suchasmuons.Alargemagnetalso
allowedforanumberoflayersofmuondetectorswithinthemagneticfield,somomentumcould
bemeasuredbothinsidethecoil(bythetrackingdevices)andoutsideofthecoil(bythemuon
chambers).
ThemagnetistheSolenoidinCompactMuonSolenoid(CMS).Thesolenoidisacoilofsupercon-
ductingwirethatcreatesamagneticfieldwhenelectricityflowsthroughit;inCMSthesolenoid
hasanoveralllengthof13mandadiameterof7m,andamagneticfieldabout100,000times
strongerthanthatoftheEarth.Itisthelargestmagnetofitstypeeverconstructedandallows
2takenfromhttp://bigscience.web.cern.ch/bigscience/en/cms/cms2.html

30

CHAPTER3.THECMSEXPERIMENT

thetrackerandcalorimeterdetectorstobeplacedinsidethecoil,resultinginadetectorthatis,
overall,compact,comparedtodetectorsofsimilarweight.
ThedesignofthewholedetectorwasalsoinspiredbylessonslearntfrompreviousCERNexperi-
mentsatLEP(theLargeElectronPositronCollider).Engineersfoundthatbuildingsectionsabove
ground,ratherthanconstructingtheminthecavernwithallitsaccessandsafetyissues,saved
valuabletime.Anotherimportantconclusionwasthatsub-detectorsshouldbemademoreeasily
accessibletoallowforeasierandfastermaintenance.
ThusCMSwasdesignedinfifteenseparatesectionsorslicesthatwerebuiltonthesurfaceand
lowereddownready-madeintothecavern.Beingabletoworkinparallelonexcavatingthecavern
andbuildingthedetectorsavedvaluabletime.Thisslicing,alongwiththecarefuldesignofcabling
andpiping,alsoensuresthatthesectionscanbefullyopenedandclosedwithminimumdisruption,
andeachpieceremainsaccessiblewithinthecavern.
Theseconsiderations,alongwiththeuniqueconditionsoftheLHC,affectedthedesignofeach
layerofthedetector.[CMS08b]

3.3.1Trackingsystem
Thetrackingsystemconsistsoftwomaincomponentsthepixeldetectornexttothebeampipe
andthesiliconstripdetectorsnexttoit.

PixelsMomentumofparticlesiscrucialinhelpingustobuildupapictureofeventsattheheartofthe
collision.Onemethodtocalculatethemomentumofaparticleistotrackitspaththrougha
magneticfield;themorecurvedthepath,thelessmomentumtheparticlehad.TheCMStracker
recordsthepathstakenbychargedparticlesbyfindingtheirpositionsatanumberofkeypoints.
Thetrackercanreconstructthepathsofhigh-energymuons,electronsandhadrons(particlesmade
upofquarks)aswellasseetrackscomingfromthedecayofveryshort-livedparticlessuchasbeauty
orbquarksthatwillbeusedtostudythedifferencesbetweenmatterandantimatter.
Thetrackerneedstorecordparticlepathsaccuratelyyetbelightweightsoastodisturbtheparticle
aslittleaspossible.Itdoesthisbytakingpositionmeasurementssoaccuratethattrackscanbe
reliablyreconstructedusingjustafewmeasurementpoints.Eachmeasurementisaccurateto10
μm,afractionofthewidthofahumanhair.Itisalsotheinnermostlayerofthedetectorandso
receivesthehighestvolumeofparticles:theconstructionmaterialswerethereforecarefullychosen
radiation.resisttoThefinaldesignconsistsofatrackermadeentirelyofsilicon:thepixels,attheverycoreofthe
detectoranddealingwiththehighestintensityofparticles,andthesiliconmicrostripdetectors
thatsurroundit.Asparticlestravelthroughthetrackerthepixelsandmicrostripsproducetiny
electricsignalsthatareamplifiedanddetected.Thetrackeremployssensorscoveringanareathe
sizeofatenniscourt,with75millionseparateelectronicread-outchannels:inthepixeldetector
therearesome6000connectionspersquarecentimeter.[CMS08b]

etectorsDStripSiliconAfterthepixelsandontheirwayoutofthetracker,particlespassthroughtenlayersofsilicon
stripdetectors,reachingouttoaradiusof130centimeters.
Thetrackersiliconstripdetectorconsistsoffourinnerbarrel(TIB)layersassembledinshellswith
twoinnerendcaps(TID),eachcomposedofthreesmalldiscs.Theouterbarrel(TOB)consistsof

3.3.CMSDETECTOR-COMPACTMUONSOLENOIDDETECTOR

31

Figure3.3:modeloftheCMSdetectordecoratedwithpicturesofthedifferentdetectorcomponents.

sixconcentriclayers.Finallytwoendcaps(TEC)closeoffthetracker.Eachhassiliconmodules
designeddifferentlyforitsplacewithinthedetector.
Thispartofthetrackercontains15,200highlysensitivemoduleswithatotalof10milliondetector
stripsreadby80,000microelectronicchips.Eachmoduleconsistsofthreeelements:asetof
sensors,itsmechanicalsupportstructureandreadoutelectronics.
Siliconsensorsarehighlysuitedtoreceivemanyparticlesinasmallspaceduetotheirfastresponse
andgoodspatialresolution.Thesilicondetectorsworkinmuchthesamewayasthepixels:asa
chargedparticlecrossesthematerialitknockselectronfromatomsandwithintheappliedelectric
fieldthesemovegivingaverysmallpulseofcurrentlastingafewnanoseconds.Thissmallamount
ofchargeisthenamplifiedbyAPV25chips,givingushitswhenaparticlepasses,allowingusto
path.[CMS08b]tsireconstruct

Lookingatthetwotrackingdetectorcomponentsitiseasytoseethattheyincludeaparticular
amountofmaterialintheinnerstregionsofthedetector.Thereforeelectrons,photonsandpionsare
stimulatedtoreactbeforereachingthecalorimetersfortheirenergymeasurements.Thisimpliesa
moredifficultreconstructionofthephysicalobjects,butopensanewfieldofparticleidentification
dependentonbremsstrahlung.Figure3.3showsthematerialbudgetoftheCMStrackerinunits
[CMS09b]ength.lradiationof

Calorimeter3.3.2

Outsidethetrackerarecalorimetersthatmeasuretheenergyofparticles.Inmeasuringthemomen-
tum,thetrackershouldinterferewiththeparticlesaslittleaspossible,whereasthecalorimeters
arespecificallydesignedtostoptheparticlesintheirtracks.
TheElectromagneticCalorimeter(ECAL)-madeofleadtungstate,averydensematerialthatpro-
duceslightwhenhit-measurestheenergyofphotonsandelectronswhereastheHadronCalorimeter
(HCAL)isdesignedprincipallytodetectanyparticlemadeupofquarks(thebasicbuildingblocks
ofprotonsandneutrons).Thesizeofthemagnetallowsthetrackerandcalorimeterstobeplaced
insideitscoil,resultinginanoverallcompactdetector.[CMS08b]

32

CHAPTER3.THECMSEXPERIMENT

ECAL)(calorimeterElectromagneticInordertobuildupapictureofeventsoccurringintheLHC,CMSmustfindtheenergiesof
emergingparticles.Ofparticularinterestareelectronsandphotons,becauseoftheiruseinfinding
theHiggsbosonandothernewphysics.
Theseparticlesaremeasuredusinganelectromagneticcalorimeter(ECAL).Buttofindthemwith
thenecessaryprecisionintheverystrictconditionsoftheLHC-ahighmagneticfield,highlevelsof
radiationandonly25nanosecondsbetweencollisions-requiredveryparticulardetectormaterials.
Leadtungstatecrystalismadeprimarilyofmetalandisheavierthanstainlesssteel,butwith
atouchofoxygeninthiscrystallineformitishighlytransparentandscintillateswhenelectrons
andphotonspassthroughit.Thismeansitproduceslightinproportiontotheparticlesenergy.
Thesehigh-densitycrystalsproducelightinfast,short,well-definedphotonburststhatallowfor
aprecise,fastandfairlycompactdetector.
Photodetectorsthathavebeenespeciallydesignedtoworkwithinthehighmagneticfield,arealso
gluedontothebackofeachofthecrystalstodetectthescintillationlightandconvertittoan
electricalsignalthatisamplifiedandsentforanalysis.
TheECAL,madeupofabarrelsectionandtwoendcaps,formsalayerbetweenthetrackerand
theHCAL.Thecylindricalbarrelconsistsof61,200crystalsformedinto36supermodules,each
weighingaroundthreetonnesandcontaining1700crystals.TheflatECALendcapssealoffthe
barrelateitherendandaremadeupofalmost15,000furthercrystals.
Forextraspatialprecision,theECALalsocontainsPreshowerdetectorsthatsitinfrontofthe
endcaps.TheseallowCMStodistinguishbetweensinglehigh-energyphotons(oftensignsof
excitingphysics)andthelessinterestingclosepairsoflow-energyphotons.[CMS08b]

(HCAL)alorimeterCHadronTheHadronCalorimeter(HCAL)measurestheenergyofhadrons,particlesmadeofquarksand
gluons(forexampleprotons,neutrons,pionsandkaons).Additionallyitprovidesindirectmea-
surementofthepresenceofnon-interacting,unchargedparticlessuchasneutrinos.
MeasuringtheseparticlesisimportantastheycantellusifnewparticlessuchastheHiggsboson
orsupersymmetricparticles(muchheavierversionsofthestandardparticlesweknow)havebeen
formed.Astheseparticlesdecaytheymayproducenewparticlesthatdonotleaverecordoftheirpresence
inanypartoftheCMSdetector.TospotthesetheHCALmustbehermetic,thatismakesure
itcaptures,totheextentpossible,everyparticleemergingfromthecollisions.Thiswayifwesee
particlesshootoutonesideofthedetector,butnottheother,withanimbalanceinthemomentum
andenergy(measuredinthesidewaystransversedirectionrelativetothebeamline),wecandeduce
thatweareproducinginvisibleparticles.
Toensurethatweareseeingsomethingnew,ratherthanjustlettingfamiliarparticlesescape
undetected,layersoftheHCALwerebuiltinastaggeredfashionsothattherearenogapsin
directlinesthatafamiliarparticlemightescapethrough.
TheHCALisasamplingcalorimeter,meaningitfindsaparticlesposition,energyandarrivaltime
usingalternatinglayersofabsorberandfluorescentscintillatormaterialsthatproducearapidlight
pulsewhentheparticlepassesthrough.Specialopticfiberscollectupthislightandfeeditinto
readoutboxeswherephotodetectorsamplifythesignal.Whentheamountoflightinagivenregion
issummedupovermanylayersoftilesindepth,calledatower,thistotalamountoflightisa
measureofaparticlesenergy.
AstheHCALismassiveandthick,fittingitintocompactCMSwasachallenge,asthecascadesof

3.3.CMSDETECTOR-COMPACTMUONSOLENOIDDETECTOR33

particlesproducedwhenahadronhitsthedenseabsorbermaterial(knownasshowers)arelarge,
andtheminimumamountofmaterialneededtocontainandmeasurethemisaboutonemeter.
Toaccomplishthisfeat,theHCALisorganizedintobarrel(HBandHO),endcap(HE)andforward
(HF)sections.Thereare36barrelwedges,eachweighing26tonnes.Theseformthelastlayerof
detectorinsidethemagnetcoilwhilstafewadditionallayers,theouterbarrel(HO),sitoutside
thecoil,ensuringnoenergyleaksoutthebackoftheHBundetected.Similarly,36endcapwedges
measureparticleenergiesastheyemergethroughtheendsofthesolenoidmagnet.
Lastly,thetwohadronicforwardcalorimeters(HF)arepositionedateitherendofCMS,topickup
themyriadparticlescomingoutofthecollisionregionatshallowanglesrelativetothebeamline.
Thesereceivethebulkoftheparticleenergycontainedinthecollisionsomustbeveryresistantto
radiationandusedifferentmaterialstotheotherpartsoftheHCAL.[CMS08b]

detectorMuon3.3.3AsthenameCompactMuonSolenoidsuggests,detectingmuonsisoneofCMSsmostimportant
tasks.Muonsarechargedparticlesthatarejustlikeelectronsandpositrons,butare200times
heavier.Weexpectthemtobeproducedinthedecayofanumberofpotentialnewparticles;for
instance,oneoftheclearest”signatures”oftheHiggsBosonisitsdecayintofourmuons.
Becausemuonscanpenetrateseveralmetersofironwithoutinteracting,unlikemostparticlesthey
arenotstoppedbyanyofCMSscalorimeters.Therefore,chamberstodetectmuonsareplacedat
theveryedgeoftheexperimentwheretheyaretheonlyparticleslikelytoregisterasignal.
Aparticleismeasuredbyfittingacurvetohitsamongthefourmuonstations,whichsitoutsidethe
magnetcoilandareinterleavedwithiron”returnyoke”plates.Bytrackingitspositionthrough
themultiplelayersofeachstation,combinedwithtrackermeasurementsthedetectorsprecisely
traceaparticlespath.Thisgivesameasurementofitsmomentumbecauseweknowthatparticles
travelingwithmoremomentumbendlessinamagneticfield.Asaconsequence,theCMSmagnet
isverypowerfulsowecanbendeventhepathsofveryhigh-energymuonsandcalculatetheir
a.tmomenIntotalthereare1400muonchambers:250drifttubes(DTs)and540cathodestripchambers
(CSCs)tracktheparticlespositionsandprovideatrigger,while610resistiveplatechambers
(RPCs)formaredundanttriggersystem,whichquicklydecidestokeeptheacquiredmuondata
ornot.Becauseofthemanylayersofdetectoranddifferentspecialitiesofeachtype,thesystem
isnaturallyrobustandabletofilteroutbackgroundnoise.
DTsandRPCsarearrangedinconcentriccylindersaroundthebeamline(thebarrelregion)whilst
CSCsandRPCs,makeuptheendcapsdisksthatcovertheendsofthebarrel.[CMS08b]

34

CHAPTER.3THECMSEXPERIMENT

4Chapter

rtenEveconstruction

RecordingthecollisionswiththeCMSdetector(see3.3)isthefirstpartofaphysicsanalysis.
Butitisalmostimpossibletostudyphysicsbehaviourontherawdata.Mostofthemanpower
isneededtotransformthisdataintoausablestructure.Themaingoalofthistransformation
istoreconstructobjectswithwelldefinedphysicsproperties.Suchobjectsaretracks,whichare
mainlyreconstructedwiththetrackingsystem(see3.3.1)inthemiddleofthedetector.Tracks
areproducedbychargedparticles.Mostofthemarepions,butalsoprotons,kaons,muonsand
electronsmakeavisiblesignalwhichisrecorded.Withtheadditionalinformationofthecalorimeter
(see3.3.2)itispossibletobuildobjectscalledjets,whichcorrespondtoelementaryparticles
producedintheQCDprocess.TostudyQCDprocessesawellunderstoodreconstructionofjet
objectsisveryimportant.Themuondetector(see3.3.3)helpsustofindmuoncandidateswith
alargepurity.Furtheritispossibletoreconstructelectroncandidates,taucandidatesorb-jet
candidates,whichneedmoreadvancedalgorithmsofpatternrecognitiontoidentifysuchobjects.
ThefollowingsectionwillintroducehowtherecordeddataisfilteredbytheCMStriggersystem.
FurtherIpresentthesamples,whichareusedforthisanalysis,explainthedifferentphysicsobjects
storedintheirfilesandhowtheyarereconstructedwithintheCMSsoftwareframework(CMSSW).

ystemsriggerT4.1Thetriggersystemistheimportantinfrastructurewhichselectsthesamplesforfurtheranalysis.
Itdoesaroughclassificationofeachevent.Indoingthisthemainjobisthereductionofthehuge
amountofdataandthedispersionintosocalledtriggerstreams.Atriggerstreamprovidesan
enrichedsampleofinterestingeventswhichareneededfortheanalysis.Anicedescriptionofthe
triggersystemcanbefoundin[A+09].Themainpartsinthissectionareextractedfromthere.It
issummarizedasmuchaspossibleinspiteofgettinganintroducingidea,howthelargeamount
ofdatameasuredbytheCMSexperimentisfilteredandrecordedforanalysis.
Thetriggersystemconsistsoftwomodules:TheCMStrigger[B+00]anddataacquisitionsystem
[CRS02].Theyaredesignedtocopewithunprecedentedluminositiesandinteractionrates.At
theLHCdesignluminosityof1034cm−2s−1,andbunch-crossingratesof40MHz,anaverage
ofabout20interactionswilltakeplaceateachbunchcrossing.Thetriggersystemmustreduce
thebunch-crossingratetoafinaloutputrateofO(100)Hz,consistentwithanarchivalstorage
capabilityofO(100)MB/s.
OnlytwotriggerlevelsareemployedinCMS.Thefirstone,theLevel-1Trigger(L1T)[B+00],is
implementedusingcustomelectronicsandisdesignedtoreducetheeventrateto100kHz.The
secondtriggerlevel,theHighLevelTrigger(HLT),providesfurtherratereductionbyanalyzing

35

36

CHAPTER4.EVENTRECONSTRUCTION

full-granularitydetectordata,usingsoftwarereconstructionandfilteringalgorithmsrunningona
largecomputingclusterconsistingofordinaryCPUs,theEventFilterFarm.

4.1.1Level1trigger
In[B+00]theLevel1triggerisexplainedindetail:TheCMSL1triggerisbasedontheidentifi-
cationofmuons,electrons,photons,jets,andmissingtransverseenergy.Thetriggermusthavea
sufficientlyhighandunderstoodefficiencyatasufficientlylowthresholdtoensureahighyieldof
eventsinthefinalCMSphysicsplotstoprovideenoughstatisticsandenoughefficiencyforthese
eventssothatthecorrectionforthisefficiencydoesnotaddappreciablytothesystematicerrorof
.teasuremenmtheGiventhehigheventrateatthenominalLHCluminosity,onlyalimitedportionofthedetector
informationfromthecalorimetersandthemuonchambersisusedbytheL1Tsystemtoperformthe
firsteventselection,whilethefullgranularitydataarestoredinthedetectorfront-endelectronics
modules,waitingfortheL1Tdecision.Theoveralllatencytodeliverthetriggersignal(L1A)
issetbythedepthofthefront-endpipelinesandcorrespondsto128bunchcrossings.TheL1T
processingelementscomputethephysicscandidates(muons,jets,e/γ,etc.)basedonwhichthe
finaldecisionistaken.
Relevantforthisanalysisareonlythejettrigger.Thedefinitionofthelevel-1jettriggercanbe
foundin[VPB07].Level-1jetsaredefinedusingthetransverseenergysumsin12x12calorimeter
triggertowerwindows.Acalorimetertriggertowerisdefinedasanarrayof5x5crystalsinthe
ECALofdimensions0.087x0.087(ΔηxΔφ),whichcorresponds1:1tothephysicstowersizeofthe
HCAL.Thealgorithmusesasliding-windowtechniquethatstepsinunitsof4x4triggertowers,
calledtriggerregions,togivecomplete(η,φ)coverageofthecalorimeter.Thefourhighestjetsin
thecentralandforwardcalorimeters,aswellasfourcentralτjetsareselected.Alsoselectedare
single,double,tripleandquad-jettriggerswithvaryingthresholdsandprescalefactors.

4.1.2Highleveltrigger
AsdescribedintheCMSTechnicalDesignReportsontheDAQ/HLTandonthePhysicsPerfor-
manceoftheexperiment[CMS06],theHLTselectionisimplementedasasequenceofreconstruction
andselectionstepsofincreasingcomplexity,reconstructionrefinementandphysicssophistication.
ThefullyprogrammablenatureoftheprocessorsintheEventFilterFarmenablestheimplemen-
tationofverycomplexalgorithmsutilizinganyandallinformationintheevent.
AtHLT,jetsarereconstructedusinganiterativeconealgorithmwithconesizeR=0.5.The
algorithmisidenticaltotheoneusedintheofflineanalysis.Theinputstothejetalgorithmare
calorimetertowers,whichareconstructedfromoneormoreprojectedHCALcellsandcorrespond-
ingprojectedECALcrystals,andsatisfycertainthresholdrequirements.Forinclusioninthejet
findingalgorithm,thecalorimetertowersmusthavepT>0.5GeVandatleastonetowermust
satisfythejetseedrequirementofpT>1GeV.Afterjetfinding,acorrectionforthecalorimeter
responseisappliedtothereconstructedjets.ThiscorrectionwasobtainedusingQCDdi-jetevents
generatedbyPYTHIAandrunthroughthefullCMSdetectorsimulationinCMSSW.
Moredetailsonthehighleveltrigger,aswellasthisshortsummaraycanbefoundin[VPB07].

4.2Luminositymeasurement
Foranycrosssectionanalysisthemeasurementoftheluminosityisimportant.Theonlineand
offlinemethodsonmeasuringtheCMSluminosityaresummarizedin[CMS10g].Thefollowingis

EASUREMENTMLUMINOSITY4.2.

here:tromfextracted

37

OnlinemethodsTheCMSonlineluminositymeasurementemployssignalsfromtheforward
hadroniccalorimeter(HF),whichcoversthepseudorapidityrange3<|η|<5.Twomethodsfor
extractingareal-timerelativeinstantaneousluminositywiththeHFhavebeenimplementedin
firmware.Thefirstisbasedonzerocounting,inwhichtheaveragefractionofemptytowersisused
toinferthemeannumberofinteractionsperbunchcrossing.Thesecondmethodexploitsthelinear
relationshipbetweentheaveragetransverseenergypertowerandtheluminosity.AlthoughallHF
towersareoutfittedwithluminosityfirmware,thebestlinearityisobtainedbylimitingthecoverage
tofourazimuthal(2π)ringsintherange3.5<|η|<4.2.Theprincipalreasonforrestrictingthe
ηrangeistoavoidnon-linearitiesintroducedbyaveragingthetoweroccupancyoverarangeofη
ringswithverydifferentprobabilitiesforhavinganoccupiedtowerinasingleinteractionevent.In
thiscase,theaveragefractionofemptytowersbecomesasumoverexponentialsandisnolonger
linearwiththenumberofinteractionsperbunchcrossing.Thedigitaloutputsofthecircuitsused
toreadthesignalsfromtheHFPMTsaremonitoredinanon-invasivewayandusedtocollect
channel-occupancyandET-sumdatainhistogramsthathaveonebinforeachofthe3564possible
rossings.chbuncBothmethodscanoperateuptothefullluminosityoftheLHC(1034cm−2s−1).Atverylow
luminosities(1025cm−2s−1andbelow)thealgorithmsjustdescribedaresubjecttosmallnoise
backgrounds,butweredemonstratedtofunctionwellfortheluminositiesdeliveredbytheLHC
duringtheinitialstagesofthe2010run.Sincethetoweroccupancymethodofferssomewhatbetter
performanceattherelativelylowluminositiesdeliveredbytheLHCthusfar,ithasbeenadopted
asthedefaultmethod.ResultsreferredtoasHFonlinearebasedontoweroccupancyunlessstated
otherwise.

OfflinemethodsAsacrosscheckontheHF-basedonlineluminositymonitor,twoofflineal-
gorithmsweredevelopedforluminositymonitoring.Oneofthesemethodsisbasedonenergy
depositionsintheHF,whiletheothermakesuseoftrackingandvertexfinding.Theofflinemeth-
odshavethedrawbackoflonglatency(typically24hourselapsebeforetheofflineinformationfrom
agivenrunisavailable),butallowforbetterbackgroundrejectionthantheonlinemethods.Most
importantly,theofflinetechniquesemployalargelyindependentdata-handlingpath,andinthe
caseofthevertex-countingmethod,involveacompletelyseparatesetofsystematicuncertainties.
Theythuscomplementtheonlinemethodnicely.
TheofflineHFmethodisbasedonthecoincidenceofETdepositionsofatleast1GeVinthe
forwardandbackwardHFarrays(thesumineachHFrunsoveralltowers).Timingcuts,where
|tHF|<8nsforbothHF+andHF-,areimposedtoeliminatenon-collisionbackgrounds.
Asecondofflinemethodrequiresthatatleastonevertexwithatleasttwotracksbefoundinthe
event.Thez-positionofthevertexisrequiredtoliewithin150mmofthecenteroftheinteraction
region.Thismethodprovidesgoodefficiencyforminimumbias(MB)events,whilesuppressing
non-collisionbackgroundstothefewpermillevel.
Anoverviewoftheincreasingluminositycanbeseeninfigure4.1.Forouranalysiswedecideto
ignoretheveryfirstdatarecordedinthecommissioningera.Thisisarguablebecauseofthesmall
luminositiesduringtheseruns(run132440-135802).Theendofthecommissioningeraisaround
theendofMay.Furthertheintegratedluminosityisplottedontheright.

2takenfromhttps://twiki.cern.ch/twiki/bin/view/CMSPublic/LumiPublicResults

38

CHAPTER4.EVENTRECONSTRUCTION

Figure4.1:PeakluminosityperdayforthefirstyearofdatatakingwiththeCMSexperiment.
TheperiodfromthefirstcollisionsuntilJuneiscalledthecommissioningera,theperiodafter
Run2010era.Ontherighttheintegratedluminosityisshown.2

4.3Eventreconstructionandobjectidentification
Thissectionaddressesthereconstructionofphysicsobjectsappearingduringtheprotoncollisions
providedbytheLHC.Asalreadyexplainedthemeasurementsofalldetectorcomponentsare
filteredbythetriggersystem(see4.1)andstoredontape.ThedataissavedinasocalledRAW
dataformat.Theinformationofalldetectorcomponentsisavailable.
TheRAWdatasetisthebaseforalmostallphysicsanalysisatCMS.Butmostdonotuseitbefore
thereconstructionofphysicsobjects.ThecreationoftheRECOdataformatisdonepromptly
afterthedatatakinginthegridinfrastructure.Thegridisaglobalnetworkofhighperformance
computingcenters,placedallovertheworld,toexecutethedatastreamsfromtheCMSexperiment
aswellasfortheotherLHCexperiments.Theprocessingisdoneinmultiplesteps.Finallythe
reconstructeddataisstoreddistributedallovertheworld,butstillavailableforallmembersof
thecollaborationviathegrid.
UsingtheCMSsoftware,everycollaborationmemberisabletoselectthephysicsobjectsofinterest
forhisanalysis.Inthemajorityofthecasestheanalystproducesrelativesmallfileswithaflat
structureforhisstudies.
Inmycasethisstructureisrelatedtojetobjects.Ajetisaparticularstructureinthedetector,
whichwasgeneratedbyhighenergeticquarksorgluonscreatedintheQCDprocess.Eachjet
isfurtherlinkedtootherphysicsobjectsliketracks,electron,muonsandvertices.Thewhole
constellationofthedifferentobjectsisusedtoanalysetheinclusivebcrosssection.
Thissectionisasummaryofalltheimportantpublicationswhichstudiedandexplainedthe
reconstructionofthephysicsobjectsneededformyanalysis.Theinformationismostlyextracted
fromthepapersrelatedtothecommissioningoftheCMSexperimentsanditscomponents.

4.3.1Trackreconstruction
ThedefaulttrackreconstructionatCMSisperformedbythecombinatorialtrackfinder(CTF).
Startingfromthereconstructedhits,thetrackreconstructionisdecomposedinfourlogicalparts
[AMST06]:

generationSeed••Patternrecognition,ortrajectorybuilding
•Ambiguityresolution

4.3.EVENTRECONSTRUCTIONANDOBJECTIDENTIFICATION

39

•Finaltrackfit
Tripletsofhitsinthetrackerorpairsofhitswithanadditionalconstraintfromthebeamspotora
vertexareusedasinitialestimates,orseeds,oftracks[CKKT06].Theseedsarethenpropagated
outwardinasearchforcompatiblehits.Ashitsarefound,theyareaddedtotheseedtrajectory
andthetrackparametersanduncertaintiesareupdated.Thissearchcontinuesuntileitherthe
limitofthetrackerisreachedornomorecompatiblehitscanbefound,yieldingthecollection
ofhitsthatbelongtothetrack.Inthefinalstep,thiscollectionofhitsisfittoobtainthebest
estimateofthetrackparameters.
TheCTFperformsmultipleiterations.Betweeneachiteration,hitsthatcanbeunambiguously
assignedtotracksinthepreviousiterationareremovedfromthecollectionoftrackerhitstocreate
asmallercollectionthatcanbeusedinthesubsequentiteration.Attheendofeachiteration,
thereconstructedtracksarefilteredtoremovetracksthatarelikelyfakeandtoflagtheexpected
purityofthetracks.Moredetailscanbefoundin[CMS10m].

4.3.2Primaryvertexreconstruction
TogetacompleteoverviewoftheimportantcomponentsoftheCMSSW,whichareneededfor
thisthesis,themainpartsofprimaryvertexreconstructionareextractedfrom[CMS10l]:
Intheprimaryvertexreconstruction,themeasurementsofthelocationanduncertaintyofan
interactionvertexarecomputedfromagivensetofreconstructedtracks.Theprompttracksorig-
inatingfromtheprimaryinteractionregionareselectedbasedonthetransverseimpactparameter
significancewithrespecttothebeamline,numberofstripandpixelhits,andthenormalizedtrack
2.χThebeamlinerepresentsthethree-dimensionalprofileoftheluminousregionwheretheLHC
beamscollideatCMS.Thebeamlineisdeterminedinanaverageovermanyevents,incontrast
totheevent-by-eventprimaryvertexwhichgivestheprecisepositionofasinglecollision.Agood
measurementofthepositionandslopeofthebeamlineisanimportantcomponentoftheevent
reconstruction.Theselectedtracksarethenclusteredbasedontheirzcoordinatesatthepointofclosestapproach
tothebeamline.Vertexcandidatesareformedbygroupingtracksthatareseparatedinzby
lessthanadistancezsep=1cmfromtheirnearestneighbor.Candidatescontainingatleasttwo
tracksarethenfitwithanadaptivevertexfittocomputethebestestimateofvertexparameters
suchaspositionandcovariancematrix,aswellastheindicatorsofthesuccessofthefit,suchas
thenumberofdegreesoffreedomofthevertexandtrackweightsofthetracksinthevertex.The
adaptivevertexfitterdoesnotrejectanoutlyingtrack;ratheritdown-weightstheoutlierswitha
weightwi.Theweightwidependsonthecompatibilityoftrackiwiththevertex,asmeasuredby
χ2[FWV07].Foratrackconsistentwiththecommonvertex,itsweightiscloseto1.Thenumber
ofdegreesoffreedomisdefinedasndof=2inTrackswi−3.Itisthusstronglycorrelatedtothe
ithnumberoftrackscompatiblewiththeprimaryinteractionregion.Forthisreason,thenumber
ofdegreesoffreedomofthevertexcanbeusedtoselectrealproton-protoninteractions.The
primaryvertexresolutiondependsstronglyonthenumberoftracksusedinfittingthevertexand
thepTofthosetracks.
Figure4.2showsexemplarythedistributionofthereconstructedprimaryverticesfromasingle
run.Theplotsshowtheresultinoneandtwodimensions.
3takenfrom[CMS10m]

40CHAPTER4.EVENTRECONSTRUCTION
45005000Width: 0.22 mmCMS preliminary 2009 s = 900 GeV4000Width: 0.25 mmCMS preliminary 2009 s = 900 GeV3500Width: 39 mmCMS preliminary 2009 s = 900 GeV
events / binevents / binevents / bin3000350040002500300030002500200020001500200015001000100010005005000-0.1-0.0500.050.10.150.20.250.30-0.15-0.1-0.0500.050.10.150.20.250-20-15-10-505101520
primary vertex z (cm)primary vertex y (cm)primary vertex x (cm)0.20.250.2CMS preliminary 2009CMS preliminary 2009CMS preliminary 20090.15s = 900 GeV0.2s = 900 GeV0.15s = 900 GeV
0.10.150.1primary vertex y (cm)0primary vertex x (cm)0.05primary vertex y (cm)0
0.050.10.05-0.050-0.05-0.1-0.0500.050.10.150.20.25-0.05-15-10-5051015-0.1-15-10-5051015
primary vertex z (cm)primary vertex z (cm)primary vertex x (cm)Figure4.2:Plotsoftheprimaryvertexdistributionsfromasinglerun.3
4.3.3Secondaryvertexreconstruction
Thesecondaryvertexreconstructionisextractedfrom[MPQW06]:
Decayvertices,whichresultfromlonglivingparticlesarecalledsecondaryvertices.Mostvertex
findersaresensitivetoprimary(PV)andsecondaryvertices(SV),soavertexfilterisneededto
selectonlythesecondaryvertexcandidates.Thediscriminationisbasedonthedistanceofavertex
tothebeamlineortoanalreadyreconstructedprimaryvertex.
ThetrimmedKalmanvertexfinder[SPF+06]searchesforvertexcandidatesamongtheinputset
oftracks,inaniterativeway.Duringthefirstiteration,aTrimmedKalmanvertexfitterisapplied
tothecompleteinputsetoftracks,yieldingasoutputsavertexcandidateandasetoftracks
whichareincompatiblewiththatvertexcandidate.Duringthesubsequentiterations,thesame
procedureisappliedtothesetofincompatibletracksidentifiedatpreviousiterations.
ThetrimmedKalmanvertexfinderissensitivetoprimaryandsecondaryvertices,soavertex
filterisusedtoselectsecondaryvertexcandidates.Thevertexfilterusesthefollowingcutsonthe
rtices:ev•Thedistancefromthevertextothebeamlinehastoexceed100μmbutmustnotexceed2
cm.Thelowerlimitshouldrejectprimaryvertices,theupperlimitphotonconversionsand
nuclearinteractionsinthebeampipe.
•Thedistancefromthevertextothebeamlineinthetransverse(r-)planedividedbyits
uncertaintyhastobegreaterthanthree:σLLtt>3.
•Thetotalinvariantmassofthevertexmustbesmallerthan6.5GeV/c2todiscardprimary
rtices.ev•VerticeswithtwotrackswithoppositechargeandaninvariantmassoftheK0mass(±50
MeV)arerejected.The100μmcutandthe3σcutonthetransverseflightdistanceare

4.3.EVENTRECONSTRUCTIONANDOBJECTIDENTIFICATION41

mostimportant,becausetheyrejectmostoftheprimaryvertices.Theeffectofthecutof
6.5GeV/c2onthetotalinvariantmassofthevertexissmaller.
Inmostcasesb-hadronsproduceatertiaryvertexbecausethedecaychainproceedsviacharm
production(theb-c-decaychain).Thelifetimeandthenumberoftracksfromthedecayvertexare
smallerforweaklydecayingc-thanforweaklydecayingb-hadrons.Forthisreasonthesecondary
andthetertiaryverticesaremergedintoonevertexinmostcases.Iftrackscomingfromatertiary
vertexarealsousedtofitthesecondaryvertex,themeasuredflightdistanceisshiftedtoahigher
value.Anothereffectthatcorruptsthesecondaryvertexresolutionaremisassociatedtracksfrom
theprimaryvertexorfromunderlyingevents.

reconstructionElectron4.3.4TheelectronreconstructionoftheCMSSWisdescribedin[CMS10d].Thefollowingsummaryis
here:tromfextractedElectronreconstructionusestwocomplementaryalgorithmsatthetrackseedingstage:tracker
drivenseeding,moresuitableforlowpTelectronsaswellasperformingbetterforelectronsinside
jetsandECALdrivenseeding.
TheECALdrivenalgorithmstartsbythereconstructionofECALsuperclustersoftransverseen-
ergyET>4GeVandisoptimizedforisolatedelectronsinthepTrangerelevantforZorW
decaysanddowntopT>5GeV/c.Superclusterisagroupofoneormoreassociatedclustersof
energydepositsintheECALconstructedusinganalgorithmwhichtakesaccounttheircharacteris-
ticnarrowwidthintheηcoordinateandtheircharacteristicspreadinφduetothebendinginthe
magneticfieldofelectronsradiatinginthetrackermaterial.Asafirstfilteringstep,superclusters
arematchedtotrackseeds(pairsortripletsofhits)intheinnertrackerlayers,andelectrontracks
arebuiltfromthesetrackseeds.Trajectoriesarereconstructedusingadedicatedmodelingofthe
electronenergylossandfittedwithaGaussianSumFilter(GSF).
Thefilteringperformedattheseedingstepiscomplementedbyapreselection.Forcandidates
foundonlybythetrackerdrivenseedingalgorithm,thepreselectionisbasedonamultivariate
analysisasdescribedin[CMS10b].ForcandidatesfoundbytheECALdrivenseedingalgorithm,
thepreselectionisbasedonthematchingbetweentheGSFtrackandthesuperclusterinηandφ
[BCF+07].ThefewECALdrivenelectroncandidates(1%forisolatedelectrons)notacceptedby
thesematchingcutsbutpassingthemultivariatepreselectionarealsokept.

reconstructionMuon4.3.5InthestandardCMSreconstructionforppcollisions,tracksarefirstreconstructedindependently
inthesilicontracker(trackertrack)andinthemuonspectrometer(standalone-muontrack).Based
onthese,tworeconstructionapproachesareused:
•GlobalMuonreconstruction(outside-in):startingfromastandalonemuoninthemuon
system,amatchingtrackertrackisfoundandaglobal-muontrackisfittedcombininghits
fromthetrackertrackandstandalone-muontrack.Atlargetransversemomenta(pT200
GeV/c),theglobal-muonfitcanimprovethemomentumresolutioncomparedtothetracker-
fit.only•TrackerMuonreconstruction(inside-out):inthisapproach,alltrackertrackswithpT>0.5
GeV/candp>2.5GeV/careconsideredaspossiblemuoncandidatesandareextrapolated
tothemuonsystem,takingintoaccounttheexpectedenergylossandtheuncertaintydueto

42

CHAPTER4.EVENTRECONSTRUCTION

multiplescattering.Ifatleastonemuonsegment(i.e.ashorttrackstubmadeofDTorCSC
hits)matchestheextrapolatedtrackinposition,thecorrespondingtrackertrackqualifiesas
atracker-muontrack.
Atlowmomentum(roughlyp<5GeV/c)thisapproachismoreefficientthantheglobalmuon
reconstruction,sinceitrequiresonlyasinglemuonsegmentinthemuonsystem,whileglobalmuon
reconstructiontypicallybecomesefficientwithtwoormoresegments.Themajorityofmuonsfrom
collisions(withsufficientmomentum)arereconstructedeitherasaGlobalMuonoraTracker
Muon,orveryoftenasboth.However,ifbothapproachesfailandonlyastandalone-muontrack
isfound,thisleadstoathirdcategoryofmuoncandidates:
•Standalone-muontrackonly:thisoccursonlyforabout1%ofmuonsfromcollisions,thanks
tothehightracker-trackefficiency.Ontheotherhand,theacceptanceofthistypeofmuon
trackforcosmic-raymuonsisafactor102to103larger,thusleadingtoacollisionmuonto
cosmic-raymuonratiothatisafactor104to105lessfavorablethanfortheprevioustwo
categories.onumTheresultsofthesethreealgorithmsaremergedintoasinglecollectionofmuoncandidates,each
onecontaininginformationfromthestandalone,tracker,andglobalfit,whenavailable.Candidates
foundbothbytheTrackerMuonandtheGlobalMuonapproachthatsharethesametracker
trackaremergedintoasinglecandidate.Similarly,standalone-muontracksnotincludedina
GlobalMuonaremergedwithaTrackerMuoniftheyshareamuonsegment.Additionalmuon
identificationinformationisstoredforeachcandidate.Thecombinationofdifferentalgorithms
providesarobustandefficientmuonreconstruction.Agivenphysicsanalysiscanachievethe
desiredbalancebetweenidentificationefficiencyandpuritybyapplyingaselectionbasedonthe
muonidentificationvariables.Severalstandardselectionsareprovided.
ThebasicselectionimportantforthisanalysisistheSoftMuonSelection:Thisselectionrequires
thecandidatetobeaTrackerMuon,withtheadditionalrequirementthatamatchingsegment
befoundintheoutermoststationwhereasegmentisexpected(basedonmuonpositionand
momentum),matchingbothinpositionanddirectionwiththepredictionofthetrackextrapolation.
Segmentsthatformabettermatchinpositionwithadifferenttrackertrackarenotconsidered.
TheseadditionalrequirementsareoptimizedforlowpT(<10GeV/c)muons.Thisselectionis
presentlyusedinB-physicsanalysesinCMS,inadditiontoGlobalMuons.[CMS10k]

Jets4.3.6In[CMS10f]Ifoundagoodexplanationofthejetreconstruction:
Jetsareexperimentalsignaturesofquarksandgluons,whichareproducedinhighenergyprocesses
suchasthehardscatteringofpartonsinppcollisions.FourtypesofjetsarereconstructedatCMS,
whichdifferentlycombineindividualcontributionsfromsubdetectorstoformtheinputstothejet
clusteringalgorithm:calorimeterjets,Jet-Plus-Track(JPT)jets,Particle-Flow(PFloworPF)
jets,andtrackjets.
InthisanalysisonlyPFlow-jetsareused.TheyarereconstructedusingtheAnti-kT[CSS08]
clusteringalgorithmwiththesizeparameterR=0.5.In[CMS10c]theyclaim:
TheParticleFlowalgorithmcombinestheinformationfromallCMSsub-detectorstoidentify
andreconstructallparticlesintheevent,namelymuons,electrons,photons,chargedhadronsand
neutralhadrons.Electronsandmuonsaside,theparticle-flowalgorithmcanberoughlysumma-
rizedinthefollowingway.Tracksreconstructedinthecentralsilicontrackerareextrapolatedto
theelectromagnetic(ECAL)andhadron(HCAL)calorimeter.Thechargedhadroncandidates,

4.3.EVENTRECONSTRUCTIONANDOBJECTIDENTIFICATION

43

Figure4.3:Exemplaryclusteringoftheanti-ktjetalgorithm.Largertransversemomentaare
responsibleformoreconicalclusters5

inparticulartheirenergiesanddirections,arereconstructedfromthesetracks.Atrackislinked
toacalorimetricenergyclusterintheECALand/orintheHCALifthetrackextrapolationfalls
withintheboundariesofoneoftheenergydepositsofthecluster.Photonsandneutralhadronsare
reconstructedfromcalorimetricenergyclusters:clustersseparatedfromtheextrapolatedposition
oftracksinthecalorimetersconstituteaclearsignatureoftheseneutralparticles;neutralparti-
clesoverlappingwithchargedparticlesinthecalorimeterscanbedetectedascalorimeterenergy
excesseswithrespecttothesumoftheassociatedtrackmomenta.
Havingtheparticleflowobjecttheyarefedtotheanti-ktjetclusteringalgorithm.In[CSS08]the
anti-ktalgorithmisdescribedasfollows:Thefunctionalityoftheanti-ktalgorithmcanbeunder-
stoodbyconsideringaneventwithafewwellseparatedhardparticleswithtransversemomenta
kt1,kt2,...andmanysoftparticles.Softparticleswilltendtoclusterwithhardoneslongbefore
theyclusteramongthemselves.Ifahardparticlehasnohardneighborswithinadistance2R,then
itwillsimplyaccumulateallthesoftparticleswithinacircleofradiusR,resultinginaperfectly
conicaljet.IfanotherhardparticleispresentsuchthatR<δ12<2Rthentherewillbetwohard
jets.Itisnotpossibleforbothtobeperfectlyconical.Ifkt1kt2thenjet1willbeconicaland
jet2willbepartlyconical,sinceitwillmissthepartoverlappingwithjet1.Insteadifkt1=kt2
neitherjetwillbeconicalandtheoverlappingpartwillsimplybedividedbyastraightlineequally
betweenthetwo.Similarlyonecanworkoutwhathappenswithδ12<R.Hereparticles1and
2willclustertoformasinglejet.Ifkt1kt2thenitwillbeaconicaljetcenteredonk1.For
kt1∼kt2theshapewillinsteadbemorecomplex,beingtheunionofcones(radius<R)around
eachhardparticleplusacone(ofradiusR)centeredonthefinaljet.Figure4.3showstheφ/η
planewithanexemplaryclusteringofjetsbytheanti-ktalgorithm.
CMShasdevelopedjetqualitycriteria(JetID)forcalorimeterjetsandPFlowjetswhicharefound
toretainthevastmajorityofrealjetsinthesimulationwhilerejectingmostfakejetsarisingfrom
calorimeterand/orreadoutelectronicsnoise.Thesearestudiedinpurenoisenon-collisiondata
samplessuchascosmictriggerdataordatafromtriggersonemptybunchesduringLHCoperation.
ThePFlowjetsarerequiredtohaveachargedhadronfractionCHF>0.0ifwithinthetracking
fiducialregionof|η|<2.4,aneutralhadronfractionNHF<1.0,achargedelectromagnetic
(electron)fractionCEF<1.0,andaneutralelectromagnetic(photon)fractionNEF<1.0.These
requirementsremovefakejetsarisingfromspuriousenergydepositionsinasinglesub-detector.In
thestudiespresentedjetsarerequiredtopassJetIDcriteria.

5takenfrom[CSS08]

44

CHAPTER4.EVENTRECONSTRUCTION

2.52CMS Simulationraw pT = 30 GeVanti-kT R=0.5
raw p = 50 GeVCaloJetsPFJetsT1.8PFJetsJPTJets2raw pT = 100 GeVTotal Correction
anti-k R=0.5T1.6Jet Energy Correction FactorJet Energy Correction FactorAbsolute Correction1.51.41.21CMS Simulation1102030100Corrected Jet p200 (GeV)-4-2024
Jet T

Figure4.4:JetenergycorrectionsappliedonthePFjets.Themulti-stepprocedureforMC-truth
jetenergycorrectionsappliesabsolute(left),relativecorrections(right).7

correctionsnergyeJet4.3.7Jetenergymeasuredinthedetectoristypicallydifferentfromthecorrespondingparticlejetenergy.
Thelatterisobtainedinthesimulationbyclustering,withthesamejetalgorithm,thestable
particlesproducedduringthehadronizationprocessthatfollowsthehardinteraction.Themain
causeforthisenergymismatchisthenon-uniformandnon-linearresponseoftheCMScalorimeters.
Furthermore,electronicsnoiseandadditionalppinteractionsinthesamebunchcrossing(event
pile-up)canleadtoextraunwantedenergy.Thepurposeofthejetenergycorrectionistorelate,
onaverage,theenergymeasuredinthedetectortotheenergyofthecorrespondingparticlejet.
Theinformationonthejetenergycorrectionisextractedfrom[CMS08a]and[CMS10f]:
CMShasdevelopedafactorizedmulti-stepprocedureforthejetenergycalibration(JEC).The
followingthreesubsequent(sub-)correctionsaredevisedtocorrectcalorimeter,PFlowandJPT
jetstothecorrespondingparticlejetlevel:offset,relativeandabsolutecorrections.Theoffset
correctionaimstocorrectthejetenergyfortheexcessunwantedenergyduetoelectronicsnoise
andpile-up.Therelativecorrectionremovesvariationsinjetresponseversusjetηrelativetoa
centralcontrolregionchosenasareferencebecauseoftheuniformityofthedetector.Theabsolute
correctionremovesvariationsinjetresponseversusjetpT.CMSpursuestwocomplementary
approachestodeterminethejetenergycorrectionfactors:utilizingMCtruthinformation(MC
truthJEC),andusingphysicsprocessesfromppcollisionsforin-situjetcalibration.Atthecurrent
initialstageofLHCrunning,MCtruthJECisusedtocorrectjetsinbothdataandMCsimulation.
Infigure4.4thetwocorrectionstepsfortheMC-truthjetenergycorrectionsareshown.Theoffset
correctionsarenotfactorizedout.
CurrentphysicsanalysesinCMSuse5%JECuncertaintiesforPFlowjets,withanadditional2%
uncertaintyperunitrapidity.

4.3.8JetFlavordefinition
Thereisnounambiguousanswertothecorrectunderlyingflavorofareconstructedjet.Three
definitionsareused,reflectingthreedifferentpointsofview:

7takenfrom[CMS10f]

4.3.EVENTRECONSTRUCTIONANDOBJECTIDENTIFICATION45

PhysicsdefinitionReconstructedjetsarematchedtoinitialpartonsfromtheprimaryphysics
process.TheymustbewithinthereconstructedjetconewithΔR<0.3.Forexample,fortt¯
events,theinitialpartonswouldbetwob-jetsfromthedecaysofthetopquarks,twonon-b-jets
perhadronicWdecay,andnoinitialgluonjets.Thereisnomatchingifhard(FS)radiation
occurredandthepartondirectionchangessignificantly.Noflavorisassigned,ifnounambiguous
answerispossiblewhenmorethanoneinitialpartonismatched.Gluonjetssplittingtoc-orb-
quarksarelabeledasgluon-jets.

AlgorithmicdefinitionThepartonthatmostlikelydeterminesthepropertiesofthejetdefines
thetrueflavorofthejet.Thefinalstatepartons,aftershoweringandradiation,areanalyzed.The
partonsmustbewithinΔR<0.3ofthereconstructedjetcone.Jetsfromradiationarematched
withfullefficiency.Ifthereisab-quarkorac-quarkwithinthejetcone,itislabeledaccordingly,
otherwisethejetisassignedwiththeflavorofthehardestparton.

EnergeticdefinitionThisdefinitionappliestogeneratedjets(GenJets),wheretheconstituents
ofajetareasetofgeneratorobjects(GenParticleCandidate).Avariableisbuiltforeachjet
computingthefractionoftheenergyofthejetwhichcomesfromborchadrons.Thesequantities
canbeusedtoattributeaflavoroftheGenJet.Amatchedreconstructedjetcangetthesame
flavorasthematchedGenJet.
Themaindifferencesbetweenthedefinitionseffectmainlyjetsfromgluonsplitting.Onlyphysics
andenergeticdefinitionsseegluonsplitting.Thealgorithmicdefinitionisblindtoit.Furtherthe
algorithmicdefinitioncausessomecontaminationfromgluonsplittingtobandcjets.Allthethree
definitionscanbeappliedtoGenJets.Onlythefirst2(PhysicsandAlgorithmicdefinitions)can
beappliedtoparticleflowjets.8
Forb-jettaggingthealgorithmicdefinitionisused.

taggingetjb4.3.9TheCMSsoftwarecontainsalreadyvariousb-jettaggingalgorithmsfordifferentpurposes.Itook
thesummarizeddescriptionofthedifferenttaggersfrom[CMS09a].Eachtaggerproducesan
outputvalueforeachjet.Theoutputofanyalgorithmisthesocalleddiscriminator,definedasa
singlenumberwhichtheusercancutontoselectdifferentregionsintheefficiencyversuspurity
phasespace.ThediscriminatorcanbeasimplephysicsquantityliketheIPsignificanceforsome
taggers,oracomplexvariableliketheoutputoflikelihoodratioorneuralnetwork.

Trackcounting(TCHE,TCHP)Thesimplestwayofproducingadiscriminatorbasedontrack
impactparametersisanextensionofthesocalledtrackcountingalgorithm.Thetrackcounting
approachidentifiesajetasab-jetifthereareatleastNtrackseachwithasignificanceofthe
impactparameterexceedingS.Thisalgorithmhastwomajorparameters(NandS).Thewayof
producingacontinuousdiscriminatorforthisalgorithmistofixthevalueofN,andconsideras
discriminatingvariabletheimpactparametersignificanceoftheNthtrack(orderedindecreasing
significance).Ifoneisinterestedinahighefficiencyforb-jets,thesecondtrackcanbeused;for
higherpurityselectionsthethirdtrackisabetterchoice.Thediscriminatorsobtainedinthisway
areplottedforQCDeventsinFigure4.5,andaresimplytheIPsignificanceshapesforthechosen
k.trac98htakenttps://tfrom[wiki.cern.cCMS10a]h/twiki/bin/view/CMSPublic/SWGuideBTagMCTools

46CHAPTER4.EVENTRECONSTRUCTION
CMS Preliminary 2010,s = 7 TeV, L = 15 nb -1CMS Preliminary 2010,s = 7 TeV, L = 15 nb -1
101066Data101066Data
Sim.(light)Sim.(light)101055Sim.(charm)101055Sim.(charm)
101044Sim.(bottom)101044Sim.(bottom)
101033101033
Entries/0.6Entries/0.6101022Entries/0.6Entries/0.6101022
101010101.1511.115
11Data/Sim0.5051015202530Data/Sim0.5051015202530
TCHP DiscriminatorTCHE DiscriminatorFigure4.5:Discriminatorofthetrackcountingb-jettagger9
CMS Preliminary 2010,s = 7 TeV, L = 15 nb -1CMS Preliminary 2010,s = 7 TeV, L = 15 nb -1
101066Data101066Data
Sim.(light)Sim.(light)101055Sim.(charm)101055Sim.(charm)
101044Sim.(bottom)101044Sim.(bottom)
101033101033
Entries/0.05Entries/0.05101022Entries/0.16Entries/0.16101022
101010101.1511.115
11Data/Sim0.500.511.522.5Data/Sim0.5012345678
JetBProb DiscriminatorJetProb DiscriminatorFigure4.6:Discriminatorofthejetprobabilityb-jettagger10
Jetprobability(JPT,JBPT)Thejetprobabilityalgorithmsareanaturalextensionofthe
trackcountingalgorithms[CMS09a].Theideaistocombinetheinformationcomingfromall
ks.tracselectedForeachtrack,theprobabilitytocomefromtheprimaryvertexiscomputedandtheseprobabilities
arecombinedtoprovidethejetprobability.Thetrackprobabilitydistributioniscalibratedby
meansofthedistributionoftrackimpactparameterswithnegativesigns.Thenegativepartofthe
impactparameterdistributionisusedforthispurposebecauseitismainlymadeupofprimary
vertextracks.Theadvantageofthismethodwithrespecttotrackcountingisthefactthatasingle
discriminatorisused(i.e.thereisnoneedtochoose)andthatinformationfromalltracksisused
atthesametime.[RPS06]Twodiscriminatorsareprovided;thefirstlabeledjetprobabilityis
strictlyrelatedtothecombinedprobabilitythatallthetracksinthejetcomefromtheprimary
vertex.Thesecond,labeledjetBprobabilityestimateshowlikelyitisthatthefourmost
displacedtracksarecompatiblewiththeprimaryvertex;theselectioncomesfromthefactthat
theaveragechargedtrackmultiplicityinweakbhadrondecayis5,andfromtheaveragetrack
reconstructionefficiency,around80%fortracksinjets.Theshapesofthediscriminantvariable
arepresentedinFigure4.6.

4.3.EVENTRECONSTRUCTIONANDOBJECTIDENTIFICATION47

CMS Preliminary 2010,s = 7 TeV, L = 15 nb -1CMS Preliminary 2010,s = 7 TeV, L = 15 nb -1
200020002000Data101044Data
Sim.(light)18001800Sim.(light)1800Sim.(charm)16001600Sim.(charm)1600140014001400Sim.(bottom)101033Sim.(bottom)
120012001200Entries/1.4Entries/1.4Entries/0.08 GeV/cEntries/0.08 GeV/cEntries/0.08 GeV/c6006006001010
100010001000101022
80080080040040040020020020000021.511
1.5Data/Sim0.500.511.522.533.54Data/Sim0.5-30-20-100102030
11relMuon 3D IP significanceMuon p [GeV/c]TFigure4.7:Inputvariablesforthesoftmuonb-jettagger11

1010

Softmuon(SMT)Thepresenceofamuonclosetothejetisalreadyahintofaweakdecay
ofaBhadron.Thiscanbecomplementedwithsomeadditionalquantity,inordertobuilda
discriminator.InthesoftmuonbypTrelalgorithmthepTofthemuonwithrespecttothejet
axisisused[CMS09a];hardercutsyieldhigherpurities.InthesoftmuonbyIPsignificance
theIPsignificanceofthemuonisusedinstead,butonlywhenfoundtobepositive.Inallthe
cases,whenmorethanonemuonisreconstructed,theonewiththehighestdiscriminatorvalueis
used.Figure4.7showsshapesofthepTandtheIPsignificanceofthesoftmuons,whichisused
aggers.tthosegenerateto

Softelectron(SET)Itisalsopossibletocreateasoftelectronb-jettagger.Becauseofthe
largenumberofpionsappearingineacheventitisnotpossibletogetab-jettaggerbasedonpure
softelectronssimilartothesoftmuoncase.Atthemomentthereisnoofficialsoftelectronb-jet
taggeratCMS.AnyhowaNeuroBayessoftelectrontaggerwasdevelopedin[Mar09].

Simplesecondaryvertex(SSV)SecondaryverticescanbeusedtoselectjetsfromBhadrons
withhighpurity.Asimpleversion,calledsimplesecondaryvertextaggingalgorithmisbased
uponthereconstructionofatleastonesecondaryvertex.Ifnosuchvertexisfound,thealgorithm
returnsnodiscriminator,limitingitsmaximumb-jetefficiencytotheprobabilityoffindingavertex
inthepresenceofweakBhadrondecay(around60-70%).Thesignificanceofthe3Dflightdistance
isusedasadiscriminatingvariableforthistagger[CMS09a].Twovariantsbasedontheminimum
numberoftracksattachedtothevertexareconsidered:Ntrk≥2yieldsthehighefficiency
version(SSVHE).Furtherthereisahighpurityversion(SSVHP),whereNtrk≥3.[CMS10a]
ThedistributionofthisdiscriminatorisshownFigure4.8

Combinedsecondaryvertex(CSV)Amorecomplexapproachinvolvestheuseofsecondary
vertices,togetherwithotherlifetimeinformation,liketheIPsignificanceordecaylengths.Byusing
theseadditionalvariables,thecombinedsecondaryvertexalgorithmprovidesdiscrimination
evenwhennosecondaryverticesarefound,sothemaximumpossibleb-taggingefficiencyisnot
limitedbythesecondaryvertexreconstructionefficiency[CMS09a].Inmanycases,trackswithan
10takenfrom[CMS10a]
1211taktakeennffromrom[[CMS10a]CMS10a]

48CHAPTER4.EVENTRECONSTRUCTION
CMS Preliminary 2010,s = 7 TeV, L = 15 nb -1CMS Preliminary 2010,s = 7 TeV, L = 15 nb -1
350035003500DataSim.(light)120012001200DataSim.(light)
300030003000Sim.(bottom)Sim.(charm)100010001000Sim.(bottom)Sim.(charm)
250025002500800800800
Entries/0.14Entries/0.14Entries/0.14150015001500Entries/0.14Entries/0.14Entries/0.14600600600
200020002000100010001000400400400
500500500200200200
1.05001.0050
11Data/Sim0.501234567Data/Sim0.501234567
SSV High Pur DiscriminatorSSV High Eff DiscriminatorFigure4.8:Discriminatorforthesimplesecondaryvertexb-jettagger12
IPsignificance>2canbecombinedinaso-calledpseudovertex,allowingforthecomputation
ofasubsetofsecondaryvertexbasedquantitiesevenwithoutanactualvertexfit.Wheneven
thisisnotpossible,anovertexcategoryrevertssimplytotrackbasedvariablessimilarlytothe
jetprobabilityalgorithm.ThesevariablesareusedasinputtoaLikelihoodRatio,usedtwiceto
discriminatebetweenb-andc-jetsandbetweenb-andlightjets,andthencombinedadditively
withafactorof0.75and0.25respectively.Forthecommissioningoftheb-jettagger,thecombined
secondaryvertexalgorithmwasnotincorporated.Becauseofitscomplexstructureitneedsalarger
amountofdataforitsinitiation.
Figure4.9showstheperformanceoftheCMSb-jettaggers.Theperformanceiscalculatedonthe
sameMonteCarlosampleswhichareusedforfurthercomparisonsinthisthesis.Onthey-axisthe
mistagrateisplotted.Thex-axisshowstheb-jetefficiency.Thebestpointtoperformisthelower
rightcorner.Asexpectedthemorecomplextaggers,jetbprobabilityandthecombinedsecondary
vertex,aremoreperformant.
efficiencyaggingTPlansonhowtomeasuretheb-taggingefficienciesarepresentedin[CMS07b].Thefollowing
sectionisextractedfromthere.Alltheb-jettaggingalgorithmsrelyuponthereconstructionof
lowerlevelobjectsliketracks,vertices,andjets,whichmightmakeitdifficultfortheMonteCarlo
simulationtoexactlyreproducetheperformanceindata.TheTevatroncolliderexperimentshave
developedmethodstomeasuretheperformanceofthelifetimetaggingalgorithmsincolliderdata.
TheCMScollaborationadaptedthesemethodstomeasuretheb-taggingefficiencyusingdata,
wherejetswithmuonappear.ThepTrelMethodreliesdirectlyonafittothepT,reldistribution
ofthemuonbeforeandaftertaggingthemuon-jet;theCountingMethodalsoreliesonpT,rel
fitsbutusesadditionalinformationderivedfromthedata.Thethirdmethod,System8Method,
consistsofsolvingasystemofeightequationsconstructedfromthetotalnumberofeventsintwo
sampleswithdifferentb-jetcontent,beforeandaftertaggingwithtwob-taggingalgorithms.
pTrelMethodThebasicideaofthepTrelmethodistomeasuretheb-quarkcontentofa
muon+jetsamplebyfittingthepT,reldistributionofthemuonstoalinearcombinationofthe
b-quarkandc/light-quarkjettemplates.Theprocessisrepeatedaftertaggingthemuon-jet.The
b-taggingefficiencyiscalculatedastheratiobetweenthenumberofbjetsafterandbeforetagging,

4.3.EVENTRECONSTRUCTIONANDOBJECTIDENTIFICATION

CMS simulation1non b-jetε-110

-210

-310

= 7 TeVs

combinedSVjetProbsimpleSV-4simpleSVP10softMuonByIPsoftMuonByPttrackCountHEfftrackCountHPur10-500.10.20.30.40.50.60.70.80.91
εb-jet

49

Figure4.9:PerformanceoftheCMSb-jettaggers.Tocomparethedifferenttaggers,theyare
plottedinthemistagrate/efficiencyphasespace.Thelowerrightcornerrepresentsatagger,where
allnon-b-jetcouldbesuppressedwithoutloosinganyb-jet.Thecolorsarechosentodistinguish
thedifferentkindoftaggers.Blue:themuontagger(SMT),green:thesimplesecondaryvertex
tagger(SSV),violet:aretrackcountingtagger(TC),orange:arethetwomorecomplex:jetb
probabilityandcombinedsecondaryvertex.

50

CHAPTER4.EVENTRECONSTRUCTION

asdeterminedbythepT,relfits.ThepT,relfitscanbeappliedtothemuon-jet+away-jetsample
(pTrel(n)method)ortothemuon-jet+tagged-away-jetsample(pTrel(p)method)[CMS07b].

CountingMethodTheCountingmethodusesadifferentapproachtoestimatethebcontent
ofthesamplebeforetagging;itassumesthattheaway-jetsinthensamplearedominatedbylight
jets,andthattheaverageprobabilityoftaggingthemcanbeestimatedfromlightjetsdatasample
withnegativeimpactparameterwithrespecttotheinteractionpoint[CMS07b].
ThemajorsourceofsystematicuncertaintyformethodsthatrelyonthepT,relfitisgivenby
themodelingofthetemplates.Toestimatethisuncertainty,thetemplateswererederivedusinga
differentsample.Thisalternativesetoftemplatesisthenusedtoremeasuretheb-taggingefficiency,
andthedifferenceinthecentralvalueobtainedisassignedassystematicuncertainty.Typical
variationsintherangebetween10and20%areobserved,withthelargervaluescorrespondingto
binswithlowerstatisticsonthesamplesusedtoderivethetemplates.TheCountingmethodhas
anadditionalsystematicuncertaintyarisingfromthemeasurementofthemistagrate,whichis
evaluatedbyvaryingthenumberofcljetsbeforetaggingby±5%[CMS07b].
System8MethodTheSystem8methodhasbeendevelopedbytheD0collaboration[CDD+03].
ItdoesnotrelyonpT,relfitstoextracttheb-jetcontentofthesamples;theMonte-Carlosimulation
isonlyusedtoevaluatecorrelationfactorsbetweendifferenttaggingalgorithms.Forthecurrent
implementationoftheSystem8method,twodatasamplesareused:themuon-jet+away-jetsample,
andthemuon-jet+tagged-away-jetsample.
Thefollowingsystemofeightequationsisthenobtained:

n=nb+ncl
p=pb+pcl
tagn=εbnb+εclncl
ptag=βεbpb+αεclpcl
nμ=εμnb+εμncl
pμ=εμpb+εμpcl
ntag,μ=κbεbtagεbμnb+κclεcltagεclμncl
ptag,μ=βκbεbtagεbμpb+ακclεcltagεclμpcl

Thetermsonthelefthandsiderepresentthetotalnumberofmuon-jetsineachsamplebefore
tagging(n,p)andaftertaggingwithalifetimetagger(ntag,ptag),themuonpT,relcut(nμ,pμ),
andboth(ntag,μ,ptag,μ).Theeightunknownsontherighthandsideoftheequationsconsistof
thenumberofbandc+lightjetsinthetwosamples(nb,ncl,pb,pcl),andthetaggingefficiencies
forbandc+lightjetsforthelifetimetagandthemuonpT,relcut(εbtag,εbμ,εcltag,εclμ).Themethod
assumesthattheefficiencyfortaggingajetwithboththelifetimetagandthemuonpT,relcutcan
approximatelybecalculatedastheproductoftheindividualefficiencies.
Fouradditionalparametersareneededtosolvethesystemofequations:κb,κcl,α,andβ.The
firsttwoparametersrepresentthecorrelationbetweenthelifetimetagandthemuonrequirement
forbjets(κb)andc+lightjets(κcl),respectively.Theyaredefinedas
εtag,μεtag,μ
κb=εbbtagεbμκb=εcltagclεclμ

4.4.MONTECARLOSAMPLES51
101110CMS simulations = 7 TeV0.05CMS simulations = 7 TeV
100.04581010102b jets fraction0.0350.03
0.04610number of jets per bin410.025-210-40.0210-6100.015-8100.01-10100.005-1210-1410011.510222.51033pp3.5T, jetT, jet11.510222.51033pp3.5
T, jetT, jetFigure4.10:pTspectrumofthealljetsinblackandb-jetsinred(left)andthefractionofb-jets
dependentonthejetpT(right)fortheMonteCarlosamplesreconstructedinCMSSWversion3.6
Theparametersαandβrepresenttheratiobetweenthelifetimetaggingefficienciesofthetwo
datasamples,usedtosolveSystem8,forbandc/lightjets[CMS07b].
Furtherthereisanothermethodtodeterminethetaggingefficiencyoflightquarkandgluonjets.
Themethodusestrackswithnegativevaluesofthesignedimpactparameter[CMS07a].
4.4MonteCarlosamples
InthisthesisdifferentMCsamplesareused.BothweregeneratedbyPYTHIA6withdifferent
tunes.TheyarealsodifferentintheirGlobalTagsandCMSSWrelease.
ThesamplesaredefinedonspecificpˆTbins.ThepˆTrangesofeachsamplecanbeobtainedfromthe
samplenames.Togetmeaningful,unbiasedMonteCarlostatisticsthesamplesmustbecombined.
BeforedoingthisitisneededtonormalizeallsamplestothesameintegratedluminosityL.
ectionscrossw=#events
DiJetCDQ6PythiaTable4.4containsthedatasetnamesalongwiththecorrespondingnumberofeventsandthe
crosssection.AllvaluesareextractedfromtheofficialMCgenerationpage13.FortheSummer
2010reprocessingtheCMSsoftwareversion3.6isused.TheGlobalTagofthereconstructionis
START36V10::All.Thetableshowsall20sampleswhichcoverthepˆTregionfrom0GeVto
3500GeV.Eachsamplehasaspecificnumberofevents,createdwiththequotedcrosssection.In
thelastcolumntheweightswarelistedwhichadaptthedistributiontoanintegratedluminosity
./pb=1LAfterapplyingthisweightsthepTspectrumofalljets(black)andalsoforb-jetscanbeseenin
figure4.10.Ontherighttheexpectedfractionofb-jetsisshown.
Thesmoothnessofthecurveisaproofoftherightapplicationoftheweights.Furthertocheck
theamountofstatisticsoftheMonteCarlothenumberofeventsofeachobjectisplottedinfigure
4.11.13https://twiki.cern.ch/twiki/bin/viewauth/CMS/ProductionReProcessingSummer10

52

CHAPTER4.EVENTRECONSTRUCTION

name#eventscrosssection[pb]weightfactor
QCDDiJetPt0to1521970294.844·10102.205·104
QCDDiJetPt15to2022564305.794·1082.568·102
QCDDiJetPt20to3010322502.361·1082.287·102
QCDDiJetPt30to5011617685.311·1074.571·101
QCDDiJetPt50to801112896.358·1065.713·101
QCDDiJetPt80to1206067717.849·1051.294·100
QCDDiJetPt120to170588881.151·1051.955·100
QCDDiJetPt170to230516802.014·1043.897·10−1
QCDDiJetPt230to300528944.094·1037.740·10−2
QCDDiJetPt300to380642659.346·1021.454·10−2
QCDDiJetPt380to470522072.338·1024.478·10−3
QCDDiJetPt470to600203807.021·1013.445·10−3
QCDDiJetPt600to800224481.557·1016.936·10−4
QCDDiJetPt800to1000260001.843·1007.088·10−5
QCDDiJetPt1000to1400239563.318·10−11.385·10−5
QCDDiJetPt1400to1800205751.086·10−25.278·10−7
QCDDiJetPt1800to2200330703.499·10−41.058·10−8
QCDDiJetPt2200to2600225807.549·10−63.343·10−10
QCDDiJetPt2600to3000206446.465·10−83.132·10−12
QCDDiJetPt3000to3500234606.295·10−112.683·10−15
Table4.1:MonteCarlosamples:/name/Summer10-START36V9S09-v1/GEN-SIM-RECO,
CMSSW36X(GlobalTag:START36V10::All)

s = 7 TeV

7CMS simulations = 7 TeV1CMS simulations = 7 TeVPt0to15
10Pt15to20jets0.9Pt20to30tracksPt30to50SVs106muonselectrons0.8Pt50to80
Pt80to1200.7Pt120to170jet fraction of samplePt170to2300.65Pt230to30010Pt300to3800.5Pt380to470statistical content of objects per binPt470to6000.44Pt600to80010Pt800to10000.3Pt1000to1400Pt1400to18000.23Pt1800to220010Pt2200to26000.1Pt2600to3000Pt3000to350011.510222.531033.511.510222.510333.5
ppT, jetT, jetppT, jetT, jet

Figure4.11:Left:Amountofstatisticsforthedifferentobjects,whichareusedintheanalyses.
MCsamplesfordifferentpˆTbinsareavailable.Thedifferentsamplesmustbeweighedtogetthe
truepTspectrum.Right:CompositionoftheweighedMCcreatedfromdifferentsampleswith
differentpˆTbinranges.

SAMPLESARLOCMONTE4.4.

53

name#eventscrosssection[pb]weightfactor
QCDPt0to55498094.844·10108.810·104
QCDPt5to1516480963.675·10102.230·104
QCDPt15to3054546408.159·1081.496·102
QCDPt30to5032646605.312·1071.627·101
QCDPt50to8031915466.359·1061.992·100
QCDPt80to12032082997.843·1052.445·10−1
QCDPt120to17030452001.151·1053.780·10−2
QCDPt170to30032200802.426·1047.534·10−3
QCDPt300to47031712401.168·1033.683·10−4
QCDPt470to60020197327.022·1013.477·10−5
QCDPt600to80019790551.555·1017.857·10−6
QCDPt800to100020844041.844·1008.847·10−7
QCDPt1000to140010869663.321·10−13.055·10−7
QCDPt1400to180010215101.087·10−21.064·10−8
QCDPt18005293603.575·10−46.753·10−10
Table4.2:MonteCarlosamples:/nameTuneZ27TeVpythia6/Fall10-START38V12-v1/GEN-
SIM-RECO,CMSSW38X(GlobalTag:START38V14::All)

Toavoidrunningoversamples,whichhaveverysmallinfluenceontheoveralldistribution,the
fractionofthedifferentsampleswasstudied(figure4.11,right).AnalysisinspecificpTbinsrequires
onlysampleswithasufficientcontingentofweightedevents.
Thesesamplesareusedintherecentinclusiveb-jetcrosssectionmeasurement,whichwasperformed
onearlyCMSdata.Furtherinthisthesisthesesamplesareusedasanindependenttestdataset.

Pythia6QCDTune2Z
TheCMScollaborationalsoprovidessamplesofPythia6QCDTune2Z.Tune2Zisarenewedfit
oftheparametersoftheMonteCarlogenerator.
Table4.4containsthedatasetnamesalongwiththecorrespondingnumberofeventsandthecross
section.AllvaluesareextractedfromofficialMCgenerationpage14.FortheFull2010production
CMSSWversion3.8wasused.TheGlobalTagofthereconstructionisSTART38V14::All.
The15samplesprovidemorestatisticsthantheformerone.TheyareseparatedinˆpTbinsup
to1800GeV.FurtheranadditionalinclusivesamplewhichcoversthehighpTregionsabove1800
GeVisadded.TheapplicationoftheweightresultsinafullyinclusivespectrumuptolargepT
lues.avForthisMonteCarlosamplethesamedistributionsasaboveareplotted.Infigure4.12thepT
spectrumandtheb-jetfractioncanbeseen.Figure4.13showtheinfluenceofthedifferentsamples
again.Againthesmoothnessofthecurveisaproofoftherightapplicationoftheweights.Furtherto
checktheamountofstatisticsofthesamplethenumberofeventsofeachobjectareplottedin
figure4.13.Thefractionofthedifferentsampleswasstudiedaswell.
Thesesamplesareusedforallstudiesbelongingtob-jettaggingaswellasthebcrosssection
measurements.MonteCarloexpectationsforcomparisonwithdataareextractedfromit.
14https://twiki.cern.ch/twiki/bin/view/CMS/ProductionFall2010

54

CHAPTER4.EVENTRECONSTRUCTION

s = 7 TeV

11CMS simulations = 7 TeV0.05CMS simulations = 7 TeV
1010100.045810b jets fraction0.04number of jets per bin6100.0354100.032100.02510.02-2100.015-4100.01-6100.005-810011.510222.51033pp3.511.510222.510333.5
T, jetT, jetppT, jetT, jet
Figure4.12:pTspectrumofthealljetsinblackandb-jetsinred(left)andthefractionofb-jets
dependentonthejetpT(right)fortheMonteCarlosamplesreconstructedinCMSSWversion3.8

s = 7 TeV

CMS simulations = 7 TeV1CMS simulations = 7 TeVPt0to5
Pt5to1580.910Pt15to300.8Pt30to50710Pt50to800.7jet fraction of samplePt80to12060.610Pt120to170Pt170to3000.55Pt300to470statistical content of objects per bin11.510222.510333.511.510222.510333.5
100.4Pt470to600104jetstracks0.3Pt800to1000Pt600to800
103SVsmuons0.2Pt1000to1400
Pt1400to18000.1electronsPt1800ppT, jetT, jetppT, jetT, jet
Figure4.13:Left:Amountofstatisticsforthedifferentobjects,whichareusedintheanalyses.
MCsamplesfordifferentpˆTbinsareavailable.Thedifferentsamplesmustbeweighedtogetthe
truepTspectrum.Right:CompositionoftheTuneZ2MCsamplecreatedfromdifferentsamples
withdifferentpˆTbinranges.

4.5.DATASAMPLES

55

name#eventsRunrange
JetMETTau/Run2010A15042368135821-141887
141950-14411457606424JetMET/Run2010A146240-14971164027020Jet/Run2010BTable4.3:Triggerstreamswhicharrangesthedatasetfortheanalysis.Thejetstream,whichis
relevantforthisanalysiswascombinedwiththemissingET(MET)andthetaustreaminearlier
periods.

samplesData4.5ForthisanalysisthecompleteamountofdataoftheRun2010eraisused,makinguseofthe
reprocessinginNovember2010.ItisreconstructedusingtheupdatedCMSsoftwareversion3.8.
Therecordeddataisstructureddependentonthedifferentsocallederasofdatatakingand
thedifferenttriggerstreams.Thetriggerstreamchangedtwotimes,becauseoftheincreasing
luminosityprovidedbytheLHC.
Whilefortheearlydatatakingitwaspossibletopooleventsforjet,missingenergyortaustudies,
laterthetriggerstreamsonlyprovideddataforoneanalysisdirection.Thethreedatasetsusedare
listedintable4.3.
Aformerreconstructionofthefirstdatasamplewasalsousedforaninclusivebcrosssectionstudy
onearlyCMSdata,whichwasperformedforthesummerconferences2010.Partofmythesisis
totakemystudiesfortheearlyanalysisasbaseforfurtherinvestigationsandprovideanupdate
tothewholeRun2010dataset.
Havingtherighttriggerstreamsitisalsoneededtocheckforanacceptableoperationofthe
detector.Foreachrunthelumisectionsarecentrallycertifiedinsocalledgoodruns.Thus
alldetectorcomponentsworkwellandwecanbelieveinthereconstructionoftheevent.These
certifiedlumisectionsareprovidedbytheCMScollaborationandlistedinapublishedJSONfile.
WeusethefollowingofficialJSONfilesforthat:

Cert136033-1494427TeVNov4ReRecoCollisions10JSON.txt15

TheattentionoftheCMScollaborationisfocussedtohighenergyphysicsnotyetcoveredbythe
Tevatronexperiments.Withthegaininluminosityitbecamenecessarytoprescalesingle,low
energetic,jettriggersbysomefactorN.Thismustbedonebecauseofthetechnicallimitationsto
recordallcollisions(4.1).ThismeansinsteadofeacheventonlyeveryNtheventacceptedbya
specifictriggerisrecorded.Thereforethelistingoftheamountofdatausedfortheanalysesneeds
aseparatedlookatthedifferenttriggers.
TheintegratedluminositiesLofthedataweanalyzedareshownintable4.4.Theyweremeasured
bytheCMSluminositysystem[CMS10g].Fordifferenttriggerrangestheintegratedluminosities
.separatelylistedareTheprescalingofthelowenergetictriggersdemandsanefficientfunctionalityofthehigherenergetic
trigger.Itispossibletotestthisandfindasocalledturnonpointforeachtrigger.Thiscanbe
achievedbycomparingthetriggerrateoftheoneweareinterestedinwithanotherfullyefficient
onewhichisalsounprescaledinthisrunrange.Thedeterminedefficiencyfromsuchacomparison
canbeseeninfigure4.14.Theestimatedturnonpoints,wherethetriggerismorethan99%
efficientisputinadditionintothetable4.4.
15https://cms-service-dqm.web.cern.ch/cms-service-dqm/CAF/certification/Collisions10/7TeV/Reprocessing/

56

CHAPTER4.EVENTRECONSTRUCTION

samplefirstrunturnonJetMETTauJetMETJetall
HLTJet15U136035370.01409.60·10−31.86·10−30.0256
HLTJet30U136035840.1200.1920.03740.352
HLTJet50U1360351140.2852.870.3173.50
HLTJet70U141956153-2.875.999.17
HLTJet100U141956196-2.8716.619.8
HLTJet140U147196245--27.836.0
HLTJet180U148822300--18.336.0
Table4.4:integratedluminosityLinpb−1fordifferentsamplesandtriggers.Thedifferent
triggersareshownwiththerunnumberofitsactivationandtheturnonpositioninpT,jet,where
thetriggerbecomes99%efficient.ThelastcolumnshowstheintegratedluminosityofthispT,jet
range.Theluminositiesforthemajortriggerandthelowerenergytriggersaresummed.

10.9jet trigger efficiency0.80.70.60.50.4

CMS private work

s = 7 TeV

jet15Ujet30Ujet50Ujet70Ujet100Ujet140Ujet180U50100150200250300
pT, jet

Figure4.14:Efficiencyofthedifferenttriggers.Asturnonpointthepositionwherethetrigger
becomes99%efficientisextracted.

4.5.DATASAMPLES

1-CMS private work, 36 pb710610510statistical content of objects per bin11.510222.5
410310

s = 7 TeVjetstracksSVsmuonselectrons

1.510222.51033pp3.5
T, jetT, jet

57

Figure4.15:Availablestatisticsfordifferentbinsintransversemomentaspaceofthejet.The
amountisrelevantfortheprecisionofthemeasurement.DuetotriggerprescalinginthelowpT
regions,therethestatisticsaremoreorlesslimitedperenniallytothegivennumber.

Intheendwegetthespectrumoftheobjectswewanttoanalyselikeitisplottedinfigure4.15.The
structureinshapeiscausedbytheprescalingofthetrigger.Thespectrumstartsbyatransverse
jetmomentumof37GeV.Therethelowenergeticjettrigger(HLTJet15U)isbarelyefficient.Itis
alsopossibletosee,thatwehavealreadyjetswithatransversemomentumof1000GeVcollected.
Accountingfortheprescalingfactorandtheintegratedluminosityweareabletoplottheinclusive
jetcrosssectionofthereconstructedjets(seefigure4.16).

58

CHAPTER4.EVENTRECONSTRUCTION

-1s = 7 TeVCMS private work, 36 pbdataPythia 6 QCD DiJetPythia 6 QCD TuneZ2

910data810/L per binPythia 6 QCD DiJet7Pythia 6 QCD TuneZ210610jetN510410310210101-110-21011.510102222.510103333.5
pppT, jetT, jetT, jet
3CMS private work, 36 pb-1s = 7 TeV
Pythia 6 QCD DiJet per binPythia 6 QCD TuneZ2jet2.5σ/MCσ2

1.5

1

0.511.510102222.51010333p3.5
ppT, jetT, jetT, jetFigure4.16:Top:thejetspectrumnormalizedbytheintegratedluminosityLisshown.Itis
comparedwiththetwoMonteCarloexpectationsofthesamplesusedinthisthesis.Bottom:
relativepT,jetspectrum.TheMCspectrafromthetopplotsareplottedrelativetothedata
distribution.

5Chapter

NewapplicationsofNeuroBayes

ManyanalysesforverydifferentpurposesareperformedwithNeuroBayes[Fei04].Itwasnotonly
appliedtophysicsbutalsoeconomics.WithNeuroBayesinterestingandimportantknowledgewas
obtained1.Asummaryofthedifferenttopicsisavailableatthepublicwebpageneurobayes.de
In2008thisknowledgewasappliedforthefirsttimetotheCMSexperiment.Ab-jettaggerfor
specificb-quarksdecayingtoelectronswasdeveloped[Mar09].Basedonthisexperiencefurther
applicationsofNeuroBayesfortheCMSexperimentweredeveloped.
InthischapterIintroducetheNeuroBayesframeworkandaccountforthenewtasksIdeveloped
MS.Cfor

esyNeuroBa5.1

InthissectionIwillgiveacompleteoverviewofthemultivariateanalysisframeworkNeuroBayes.
Iwillexplainthearchitectureandthestatisticalmethodsincludedinthisframework.

ductiontroIn5.1.1NeuroBayesisamultivariateanalysisframework,whichwasoriginallydesignedbyMichaelFeindt
[PT10].Likemostframeworksforphysicsanalysisitwasdevelopedtotackleoneofthemost
importantchallengesinphysics:thepredictionofphysicsproperties.Thisrangesfromthebinary
case,whereweareinterestedinwhetherornotaneventbelongstoaspecificclass,forexample
signalorbackground,tothecontinuouscase,wherethepropertyofanobject,forexamplethe
decaytime,isestimated[Mor06].InthiscaseNeuroBayesdeliverstheprobabilitydensityofthe
targetforeachevent[Fei04].
Thesepredictionsaredonebyanintelligentcombinationofwellknownstatisticalmethods[Fei04].
InthissectionIwillintroducethesemethodsandtheircontributiontoNeuroBayes.Thesectionis
separatedintotwoparts.Thefirstdescribesthepreprocessingoftheinputvariables.Thesecond
partdealswithcorrelationoftheinputvariablestothetargetandthecalculationoftheprediction.
ForaNeuroBayesanalysisthevariousmethodsmustbesetupandcalibratedonsocalledtraining
samples.IwilllimitmyselftotheexplanationtotheclassificationmodeofNeuroBayes.The
classificationisperformedusingtwodifferentsamplesasfromnowlabeledastarget0(T0)and
target1(T1).TheaimistocalibrateasocalledNeuroBayesexpertise/expertwhichisable
todistinguishthosetwosamples.InthissectionIwillintroducethedifferentmethodswhichare
1http://neurobayes.phi-t.de/index.php/theses/jresearch-theses

59

60CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

CMS private work 2010s = 7 TeVpythia6 MC50000no. of events per bins400003000020000100000-2-1012η
track

CMS private work 2010s = 7 TeV®1200 <phi-t>®pythia6 MCNeuroBayes Teacher11001000no. of events in flat bins900800700600500400 -2.6 -2.2 -2.8 -1.7 -1.6 -1.4 -1 -1.3 -1.2.1 -1.1 -1.9 -0.9 -0 -0.8 -0.7 -0.6.6 -0.5 -0.4 -0.4 -0 -0.3 -0.2 -0.2.1 -0.0 -0 0.0 0.0 0.1 0.2.2 0.3 0.4 0 0.4 0.5 0.6 0.6.7 0 0.8.9 0 0.9 1.1 1.2 1.2.3 1 1.5 1.6 1.7 1.8 2.2 2
00.282581470.414736209685290.65284185280.81
ηtrack

Figure5.1:Probabilityintegraltransformforanexemplaryvariable.Thelefthistogramshowsthe
originaldistribution.Inmiddleplotisthecumulativedistributioncreatedfromtheleft.Dividing
they-axisinequallysizedslicesdeliverstheboundariesofthebinsoftherighthistogram.The
contentofeachbinofthelastplothasthesameamountofstatistics.

implementedinNeuroBayes.Anoverviewofthepossiblesetupparameterscanbefoundin[PT10].
Thisexpertisefileisusedtogetthepredictionsforeacheventofadatasample.

essingcPrepro5.1.2Preprocessingistheumbrellatermforallmethodsappliedontheinputvariablesxibeforethe
studyofcorrelationtothetarget.Thisincludestransformationsofthesinglevariableswithand
withoutknowledgeofthetargetinformationaswellasrotationsofthecompleteinputvectorx.

Probabilityintegraltransform
TheprobabilityintegraltransformisthetransformationofrandomvariablesXdistributedwith
x−∞f(˜x)dx˜thevariableY=F(X)isdistributeduniformly.IfthedistributionofXisunknown
densityf(x)toauniformdistribution.ForaknowncumulativedistributionfunctionF(x)=
itispossibletoestimatethistransformation.Forthisthecumulativehistogramiscreatedwhich
representsthecumulativedistributionfunction.Dividingthey-axisinequalsizedslicesdelivers
ustheboundariesofthebinsinxforanewhistogram(figure5.1).Thebinsofthishistogramall
havethesameamountofstatistics.

Parametrizationoftheinputvariabledistributions
Typicallythedistributionsoftheinputvariablesforagiventargetarenotknown.Fortransfor-
mationsoftheinputvariableinamostconvenientwayitisrecommendedtoparametrizethese
distributions.Onepossibleprocedureistheconceptoforthogonalpolynomials[BL98].
pk(x)=ibkixinthefollowingway
Eachfunctionkyii(xi)canbeconstructedbythelinearcombinationoforthogonalpolynomials
my˜i(xi)=ajpj(xi).
jItispossibletogetanestimateoftheparametersaibyafittotheevents.Forajwithexpectation
zerotheestimateaˆifollowsanormaldistributionwithmeanE[ˆai]=0andvarianceVar(aˆi)=1.
Fortheconstructionofthepolynomialfortheparametrizationonlytheparametersai,whichare
significantlydifferentfromzero,areused.

5.1.NEUROBAYES61
pythia6 MC b-jetCMS private work 2010s = 7 TeVCMS private work 2010pythia6 MC b-jetpythia6 MC b-jets = 7 TeV
orthogonal polynom fit500orthogonal polynom fitorthogonal polynom fit50000no. of events per binsno. of events per bins400400003003000020020000100100000-2-1012ηtrack001234567secondary vertex mass8910
Figure5.2:Thetarget1distributionoftwoexemplaryinputvariables(black)isfittedwiththe
orthogonalpolynomialmethod.Theresultedfunctionisplottedinred.Dependentontheshape
ofthevariablemaybedifficulttofindagoodparametrization.
Infigure5.2anorthogonalpolynomialfitisperformed.Thefitlooksgoodfortheleftplot.On
theotherhanditispossibletogetaworsedescriptionoftheeventsasshownontheright.This
happensbecauseofthelowstatisticsinthecorrespondingbinsofthehistogramandthelackof
flexibilityofthefitfunction.Toreducesuchaneffectaprobabilityintegraltransformationcanbe
tting.fieforebappliedAnotherimportantfunctiontohandletheparametrizationisafitofasplinefunctionSn(xi).This
isafunctiondefinedpiecewisebypolynomialsofdegreen(seealso[BL98]).Withdegreen=0
itisastepfunction,indenticaltothehistogram.Thenaturalcubicsplinefunctionhasdegree
n=3.Itistwicecontinuouslydifferentiableandthecurvatureoftheendpointsa,bisdefinedby
S3(a)=S3(b)=0.Thisrequirementleadstothesmallestpossiblecurvature.Theknotsforthe
splinefunctionareconstrainedtothebinvalues.
ransformationtyProbabilitNeuroBayesmakesuseofbothmethodsdescribedabove.Firstthedistributionsoftheinput
variablesxiaretransformedbytheprobabilityintegraltransform.Thuseachbinoftheinput
variablehistogramhasthesamestatisticalpower.Inthenextstepweareonlyinterestedinthe
fractionoftheeventsofoneclass.Figure5.3showsahistogramwiththedistributionofthetwo
targetsofaNeuroBayesclassificationinredandblack.Theplotisextractedfromtheoutput
fileoftheofficialmonitoringmacroanalysis.C.Theplotcanbeidentifiedbythelabelonthe
right.’Flat’standsherefortheresultoftheprobabilityintegraltransform,splittedforthetarget
0distributioninblackandthetarget1distributioninred.Notethevaryingbinwidth,labeledon
-axis.xtheBasedonthishistogramitisalreadypossibletoestimateaconditionedprobabilityP(T1|xi)for
t,enevheacP(T1|xi)=Nbin(TN0)bin+(TN1)bin(T1).
NbinisthenumberofT0orT1eventsperbin.Thefractionofeachbinisshowninfigure5.4.
Theplotislabeledwith’splinefit’.The100binsaresimplylabeledwiththebinnumber.
Toreducebinningeffectsandeffectsofthestatisticaluncertaintiesofeachbinwecanperforma
fitbythemethodoforthogonalpolynomialstothefraction.

62

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

Figure5.3:Target1(red)andtarget0(black)distributionaftertheprobabilityintegraltransform.

Figure5.4:Thefractionofthesignaldistributionoftheflattenedhistogram(black)isfittedwith
theorthogonalpolynomialmethod.Theresultingfunctionisplottedinred.

Nowweareabletotransformtheinputvariablesdirectlytoanestimateoftheirprobability
P(T1|xi),whereNisthenormalizationfactor,whichisgivenbytheoverallnumberofevents,
N=(Nbin(T0)+Nbin(T1)).
binsThisnormalizationfactorcancelswiththetransformedaprioridistributionF(xi),becausethisis
anuniformdistribution.WehaveconstructedthesimplestcaseofBayestheoremwithflatprior:
1P(T1|xi)=Ny˜i(xi|T1)F(xi)=y˜i(xi|T1).
NeuroBayesdoesthisforallinputvariables.Forthefollowingcalculationseachxiisreplacedby
itsP(T1|xi).
Standardizationandcorrelationcoefficients
Inpreparationofthecalculationofthecorrelationcoefficientsthedistributionsofdifferentinput
variablesaretransformedoncemore.Thistimethevariabledistributionsarestandardized.The
transformationischoseninawaythatthemeanofthesoledistributionsiszeroandthevariance
ofthemisone(figure5.5)

y=y˜i−E[y˜i].
iVar(y˜i)
Tocalculatethecorrelationcoefficientsijoftwoinputvariablesiandjwecansumtheproduct
ofthetransformedvaluesforallevents.
ij=yiyj
NFigure5.6showsthematrixofthecorrelationcoefficientsofaexemplaryNeuroBayestraining.

5.1.OBANEURESY63

Figure5.5:Standardizationoftheestimatedsignalprobabilityofthevariablefromfigure5.4.
Themeanofthisdistributioniszerowithawidthofone.Thisisthefinaldistributionofthe
transformedinputvariableafterthepreprocessing.

Figure5.6:Matrixofcorrelationcoefficientsij.Thefirstcolumnisthetargetdistribution,where
target1issetto1andtarget0issetto0.Inthisexampletherearemanyhighlycorrelated
variablesij→1,whicharepaintedindeepredanddeepbluecolors.

64

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

ecorrelationdorderFirstThemainproblemofmultivariateanalysisistheunknowncorrelationoftheinputvariables.If
thecorrelationswereknownwecanconstructthelikelihoodfunctionandtestanyhypothesisbya
likelihoodratiotest[NP33].Otherwiseonehastofindaprocedurehowtohandlethecorrelated
variables.Theeasiestwayistoremoveallvariableswhicharehighlycorrelated.Thisresults
inamorerobustprocedurebutcausesalossofinformationwhichmightberelevantforthe
discrimination.Thereforeamoreadvancedprocedureistoattemptadecorrelationoftheinputvariables.One
possibilityistodiagonalizethematrixofcorrelationcoefficients.Suchadiagonalizationisa
rotationinthen-dimensionalphasespace.Thedecorrelationisdoneinfirstorderonly.Second
ordercorrelationsstillremain.
FormallythediagonalmatrixDcanbedescribedas

1−AA=DTechnicallythecalculationoftherotationmatrixAisalmostimpossible,becauseitneedsthe
inverseofthen-dimensionalmatrix.Buttherearemethods,whichconvergetothediagonalized
form.ThemethodusedinNeuroBayesistheJacobirotation[BL98].Theideaistodoseveral
2-dimensionalrotationstepsuntilaclosetodiagonalmatrixisformed.Themergingofallthis
sub-rotationsgivesusthematrixA.
Weareabletoconstructasetofuncorrelatedvariables˜zi=Aijyj.
Theremainingcorrelationtothetargetofeachrowrepresentstheinformationoftheinputvariable
finallyaddedtotheclassification.Withthisinformationitispossibletoprunevariableswithless
relevance.Thisregardstovariableswithlessinformationfortheclassificationaswellastovariables
withmoreinformationbutlargecorrelationtoothers.

5.1.3Targetcorrelationandprediction
Thepreprocessinggivesusasetofnalmostuncorrelatedvariablesz˜i.Therearemanymethodsfor
hypothesistestingwiththeseconditionsintheliterature.Alotofthemhaveneedlesslylargerun-
ningtime,aboveallifsomeinputvariableshaveaverysmallcorrelationtothetarget.Depending
onthecorrelationtothetarget,aselectionoftherelevantinputvariablesisrecommended.
NeuroBayeshasanautomaticsortingalgorithmofthevariables.Variablesaresortedbyrelevance
and,furthermore,itispossibletoneglectvariableswithlowsignificance.
TogetthepredictionsNeuroBayesprovidesthreedifferentmodeswhichleadtoacalibratedex-
rtise.ep

rainingTiterationZeroThefastestandthereforemostwidelyusedprocedureisananalyticmethodcalledzeroiteration
training.Asdescribedabovewehaveamatrixofuncorrelatedstandardizedvariables.This
representsasphereinthen-dimensionalphasespace.Sowehavethefreedomtodecideadirection
withoutinfluencingthevariablesitself.Wecanchoosethedirectionwiththemostdiscriminating
powerwithrespecttothetarget.Thisdirectionisnowcalledz0.Itisalinearcombinationofthe
:z˜i

z0=rijz˜i
whererijarethecoefficientsoftherotationmatrix,whichprovidesthechosendirection.

YOBANEUR5.1.ES

s = 7 TeV

CMS simulations = 7 TeV1® <phi-t>®NeuroBayes Teacher0.90.80.70.60.5−→0.40.30.2pythia6 QCD b-jetdiagonal0.1000.10.20.30.40.50.60.70.8®0.91
outputNeuroBayes

65

Figure5.7:DiagonalizationoftheZeroiterationtraining.Thisisneededtogetanprobability
interpretationoftheNeuroBayesoutput.Ontheleftthetransformationfunctionsisdetermined
doingafitofamonotonouslyrisingsplinefunction.Ontherighttheresultingdistributionofthe
target1purityP(T1|ot)tothefinaloutputvalueotisplotted.Theexpectedbehaviour,shownby
thediagonallineisfullfilled.

Thismethodworksverywell,becauseofthetransformationwedidfortheinputvariables.The
finalinputvariablesyihaveamonotonousrisingdependencytothetarget1purityofeachvariable
P(T1|yi).Thisdependencyisconservedduringthedecorrelation.Theprojectionofthevariables
zitothetargetisaverygooddiscriminatorz0.Butthisz0doesnotfulfilltheprobabilityinterpre-
tation.PlottingtheT1fractionofz0andfittingwithamonotonouslysplinefunctiontransforms
thistotheprobabilityP(T1|z0).Theprobabilitytransformationcanbeseeninfigure5.7onthe
left.Ifthepurityofeachbinisplotted(asdoneontheright)itcorresponddirectlytothemean
valueofeachbin.Thatmeanstheoutputcanbeinterpretedasprobability.
Theoutputvalueisnowcalledot.Theindex-finallycombinedwithanidentificationnumber-
isusedtospecifydifferentNeuroBayestrainingst.HencefortheprobabilityItakethefollowing
notation:

ot=P(T1|ot).

NeuralNetworkTraining
Anothermoretimeconsumingmethodisanartificialneuralnetwork[Ros58].Withthiswecan
handlehigherordercorrelationstoo.InNeuroBayesjustasimplefeed-forwardnetworkwithone
hiddenlayerisimplemented.Thedefaultvalueforthenumberofhiddennodesisthenumberof
inputnodesminusone.ThenumberofnodesNoftheoutputlayerdependsontheNeuroBayes
mode.Forabinarytargetithasonenode,forthecontinuousmodeN=20.Theoutputvalues
ot,iarecalculatedby:⎛⎞
ot,i=S⎝wji2→3Sw1kj→2yk⎠
kjwheretheweightswjia→baretheconnectionsbetweenthedifferentlayers,ykisthetransformed
inputvalueofvariablekandS(x)isthesigmoidfunction
2S(x)=1−e−x−1

66

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

Figure5.8:ExampleofanarchitectureofanartificialneuralnetworkcalibratedinNeuroBayes.
Thethicknessofthelinescorrespondstotheabsolutevaluesoftheweightwa→b

usedasthetransferfunctionforeachknot.Apossiblearchitectureoftheartificialneuralnetwork
implementedinNeuroBayesisshowninfigure5.8.Therearethreelayers,theinputlayer,one
hiddenlayerandtheoutputlayerwithonlyoneknot.Thenumberofinputlayersisreduced,
becauseoftheminorcorrelationoftheinputvariablesyitothetarget.Thethicknessofthelines
correspondtotheabsolutevaluesoftheweightwa→b.Thesearecalculatedbythebackpropagation
mechanism[RHW87].ThereisthepossibilitytousethesocalledBFGSmechanism[BRO70]to
minimizetheerrorfunctioninamoreefficientway.Asshownin[Fei04]theoutputofsuchan
artificialneuralnetworkcanbeinterpretedasprobabilityot=Pt(T1|ot).

5.2NeuroBayesprobability
InthissectionIwillpresentapplicationsoftheprobabilityinterpretationoftheNeuroBayesoutput.
Iwillshowhowwemusttransformittogettherightprobabilitiesofagivendatasampleandhow
itcanbeusedtoestimatethefractionofsignalevents.FurtherIwillintroducethesPlotmethod
andhowitcanbeimplementedifwehavetheNeuroBayesoutputdistributionatourhands.

5.2.1NeuroBayesprobabilitytransformation
ForagivenNeuroBayesclassificationwithtwosamplesT0(target0)andT1(target1)theresult
ofthetrainingtgivesusforeachoftheeventstheprobabilityPt(T1|ot)withtheNeuroBayes
outputvalueot.TheoverallnumberofeventsisgivenbyN=NT0+NT1.Fortheoutputwe
havethefollowingequations:

Pt(T1|ot)=ot,
Pt(T0|ot)=1−ot.
BeingaprobabilityisoneofthemainpropertiesoftheNeuroBayesoutput.Checkingforthisis
thereforeacrosscheckofareliablecalibrationoftheNeuroBayesexpert.Infigure5.9ontheright

5.2.NEUROBAYESPROBABILITY67
250CMS simulations = 7 TeV1CMS simulations = 7 TeV
®®pythia6 MC non-b-jetNeuroBayes <phi-t>® Teacherpythia6 MC b-jet0.9NeuroBayes <phi-t>® Teacher
no. of jets/0.010.8b-jet purity per bins2000.70.61500.50.41000.30.250pythia6 QCD b-jetdiagonal0.10-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.811000.10.20.30.40.50.60.70.8®0.91
NeuroBayes outputNeuroBayes outputFigure5.9:OnthelefttheoutputdistributionofanexemplaryNeuroBayescalibrationisshown.
Inblackisthebackgroundandinredthesignaldistribution.Ontherightthepurityofthesignal
distributionforeachbinisplotted.
1)T(nithepuritypi=ni(T1∨T0)ofeachbinfortheexemplaryNeuroBayesoutputvariableotontheleft
isshown.Asexpected,allthecalculatedpurityvalueslieonthediagonalaxis,whichcorresponds
tothequotedproperty.
Ifthetargetfractiondiffersformtheanalysissample,amonotonetransformationisneededto
maintaintheprobabilityinterpretation.HereIwilldiscusstwocases,wheresuchatransformation
needed.isLetusassumeageneralcase,wherewewanttoanalyseagivendatasampleconsistingoftwo
classes:signalSandbackgroundB.ThesizeofthesampleisgivenbyNd=N(S)+N(B).We
areinterestedintheprobabilityP(S|onb)ofsomeeventoutofthissampletobeasignalevent
dependentontheNeuroBayesoutputonb.
Inthefirstcasewehavesomesimulationstostudythedifferencesbetweenthetwoclasses.
ThereforewehaveonesampleSMC,wherepdf(x|SMC)≈pdf(x|S)andonesampleBMC,where
pdf(x|BMC)≈pdf(x|B),withgivennumberofeventsN(SMC)andN(BMC).Thesetwosamples
arecalledtrainingsamples.
Thewaytogoisquiteeasy.AsuccessfulNeuroBayesclassificationforcaseone(t1:x→ot1)on
thesimulatedsamplesgivesustheprobabilityPt1(SMC|ot1).
ot1=Pt1(SMC|ot1)
and1−ot1=Pt1(BMC|ot1)
Bayestheoremsays:
Pt1(SMC|ot1)pdf(ot1)=pdf(ot1|SMC)Pt1(SMC)
Pt1(BMC|ot1)pdf(ot1)=pdf(ot1|BMC)Pt1(BMC)
andgivesustheratioforthetrainingsampleas:
ot1Pt1(SMC|ot1)pdf(ot1|SMC)Pt1(SMC)pdf(ot1|S)Pt1(SMC)
==≈1−ot1Pt1(BMC|ot1)pdf(ot1|BMC)Pt1(BMC)pdf(ot1|B)Pt1(BMC)
Tosimplifytheformulainthefurtherstepswecanintroducethelikelihoodratio:

68

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

pdf(ot1|S)ot1Pt1(BMC)
Λt1(ot1)=pdf(ot1|B)=1−ot1Pt1(SMC).
Bayestheoremisalsotrueforthedatasamplesowegetfortheratioondata:
P(S|ot1)=pdf(ot1|S)P(S).
P(B|ot1)pdf(ot1|B)P(B)
WithP(B|ot1)=1−P(S|ot1)andtheratioforthetrainingsamplewegettheprobabilityofan
eventtobesignalforagivenNeuroBayesvalue:
P(S|o)=Λt1(ot1)=Λt1(ot1)P(S).
1tPP((SB))+Λt1(ot1)1+P(S)(Λt1(ot1)−1)
ThelikelihoodratioΛt1(ot1)iseasytocalculatefromtheknownpropertiesoftheNeuroBayes
training.SoonlythefractionofsignaleventsP(S)isneededtocalculatetheposteriorprobability
P(S|ot1).P(S)canbeestimatedbyatemplatefitusingP(ot|S)andP(ot|B)astemplates.
UsingP(S|ot1)asaweightonadatasamplewecanunfoldthesignaldistributionofanyvariable,
whichistotallycorrelatedtoot1.In[PL05]thissimplebehaviorisnamedinPlot.
Inasecondcasewewanttocalculatethisprobabilityonlywithagivensampleofsimulatedsignal
(pdf(x|SMC)≈pdf(x|S)).Allinformationaboutthebackgroundmustbetakenfromthedata
sampleD=S+B.FortheNeuroBayesclassification(t2:x→ot2)weusethesignalsimulationas
target1sampleandthedatasampleastarget0sample:Pt2(D)+Pt2(SMC)=1.Bayestheorem
fortheNeuroBayestraininggivesus:
pdf(ot2|SMC)Pt2(SMC)
ot2=Pt2(SMC|ot2)=pdf(ot2|S)Pt2(S)+pdf(ot2|B)Pt2(B)+pdf(ot2|SMC)Pt2(SMC)
FortheinterestingprobabilityP(S|ot2)weknow:
P(S|o)=pdf(ot2|S)P(S)
t2pdf(ot2|S)P(S)+pdf(ot2|B)P(B)
WithPt2(S)/Pt2(B)=P(S)/P(B),Nt2(S)/Nt2(D)=N(S)/NdandBayestheoremfortheNeu-
roBayestrainingweget:
P(S|ot2)=Pt2(S)ot2=P(S)Λt2(ot2)
Pt2(SMC)1−ot2
Inthiscasewehavesimilardependenciesasinthefirstcase.ThelikelihoodratioΛt2(ot2)iswell
known,butweneedanestimateoftheunknownsignalfractionP(S).Anotherpropertyofthis
equationisthelimitationofΛt2(ot2).Becauseofmax(P(S|ot2))=1wegetΛt2(ot2)<1/P(S).
ThisupperlimitissmearedbytheresolutionoftheNeuroBayestraining.

5.2.2BoostTraining-NeuroBayesandweights
AtlastIwanttofocusontheprocedureofboostinganalreadycalibratedNeuroBayesexpertise.
Boostingisanumbrellatermforcalibratingmorethanoneexpertisetogetafinalresult.The
goalofeachiterationstepistocorrectpossibleimprecisionsoftheformersteps.Forexampleif
youhaveaverycomplexproblemitisclevertolearntheobviousthinginafirststepanddothe
complicatedthingsinasecond.Thesecondstepwillbecomeeasierbecauseweareonabetter
osition.pinitialToimplementsuchaboostforthenewcalibrationalleventsmustbeweighted.

5.2.NEUROBAYESPROBABILITY

69

Theeasiestandmostintuitiveapproachforaboosttrainingistoweighttheeventseoftheone
targetwiththeprobabilitytobeoftheothertarget:
wT0=P(T1|e),
wT1=P(T0|e).
AgivenregionrwithagivennumberofeventsNoutoftwoclassesT0andT1hastheprobabilities
P(T0|r)=N(TN1)+(TN0)(T0)andviceversa.Applyingtheweightfromaboveresultsinthesame
effectivenumbersofeventsNb:
Nb(T0)=wT0N(T0)=NN(T(T1)1)+N(NT(T0)0)=wT1N(T1)=Nb(T1).
ByknowingthetrueP(T|r)ofthegivenregion,nofurtherclassificationispossible.Forany
inclusivedistributionis:
P(x|T1)wT0=P(x|T1)P(T0|x)=P(x|T1)drP(T0|x,r)P(r|x)
AusualNeuroBayesclassificationresultsinanestimateofPˆ(T|r)=P(T|ot).Theregionris
definedbytheinputvariables.Ifweconstructaweightinthesamewaywiththisestimate,
onlytheinformationgainedbythetrainingvanish.Anadditionalsocalledboosttrainingcan
findfurtherquantitiestoseparatethetwoclasses.Thecombinationofbothimprovestheoverall
classification.Byconstructiontheoutputofthetwoexpertsshouldbelesscorrelated.Thecombinationofthe
tworesultscanbeapproachedbythemultiplicationoftheirlikelihoodratios,givenby
pdf(ot|T1)otN(T0)
Λ=pdf(ot|T0)=1−otN(T1).
Stillcorrelationsofthetwoexpertscanappear.Ithappenswhentheprobabilityinterpretationof
theoutputvariableotoftheunboostedcalibrationisnotentirelycorrect.Thustheweightsfor
theboosttraininginvolveabiasfromtheillestimatedevents.Tocontroleffectsofthissourceit
isadvisedtotakeotasaninputfortheboostedtraining.Ifeverythingiscorrect,thevariablehas
nocorrelationstothetargetanddoesnotinfluencetheboosttraining.Anydependencybetween
otandthetargetisahinttowardsproblemswhichmustbyinvestigated.
Inmostofthecasestheboosttrainingdoesonlysmallcorrectionstothefirst.Thereforepossible
correlationsarealsosmallandthelikelihoodratiocombinationcanbeused.
Suchaboosttrainingcanbeappliedmanytimes.Infactmostoftheimprovementsarealready
achievedbythefirstboosttraining.Maybeforveryspecialcasesagainwithmoreiterationsis
ssible.opFortheboostitisallowedtochangethecalibrationsettingsinanyimaginableway,whilethe
effectivenumberofeventsinanyregionisthesameforbothclasses.Thisenablesvariousimple-
mentationswhichIwillexplaininthefollowing:
•Itispossibletofocusthecalibrationonspecificregionsofthesamples.Thiscanbearranged,
iftheweightsarevariedbyanyfocusingfunctionFf.
wT0=P(T1|ot)Ff
wT1=P(T0|ot)Ff

70

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

Theeffectivenumberofeventsisstillthesameforthetwoclasses,buttransformedbythe
.Ffunctionf

N(T0)N(T1)
Nb(T)=wTN(T)=N(T1)+N(T0)Ff
Ifthemultivariateanalysistechniquewithacorrecterrorpropagationworkswithoutapre-
processing,whichcancausebinningeffects,nothingshouldchange.Inmostofthecases,
e.g.usingprobabilityintegraltransformationsorabinnedfittingofaregularizationfunc-
tion,theapplicationofthefocusingfunctionunclosesthebinning.Thereforeitispossible
toseethestructureofthephasespaceonasubbinlevel.Thiswayinformationlostduring
preprocessingisrecoveredandtheclassificationisimproved.
Asanexampleforb-jettaggingitisinterestingtoenableverypureb-jetsamples.Choosing
Ff=P(T10|ot)focusestheboosttrainingonthissocalledpurityregion.
•Itisalsopossibletoenlargethenumberofeventstobemorepreciseforthenextcalibra-
tion.Forthisitisimportanttotakethenewstatisticsintoaccount,whencalculatingthe
probabilitiesfromtheNeuroBayesoutput.
•Itisallowedtoaddnewvariablestotheboosttrainingorleavesomeout.
•Anotherapplicationistostudypropertiesoftheeventsindependentofsomeothervariable
x.Withtheweightingwecanremovethedependenciesofthisvariablexanddoanew
classification.Thecalibrationofthiskindofboosttraininggivesusanestimateindependent
ofthevariablex.Thisfeatureisinterestingforb-jettaggingefficiencymeasurements.With
aboosttrainingitispossibleforeachexistingb-jettaggertocreateanuncorrelatedpartner.
Thiscanbeusedforefficiencymeasurementsonthedatasample(see4.3.9).

NeuroBayeshastheabilitytohandleeacheventwithaspecificweight.Thatiswhyitiseasyto
implementsuchaboost.NeuroBayeshasalsoaninternalboostmodewhereinafirststepazero
iterationtrainingisperformedandinasecondstepitattemptstofindsecondordercorrelations
withanartificialneuralnetwork.

WeightingissueIndeedNeuroBayesisconstructedfortheuseofweights.Theideaistohave
acorrecterrorpropagationincludedintheframeworkfortheapplicationofadvancedalgorithms
asdescribedinthissectionofthethesis.Unfortunatelythereisbuginversion20101026.Phi-
Tclaimedtofixthisissuesfortheupdatedversions.Thebugoccursinthewronguncertainty
calculation,whenlargeweightsareapplied(Figure5.10).Theeffectofverylargeweightscanbe
seeninthebinsontheright.Thepuritydistributioniseffectedbythelargeweightofsingleevents.
Fortheregularizationawrongestimateoftheexpectedbincontentisused.
Thisissueinfluencestheresultdramaticallyifverylargeweightsareused.Thereforeitisnecessary
forthisthesistoavoidlargeweights.

sPlot5.2.3sPlotisanunfoldingtechniqueintroduced2005byMurielPivk[PL05].ThesPlottechniqueisa
methodwherethesocalledsPlotweightsareappliedoneacheventofagivensample.Theweight
iscalculatedcorrespondingtoatargetT.Foranyvariablexnotcorrelatedtothisweight,its
distributionistransformedtotheconditioneddistributionofT:

5.2.NEUROBAYESPROBABILITY

71

Figure5.10:Effectoflargeweights.Thebinsontherightcontainonlyoneeventwithalarge
weight.Thepurityiseffectedbythetargettypeofthissingleevent.Amorereasonableestimate
ofthepurityshouldbearound0.5withadequateuncertainty.

dwsPlotpdf(x|wsPlot)wsPlot=pdf(x|T)
ThesPlotpaperincludesamongotherthingsthederivationofthesPlotweights.Becauseofthe
differingnotationsusedinthisthesisandanalogiestoalatermethodusedforthebcrosssection,
IwillintroducethesPlotmethodinmyownwords.
IwillexplainthesPlotmethodforaspecialcaseofonlytwoclasses.Similartothedescriptions
abovewehaveoneclasscalledtarget0(T0)andtheothercalledtarget1(T1).ThesPlotweights
wsPlotaredeterminedusingtheoutputvaluesotofagivenNeuroBayesexpert.
Lookingattheinclusivedistributionofotforagivenxwefind:
dotpdf(ot|x)=dotP(T|x)pdf(ot|T,x)
TThisequationistheusualbaseforanyinclusivestudy.Wehavetwovariables.Itispossibleto
defineaninclusiveregionofonevariablexandtaketheothervariableforstudiesofthetarget.
Thereforeweneedexternalknowledgeaboutpdf(ot|T,x),e.g.templatesfromaMonteCarlo
sample.Ifwedothisinfurtherinclusiveregionswecangetapictureofhowthefirstvariableis
relatedtothetarget:

P(T|x)pdf(x)
pdf(x|T)=dxP(T|x)pdf(x).
IfotisuncorrelatedtothevariablexforbothtargetsT,weareabletodeterminethexdistribution
inamoreadvancedway.ThisiscalledsPlotmethod.Thereforethefollowingrequirement

pdf(ot|T,x)=pdf(ot|T)
mustbefulfilled.ThereisnocorrelationofthetwovariablesforthedifferenttargetsT.This
bringsusthefollowingsimplification:
dotpdf(ot|x)=dot[P(T0|x)pdf(ot|T0)+P(T1|x)pdf(ot|T1)].
AsimpletrickallowsusthecalculationofthedistributionsP(T|x).Ifweweighteacheventinot
withtheprobabilitiesP(T0)/P(T0|ot)orP(T1)/P(T1|ot),wegettwosimilarequations:

72CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

P(T)P(T)
dotpdf(ot|x)P(T|ot)=pdf(T|x)dotpdf(ot|T)P(T|ot).
TAfterapplyingBayes’Theoremwecanwritethisinmatrixnotation:
⎛dotpdf(ot|x)P(PT(1T|o1)t)⎞⎛pdPf((TT00)|x)⎞
⎝⎜⎠⎟=⎝⎜⎠⎟V−1
dotpdf(ot|x)P(PT(0T|o0)t)pdPf((TT11)|x)
whereV−1isthematrix:
⎛N1Λt−111⎞
V−1=⎝⎜⎠⎟
1N1Λt1
Theintegrationaboutotisreplacedbythesumoverthefinitenumberofeventsfromthesample.
Λt1isthelikelihoodratioasdefinedinsection5.2.1forthefirstcase:Λt1=pdpdff((oott||TT0)1).The
integrationofthenormalizeddistributionpdf(ot)inthematrixelementsnexttothediagonalis
one.AfterthedeterminationofΛt1withaNeuroBayestrainingweareabletocalculatethematrix
V,whichistheinverseofV−1.
Forpdf(x|T0)andpdf(x|T1)finallyweget:
pdf(x|T1)=pdf(x)dotpdf(ot|x)P(PT(T0|o0))(VT1,T1+Λt1VT1,T0)
twsPlot(T1)
0)T(Ppdf(x|T0)=pdf(x)dotpdf(ot|x)P(T0|ot)(VT0,T1+Λt1VT0,T0)
wsPlot(T0)
HerewecandefinethesPlotweightsasrequestedinthebeginning.
ThesPlotweightswsPlothavefurtherpropertiesshownin[PL05].Sothesumofthesignaland
thebackgroundweightisgivenbywsPlot(T0)+wsPlot(T1)=1.Theweightsarenotlimitedtoan
statisticaluncertaintiescanbecalculatedbythesumofthesquaredweights(σsPlot=w2sPlot).
intervalbetween(0,1).Itisalsopossibletogetweightssmallerthanzeroandlargerthanone.The
ThesPlotmethodisanicefeaturetoextractthesignalandbackgroundshapesofvariables,where
theparticulardistributionsareunknown.Itispossibletogetthisdistributionsbyrunningover
thewholedatasample.Theuncertaintydependsonamountofstatistics,whichisavailablefor
kground.acbandsignal

5.3NeuroBayesb-jettagger
InthissectionIwillpresenttwonewmethodsfordiscriminatingb-jetsfromnon-b-jets.Both
methodsmakeuseofthemultivarianteanalysisframeworkNeuroBayes.Forthefirsttaggerthe
NeuroBayesexpertiscalibratedusingasampleofsimulatedb-jets(signaltarget,T1)andnon-b-
jets(backgroundtarget,T0).Theotheriscalibratedusingtherealdatasampleasbackground
target.FirstIwillexplaintheneededinputvariablesandtheirqualityforb-jettagging.Inthesecond
partIexplainthedifferencesofthetwob-jettaggersandshowhowtheyperform.

5.3.NEUROBAYESB-JETTAGGER

categoryinputvariablesoftrackobjects
fourvectormomentump
ηyrapiditpseudoprimaryvertexsignificanceofthetwodimensionalsignedimpactparameter
significanceofthethreedimensionalsignedimpactparameter
twodimensionalsignedimpactparameter
threedimensionalsignedimpactparameter
trackdecaylength
jetpositiontracktransversemomentum,relativetothejetaxis
trackparallelmomentum,alongthejetaxis
ΔRofthetracktothejetaxis
minimumtrackapproachdistancetojetaxis
jetenergytransversemomentum,relativetothejetaxis,normalizedtoitsenergy
parallelmomentum,alongthejetaxis,normalizedtoitsenergy
qualityχ2valueofthetrackfit[SAF+06]
numberofhitsinthepixeldetector
numberofhitsinalltrackingdetectors
bhadrondistancetoreconstructedbhadronaxis
significanceofdistancetoreconstructedbhadronaxis
trackweightforbhadronreconstruction
Table5.1:Inputvariablesofthetrackobjects

73

ariablesvtaggingb-jet5.3.1ForthetwoNeuroBayesb-jettaggerIdecidedtodevelopaframeworkofconditionalNeuroBayes
experts.TobemostperformantIuseallavailableinformationconcerningb-jets.Thisincludes
lifetimeinformationaswellasleptoninformation.Forthisitisnecessarytomatchdifferentobjects
tothejets.Thisobjectsarethetracksfromthejet,secondaryvertices,whicharereconstructed
fromtracks,electronandmuontracks(see4.3).Atableoftheavailableinputvariablesisshown
inthetables5.1-5.5.Propertieswhichcorrespondtothephysicalquantitiesaswellasthequality
oftheobjectsarestoredintheinputvariables.
Allofthesevariableshavetobecomparedtodata.Onlyinputvariables,whichcomparetodata
areusedfortheNeuroBayesb-jettagger.AlloftheseobjectswillbeusedbyNeuroBayestodecide,
howb-likeajetis.

TracksDetailsofthetrackreconstructioncanbefoundinsection4.3.1.Fromthefourmomen-
tumvectorthetransversemomentumandthepseudorapidityareextracted.Furthergeometrical
propertiesofthetrackrelativetotheprimaryvertexandthejetareused,e.g.theimpactparame-
teranditssignificancetobeinconsistentwiththeprimaryvertex.Theratioofthesumofthetrack
momentatothejetenergymeasuredinthecalorimeteriscalculatedaswellasproperties,which
describethetrackkinematicsrelativetothejet.Finallysomequalityvariables,thefitparameters
andthenumberofhitsintrackingdetectorcomponentsareused(table5.1).

MuonsDetailsofthemuonreconstructioncanbefoundinsection4.3.5.Themuoninput
parameterlistpartiallyisthesameasforthegeneraltrackobjects.Therearetwonewvariables,
whichdependonthejetenergy.Themomentumofthemuontrackisboostedintothejetrest
frame.Thisandnormalizedbythejetenergyistakenasadditionalinputvariable.Thequalityof

74

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

categoryinputvariablesofmuoncandidates
fourvectormomentump
ηyrapiditpseudoφangleprimaryvertexsignificanceofthetwodimensionalsignedimpactparameter
significanceofthethreedimensionalsignedimpactparameter
jetpositiontracktransversemomentum,relativetothejetaxis
trackpseudorapidity,relativetothejetaxis
ΔRofthetracktothejetaxis
jetenergytrackmomentumalongthejetaxis,inthejetrestframe
same,normalizedtojetenergy
qualityχ2valueofthetrackfit[SAF+06]
Table5.2:Inputvariablesofthemuonobjects

themuonstrackisdescribedbytheχ2valueofthetrackfit.Noinformationaboutthedetector
componentsandmuonidentificationvariablesareused.Thefulllistcanbeseenintable5.2.
Themuonobjectsareverypure.Thefractionofmisidentifiedpions,kaonsandprotonsis0.26%,
[CMS10k]..05%0and0.3%

ElectroncandidatesDetailsoftheelectronreconstructioncanbefoundinsection4.3.4.The
samevariablesasforthemuonsareusedtodescribetheelectroncandidates.Electronsaredifficult
toidentify.Thereforesomevariableswhichdeliverinformationabouttheelectronlikelinessofthe
candidatesareadded.Theelectronshaveasmallmass.Variablestogetinformationonpossible
bremsstrahlungarecreated.Alsotheoutputofaclassifierwhichseparateselectronsfrompionsis
5.3).(tabularused

SecondaryverticesDetailsofthesecondaryvertexreconstructioncanbefoundinsection4.3.3.
Additionaltotherealreconstructedsecondaryverticesthesecondaryvertexobjectsinthisanalyses
containsocalledpseudovertices.Thepseudoverticesarethesumofthefourvectorsoftracks
displacedfromtheprimaryvertex,whichdonotrequirethesecondaryvertexrequirements.The
variablesformedfromthepropertiesoftheseobjectsarelistedintabule5.4.Fortherealsecondary
verticesinadditionthedistanceofthevertexpositiontotheprimaryvertexiscalculated.

JetsDetailsofthejetreconstructioncanbefoundinsection4.3.6.Forthejetclassificationthe
meanvaluesoftheNeuroBayesoutputfromtheclassificationsofthesoleobjectscorrespondingto
thejetaretakenasadditionalinputvariables.Furtherwehavethecorrectedfourvectorofthejet
andthediscriminatingvariablesofallexistingb-jettaggers(table5.5).

5.3.2NeuroBayesMCtagger(NBMC)
MCtrainingisthecommoncasehowNeuroBayes(seealso5.1)isused.Forcalibratingthe
NeuroBayesexperttwosimulatedsamplesareneeded:onesampleforthesignaltargetSand
onesampleforthebackgroundtargetB.Thecalibrationprocedureisoftencalled:training.A
fullycalibratedNeuroBayesexpertisabletodiscriminateeventswithsignaltargetfromevents
withbackgroundtarget.Thisexpertiseisappliedonthedatasample.Thuseachjetisrelatedto
thetransformedNeuroBayesoutputvariableotoftheinterval(0,1).Smallvalueofotrepresent

5.3.NEUROBAYESB-JETTAGGER

categoryinputvariablesofelectroncandidate
fourvectormomentump
ηyrapiditpseudoφangleprimaryvertexsignificanceofthetwodimensionalsignedimpactparameter
significanceofthethreedimensionalsignedimpactparameter
jetpositiontracktransversemomentum,relativetothejetaxis
trackpseudorapidity,relativetothejetaxis
ΔRofthetracktothejetaxis
jetenergytrackmomentumalongthejetaxis,inthejetrestframe
same,normalizedtojetenergy
qualityχ2valueofthetrackfit[SAF+06]
outputofamvaelectron/pionclassifier[CMS10j]
positionoffirsthitinzdirection
positionoffirsthitinradialdirection
inversedΔRoffirstandlasthitofthetrack
ΔRofelectroncandidateandGaussiansumfiltertrack
invertedenergyofbremsstrahlung
energylossbeforecalorimeter
Table5.3:Inputvariablesoftheelectroncandidateobjects

categoryinputvariablesofsecondaryvertex(SV)
fourvectormassoftracksumatsecondaryvertex
primaryvertex2DdistanceoftheSVtotheprimaryvertex
significanceof2DdistanceoftheSVtotheprimaryvertex
3DdistanceoftheSVtotheprimaryvertex
significanceof3DdistanceoftheSVtotheprimaryvertex
jetΔRoftheSVtothejetaxis
ratioofenergyatsecondaryvertexovertotalenergy
tracknumberoftrackconnectedtothevertex
ΔRoftheSVtothetracksum
ratioofenergyatsecondaryvertexovertracksum
qualitycategoryofsecondaryvertex(Reco,Pseudo,No)
Table5.4:Inputvariablesofthesecondaryvertexobjects

75

76

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

categoryinputvariablesofjets
fourvectorcorrectedtransversemomentumpT
nergyejetbareηyrapiditpseudoφanglebtagcombinedSVtagger
jetBprobabilitytagger
taggerSVsimpletaggerypurithighsimpleSVsoftmuonimpactparametertagger
softmuontransversemomentumtagger
trackcountinghighefficiencytagger
trackcountinghighpuritytagger
objectsnumberoftrackobjects
numberofsecondaryvertexobjects
numberofelectroncandidates
numberofmuoncandidates
Table5.5:Inputvariablesofthejetobjects

background-likeevents,largervaluesstandformoresignal-likeevents.
FortheNeuroBayesb-jettaggeramulti-levelarchitecturewasdesigned.Thearchitectureofthe
NeuroBayesb-jettaggerisshowninfigure5.11.CalibrationsoftheNeuroBayesexpertsoneach
ofthefivephysicalobjects,thetracks,secondaryvertices,electroncandidates,muonsandjetsare
needed.Wegetanestimateforeachobject,howlikelyitistobepartofb-jet.Thisinformationwere
collectedforeachjet.ForthefinalNeuroBayescalibrationsonjet-levelnewinputvariableswere
defined.Thesewereconstructedoutoftheoutputvaluesofobject-levelexperts.
Thejet-levelconsistsoftwosteps.Itiseasytoachieveagoodseparationbetweenb-jetsand
non-b-jetsbecauseofthelifetimeoftheb-hadron.ThisisdonebyafirstNeuroBayescalibration.
Tobemoreeffectiveforthejets,whicharedifficulttoseparate,inadditionaboosttrainingwas
performed.Forthisthenumberoftarget0eventswasincreasedandweightedbytheprocedure
5.2.2.inducedtroinForallcalibrationsNeuroBayesissetupwithdefaultparameters.Thenumberofhiddenlayersis
thenumberofinputnodesminusone.Eachinputnodeisfedbyoneoftheinputvariables.Forall
calibrationsoftheexpertsNeuroBayesisusedinclassificationmodewiththeglobalpreprocessing
flag422,whichrepresentspreprocessingandzeroiterationtraining.Theresultisboostedbyan
internalartificialneuralnetworktraining.Themaximalnumberofiterationsfortheneuralnetwork
is100.Furthereachvariableneedsatleast2σsignificancetobeusedfortheclassification.
Thedecisiontousetheinternalboostmodeisbasicallyaaestheticandlessatechnical.Inmost
ofthecasestheweightsintheneuralnetworkconvergetozeroandthetrainingisstoppedbefore
100iterations.Thismeansthatitisnotpossibletoenhancetheresultfoundbythezeroiteration
training.Buttheruntimeisenlargedwithoutaqualitativeimprovement.Neverthelesstheoutput
distributionisslightlydifferent,notinseparationpower,butintheshape.Figure5.12shows
twooutputdistributionsonthesametargetforthedifferentsetupmodes.Theruntimeinzero
iterationmodewasonly35.43sincontrasttotheinternalboostmodewhere290.47spassedby.

5.3.NEUROBAYESB-JETTAGGER77
Figure5.11:ArchitectureoftheNeuroBayesb-jettagger.EachboxstandsforasingleNeuroBayes
calibration.Thearrowspointupwheretheresultoftheexpertisused.Thecolorsclarifythe
jects.obtdifferen2500CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
NeuroBayes <phi-t>® Teacher®target 0target 11800NeuroBayes <phi-t>® Teacher®target 0target 1
1600no. of events/0.01no. of events/0.012000140012001500100080010006004005002000-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output11-10-0.80.1-0.60.20.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output11
Figure5.12:Left:theoutputdistributionofthezeroiterationmode.Thetrainingresultsina
Giniindexof19.9calculatedin35.43s.Aclassificationwiththesamesetupparametersplusan
additionalinternalboostresultsinaoutputdistributionasshownontheright.TheGiniindexis
also19.9,buttheshapeismuchsmootherwithamuchlongerruntimeof290.47s.Theexplanation
ofthiseffectisadifferentforthemonotonoussplinefit:NeuroBayessetupparameterDIAGinstead
5.1.3.DIA2of

78

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

Theexplanationofthiseffectisadifferentvalueoftheparameterwhichcontrolsthecurvatureinthe
monotonoussplinefitforthediagonalization(see5.1.3).InsteadofthecommonNeuroBayesshape
parameterDIAG,thealternativeDIA2isused.Theresultsofbothsetupmodesareequivalent
discriminators.InthisthesistheoutputvaluesareusedforanotherNeuroBayestraining.Havingastructuresimilar
tomanydeltafunctioncaneffectthepreprocessingofthefurtherNeuroBayesexpertcalibration.
ThereforeIdecidedusethesmoothdiagonalizationmode(DIA2).FurtherIacceptedtheenlarged,
butstillsmall,runtimeoftheinternalboosttogetthebestpossibleresult.
ForthecalibrationsoftheexpertsthePythia6QCDTune2Zsamplesareused.Asmentioned
abovethesamplesmustbeweightedbyw(sample)togetasmoothrealisticpT,jetspectrum(see
4.4).Theavailablestatisticsaremoreorlessflatinlog10(pT,jet)(seealsofigure4.13).Thisbrings
largeweightswhichareproblematicintherecentNeuroBayesversion.Toavoidweightingeffects,
thespectrumistransformedoncemoretoaflatdistribution.ThepT,jetspectrumplottedindouble
logarithmicscalecanbefittedbyapolynomialfunctionofthethirdorderwithasufficientaccuracy
intherange37GeV<pT,jet<1000GeV:

f(pT,jet)=expa0+a1log10(pT,jet)+a2log10(pT,jet)2+a3log10(pT,jet)3
Theparametersaredeterminedasfollows:

a0=39.86±0.89,a1=−24.79±0.50,a2=8.39±0.17,a3=−1.582±0.048
Figure5.13showsthefittedspectrum.ThefinalweightsarecalculatedoutoftheMonteCarlo
weightswmultipliedwiththeextractedweightfromthefit:
(sample)wwfinal=α·f(pT,jet)
Theconstantfactorαischoseninaway,thatthesumofallweightscorrespondstotheamount
statistics.ofAfterthispreprocessingtheinputvariablesareusedforthevariouscalibrationsofNeuroBayes
experts.Theamountofstatisticsforthetrainingisminimizedasmuchaspossibletoavoid
timeconsumingdiskaccessoperations.AmainfeatureofNeuroBayesis,thatgoodresultsfora
discriminatingoutputvariablecanbeachieved,alreadzwitharelativesmallnumberoftraining
events.Furtherthecalibrationitselfisveryfastcomparedtootheradvancedmultivariateanalysis
methodslikeboosteddecisiontreesorartificialneuralnetworks.Adetailedstudyonthiscanbe
foundin[Mar10].Table5.6showstheamountofeventsused.AlsotheruntimeoftheNeuroBayes
trainingislisted.Largeruntimeoccurewhentheinternalboostisabletoimprovethezero
esult.riterationAlistoftherelevantinputvariablesandtheNeuroBayesoutputdistributionsareshownonthe
nextpages.AcompletelistisattachedintheappendixA.

TracktrainingIntable5.7theinputvariablesofthetrackcalibrationarelisted.Theyare
sortedbytheirrelevancefortheclassification.
Themostimportantvariableisthethreedimensionalsignificanceoftheimpactparameterofthe
track(figure5.14).Theplotsshowthevariablefirstintheusualexpositioninlogarithmicscalewith
equidistantbinsandsecondwithabinningcalculatedbytheprobabilityintegraltransformation
5.1.2.Inboththedistributionfortracksfromb-jetsandnon-b-jetsareplotted.Itiseasytosee,
thatwiththisinputagoodseparationbetweensignalandbackgroundcanbeestablished.

5.3.NEUROBAYESB-JETTAGGER

CMS simulation

s = 7 TeVχχ22 / ndf = 0.01463 / 133 / ndf = 0.01463 / 133
p0p0 39.86 39.86 ±± 0.89 0.89
p1p1 -24.79 -24.79 ±± 0.50 0.50
p2p2 8.39 8.39 ±± 0.17 0.17
p3p3 -1.582 -1.582 ±± 0.048 0.048

910208χχ22 / ndf = 0.01463 / 133 / ndf = 0.01463 / 133
107p0p0 39.86 39.86 ±± 0.89 0.89
#jets per bin105p2p2 8.39 8.39 ±± 0.17 0.17
10156p1p1 -24.79 -24.79 ±± 0.50 0.50
10104p3p3 -1.582 -1.582 ±± 0.048 0.048
1031052101001-110-210-5-310-410-10-510-610-1511.510222.510333.5p
T

79

Figure5.13:TotransformthepT,jetspectrumintoaflatdistribution,itwasfittedbyapolynomial
functionatdoublelogarithmicscale.TheshownspectrumiscalculatedoutoftheavaliableMC
events.Thereforedifferentsamplesmustbeweightedbyaspecificvalue(see4.4).Thiscausesthe
binningeffectsontheright.

target1numberof
calibration#t0events#t1eventsfractioninputvarsruntime
track18454122546050.0%19336.35s
vertex36261144841851.1%111969.51s
electron14527820493253.3%161396.76s
muon25336134574448.3%111335.45s
jet19263121731248.2%10974.31s
boost13877243302349.0%10167.72s
Table5.6:RuntimeofthedifferentNeuroBayescalibrations.Forthedifferenttrainingsthenumber
ofeventsandthetarget1fractionislisted.Becauseoftheweighingoftheeventsthisnumber
doesnotcorrespondtotheexpectedfromthenumberofevents.Thelastcolumnshowshowlong
ittakestogetthecalibrationoftheNeuroBayesexpert.

80CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES
addedonlyloss,whencorrelation
namesignificancethisremovedtoothers
trackSip3dSig102.89102.8913.6797.8%
trackEta13.8813.5412.8822.5%
trackBdistSig8.8153.166.0198.4%
trackSip3d7.7198.487.2397.4%
trackJetDist9.8567.937.2482.9%
trackMom7.7010.326.8624.9%
trackJetDeltaR2.498.057.2183.8%
trackPtRelFrac5.718.737.2885.2%
trackLxy5.6781.425.4184.0%
trackChi24.102.714.275.8%
trackBDist3.7543.073.7694.6%
trackHits3.663.243.4118.4%
trackPxHits2.035.821.9920.5%
trackBweight1.8253.531.8397.8%
trackSip2dSig1.6492.410.5697.0%
trackSip2d0.3986.670.3996.3%
trackPparFrac0.008.730.00100.0%
Table5.7:InputvariablesofthetrackobjectNeuroBayesclassification.Onlyvariables,whichare
morethan2σinsignificanceareusedbyNeuroBayes.Thetablealsoshowsinformationaboutthe
classificationpowerofthevariableitselfandhowimportantitis,thatthisvariableisused.The
lastcolumnshowsthecorrelationtotheothervariables.
CMS simulations = 7 TeVCMS simulations = 7 TeV
10-1pythia 6 b-jetpythia 6 non-b-jet600NeuroBayes <phi-t>® Teacher®pythia6 QCD b-jetpythia6 QCD non-b-jet
per bin-2total10500no. of events in flat binsN/N-310400-410-510300-610-720010-810100-910-100-50050100150200250300e+02 -10.1 -62.8 - -2.1.8 -11.6 -1.4 - -1.3 -1.2.1 -10.21 - -0.92 -0.840.76 -0.69 - -0.62 -0.540.46 -0.37 - -0.260.472 -0.0.24 0.35 0.44 02 0.5.59 0.66 0.73 0 0.8.87 00.6.95 0 1 1.1.2 1 1.3.4 1 1.5 1.7 1.9.2 20.8 2.6 3.4 4.3.5 5 6.8 8.5 113 18 1 261
signed IP3d/σIPsigned IP3d/σIP
Figure5.14:Trackobjecttraining:Themostimportantinputvariableisthesignificanceofthe
signedimpactparameter.Thedistributionofthisvariableisshownintheclassicalhistogramwith
equidistantbinsonlogarithmicscale(left)andinprobabilityintegraltransform(right).Theb-jet
tracksareplottedinred.Thedifferencestothenon-b-jettracks(black)areclearlyvisible.

5.3.NEUROBAYESB-JETTAGGER

CMS simulation®1000 <phi-t>®NeuroBayes Teacherno. of tracks/0.01800600400200

s = 7 TeVCMS private work 2010s = 7 TeV®pythia6 QCDpythia6 MC non-b-jet12000NeuroBayes <phi-t>® Teacherdata CMSSW38Xpythia6 MC b-jet
no. of vertices/0.01100008000600040002000

0-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.8110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.811
NeuroBayes outputNeuroBayes output

81

Figure5.15:NeuroBayesoutputofthetrackandthevertexclassification.Inredaretheobjects
comingfromab-jetinblackareobjectscomingfromotherjets.

Havingthisvariablein,theothervariablesaremoreorlesscorrectionstothispowerfulone.
Variables,whicharehighlycorrelatedtoitarerankeddown.Forexamplethetwodimensional
significanceoftheimpactparameterissortedoutbyNeuroBayes,becauseofthesmalladditional
information,whichisleftafterthedecorrelation.Theparallelmomentumofthetracktothe
jetaxisnormalizedbythejetenergyisyet100%correlated.ThismeansNeuroBayesisableto
reconstructthisvariablefromtheothervariables.
Figure5.15showsonthelefttheoutputdistributionsoftrackclassification.Eacheventinthis
plotcorrespondstoonetrack.Thetwoclasses,tracksfromab-jet(red)andtrackfromotherjets
(black),showtheexpectedbehavior.Thereisagoodseparationbetweensignalandbackground.
Weseeaninterestingdoublepeakingstructureoftheredcurve.Thisresultsbecauseofthe
fragmentationoftheb-jet.Thetracksoftherightpeakprimarilycorrespondtotracksfromtheb
hadron.Thetracksfromtheleftaremainlypionsfromthehadronizationprocess.Therearealso
largeoutputvalues.Thisteachestheexistenceoftracks,whichareveryspecificforbdecays.On
theotherhandforlowvaluestherearenotracksinthefirst20%oftheoutputinterval.Thistells
usthatallkindofvarioustracksappearinb-jets.Nosingletrackcanbeexcludedtostemfroma
b-jet.Thisbehaviorisusedtoconstructanadditionalinputvariableforthefinalb-jettagger.Similar
tothenumberoftrackscorrespondingtoasecondaryvertex,herethenumberoftrackswhich
correspondtothebhadroncandidateHb,whichshouldappearintherightpeak,arecounted.
ThisisimplementedbyintegratingtheNeuroBayesoutputotofthetrackexpertstartingfroma
specificthresholdot>0.5foreachjet.
1Ntrack(Hb)=Ntrack(jet)ot=0.5dot(track)pdf(ot)
VertextrainingIntabular5.8theinputvariablesofthevertexcalibrationarelisted.The
outputofNeuroBayesisplottedinfigure5.15ontheright.
Themostimportantvariableisthenumberoftrackswhichareconnectedtothesecondaryvertex.
Thisiscausedbythelargemassofthebhadron.Thecorrelationof77%toothervariables,
especiallythesecondaryvertexmassconfirmsthisstatement.Anotherinformationcontainedin
thisvariableisthefact,thattheb-hadronneedsatleastoneadditionalweakdecaycomparedto
thelighterquarksuntilitresultsinstableparticles.Thisleadstoanincreasednumberoftracks

82

82CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES
addedonlyloss,whencorrelation
namesignificancethisremovedtoothers
vertexNtracks125.22125.2232.5777.0%
vertexJetEFrac63.54122.7935.0172.2%
vertexPVSig2d40.3990.9332.8483.0%
vertexMass28.87112.8125.1678.7%
vertexTrackEFrac18.6511.9917.1935.6%
vertexPVDist3d10.0933.785.4695.5%
vertexJetDeltaR6.8044.668.1759.3%
vertexTrackDeltaR5.578.395.5740.9%
vertexCategory0.4166.400.3969.1%
vertexPVDist2d0.1237.390.1296.1%
vertexPVSig3d0.0091.150.00100.0%
Table5.8:Inputvariablesofthevertexobjectclassification.Thecolumnsshowtherelevanceof
riable.avheacCMS simulations = 7 TeV9000NeuroBayes <phi-t>® Teacher®pythia6 QCD b-jetpythia6 QCD non-b-jet
80007000no. of events in flat bins6000500040003000200010000 00 0 0 0 0 0 0 0 0 00.2 0 0 2 2 2 2 2 2 2 20.4 2 2 2 2 2 3 3 3 3 30.6 3 3 3 4 4 4 4 4 40.8 5 5 5 5 5 6 6 6 7 81 9
N(SV)track

Figure5.16:Secondaryvertexobjecttraining:Themostimportantinputvariableisthenumberof
tracksconnectedtoareconstructedsecondaryvertex.Thedistributionofitisshownforvertices
standinginb-jets(red)andnon-b-jets(black).Thebinwithvaluezeroshowsonlypseudovertices.

connectedtothesecondaryvertex.Figure5.16showsthedistributionofthisvariable.Thebin
withnotrackscorrespondtopseudovertices.Ifnosecondaryvertexisreconstructed,atleasttwo
trackswithlargeimpactparameteraresummedtothiskindofobject.
Otherimportantvariablesarethevertexmassitselfandthevertexenergycomparedtothejet
energy.Bothquantifythemassofthebhadron.Furthertheinformationonthelifetimeoftheb
hadronarecoveredinthesignificance,howlikelyitistohaveasecondaryvertexawayfromthe
primaryvertex(vertexPVSig2d).
Allthisinformationleadtoaverygoodclassification,ifthereconstructedsecondaryvertexispart
ofab-jet.Manyofthesecondaryverticesareclassifiedbyalmost100%andappearinthelastbin
oftheoutputdistribution.Theexistenceofareconstructedsecondaryvertexisthereforealready
agoodb-jettagger.
ElectrontrainingIntable5.9theinputvariablesoftheNeuroBayeselectroncalibrationare
listed.

5.3.NEUROBAYESB-JETTAGGER83
addedonlyloss,whencorrelation
namesignificancethisremovedtoothers
83.7%60.8292.6292.62eleSip3dSigelePtReleleSip2dSig13.0919.9169.8118.7112.8212.0883.9%66.9%
eleZpos9.579.356.3939.1%
eleInvDeltaR8.756.126.5828.7%
eleIdeleMom8.078.4220.747.236.575.7045.3%26.9%
17.0%7.155.106.78eleChi2eleEtaeleJetDeltaR4.556.7613.1717.791.335.8395.1%68.8%
44.9%4.6310.434.40eleBrem44.3%4.514.064.66eleGSFDif5.0%4.082.454.07elePhieleJetPparFrac0.985.131.1469.7%
95.8%0.7616.450.76eleEtaRel100.0%0.0018.710.00eleJetPparTable5.9:Inputvariablesoftheelectroncandidateobjectclassification.Thecolumnsshowthe
relevanceofeachvariable.
Theelectroncandidatesaremoreorlessasubgroupofthetrackobjects.Soitisnotsurprisingthat
alsothethreedimensionalsignificanceofthesignedimpactparametercontainsthemostimportant
informationfortheclassification.Thesameargumentsduetothelifetimeofthebhadronapply
here.Againtheothervariablesaremoreorlesscorrectionstothispowerfulone.
Butthereisanotherinterestingvariable,whichismorerelevanttodistinguishelectroncandidates
fromb-jets.Thisisthetransversemomentumoftheelectroncandidaterelativetothejetaxis
pT,rel.Becauseoftheleptondecayofthebhadronintoaelectronitispossible,thattheelectron
carriesmuchofthemomentumfromitsmotherparticle.Thisleadstolargerrelativemomenta
pT,rel.Thedistributionofthisvariableisshowninfigure5.17.
Figure5.18showstheoutputdistributionsoftheelectronclassificationontheleft.Eachevent
inthisplotcorrespondstooneelectroncandidate.Thetwoclasses,electroncandidatesfroma
b-jet(red)andfromanotherjet(black),showtheexpectedbehavior.Thereisagoodseparation
betweensignalandbackground.
Weexpectsalsotwopeaksinthesignaldistributionasseenforthetracks.Therearetwoeffects
whichcausetheshapeoftheoutputdistribution.
Atfirsttheelectronreconstructionisdifficultbecauseofthemultiplicityoftracks.Manyof
thesetrackscorrespondtopions.Toreducethecontributionofmisidentifiedparticlesaselection
dependentonagoodelectronidentificationisneeded.Forthisadditionalelectronqualityvariables
areused.Neverthelessnonelectronparticlesremainandformthepeakingstructureontheleft.
Theothereffectdependsonrealelectronsnotcomingfrombhadrons.Becauseofthematerialin
thetrackingdetector,photonscancreateelectron/positronpairs.Theseparticlesarepartofthe
jetsandariseforb-jetsandnon-b-jetsinratherlargeimpactparametervalues.Comparedtothe
trackobjectNeuroBayesoutputdistributiontheexpectedpeakontherightisreduced.
Theeventsintheleftpeakcorrespondagaintotracksnotcomingfromthebhadron.
Finallyanadditionalinputvariableisconstructedforthefinalb-jettagger.Thisisimplemented
byintegratingtheNeuroBayesoutputoftheelectronexpertstartingfromaspecificthreshold

84CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES
CMS simulations = 7 TeV450NeuroBayes <phi-t>® Teacher®pythia6 QCD b-jetpythia6 QCD non-b-jet
400no. of events in flat bins35030025020000.20.40.60.81
.00049 0.056 0.082 0 0.1.12 0.14 0.16 08 0.19 0.1.21 0.23 04 0.26 0.2.28 0.3 02 0.33 0.3.35 0.37 0.39 01 0.4.43 0.46 0.48 0 0.5.53 0.55 0.58 01 0.6.64 0.67 0.7 04 0.78.7 03 0.8.88 03 0.9 1 1.1.1 1 1.2 1.3 1.4.6 1 1.7 1.9 2.2.6 2.2 3 4.4
p(electron)T,relFigure5.17:Electroncandidateobjecttraining:Distributionofthetransversemomentumofthe
electronrelativetothejetaxis.
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
2500®®1600NeuroBayes <phi-t>® Teacherdata CMSSW38Xpythia6 MC b-jetNeuroBayes <phi-t>® Teacherdata CMSSW38Xpythia6 MC b-jet
1400no. of muons/0.0120001200no. of electron canditates/0.011000150080010006004005002000-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.811-10-0.80.1-0.60.2-0.40.3-0.20.400.50.20.60.40.70.60.8®0.90.811
NeuroBayes outputNeuroBayes outputFigure5.18:NeuroBayesoutputoftheleptonclassification.Left:electroncandidates.Right:
muons.Theb-jetobjectsareplottedinred.

5.3.NEUROBAYESB-JETTAGGER

85

addedonlyloss,whencorrelation
namesignificancethisremovedtoothers
muonSip3dSig92.2192.2142.0086.3%
muonChi253.0568.8449.8925.2%
muonPtRel24.5642.328.8885.3%
muonJetPparFrac16.4131.9711.9585.8%
muonEta10.7014.699.0519.4%
muonEtaRel5.1320.046.2495.2%
muonJetDeltaR5.8218.985.6794.7%
muonPhi3.023.113.011.4%
muonMom2.8322.362.8488.2%
muonSip2dSig2.5176.312.5185.9%
muonJetPpar0.0042.320.00100.0%
Table5.10:Inputvariablesofthemuoncandidateobjectclassification.Thecolumnsshowthe
relevanceofeachvariable.

ot>0.5foreachjet.
1Nelectron(Hb)=Nelectron(jet)o=0.5dot(electron)pdf(ot)
tMuontrainingIntable5.10theinputvariablesoftheNeuroBayesmuoncalibrationarelisted.
Themostimportantvariableisagaintohaveasignificantimpactofthemuon.TheNeuroBayes
outputdistributionisshowninfigure5.18ontheright.Oppositetotheelectroncase,themuons
areeasytodetect.BecauseofthelargemuonsystemofCMSadetailedmuonidentificationisnot
needed.Moreproblematicistheextrapolationofthemuonsintothetrackingdetectorandthe
mappingtoajet.Herethereconstructionqualitybecomesanimportantvariable.Thetransverse
momentumofthemuonrelativethejetaxisisimportantagain.Theargumentforthisisthesame
asfortheelectrons,becausethemuonsappearmostlyfromtheweakdecayofthebhadron.

JettrainingFinallytheinputvariablesoftable5.11areusefortheNBMCb-jettagger.The
inputvariablesareconstructedfromtheNeuroBayesoutputsforthedifferentobjects.Foreach
jetthenumberofthespecificobjectsfoundinthejetwascreated.Forthetracksandtheelectron
inadditionthegoodcandidatesarecounted.
ForeachjetatheNeuroBayesoutputsoiarecombined.Underthepresumptionthateachoutput
isanindependentestimateoftheprobabilitytobepartofab-jet,itispossibletocombinethe
valuesbymultiplyingthelikelihoodratiosΛ(oi)=k·oi/(1−oi),wherekistheratioofthetwo
targetsusedforthecalibration.
NΛ(b-jet|o)=Λ(oi)
iFinallyajetprobabilityP(b-jet|o)isdefined:
)oΛ(b-jet|P(b-jet|o)=1+Λ(b-jet|o)
Theindependenceassumptionisnotentirelycorrectforobjectslikeours,becauseofcorrelations
betweenthem.Togettheprobabilityright,correctionshavetobeapplied.Neverthelesswithout

86

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

addedonlyloss,whencorrelation
namesignificancethisremovedtoothers
jetTrackProb196.07196.0758.3890.6%
77.0%36.38168.5334.82jetNSV16.5%19.1145.3518.74jetNMuonjetVertexProb15.7450.9315.0535.1%
82.6%5.4661.388.23jetElectronProbjetNTrack4.3132.045.6472.4%
jetNGoodTrack3.97163.614.0991.3%
jetNElejetMuonProb2.263.6326.0839.692.113.6430.8%18.9%
jetNGoodEle0.6852.670.6882.5%
Table5.11:Inputvariablesofthejetclassification.Thecolumnsshowtherelevanceofeach
riable.av

anycorrectionthevariableshavegooddiscriminationpowerandcanbeusedfortheconstruction
tagger.-jetbaofIntable5.11theinputvariablesoftheNeuroBayesjetcalibrationarelisted.
Themostimportantinputvariableistheprobabilityestimatecalculatedfromthetrackobjects.
Thisisobvious,becauseitcontainsalmostthewholelifetimeinformationofthebhadronand
iscalculableforeachjet.Theothervariablesaremoreorlesscorrectionstothismaininput
variable.Furthertheexistenceofasecondaryvertexoramuonareimportantinformationsfor
theidentificationofab-jet.Atlastletushavealookatthevariableofgoodtracks.Compared
tothetotalnumberoftracksinthejet,thecorrelationtothetargetisstronglyincreased.The
significanceisintheorderofthesignificancefromthenumberofsecondaryvertices.Thestatement
thatwecounttracksfrombhadronsseemtobetrue.
TheNeuroBayesoutputisshowninfigure5.19.Theoutputisplottedinlogarithmicscaleforthe
y-axis.Becauseofthepowerfulseparationbetweenthetwoclasses,mostofthejetsareinafew
binsatlowandhighvalues.Thiswasachievedusingaround400000eventsforthecalibrationof
theNeuroBayesexpert.Thepowerfulseparationindicateshowsimpleaconstructionofab-tagger
is.Moredifficultistoimprovethis.

BoosttrainingTogaintheperformanceweneedaboosttraining.Usingthesimplestboost
weightwillcausesomeproblems.Becauseofthealreadypowerfulseparation,mostoftheevents
wouldgetaverysmallweight(see5.2.2).Thismakesafurtherseparationverydifficult.The
effectivestatisticsaftertheweightingaresmall.Togetintothewholeadvantagesofaboost
training,thenumberofeventsforthecalibrationmustincrease.
AtfirsttheprobabilityinterpretationoftheNeuroBayesoutputmustbetested.Therightplotin
figure5.19showstheexistenceofthisproperty.Thecalculatedpurityforeachbincomparesto
thediagonalwithinthestatisticaluncertainties.Theprobabilityinterpretationiscorrectforthe
inning.bnegivFortheexternalboostaNeuroBayestrainingwiththesamesetupasthepreviousonewasexecuted.
Asmentionedabovetherearedifferentapproachestoimplementsuchaboosttraining.Inthis
sectionIwillfocusontwodifferentapproaches.Thefirsttriestoimprovetheb-jetclassification
lookinginmoredetailintotheb-jetdistributions.Theweightsareonlyappliedonthebackground
sample.Theotheransatzdoesaweightingalsoonsignalsamplebutbalancestheeffectivestatistics
tobeinthesameorderastherealstatistics.

5.3.NEUROBAYESB-JETTAGGER87
CMS private work 2010s = 7 TeVCMS simulations = 7 TeV
104NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet1NeuroBayes <phi-t>® Teacher®
0.9no. of jets/0.010.83100.70.60.52100.40.3100.2pythia6 QCD b-jetdiagonal0.110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.811000.10.20.30.40.50.60.70.8®0.91
NeuroBayes outputNeuroBayes outputFigure5.19:NeuroBayesoutputofthejetclassification(left).red:theb-jets,black:otherjets.
Therearenojetsinthefirst5%oftheoutputdistribution.Nocutwasappliedhere.B-jetswithout
leptonandsmalllifetimelooklikealightjet.Intherightplotthepurityoftheb-jetdistribution
isshown.IfthepurityofeachbinmatcheswiththevalueoftheNeuroBayesoutputvariablethe
probabilityinterpretationisfulfilled.Thispropertyisrequiredfortheboosttraining.
Puritytagger.ForthefirstweusethefocusingfunctionFfpur=1.Therebytheweights
forthetarget1eventsarealwayswS=1.Takingthechangedstatistics1−P(fS|orot)theboosttraininginto
accountbringsusthefollowingweights,whichhastobeappliedonthetarget0events.
P(S|o)N(S)oN(B)N(S)
wB=1−P(S|tot)=ΛtNbb(B)=1−totNtt(S)Nbb(B)
Nb(S)andNb(B)describethenumberofeventsusedfortheboosttraining,whilethenumberof
eventsusedinthefirsttrainingarequotedasNt(S)andNt(B).
Ascontrolvariabletheoutputdistributionoftheunboostedtrainingisadded.Thisvariableshould
noteffecttheboosttraininganddoesnothavecorrelationstothetarget.
Anoverviewoftheinputvariablescanbeseenintable5.12.Moredetailedinformationaboutthe
vertexisnowthemostimportant.Alsotheb-jetprobabilityestimatedfromtrackinformations
isstillanimportantvariable.Inthefirsttrainingnotthecompleteinformationitcontainscould
beextractedandusedfortheclassification.Infigure5.20thedistributionofthisvariable,as
usedforthefirsttraining,isshown.Thebackground(black,simulatednon-b-jets),andsignal
(red,simulatedb-jets)areplottedseparatelyandthegoodseparationcanbeseen.Toavoid
overtraining,NeuroBayesreducesthedependencyonstatisticalfluctuations.Insteadofthered
curve,theregularizedbluecurveistakenforthecalibration.Thisprocedurecanaffectsome
informationloss,especiallyifthestatisticsaresmallinsomeregionsofthevariable.
Applyingtheweightscalculatedfortheboosttraining,theblackcurveistransformedintoone
similartotheblue.Fortheboosttrainingwewanttobefocusedonthepurityregion,the
statisticswereincreasedandwegetadistributionasshownontherightforthisvariable.With
theenlargedstatisticsNeuroBayesisabletoseethedifferencesofthetwoshapes,whichwerenot
apparentinthefirsttraining,duetothelowstatisticsofthebackgroundinthisregion.
Onecanobservethattheot(binned)givesarelevantcontributionfortheboostclassification.This
shouldnothappenifthecalibrationofthefirstexpertisperfect.Lookingatthedistributionshows
usthecauseoftheremainingcorrelationtothetargetaftertheweighing(figure5.21).
Thetargetdependencyoccursforlargevaluesoftheoutputdistributionatalmostone.Allthese

88

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

addedonlyloss,whencorrelation
namesignificancethisremovedtoothers
jetVertexProb20.2120.2115.6548.1%
jetTrackProb12.1517.7911.2450.8%
ot(binned)11.854.786.7564.1%
jetNTracks7.497.836.6725.4%
22.3%4.965.414.24jetElectronProbjetNGoodEle4.683.423.9439.0%
57.2%3.5111.933.08jetNMuonjetNGoodTrack2.578.472.7141.7%
17.3%2.604.542.54jetMuonProb25.3%2.122.052.02jetNSV37.3%2.003.772.00jetNEleTable5.12:Inputvariablesoftheboosttrainingforthejetclassification.Thecolumnsshowthe
relevanceofeachvariable.

s = 7 TeVpythia6 QCD non-b-jetpythia6 QCD b-jet

CMS simulations = 7 TeVCMS simulations = 7 TeV
160®®pythia6 QCD non-b-jetpythia6 QCD non-b-jet1600NeuroBayes <phi-t>® Teacherregularisationpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
1401400no. of events in flat binsno. of events in flat bins12012001001000808006060040400202000 4.15e-120 0.02120.2 0.1450.4 0.7370.6 0.9980.8 110 9.96e-100 0.6040.2 0.9650.4 0.9960.6 10.8 11
P(b-jet|tracks)P(b-jet|tracks)P(b-jet|tracks)P(b-jet|tracks)

Figure5.20:Theplotsshowthedistributionofthetrackprobability,whichisusedasinputvariable,
fortheunboosted(left)andtheboostedNeuroBayestraining.Theincreaseofstatisticsandthe
reweighingcauseagainofinformationforanimprovementoftheb-jettagger.

5.3.NEUROBAYESB-JETTAGGER89
CMS simulations = 7 TeVNeuroBayes <phi-t>® Teacher®pythia6 QCD b-jetpythia6 QCD non-b-jet
regularisation60no. of events in flat bins5040302010 0.01760 0.3780.2 0.7830.4 0.9610.6 0.990.8 0.9991
o_to_tFigure5.21:TheplotsshowthedistributionoftheNeuroBayesoutputofthemainexpert,which
isusedasinputvariable,theboostedNeuroBayestraining.Theprobabilityinterpretationwas
testedonawelldefinedbinning.Theweightingallowsustoseethestructurewithinthebins.
160CMS simulation®s = 7 TeV1CMS simulation®s = 7 TeV
NeuroBayes <phi-t>® Teacherpythia6 QCDpythia6 MC non-b-jet(jet)0.9NeuroBayes <phi-t>® Teacher
t2140no. of jets/0.01o0.81200.71000.60.5800.4600.3400.2diagonal200.1-10-0.80.1-0.60.2-0.40.3-0.20.400.50.20.60.40.70.60.8®0.80.911000.10.20.30.40.50.60.70.80.91
NeuroBayes outputo(jet)t1Figure5.22:NeuroBayesoutputofboosttrainingforthejetclassification.Inredaretheb-jets,
inblackareotherjets.
eventsbelongtoonebinofthediagonalizationhistogram.NeuroBayesisnotabletoresolveevents
insuchadetail.
Moreinsecureisanotherfact.Theweightsarelargeinthisregion.Asshownbeforeisthere
aproblemwiththecorrecterrorpropagationforeventswithlargeweights.Thiscaneffectthe
regularizationofthisspecialinputvariable.Forthefinalb-jettaggerIleftthisvariableout.
Figure5.22showstheoutputdistributionoftheboostedNeuroBayesMCtrainingontheright.
ComparedtotheunboostedNBMCweseeasmallerseparationbetweenthetwoclasses.Thisis
expectedbecauseallinformationsusedinthefirsttrainingarenotincludedinthesecondone.The
mainadvantageisthatthebothclassificationlesscorrelatedtoeachother.Infigure5.22onthe
rightascatterplotofthetwovariablesisshown.
TogetacombinedNeuroBayesb-jettagger,whichusestheresultsofthetwocalibrations,the
likelihoodratiosofthesingletrainingcanbemultiplied(seesection5.2.2).Thisresultsinafinal
powerfuldiscriminatortoidentifyb-jets.StartingfromnowIwillcallthisb-jettagger:NeuroBayes
combinedpuritytagger(NBcombPur).Theperformanceofthistaggerwillbeshownafterthe

90

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

addedonlyloss,whencorrelation
namesignificancethisremovedtoothers
jetNGoodEle15.7215.7216.2123.4%
jetTrackProb13.1111.7811.8216.8%
jetVertexProb10.1011.129.6712.9%
jetNTracks7.079.676.2824.3%
10.0%5.836.075.65jetMuonProbot(binned)4.163.834.1529.2%
8.0%4.064.523.88jetElectronProb26.1%2.800.163.07jetNSV9.2%2.042.362.05jetNMuonjetNGoodTrack2.021.411.8618.0%
14.9%1.291.961.29jetNEleTable5.13:Inputvariablesofboosttrainingforthejetclassification.Thecolumnsshowthe
relevanceofeachvariable.

introductionofanothertaggeroptimizedontheefficiencyregion.

Efficiencytagger.InadditionIwanttoconstructanotherb-jettagger,whichismoreperfor-
mantintheefficiencyregion.Thefocusingfunctionischoseninawaythattheeffectivenumber
ofeventsisconservedoverthespectrumoftheNeuroBayesoutputdistributionot.Thereforethis
variablewasstudiedtocreateanadditionalweightfactordependentontheirshape.
Normallythesignalandbackgroundeventsareweightedwithaspecificweight:wS=P(B|ot)and
wB=P(S|ot).Theeffectivenumberofeventsinotisthereforereducedbyα=P(B|ot)pdf(ot|S)+
P(S|ot)pdf(ot|B).Tobalancethisweneedthefollowingfocusingfunction:
effpdf(ot)P(S)P(B)
Ff=P(B|ot)pdf(ot|S)+P(S|ot)pdf(ot|B)=P(S|ot)P(B|ot)
TheapplicationoftheweightswT=(1−P(T|ot))Ffwiththetargetst=S,Bdoesnotchange
theeffectivenumberofevents.
ANeuroBayesexpertcalibrationwasarranged.Theoverviewoftheinputvariablesisshownin
.13.5tableTheorderingbyrelevanceoftheinputvariableschangedwhenapplyingtheotherfocusingfunction.
Especiallytheelectroninformationsaremoreimportant.
Thevariablesarelesscorrelatedtoeachotherwhichpointtothatthefirstcalibrationfoundsome
ofthedependenciesbetweenthevariables.Alsotheoutputvariableoftheformertrainingisless
importantcomparedtotheboostinthepurityregion,wherewehadtheissuewithlargeweights.
TheNeuroBayescalibrationresultsintheoutputdistributionplottedinfigure5.23.Theseparation
seemstobelessdevelopedthanfortheformerboosttraining.Thisisdeceivingbecauseofthe
differentweighing.Thetwooutputdistributionsarenotcomparable.
Forthefinalb-jettaggeralsoacombinationwithotmustbedone.Againthelikelihoodratiosof
thetwotrainingaremultipliedasdescribedinsection5.2.2.StartingfromnowIwillcallthisb-jet
tagger:NeuroBayescombinedefficiencytagger(NBcombEff).
HavingtwokindsofboostedNeuroBayesb-jettaggeravailableitwouldbeinterestingtocompare
them.Thiscomparisonisdoneonaindependentsample.Wemustproducesocalledperformance
plots.

5.3.NEUROBAYESB-JETTAGGER

= 7 TeVs

CMS simulations = 7 TeV1CMS simulations = 7 TeV
®®pythia6 MC non-b-jet800NeuroBayes <phi-t>® Teacherpythia6 QCD(jet)0.9NeuroBayes <phi-t>® Teacher
t2700no. of jets/0.01o0.86000.70.65000.54000.43000.32000.2diagonal1000.1-10-0.80.1-0.60.2-0.40.3-0.20.400.50.20.60.40.70.60.8®0.80.911000.10.20.30.40.50.60.70.80.91
NeuroBayes outputo(jet)t1

91

Figure5.23:NeuroBayesoutputofthealternativeboosttrainingforthejetclassification.Inred
aretheb-jets,inblackareotherjets.

AtCMSperformanceplotsareusedwherethemistagrateisplottedagainsttheb-jetefficiency.
Furthercomparisonsusingthepurityarealsopossible.Thealreadyexistingb-jettaggersfrom
CMSarepresentedin4.3.9.TocomparetheNeuroBayesb-jettaggeronlythemostseparating
oftheexistingb-jettaggersaretaken-thecombinedsecondaryvertexb-jettagger.Figure5.24
showstheperformanceplotforallNeuroBayesb-jettaggers:theunboosted,theboostedandthe
ined.bcomIntheplotweseethealreadyshownperformanceofthecombinedSVb-jettaggerandtheNeu-
roBayesb-jettaggerwewantedtoimprove.Theresultsofthetwodifferentboostcalibrationsare
alsoplotted,calledboostandalternativeboost.Asexpectedtheboostonpurityregionisless
performantoverawiderange.Mostoftheeventswereweightedtoaverysmallnumber,that
improvementinthisregionsareimpossible.Thealternativeboostismoresensitivetojetsinthis
region.AfterthecombinationwiththeformerNeuroBayesb-jettaggerisimprovedinthediffer-
entregionsdependentontheirassignment.Intheefficiencyregionthereisnotenoughspacefor
improvements.Onlyatinyshiftisvisible.Thisdiffersinthepurityregion.Bothcombinedtagger
cangaininthisregion.Thepuritytagger,whichisoptimizedforthisregiongivesthestrongest
enhancements.Onthissamplewefoundaalmostexponentialseparationoverthewholerange.

ThisbringsustotheconclusionfortheMCbasedb-jettagger.ThemainadvantageofaMC
trainingis,thatsignalandbackgroundiswelldefined.Regardlesswhichfractionofsignalto
backgroundeventsisarranged,fortheNeuroBayesclassificationwegetanoutputwiththebest
discriminationpowerforthesetwoclasses.Butforapplyingtheexpertiseondata,thesimulation
ofourtwoclassesmustbequitewell.Fortheb-jetwecanbequiteconfident,thatthesimulations
describedatawell.Indeedtheproductionmechanism,productionrateandfragmentationisnot
wellunderstood,butwehavegoodunderstandingofthelifetime,themassandtheleptonbranching
ratioandthesearetheinformationsweuseforb-jettagging.
Forthebackgroundsampleitismoredifficult.Wehavealreadyseenthatnoisetracksarenot
goodsimulated.Furtherwehaveunderlyingeventsandpileup,whichishardtosimulate.Toget
allbackgroundsourcesingoodshapemuchworkmustbedone,ifitispossibleatall.

92

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

CMS simulation1non b-jet-1ε10

-210

s = 7 TeV

-310combinedSVNB b-jetalternative boost-410boostb-jet comb Effb-jet comb Pur10-500.10.20.30.40.50.60.70.80.91
εb-jet

Figure5.24:PerformanceoftheNeuroBayesb-jettaggers.Allnewtaggersareplottedtogether
withthebestexistingtaggerfromCMS.

comparisondatatoMC5.3.3Havingagoodb-jettaggeronMC,agoodperformanceondataisnotguaranteed.Allinput
variableshavetobecheckedhowtheycomparewithdata.Ifthereisgoodagreementitismore
likelythattheb-jettaggerworksondata.
ThereforethisisoneofthemaintasksforthecommissioningoftheCMSexperimentandb-jet
tagging.Ifthereisagoodagreementbetweensimulationsanddata,weknowthatthedetectoris
wellunderstoodandusableforphysicsstudies.Otherwisewehavetorestrictthefieldforanalysis
totheknownregionsandstudythemisunderstoodareasfortheirissues.
ItispossibletouseNeuroBayesforthiscomparisonofdataandMonteCarlosimulations.
NeuroBayesissetupwithdefaultparameters.Thenumberofhiddenlayersischosentothenumber
ofinputlayersminusone.Eachinputnodeisfedbyoneoftheinputvariables.NeuroBayesisused
inclassificationmodewiththeglobalpreprocessingflag422,whichrepresentspreprocessingwith
theinternalboosttraining.Themaximumnumberofiterationsissetto100.NoBFGSalgorithm
isapplied.Eachvariableneedsatleast2σsignificancetobeusedfortheclassification.
ForsuchastudyaNeuroBayesexpertiscalibratedonthesampleofthesimulatedeventsasthe
onetargetandtheeventsofthedatasampleastheothertarget.Sotheoutputdistribution
discriminatesbetweensimulationandreality.Ifdataandsimulationagreewell,theNeuroBayes
outputdistributionshouldbecompatiblewithstatisticalfluctuationsaroundtheapriorifraction
ofthedatasampleandthesimulatedsample.
TheoutputdistributionsforallNeuroBayescomparisonscanbefoundinappendixB.
Besidestestingthequalityofthesimulation,NeuroBayesprovidesthetoolstoidentifythevariables
whichcausedifferencesbetweenthetwosamples.Itcomeswithanautomaticcalculationofthe
relevanceofeachvariabletoseparatethetwoclasses(see5.1).Inourcasewearepointeddirectly
tothevariablewiththelargestdiscrepancybetweendataandsimulation.

5.3.NEUROBAYESB-JETTAGGER93
160NeuroBayes <phi-t>® Teacher®CMS private work 2010sdata 38Xpythia6 QCD = 7 TeV1400NeuroBayes <phi-t>® Teacher®CMS private work, 39 pb-1sdata 38Xpythia6 QCD = 7 TeV850NeuroBayes <phi-t>® Teacher®CMS private work, 39 pb-1sdata 38Xpythia6 QCD = 7 TeV
1200800140no. of events in flat bins100no. of events in flat bins800no. of events in flat bins700
75010001206506006008040055060 6.04e-090 0.7210.2 1.050.4 1.780.6 8.460.8muonChi2muonChi2 9.43e+031200 -17093 -0.035 -0.0 -0.022.017 -0.014 -0.013 -011 -0.01 -0.0.0093 -00.2 -0.0085.0079 -0.0073 -0068 -0.0064 -0.006 -0.0.0057 -0.0053 -0.005 -0.0048 -00.4045 -0.0042 -0.004 -0.0.0038 -0.0036 -0.0034 -0.0032 -003 -0.0 -0.0028.0027 -00.6.0025 -0.0024 -0.0022 -0021 -0.0019 -0.0018 -0.0017.0 -0015.0 -0014.0 -0013.0 -00.8012 -0.001 -0.00091 -0.0008.0 -00068.0 -00056 -0.00045 -0.00033 -0.00022 -0.00011.0 -01 -2.50 -2.3 -2.1 -2.8 -1.7 -1.6 -1 -1.5 -1.4.3 -10.2.2 -1.1 -1 -12 -0.93 -0.85 -0.77.6 -09.5 -02.5 -04.4 -00.47 -0.39 -0.22 -0.24.1 -07.0 -0048 0.079 0.05 0.12 0.29.2 00.67.3 04.4 01 0.59 0.57 0.64 0.72.8 0 0.98.9 0 1.10.8 1.2 1.3 1.4.5 1 1.6 1.7 1.8 2η.1 2.3 2jet1
d(jet,track)Figure5.25:Left:FlatdistributiongivenbytheNeuroBayescomparionforthemuonχ2.Middle:
Flatdistributionofthedistanceofatracktothejetaxis.Forlargedistanceseffectsofnoisetracks
canbeseen.Right:ComparisonoftheηdistributionsofdataandPythia6QCDMC.Inthe
barrelregionη<1.5isagoodagreementisobserved.Intheforwardregionlargedifferencescan
seen.ebForouranalysiswecanusethisfeaturetotesthowwellsimulationdescribesdata.Thisstudywas
doneforallinputvariablesanddifferenttriggers,becausetheycorrespondtodifferentmomentum
ranges.Theresultisshownintables5.14and5.15.Theorderofthevariablesisthesameasin
thetablesabove,butforthedescriptionanabbreviationistaken.InthefollowingIwillusethis
abbreviations.Theentriesrepresentthecorrelationcoefficientstothetarget.
Whenlookingattheresultsofthecomparison,onecannoticethefollowingthings:Largervalues
representmoredifferencesbetweendataandsimulations.Thishappensmostlyforthemuon
candidates,whichimpliesthatmuonsinjetsarenotwellsimulated.Themaindifferenceappearin
theχ2valuesofthemuontrackfit(figure5.25).Theexactcausecouldnotbeidentified,because
themuonobjectscontaindifferenttypesofmuons(seesection4.3.5)anditispossiblethatthe
differenceiseffectedbyonesingletype.Forafinalclarificationwehavetowaitforanupdatefrom
theCMSmuonphysicsobjectgroup(POG).
Furtherweseeaneffectofthejetenergycorrection.Beforethecorrectionthediscrepencyislarger
thanafter(jetEnergyUCorrvs.jetPt).Thiseffectsalsothevariablescalculatedrelativetothejet
energy,likethemomentaofthetracks(PtRelandPpar).Ifthisquantitiesarenormalizedtothe
jetenergy,thefractionlooksquitefine(trackPtRelFracandtrackPparFrac).SoIdecideonlyto
keepthenormalizedones.
Anotherissuecanbeseenonthesignedimpactparametervariables.Forthetracksthetwo
dimensionalrφ-calculationismorereliablethanthethreedimensionalone.Becausethelarge
correlationbetweenthese,Idecidedtotakeonlythetwodimensionalones.
Themostimportantdifferencecanbeseeninthedistancebetweentrackandjetaxis(figure5.25).
Unfortunatlythevariableismultipliedwithafactor-1.Butneverthelesswecanseethatindata
trackswithaverylargedistancetothejetaxiswerefound.Thiscouldhappenbynoisetracks
whichareconnectedtothejetobjectandnotsimulatedinMonteCarlo.ToreducethisnoiseI
appliedacutatd(jet,track)>−0.1.
AlsotheΔRdistributionofthevertextothejetaxisaswellasthesumofthejettracksisnot
wellsimulated.
Asalreadyseeninthecorrectedjetspectrumofthetransversalmomentumisthereadiscrepancy
forthelowenergyjets.Inadditionalsoadifferenceonjetandtracklevelinηisfound.Thisisin
theforwardregion|η|>1.5(figure5.25).Fortheconstructionofab-jettaggerIwillrestrictto
thebarrelregionandpT>84GeV.

94CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES
nameJet15UJet30UJet50UJet70UJet100UJet140U
trackMom3.02.32.32.12.21.6
trackEta2.72.52.92.72.82.6
trackSip2dSig2.11.21.01.41.21.1
trackSip3dSig5.94.43.64.65.04.9
trackSip2d1.31.21.41.81.62.1
trackSip3d5.83.93.54.54.95.1
trackLxy6.34.33.44.33.83.9
trackPtRel5.35.24.95.15.14.7
trackPpar3.02.32.32.12.21.6
trackJetDeltaR1.41.91.82.62.93.0
trackJetDist8.06.35.06.46.66.8
trackPtRelFrac0.80.80.81.62.02.3
trackPparFrac0.80.80.81.62.02.3
trackChi24.33.02.83.34.45.2
trackPxHits1.52.62.62.93.13.9
trackHits1.53.12.63.43.94.8
trackBDist6.95.54.65.85.86.2
trackBdistSig5.94.73.74.64.84.9
trackBweight6.65.13.95.15.15.2
muonMom1.55.04.73.82.62.1
muonEta3.11.17.26.34.14.0
muonPhi1.31.31.31.32.82.6
muonSip2dSig2.42.31.41.01.01.5
muonSip3dSig0.91.81.62.01.43.3
muonPtRel6.97.69.36.96.36.3
muonEtaRel5.85.85.94.55.74.8
muonJetDeltaR5.84.94.13.23.53.8
muonJetPpar6.97.69.36.96.36.3
muonJetPparFrac1.44.75.44.33.02.8
muonChi27.48.110.59.59.811.3
eleMom0.41.11.32.21.52.2
eleEta2.50.90.41.61.41.4
elePhi0.70.21.21.10.72.1
eleSip2dSig1.10.81.11.61.82.7
eleSip3dSig2.30.80.52.30.53.2
elePtRel5.75.14.54.24.44.0
eleEtaRel3.63.33.73.83.94.5
eleJetDeltaR4.23.84.44.44.34.5
eleJetPpar5.75.14.54.24.44.0
eleJetPparFrac1.21.11.11.81.01.4
eleChi23.03.32.83.74.44.1
Table5.14:ResultofthedataversusMCcomparison.Thesinglevaluesrepresentthecorrelation
coefficientsofthelistedvariablestothetarget(data/MC).Largervaluesmeanlargerdifferences
betweendataandsimulation.Inthelastcolumnsomevariablesaremarkedwithasmallarrow.
ThisillustratesthatthedifferencesbetweendataandMCincreaseforthedifferentjetmomenta.

5.3.NEUROBAYESB-JETTAGGER95
nameJet15UJet30UJet50UJet70UJet100UJet140U
eleId1.21.51.92.22.21.8
eleZpos3.73.03.93.82.51.6
eleInvDeltaR5.33.24.13.03.33.1
eleGSFDif1.11.50.31.10.20.5
eleBrem1.10.60.91.41.72.2
vertexMass1.72.93.23.13.73.4
vertexPVDist2d2.91.53.82.52.74.9
vertexPVSig2d1.81.42.01.30.61.4
vertexPVDist3d1.71.02.52.12.44.5
vertexPVSig3d1.81.82.01.30.41.4
vertexJetDeltaR4.93.94.15.25.06.6
vertexJetEFrac1.94.03.32.72.91.2
vertexNtracks3.12.91.61.92.03.4
vertexTrackDeltaR7.07.07.78.59.28.0
vertexTrackEFrac2.43.54.13.53.23.7
vertexCategory1.61.60.20.20.31.5
jetPt2.80.60.20.20.20.4
jetEnergyUCorr3.83.84.23.63.62.6
jetEta3.83.94.33.93.83.1
jetPhi1.31.72.02.02.22.3
jetNTrack5.65.45.35.04.53.8
jetNSV0.20.20.30.10.40.3
jetNEle0.70.91.72.82.83.7
jetNMuon0.50.50.70.80.80.8
jetCSV4.74.84.24.74.75.7
jetBProb2.61.91.61.82.74.2
jetSSV0.20.50.90.40.70.6
jetSSVP0.60.60.80.60.90.8
jetMuonIP0.91.52.63.63.74.4
jetMuonPt1.42.23.34.04.14.8
jetTCHE4.72.93.03.03.03.4
jetTCHP5.33.53.33.83.74.1
Table5.15:ResultofthedataversusMCcomparison.Thesinglevaluesrepresentthecorrelation
coefficientsofthelistedvariablestothetarget(data/MC).Largervaluesmeanlargerdifferences
betweendataandsimulation.Inthelastcolumnsomevariablesaremarkedwithasmallarrow.
ThisillustratesthatthedifferencesbetweendataandMCincreaseforthedifferentjetmomenta.
Theboldvaluespointstounexpectedlargedifferencesbetweendataandsimulation.

96

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

#t0#t1target1numberofexpectedsignal
calibrationeventseventsfractioninputvarsruntimefractionindata
track41546522546047.8%19615.4s0.036
vertex16433510458252.6%1168.97s0.26
electron17566610235351.4%16144.69s0.041
muon1103887321649.8%1143.89s0.11
jet1174328681351.0%10180.3s0.033
boost15007717389951.3%10146.15s
Table5.16:RuntimeofthedifferentNeuroBayescalibrations.Forthedifferenttrainingsthe
numberofeventsandthetarget1fractionislisted.Becauseoftheweighingoftheeventsthis
numberdoesnotcorrespondtotheexpectedfromthenumberofevents.Thelastcolumnshows
howlongittakestogetthecalibrationoftheNeuroBayesexpert.

Atlastweshoulddiscusstheexistingb-jettaggers.Asconstructed,thesimplesecondaryvertex
b-jettaggersarewellunderstoodandveryrobust.Thecombinedsecondaryvertextaggerisnotyet
calibratedandthereforeshowsdiscrepanciesinthecomparison.Themuonb-jettaggerisdependent
onthejetmomentum.Forlargerjetmomentathedifferencesbetweendataandsimulationrise.
ToindicatetothisIaddedanarrowsymbolintothelastcolumnofthetable.Theelectronb-jet
taggeronlyusesonevariableoftheelectronproperties.Thisvariablelooksalsogood.Forthejet
probabilitytaggeralargerdiscrepancyinthehighpTregionisfound.Thetrackcountingtaggers
showdifferencesbutinanacceptablesize.
ThecompleteoverviewoftheinputvariablesandhowtheycomparebetweendataandMCcanbe
seeninappendixA.

5.3.4NeuroBayesdatatagger(NBD)
Inthelastsectionwefoundagoodagreementbetweendataandsimulations.Thisallowsusto
createanotherkindofb-jettagger:adatabasedb-jettagger.Theideaistobelessdependent
onthesimulationofthebackground.Thetrainingisarrangedinthesamewayasbefore,butfor
target0datasamplesareused.Target1isagainasimulatedsampleofb-jets.Thedataincludes
correctbackgrounddistributions,butalsosignaldistributions.Asshownin5.2.1forsuchasetup
itisalsopossibletotransformtheNeuroBayesoutputdistributiontothesignalprobabilityifthe
dataiswelldescribedbyMC.
Wewillseethataspecialpreprocessingisneededtomakethedistributionscomparable.Iwill
startexplainingthealternativeb-jettaggerbasedondatasamples(NBD)whichhasthesame
architectureastheclassicalMCbasedtagger(NBMC)describedabove.TheNeuroBayestrainings
aresetupwiththesamesettingsasbefore.Thereareonlytwodifferences:attarget0thedata
sampleisusedandfortechnicalreasonstheamountofstatisticshadchanged.Table5.16shows
thestatisticsusedforthecalibrations.
InthefollowingIstartwiththestudiesonthetracklevelcalibration.ThereIwillpointout
thespecialissueswhichappearfordatabasedtrainings.Iwillsuggestasolutiontoreducethis
problemandgoonwithanupdatedsetupfortherestofthecalibrations.

TrackThecalibrationoftheNeuroBayesexpertresultsintheoutputdistributionshowninfigure
eft.ltheon5.26Theresultlooksquitepromising.Theshapelooksaswewouldexpectitwithaprominentsepa-
ration,whichiscausedbythelargedependencyonthetrackimpactparameter.Ontherightside

5.3.NEUROBAYESB-JETTAGGER

CMS private work 2010s = 7 TeVCMS simulation®®NeuroBayes <phi-t>® Teacherpythia6 MC b-jet1000NeuroBayes <phi-t>® Teacher
data CMSSW38X2500no. of tracks/0.01no. of tracks/0.01800200060015004001000

500

200

s = 7 TeVpythia6 MC non-b-jetpythia6 QCD

-10-0.80.1-0.60.2-0.40.3-0.20.400.50.20.60.40.70.60.8®0.80.9110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.811
NeuroBayes outputNeuroBayes output

97

Figure5.26:Theoutputdistributionsofthedatabasedtrainingontracklevelontheleftandfor
aclassicalMCbasedtrainingontheright.

ofthedistributionwehavetracks,whichmainlycomefromBhadrons,thesignaleventsonthe
leftcorrespondtochargedparticlesgeneratedinthehadronizationprocess.Themainvariationin
shapeappearbecauseofthedifferingamountofstatisticsusedforthecalibrationoftheexpertand
thesignaleventspresentattarget0fortheNBDtraining.Letushaveamoredetaileddiscussion
ofthetwodistributions.
ThefirstobviousdifferencebetweenMCtrainingt1anddatatrainingt2istheenlargedgapfor
largeoutputvalues.Theexistenceofsignaleventsintarget0leadstoashiftofalloutputvalues.
Usingtheprobabilitytransformation,calculatedinsection5.2.1,wefindforthetwotraining
scenariosthefollowingdependencybetweenthetwooutputdistributions:
omcfmc/fd
od=1−P(S)+omc(P(S)(fmc+1)+fmc/fd−1)
odandomcaretheoutputvaluesofthetwoNeuroBayesexperts.fmcandfdarethecorresponding
fractionsofthetrainingsamplef=NN((TT1)0).P(S)istheunknownsignalfractionofthedatasample.
Theenlargedgapontherightcanbedirectlyextractedfromtheequation.Assumingamaximum
valueofomc=1fortheMCbasedcalibrationthemaximumofthedatabasedtrainingislimited
to1max(od)=P(S)fd+1.
ButwecanalsogoastepfurtherandtransformthecompletedistributionoftheMCbased
trainingtoanexpecteddistributionforthedatabasedtraining.Forafaircomparisonweapplied
thetwoexpertsonanotherMCsample(Pythia6,QCDDiJet,CMSSW36X),whichisstatistically
independentfromthesamplesusedfortheircalibrations.Theresultisisshowninfigure5.27.
Weseenowadominatingstructureproducedbymanytracksaround0.4.Thisissimilarforthe
NBDdistributionandtheNBMCdistribution.Inshapethereisasmalldifferencebetweenthe
two.Thedistributionconsistsoftwopeakingsubstructures.
Thecauseforthisvariationcanbeidentifiedlookingatthedistributionsoftheinputvariables.
Themovementofthesubstructureisaneffectofthechangeddistributionofthetransversetrack
momentumrelativetothejetaxispT,rel.Theresultoftheformercomparisonisshowninfigure5.28
ontheleft.WefoundthedifferencesbetweendataandMCalreadybythedata/MCcomparison
introducedinsection5.3.3.IndatawehavemoretracksatlargervaluesthaninMC.Thejetsin
dataarebroaderthansimulatedinMC.

98

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

CMS private work 2010s = 7 TeV4500NeuroBayes <phi-t>® Teacher®NBDNBMC expectation
4000no. of tracks/0.01350030002500200015001000500000.10.10.20.20.30.30.40.40.50.50.60.60.70.7NeuroBayes0.80.8®0.90.9 output11

Figure5.27:Theoutputdistributionofthedatabasedtrainingisplotted.Furtheracomparison
ofthedistributionwiththeexpectationsfromtheNBMCtrainingisshown.TheNBMCoutput
distribution(gray)wastransformedbythefunctionintroducedinthetext.

CMS private work 2010s = 7 TeV1CMS simulations = 7 TeV1CMS private work 2010s = 7 TeV
®®®360NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD0.9NeuroBayes <phi-t>® Teacher0.9NeuroBayes <phi-t>® Teacher
purity per bin3400.80.8extended purity per binno. of events in flat bins3200.70.73000.60.60.50.52800.40.42600.30.32400.20.2220regularisationregularisation0.10.1200 0.000160 0.240.2 0.390.4 0.580.6 0.940.8 1.6e+0130 0.000641020 0.263040 0.435060 0.677080 1.290100 1.9e+030 0.0001610 0.242030 0.44050 0.66070 0.998090 1.9e+01003
ppTT(track)(track)ppT,relT,rel(tracks)(tracks)ppT,relT,rel(tracks)(tracks)

Figure5.28:TheleftplotshowsthedifferencesbetweendataandMC.Indatamoretrackswith
largepT,relwerefoundthansimulatedinMC.Thiseffectsalsotheclassificationofb-jets.Theplot
inthemiddleshowsthepurityoftrackscomingfromb-jetsascalculatedbyMC,ontherightthe
extendedpurityoftheNBDtrainingisshown.

5.3.NEUROBAYESB-JETTAGGER

99

ThisleadstoadifferentpurityestimatebythetaggercalibrationsatlargepT,relvalues.The
plotinthemiddleshowsthepurityasextractedfortheMCbasedtraining.Therelativenumber
oftracksfromb-jetsincreasesforlargevalues.FortheNBDtrainingwecanmakeasimilarplot
whichdoesnotcorrespondtothepurity.InthefollowingIwillcallitextendedpurityα.Extended
means,thatsomesignaleventsarepresentinthedenominator,becauseoftheconstructionofthe
tagger,wheresignalistrainedagainstdata.
)S(NMCα=Nd+NMC(S)
NMC(S)isthenumberofsignaleventsfromMCsimulationandNd=N(B)+xN(S)isthe
numberofeventsfromthedatasample.Comparedtotherealpuritytheshapeismoreuniform
butshouldhavethesamevariationsatthesamepositions.
Therightplotinfigure5.28showstheαofthepT,relvariable.Theshapeisnotmoreuniformthan
inthemiddleplot,butisdifferentinthelargepT,relvalueregion.Thefractionofsignaleventsis
smaller.KnowingthedifferencesbetweendataandMCthisisnotsurprising.Wehavelessevents
MC.inregionthisinThequestionisnow:Howdosucheffectsaffectab-jettaggerwhichisbasedondata?Westill
achieveagooddiscriminationpower.Alsotheprobabilityinterpretationiscorrectifweaskfor
objectsliketheyaresimulatedinMC.Itistruethattheprobabilitytofindthesetracksfroma
b-jetwithlargepT,relisinrealitysmallerthanexpectedfromMC.Butthisdoesnotmeanthat
therearelesstracksfromb-jetsinreality-onlylesstrackswhicharesimulated.
Havingajustcommissioneddetectoritisnotexpectedtohaveeverythinginperfectstate.Looking
atphysicswhichiswellknownfromotherexperimentswedonotexpectanynewobservation.
AlmostalldiscrepanciesbetweendataandMCcanpointtoproblemsinthesimulation,wrong
assumptionsfortheresolutionofdetectorcomponentsorefficiencycalculations.Thereforeweare
abletodoasimplecorrection.Undertheassumptionthatthefractionofoursignaliscorrectly
simulated,andonlytheabsolutenumbersarewronglysimulated,wecandoareweighingofthe
samples.MCTheweightfactorcanbeextractedlookingatthepurityP(MC|xi)ofthedata/MCcomparison
forthevariableswhereweexpectacorrectsimulatedsignalfraction.Ifwewanttocorrectformore
variableswehavetoaccountforthecorrelationsbetweenthem.DoingaNeuroBayesclassification
givesusanestimateoftheoverallpurityP(MC|ot).Theapplicationoftheweightsissimilarto
aboosttrainingwheretheweightsforthedataeventsarewD=1.Thisisthefocusingfunction
Ff=P(MC1|ot).TheweightsfortheMCeventsare:
wMC=1−P(MC|ot).
P(MC|ot)
Acorrecthandlingofthisweightsneedsadetailedstudyoftheinputvariablesandthedata.
Thisisaambitiousgoalbutnotreallyachievable.Abigstepcouldbemadewiththeapplication
ofweightsextractedfromaoveralldata/MCcomparison.Theweightscanbecalculatedwith
aNeuroBayesexpertise.Thesamplewillbecorrectedonthelevelofthecomparison.Forall
inclusivesubsamplesstilldifferencescanoccur.
Finallyweapplytheweightscalculatedfromanoveralldata/MCcomparisonfortheconstruction
oftheNBDb-jettagger.Figure5.29showshowthedistributionscompareafterthiscorrection.
Thetwodistributionsbecomemoresimilar.ThismeansthatitisworthtocorrecttheMCsamples.
Ontheotherhandwestillfoundsomedifferences.Thereisaclearstructureontheleft,where
moretracksappearthanexpectedfromMC.Itseemsthatwefoundanotherdiscrepancybetween

100

CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES

CMS private work 2010s = 7 TeV®3500NeuroBayes <phi-t>® TeacherNBDNBMC expectation
3000no. of tracks/0.012500200015001000500

000.10.10.20.20.30.30.40.40.50.50.60.60.70.70.80.8®0.90.911
NeuroBayes output

Figure5.29:Theoutputdistributionofthedatabasedtrainingisplotted.Forthecalibration
acorrectionoftheMCdistributionswasapplied.Furtheracomparisonofthedistributionwith
theexpectationsfromtheNBMCtrainingisshown.TheNBMCoutputdistribution(gray)was
transformedbythefunctionintroducedinthetext.

dataandMC.ThispointstodifferencesbetweendataandMCoftheinclusivedistributionsof
non-b-jets.orb-jetsThisimpliestwothingsforthedatabasedb-jettagger.Ifthesediscrepanciesarecausedby
insufficientsimulationofthebackgroundprocesseswehaveastrongargumenttodothiscorrection.
WewillprofitfromthesituationthatfortheNBDweareindependentfromthebackground
simulations.Thisleadstoanimprovementfortheb-jettagger.
Ontheotherhanditisalsopossiblethattheb-jetsimulationsareinadequate.Thenthedata
basedtrainingleadstoamisinterpretationofthesample.Theprobabilityinterpretationisonly
validforMCb-jets.Furtherrealbutnotsimulatedb-jetsaretreatedasbackground.Tosolvethis
problemamoregeneralmodelofthesignaldistributionshastobedeveloped.
InthefollowingIdecidedtoaddanotherpreprocessingstep,whereIcalculatetheweightstoapply
themforthecorrection.Theverygoodknowledgeonbhadronsandtheexcellentstudiesofb-jets
atotherexperimentsmakesmebelieveingoodsimulationsofthisinclusiveclass.
Theadditionalpreprocessingmustbeincludedforalldatabasedb-jettaggingclassifications.

VertexFigure5.30showstheoutputdistributionsofthesecondaryvertexclassificationforthe
NBDtrainingandtheexpectationfromtheNBMC.Thedistributionslookssimilartowhatwe
expect.Weseealargegapontheright,causedbythesignaleventspresentinthetarget0(black)
distribution.Theshiftislargerthanforthetracks.Thisisexpectedbecauseareconstructed
secondaryverticesisalreadyagoodindicationofab-jet.Thereforeinthetarget0samplealarge
fractionofsecondaryverticesfromb-jetsispresent.FromMCsimulationsweexpect26%of
b-jets.FurtherIwillpointtothenumberoftracksassociatedtothesecondaryvertex.Thedistributions

5.3.NEUROBAYESB-JETTAGGER101
®CMS private work 2010s = 7 TeV®CMS private work 2010s = 7 TeV
NeuroBayes <phi-t>® TeacherNBDNBMC expectation220NeuroBayes <phi-t>® TeacherNBDNBMC expectation
250200180no. of vertices/0.01no. of vertices/0.012001601401501201001008060405020000.10.10.20.20.30.30.40.40.50.50.60.60.70.70.80.8®0.90.911000.10.10.20.20.30.30.40.40.50.50.60.60.70.70.80.8®0.90.911
NeuroBayes outputNeuroBayes outputFigure5.30:AcomparisonoftheNeuroBayesoutputdistributionsoftheNBDvertextraining
withtheexpectationsfromtheNBMCtrainingisshown.Ontherightacorrectionisappliedon
ample.sCMtheofthisillustratesthebehaviorintheNBDtrainingverywell.Figure5.31showsthepurityofthe
variableasusedfortheNBMCtrainingontheleftandthesamefortheextendedpurityofthe
NBDtraining.Forallbinsweseeashifttothecenter.Especiallyforthe100%puritybinsonthe
righttheshiftcausedbythepurityextensionisnicelyvisible.
Thevaluesofthesebinscanbetakentomakearoughestimateofthesignalfraction.Theextended
purityαisaround0.8.Weassumethatthebinscontainonlysignal:α=1/(P(S)+1).Thisleads
toasignalfractionaround25%,whichagreeswiththeexpectationsfromMC.
LeptonsFigure5.32showstheimprovementswegetafterapplyingthecorrectionsforthecal-
ibrationsonleptonlevel.Itisinteresting,thatfortheelectroncandidatestheweighingisnot
neededtogetaresultsimilartotheexpectations.
JetWehaveallNeuroBayescalibrationsofthejetobjectsinagoodstateanditispossibleto
takethemforthefinaljetclassification.Againwedothecorrections.Theweightsweredetermined
fromthedata/MCcomparison.Theresultofthecalibrationcanbeseeninfigure5.33.Theleft
plotshowstheoutputdistributiononthetrainedsample,whiletherightshowsitonaindependent
sample.Themethodseemstoworkverywell.
Tofinalizethedatabasedb-jettaggerinalaststeptheboosttrainingwasperformed.The
improvementsaresimilartothetheMCbasedtraining.Theperformanceisplottedinfigure5.34.
Insummarywecreatedtwodifferentb-jettaggers.ThefirstoneisbasedonMCsampleswhile
thesecondusesdatainsteadofbackgroundsimulations.Bothb-jettaggersarecompetitivetothe
existingonesorevenbetter.MoreimportantarethecomparisonsbetweendataandMC.With
thesestudieswehaveacompleteunderstandingofthetagger,whichleadstoagoodbeliefina
accuratefunctionalityforfurtheranalysis.
InthefollowingIwillpresentafirstusecaseofthenewb-jettaggerinaninclusiveb-jetcross
sectionmeasurement.Butalsofurtherapplicationsareimaginable,especiallyforthedatabased
b-jettagger.Aboveallinmostoftheanalyseswheresignalhastobeseparatedfromalarge
amountofQCDbackgroundthisansatzisaninterestingalternativetotheusualapproaches.In
manycasesthebackgrounddistributionconsistsofmanynotperfectlyknownsubprocesses.With

102CHAPTER5.NEWAPPLICATIONSOFNEUROBAYES
1CMS simulation®s = 7 TeV1®CMS private work 2010s = 7 TeV
0.9NeuroBayes <phi-t>® Teacher0.9NeuroBayes <phi-t>® Teacher
purity per bin0.80.8extended purity per bin0.70.70.60.60.50.50.40.40.30.30.20.20.10.10 0 22 3 44 5 66 7 88 9 1010 1112 12 1314 14 1516 16 1718 18 1920 230 0 22 3 44 5 66 7 88 9 1010 11 1212 13 1414 15 1616 17 1918
nntrackstracks(SV)(SV)nntrackstracks(SV)(SV)
Figure5.31:Theleftplotshowsthepurityofthenumberoftracksassociatedtothesecondary
vertexfortheNBMCtraining.OntherightthesameisshownfortheextendedpurityoftheNBD
training.Especiallyforthe100%puritybinsontherighttheshiftcausedbythepurityextension
visible.clearlyis®CMS private work 2010s = 7 TeV®CMS private work 2010s = 7 TeV
NeuroBayes <phi-t>® TeacherNBDNBMC expectation800NeuroBayes <phi-t>® TeacherNBDNBMC expectation
1000700no. of muons/0.01600800500no. of electron canditates/0.01600400300400200200100000.10.10.20.20.30.30.40.40.50.50.60.60.70.70.80.8®0.90.911000.10.10.20.20.30.30.40.40.50.50.60.60.70.70.80.8®0.90.911
NeuroBayes outputNeuroBayes outputCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
®®NeuroBayes <phi-t>® TeacherNBDNBMC expectation600NeuroBayes <phi-t>® TeacherNBDNBMC expectation
1000500no. of muons/0.01800400no. of electron canditates/0.01600300400200200100000.10.10.20.20.30.30.40.40.50.50.60.60.70.70.80.8®0.90.911000.10.10.20.20.30.30.40.40.50.50.60.60.70.70.80.8®0.90.911
NeuroBayes outputNeuroBayes outputFigure5.32:Theoutputdistributionsofthedatabasedtrainingareplotted.Theupperplotsshow
thedistributionswhennocorrectionisappliedontheleftforelectroncandidatesandontheright
formuons.Thelowerplotsshowthesamewithcorrections.

5.3.NEUROBAYESB-JETTAGGER103
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
5000NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet1400NeuroBayes <phi-t>® Teacher®NBDNBMC expectation
no. of jets/0.011200no. of jets/0.01400010008003000600200040010002000-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output11000.10.10.20.20.30.30.40.40.50.50.60.60.70.7NeuroBayes0.80.8®0.90.9 output11
Figure5.33:Left:theoutputdistributionoftheNBDtraining.Thisexpertiseappliedonan
independentsampleresultsinthedistributionplottedontheright.Itissimilartotheoneexpected
MC.fromCMS simulations = 7 TeV1non b-jet-1ε10-210-310NBD b-jetNBD boost-410NBD b-jet comb PurNBMC b-jet comb Pur-51000.10.20.30.40.50.60.70.80.91
εb-jetFigure5.34:PerformanceoftheNeuroBayesdatabasedb-jettagger.Insteadofsimulatedback-
groundeventsdataisused.Theperformanceofthisnewkindofb-jettaggeriscomparabletoa
based.MC

104

the

atad

signal

asedb

.study

happroac

neo

is

able

CHAPTERto

passyb

a

.5EWNetailedd

APPLICAtudys

on

TIONSOFkbacground

NEURand

of

OBAusc

Yon

ESthe

6Chapter

bjetcrosssectionmeasurement

InthischapterIwillpresenttheanalysisoftheinclusiveb-jetcrosssectionmeasurement.As
introducedinsection2.2ameasurementofthisquantityisoflargeinterest.Itisimportantfor
searchesofparticleswhichdecayintob-quarks.IntheStandardModelweknowthreeparticles
withthisproperty.Thet-quark,whichdecaystoab-quarkwithalmost100%probability,the
Zboson,whichisabletodecaytoabb¯pair,andtheWboson,wheresuchadecayisstrongly
suppressed.Inaddition,asafourthparticleoftheStandardModel,theexpectedHiggsboson
isabletodecayintobb¯pairs.Fortheseandmanyofthenewparticlesfrommodelsbeyondthe
StandardModeltheb-quarkisanimportantindicator.Thusb-quarkprocessesarealsoalarge
backgroundsource.Withtheresultsofanb-jetcrosssectionmeasurementitispossibletoscale
thebackgroundforsuchanalysisinamorereasonableway.
Butalsotheanalysisoftheb-jetquantityitselfisveryinteresting.Around20yearsago,thesame
measurementwasonthewaytocauseasensation.ThedetectorsatthehadroncolliderSPPS
andTevatronfoundadifferencebetweentheoryandexperiment(seesection2.2.3).Newphysics
modelswerediscussed,butintheendarecalculationofthenexttoleadingorderpredictionssolved
thisdisparityandwereapprovedbytheTevatronRun2experiments.
Amongotherthingstheoldmiscalculationswerecausedbyaninadequateknowledgeofthefrag-
mentationfunctionsandthepartondistributionfunctions.Thesearestillnotunderstoodcom-
pletelytoday.Ameasurementoftheb-jetcrosssectionwillgiveusaverificationoftheestablished
model.AboveallitispossibletotesttheQCDcalculationsintransversemomentumspace,which
isachievedbythecollisionsatLHC.Anotherdiscrepancybetweenexperimentandtheorywill
bringusclosertothediscoveryofphysicsbeyondtheStandardModel.
Thischapterincludesdifferentapproachesforsuchameasurement.InthefirstpartIwilldescribe
themethodaspublishedin[CMS10e].Thelasttwosectionswilldealwithupdatesdoneonthe
extendeddataandaalternativeapproachforab-jetcrosssectionmeasurementwiththeuseof
es.yNeuroBa

6.1RecentbcrosssectionmeasurementatCMS

InthefirstpartofthissectionIreviewtherecentstatusoftheCMSb-jetcrosssectionmeasurement
atanintegratedluminosityof60nb−1[CMS10e].Iperformedthisanalysistogetherwithcolleagues
fromtheCMScollaboration.Herethemainpartsaretakenaspublished.Someinformationwas
addedtobringitintothecontextofthisthesis.
Thereviewstartswiththespecificationofthecollecteddataatthattime.Theprocedureofb-jet
taggingispresentedforaverysmallamountofdata.Itfollowsthemeasurementoftheb-jetpurity

105

106CHAPTER6.BJETCROSSSECTIONMEASUREMENT

andtheb-jetefficiency.Togetthefinalresult,detectoreffectsareunfolded.Theuncertainties
bwilldiscussed.e

6.1.1Eventandjetselections
TheinclusivejetdatawascollectedusingacombinationofMinimumBiasandsinglejettriggers
(seesection4.1),whichareconsecutivelyusedinthelowestpTrangewherethetriggersarefully
efficient.Dependentonthesamllamountofdataeventsthequalityselectionwasappliedsimilar
tothatpresentedinsection4.5.
ThepTspectrafromindividualtriggersarenormalizedusingluminosityestimates[CMS10h]and
thencombinedintoacontinuousjetpTspectrum.Theintegratedluminositycorrespondsto60
nb−1.OnlyonetriggerisusedpereachpTbin,tosimplifytheanalysis.TherawpTspectra
areunfoldedusingtheansatzmethod[BBK71;FFF78],withthejetpTresolutionobtainedfrom
MC.TheuncertaintyofthejetpTresolutionisestimatedusingacomparisonofdijetpTbalance
betweendataandMC[CMS10f].

b-tagging6.1.2Theb-jetsaretaggedusingasecondaryvertexhigh-puritytagger(SSVHP[CMS10a]).Thesec-
ondaryvertexisfittedwithatleastthreechargedparticletracks.Aselectiononthereconstructed
3Ddecaylengthsignificanceisapplied,correspondingtoabout0.1%efficiencytotaglightflavor
jetsand60%efficiencytotagb-jetsatpT=100GeV.
Theb-taggingefficiencyandthemistagratesfromc-jetandlightjetflavorsaretakenfromtheMC
simulationandconstrainedbyadata/MCscalefactordeterminedfromdata.Thisb-tagefficiency
measurementreliesonsemileptonicdecaysofb-hadrons,thekinematicsofwhichallowfordiscrim-
inationbetweenbandnon-b-jets.Fitstothedistributionoftherelativetransversemomentum
ofthemuonwithrespecttothejetdirectionenabletheextractionoftheflavorcompositionof
thedata,andultimatelytheefficiencyfortaggingb-jets.Themistagratefromlightflavorjetsis
constrainedseparatelybyastudyusinganegative-tagdiscriminator[CMS10a].
Theproductioncrosssectionforb-jetsiscalculatedasadoubledifferential,

d2σb−jetsNtaggedfbCsmear
dpTdy=jetbΔpTΔyL,
whereNtaggedisthemeasurednumberoftaggedjetsperbin,ΔpTandΔyarethebinwidthsin
pTandy,fbisthefractionoftaggedjetscontainingab-hadron,bistheefficiencyoftagging
b-jets,jetisthejetreconstructionefficiencyandCsmearistheunfoldingcorrection.jet,bandfb
areallcalculatedfromMCinbinsofreconstructedpTandy,forconsistencywiththedata-based
methods.ThecorrectionfactorCsmearunfoldsthemeasuredpTbacktoparticlelevelusingthe
ansatzmethod,usedalsofortheinclusivejetcrosssectionmeasurementanddescribedin[CMS10h].

efficiencyb-taggingTheb-taggingefficiencywiththeselectionsusedinthisanalysisisbetween6%and60%atpT>
18GeVand|y|<2.0.TheefficiencyrisesathigherpTastheb-hadronproper-timeincreases.The
efficienciesestimatedfromMCareshowninFig.6.1.Tosmoothenoutstatisticalfluctuations,the
b-taggingefficiencyineachrapiditybinisfittedversuspT,andthefitresultisusedintheanalysis.

6.1.RECENTBCROSSSECTIONMEASUREMENTATCMS

CMS simulations = 7 TeV0.70.70.60.60.50.5b-tagging efficiencyb-tagging efficiency0.40.40.30.3|y| < 0.50.20.21.00.5≤≤ |y| < 1.5 |y| < 1.0
0.10.1≤ |y| < 2.01.5002020303040405050100100200200
ppTT (GeV) (GeV)

Figure6.1:b-taggingefficiencyindifferentrapiditybins.

107

ypuritamplesb-taggedTheb-taggedsamplepurityisestimatedusingtwocomplementaryapproaches.Inthefirstmethod,
theinvariantmassofthetracksassociatedtothesecondaryvertex,denotedsecondaryvertex
mass,iscomputedaftertheSSVHPselection.Afittothesecondaryvertexmassdistributionis
performed,takingtheshapesforlight,candb-jetsfromsimulationandlettingfreetherelative
normalisationsforcandb-jets,whilefixingthesmallcontributionfromlightjetstotheMC
expectation(“templatefit”).Thisfitallowsforarobustestimateoftheb-taggedsamplepurity
andconstrainsthemistagrateuncertaintyfromcjets.Anexampleofthetemplatefitsisshown
.2.6Fig.inInthesecondmethodtheb-taggingefficiencybaswellasthemistagratesforlightflavorland
charmcareestimatedfromMC.TheseareshowninFig.6.3.Multipliedbytheexpectedrelative
fractionsofb-jetsFb,cjetsFcandlightflavorjetsFl,alsoshowninFig.6.3intheinclusivejet
sample(withoutb-tagging),thetagratescanbeusedtocalculatetheexpectedpurityas
Fbbfb=Fbb+Fcc+Fll.
Theb-taggingefficienciesofcandlightjetsinFig.6.3(left)aremultipliedbytheirrelativefrequency
tob-jetstoillustratetheroughrelativecontributionsofFbb,FccandFlltotheb-taggedsample
atpT≈100GeV.Theresultingestimatesofb-taggedsamplepurityfromdataandfromMCare
showninFig.6.4.ThedataandMCarefoundthebeingoodagreement,withanoverallrelative
data/MCscalefactormeasuredtobe0.976±0.022(0.996±0.030)forb-jetsinthepTrange18–220
GeV(18–84GeV)andrapidity|y|<2.0.
GiventhegoodagreementbetweendataandMC,thecentralvaluesforpurityaretakenfromMC
toproperlytakeintoaccountthepTandydependence.

estimatesytuncertainb-taggingTheleadinguncertaintiesfortheinclusiveb-jetproductionarethosecomingfromjetenergyscale,
luminosity,b-tagefficiency,andmistagrates.The11%luminosityuncertainty[CMS10g]cancels
completelyintheratiototheinclusivejetpTspectrum,andtheJECuncertaintyproducesonlya

108

CHAPTER6.BJETCROSSSECTIONMEASUREMENT

160160160CMS preliminary, 60 nb-1s = 7 TeV
140140140|y| < 2.0Datab template
c templatelight template120120120100100100χ372≤ pT / NDF = 18.9 / 17 < 56 GeV
Number of jets / 0.25 GeVNumber of jets / 0.25 GeVNumber of jets / 0.25 GeV404040
8080806060602020200000000.50.50.51111.51.51.52222.52.52.53333.53.53.54444.54.54.5555
Secondary vertex mass (GeV)Secondary vertex mass (GeV)Secondary vertex mass (GeV)

Figure6.2:Exampleofsecondaryvertexmassfits.

1CMS simulations = 7 TeV1CMS simulation
bottomlightcharm× 2bottom× 10
Flavor fractions-1b-tagging efficiency10-110charm-210|y| < 0.5|y| < 0.5× 250.5≤ |y| < 10.5≤ |y| < 1
light10-311.5≤≤ |y| < 1.5 |y| < 210-211.5≤≤ |y| < 1.5 |y| < 2
2030405010020020304050
p (GeV)T

s = 7 TeV

200100 (GeV)pT

Figure6.3:Theb-taggingefficiencyandlight,charmmistagratesfromMCtruth(left).Bottom,
charmandlightfractionsofinclusivejetsfromMCtruth(right).

6.1.RECENTBCROSSSECTIONMEASUREMENTATCMS109
11CMS preliminary, 60 nb-1s = 7 TeV11CMS simulations = 7 TeV
0.90.9MCData|y| < 2.00.90.9
0.80.80.80.8b-tagged sample purityb-tagged sample purity0.50.5b-tagged sample purityb-tagged sample purity0.50.50.5≤ |y| < 1.0
0.70.70.70.70.60.60.60.6|y| < 0.51.0≤ |y| < 1.50.40.4Data / MC = 0.976 χ2 / NDF = 1.2 / 3± 0.0220.40.41.5≤ |y| < 2.0
0.30.320203030404050501001002002000.30.32020303040405050100100200200
ppTT (GeV) (GeV)ppTT (GeV) (GeV)
Figure6.4:Theb-taggedsamplepurityobtainedusingfitstosecondaryvertexmass(left).The
b-taggedsamplepurityestimatedusingb-taggingefficiencyandmistagratesfromMC(right).
smallresidualuncertaintyduetodifferencesinpTspectraandjetfragmentationbetweeninclusive
-jets.bandjetsTheleadingremaininguncertaintiesfortheratiobetweenb-jetandinclusivejetproductionarethe
b-taggingefficiencyandthecharmmistagrate,bothofwhicharecurrentlyinessencestatistical
uncertaintiesfromthedata-basedmethodstoconstraintheb-taggingefficiencyandtheb-tagged
samplepurity,andtheb-jetspecificJEC.Thelightquarkmistagratehasasignificantcontribution
tothetotaluncertaintyathighpTandforwardrapidities,butisotherwisenegligibleduetothelow
mistagrate.Theinclusivejetenergyscale,ontheotherhand,onlycontributesatpT<30GeV,
wheretheb-jetspectrumflattenswhiletheinclusivejetspectrumisstillexponentiallyfalling.
Theb-taggingefficiencymeasurementreliesonsemimuonicdecaysofb-hadrons.Thelimiting
factorsforthismeasurementarethelimitednumberofSSVHPtaggedjetscontainingamuon,the
uncertaintyinthec-andlighttemplateshapesandthesystematicuncertaintyingeneralizingthe
efficiencymeasuredonsemileptonicallydecayingb-jetstoallb-jets.Theobtainedscalefactoris
0.98±0.08(stat)±0.18(syst)forjetswithpT>20GeVand|y|<2.4[CMS10a].
Theuncertaintyonb-taggingefficiencyarisingfrompoorlyknownrelativecontributionsofflavor
creation(FCR),flavorexcitation(FEX)andgluonsplitting(GS)hasalsobeenstudiedindetail.
TherelativeangleΔRbetweentheb-hadronsisstronglydependentontheproductionmechanism.
Theb-hadronsproducedbyGS,inparticular,tendtobeclosetoeachotherinΔR,whichleads
toareducedefficiencyoftheSSVHPtagger.Thisuncertaintyisestimatedbyvaryingtherelative
contributionsinMCwithin±50%,constrainedbystudiesoftheratiobetweensecondaryvertex
energyandb-jetenergy,whichissensitivetothecontributionsofFCR+FEX(largeratio)compared
toGS(smallratio).Theb-taggingefficiencyasafunctionoftheΔRdistancebetweentheb-jets
isshowninFig.6.5(left).ThevariationversusΔRisobservedtobeupto25%,butcombined
withthemaximalvariationsoftheGSandFCR+FEXby±50%showninFig.6.5(right)this
uncertaintyisfoundtobelessthan2%.
Theb-taggingefficiencyuncertaintyisdominatedbythestatisticaluncertaintyinthedata-driven
method.Theuncertaintyisconservativelytakenasthestatisticaluncertaintyof8%inquadrature
withthe18%systematicuncertaintyandthe2%fromthedata/MCscalefactorof0.98thatisnot
appliedinthisanalysis,giving20%asthetotalsystematicuncertaintyfortheb-taggingefficiency.

110CHAPTER6.BJETCROSSSECTIONMEASUREMENT
0.040.44CMS simulations=7 GeVCMS simulationGS:+50%/FEX:-50%/FC:-50%s=7 GeV
0.035SSVHP0.42GS:-50%/FEX:+50%/FC:+50%Probability0.430≤ pT|y(jet)|<2.0 < 45 GeV0.0330≤ pT|y(jet)|<2.0 < 45 GeV
b-tagging efficiency0.380.0250.360.020.340.0150.320.010.30.280.0050.26012345000.511.522.533.544.55
ΔR between B mesonsΔR between B mesonsFigure6.5:Theb-taggingefficiencyvariationversusΔRbetweenb-hadrons(left).Distributionof
ΔRbetweenb-hadronsfor±50%variationsofGSandFC+FEX(right).
Itshouldbenoted,however,thattherobustnessofthedecaylengthobservablecandegradeat
pT>200GeV,whichshouldbetakenintoaccountinfutureupdatesoftheanalysisthatstartto
probethiskinematicregion.
Anadditional10%uncertaintyatpT>200GeVistakenintoaccountforthis,withtheextra
uncertaintylog-linearlyreducedto0%atpT=100GeV.
ThelightquarkmistagratecalculatedbyMCsimulationhasbeenvalidatedondatabystudies
usinganegative-tagdiscriminatortowithinasystematicuncertaintyofabout50%[CMS10a].
Thisuncertaintyhasbeendirectlypropagatedtothelightquarkmistagrateusedinthepresent
analysis.Thisuncertaintyisonlyafewpercentacrossmostofthekinematicrange,butgrowsup
to15%athighpTinthemostforwardrapiditybins.
Thecharmmistagrateisconstrainedbythesecondaryvertexmasstemplatefits,whoseresultsare
showninFig.6.4(left),withadata/MCscalefactorof0.976±0.022.Thetemplatefituncertainty
isconservativelytakenasthestatisticaluncertaintyof2.2%addedinquadraturewiththe2.4%
fromthedata/MCscalefactorof0.976thatisnotappliedinthisanalysis,giving3.3%asthetotal
systematicuncertaintyforb-taggedsamplepurity.Thesystematicuncertaintyforthetemplate
fitsduetofixingthelightquarkmistagratetotheMCpredictionhasbeentestedbyvarying
thelightquarkmistagrateby±50%andwasfoundtobenegligiblecomparedtothestatistical
uncertainty.Thesestudiesconstrainthecharmmistagrateuncertaintyto20%orbetter,which
isthenpropagatedintoanuncertaintyintheanalysis.Theresultinguncertaintyisaround3–4%
andflatinpTandy.
Thedifferenceofinclusivejetandb-JECwasstudiedbyusingtheMCtruthafterapplyingthe
standardinclusiveJEC.TheresidualdifferenceinMCislessthan1%atpT>30GeVwherethe
b-JECuncertaintycontributesmost,andthedifferenceindatacouldbeexpectedtobeofthesame
magnitude.DuetothesteeplyfallingpTspectrum,a1%b-JECuncertaintyleadstoabout5%
uncertaintyontheratioofb-jetandinclusivejetcrosssection.Hereitisinterestingtonotethat
directmeasurementsdoneatCDFusingZ→bb¯observedarelativeb-jetscaleof0.971±0.011
[D+08].Thesignificantlysmallerrelativeb-jetcorrectionexpectedatCMScanbeattributedto
theParticleFlowreconstruction,whichnativelyincludesmuonsfromsemileptonicdecaysandis
morerobustagainstdifferencesinjetfragmentationthanthecalorimetricjetsusedintheCDF

6.1.RECENTBCROSSSECTIONMEASUREMENTATCMS111
CMS preliminary, 60 nb-1s = 7 TeVCMS preliminary, 60 nb-1s = 7 TeV
8080Total uncertaintyb-tag efficiency (20%)|y| < 0.58080Total uncertaintyb-tag efficiency (20%)1.5≤ |y| < 2
6060Charm mistag (20%)Jet energy scale (5%)6060Charm mistag (20%)Jet energy scale (5.2%)
Light mistag (50%)Light mistag (50%)4040404020202020Uncertainty on b-jet production (%)Uncertainty on b-jet production (%)2020303040405050100100200200Uncertainty on b-jet production (%)Uncertainty on b-jet production (%)2020303040405050100100200200
0000-20-20-20-20-40-40-40-40-1b-jet pb-jet pTT (GeV) (GeV)-1b-jet pb-jet pTT (GeV) (GeV)
8080CMS preliminary, 60 nbs = 7 TeV8080CMS preliminary, 60 nbs = 7 TeV
Total uncertaintyb-tag efficiency (20%)|y| < 0.5Total uncertaintyb-tag efficiency (20%)1.5≤ |y| < 2
6060b-jet scale (1%)Jet energy scale (5%)6060b-jet scale (1%)Jet energy scale (5.2%)
4040Light mistag (50%)Charm mistag (20%)4040Light mistag (50%)Charm mistag (20%)
Uncertainty on b-jet ratio (%)Uncertainty on b-jet ratio (%)-20-20Uncertainty on b-jet ratio (%)Uncertainty on b-jet ratio (%)-20-20
202020200000-40-402020303040405050100100200200-40-402020303040405050100100200200
b-jet pb-jet pTT (GeV) (GeV)b-jet pb-jet pTT (GeV) (GeV)
Figure6.6:Leadingsourcesofsystematicsuncertaintyfortheb-jetcrosssectionmeasurementat
|y|<0.5(topleft)andat1.5≤|y|<2.0(topright),andfortheratioofb-jetandinclusivejet
crosssectionmeasurementsat|y|<0.5(bottomleft),and1.5≤|y|<2.0(bottomright).The
11%luminosityuncertaintyisnotshown.
t.measuremenFigure6.6showsasummaryoftheleadingsourcesofuncertaintyfortheb-jetcrosssectionandfor
theratioofb-jetandinclusivejetcrosssections.Thecontributionfromluminosityuncertaintyis
completelycanceledoutintheratio,andthecontributionsfromJECandJER[CMS10f]arelargely
reducedatpT>20GeV.Theremainingleadingsystematicsfortheratioareb-taggingefficiency,
relativeb-jetscaleandcharmmistagrate,allcontributingwithsimilarweightandleadingtoa
flattotaluncertaintyofabout20%atpT>20GeV.
ThereconstructedMChasbeenprocessedthroughthesameanalysischainasthedata,andthe
resultshavebeencomparedtotheMCtruthresults.Thisclosuretestfoundoverallagreementto
betterthan1%(10%)atpT>30GeV(pT>15GeV)and|y|<2.0.Theworseclosuretestatlow
pTcanbeexplainedbythelargesize(morethanafactoroftenatpT<20GeV)oftheb-tagging
correctionatlowpT,combinedwithrelativelypoorMCstatistics(10%uncertaintyat10GeV).

112CHAPTER6.BJETCROSSSECTIONMEASUREMENT
88CMS preliminary, 60 nb-1s = 7 TeVCMS preliminary, 60 nb-1s = 7 TeV
1010|y| < 0.5 (×125)MC@NLO|y| < 0.5
1010770.5≤ |y| < 1 (×25)5Pythia0.5≤ |y| < 1
1010661≤ |y| < 1.5 (×5)4(centered on ansatz)Exp. uncertainty1≤ |y| < 1.5
dy (pb/GeV)dy (pb/GeV)TT1010442
1010551.5≤ |y| < 231.5≤ |y| < 2
22σσ101022Data / NLO theoryData / NLO theory1
1/dp/dp1010330
2b-jet db-jet d11MC@NLO1
10101010-1-1exp. uncertainty2
1010-2-2Anti-kT R=0.5 PF1
20203030404010010020020002020303040405050100100200200
b-jet pb-jet pTT (GeV) (GeV)b-jet pb-jet pTT (GeV) (GeV)
Figure6.7:Measuredb-jetcrosssectioncomparedtotheMC@NLOcalculation,overlaid(left)and
asaratio(right).ThePythiapredictionisalsoshown,forcomparison.
tMeasuremen6.1.3Themeasuredb-jetcrosssectionisshownasastand-alonemeasurementinFig.6.7andasa
ratiototheinclusivejetpTspectruminFig.6.8.TheinclusivejetNLOtheorypredictionis
calculatedwithNLOJet++[Nag02]usingCTEQ6.6MPDFsets[P+02]andfastNLO[KRW06]
implementation.ThefactorizationandrenormalizationscalesweresettoμF=μR=pT.The
inclusiveb-jetpredictioniscalculatedwithMC@NLO[FW02;FNW03]usingtheCTEQ6MPDF
setandthenominalb-quarkmassof4.75GeV,givingatotalbcrosssectionof238μb.The
partonshowerismodeledusingHerwig6.510[M+92].TheresultsarecomparedtoaNLOtheory
prediction(MC@NLO)andtothePythiaMC(tuneD6T[Fan07]),andarefoundtobeingood
agreementwithPythiaandinreasonableagreementwithMC@NLO.TheNLOcalculationisfound
todescribetheoverallfractionofb-jetsatpT>18GeVand|y|<2.0well,butwithsignificant
shapedifferencesinpTandy.
FittingthemeasuredratioofdatatoPythiainthephasespacewindow30<pT<150GeVand
|y|<2.0toaconstant,weobtainaglobalscalefactorof0.99±0.02(stat)±0.21(syst),where
thesystematicuncertaintyisaweightedaverageoverallthebinscontributingtothefit.The
fithasχ2/NDF=43.4/47.RepeatingthesamefitfortheratiobetweenreconstructedMCand
generator-levelMCresultsinascalefactorof1.009±0.005withχ2/NDF=246/46,confirming
goodclosureoftheanalysischain.Finally,theNLO/MCglobalscalefactoris1.04±0.05.
Thetotalbcrosssectionof238μbfromtheMC@NLOcalculationhasasizableuncertaintyfrom
thechoiceofrenormalizationscalebetweenμR=0.5andμR=2(+40%,−25%),fromCTEQ
PDFvariations(+10%,−6%),andfromthechoiceofb-quarkmassbetween4.5GeVand5.0GeV
(+17%,−14%).Thedominantscaleuncertaintyisoverlaidasanuncertaintybandaroundthe
MC@NLOpredictioninFigs.6.7(b)and6.8.
Theapplicationoftheformerunfoldingmethodisdifficulttoreuse.Thisisbecauseofthestrongly
fallingpTspectrum,whichiscoveredintheupdatemeasurement.ThepTdistributionranges
manyordersofmagnitude,wheretheunfoldingmethodleadstoabiastohighervalues.There
areongoingstudiestosolvetheseproblems.Theb-taggingmeasurementscanbeupdatedtothe
36/pb.

113

6.2.UPDATEOFTHEFLAVORCONTENTFITTER113
1-CMS preliminary, 60 nbs = 7 TeV0.1PythiaMC@NLOAnti-kT R=0.5 PF
Exp. uncertainty(centered on ansatz)b-jet / inclusive jetb-jet / inclusive jet 00.5≤ |y| < 1
0.05|y| < 0.50.051≤ |y| < 1.50.051.5≤ |y| < 2 02020303040405050100100200200
ppTT (GeV) (GeV)
Figure6.8:Measuredb-jetcrosssectionasaratiotoinclusivejetcrosssection.TheNLOtheory
andPythiaMCpredictionsareshownforcomparison.

6.2Updateoftheflavorcontentfitter
InthissectionIwillpresenttheresultsoftheb-jetcrosssectionmeasurementusingaflavorcontent
fittoextractthefractionofb-jetsinasocalledtaggedsample.Taggedmeans,onlyjetswitha
largeprobabilitytobeab-jetareselected.Thisisobtainedbyacutonthesignificance,howlikely
itis,thatthesecondaryvertexhasalifetime.Thisisanimportantpropertyofb-jets,becausethey
containadecayvertexoflong-livedBhadrons.Togetthecrosssectionright,furtheranestimate
oftheb-jetefficiencyforthetaggedsampleisneeded.
InthefollowingIwillintroduceinthemethodoftemplatefitting,whichisusedfortheflavor
contentfit.ThereafterIwillspecifythearea,wherethesefitsareappliedandpresenttheresults
dependentondifferentintervalsintransversalmomentumandrapidityofthejet.

fitmplateeT6.2.1JustforcompletenessIdescribeherethemethodofabinnedloglikelihoodfit.Givenahistogram
withaknownnumberofbinsnbinsfilledwiththevaluesfromthevariableofinterest,thestatistics
diineachbinifollowaPoissondistribution.Thegoalofthebinnedloglikelihoodfitistovary
theparameterspofagivenmodelFitothemostprobablevalues.Thisisdonebyaminimization
oftheextendedlog-likelihoodfunctionwithTMinuit.Thestatisticaluncertaintiesarecalculated
byMinos.Thebinnedloglikelihoodfunctionisshowninthefollowingequation:

nbins−2logL=di·log(Fp,i)−Fp,i
=0iTherearedifferentwaysfortheparametrizationp.Theparametrizationmustbechosendependent
ontheinformationoneisinterestedin.Thefirstparametrization(p=0)estimatesthenumberof
eventsforeachtemplate.FornttemplateswehavethesamenumberoffreeparametersN0...Nnt.

ntF0,i=Fi(N0...Nnt)=Nktk,i
=0k

114

CHAPTER6.BJETCROSSSECTIONMEASUREMENT

Eachparameterisanestimateofhowmanyeventsofatemplateclasskareinagivendatasample.
Thefitextractsthenumberofeventsanditsstatisticaluncertainties.
Thesecondparametrization(p=1)triestoextractthefractionsfkofeventstothetotalnum-
berofeventsinthegivendatasampleNtot.ThereforeNtotisoneofthefreeparameters.A
parametrizationwithallfractionsfkisnotpossible.Onefractioncanbereplacedbytheothers:
fk=1−j=kfj,becausethesumofallfractionshastobe1.
Anotherdifficultyisthatthefractionshavenaturallimits.Theyhavetobebetween0and1.
Thegoalistohaveaparametrizationwhichcantakethisintoaccount.Soitisonlypossibleto
estimateoneofthefractionsr0∈fkatonetime.
⎞⎛nt−1k−1nt−1
k=0j=0j=0
F1,i=Fi(Ntot,r0...rnt−1)=Ntot⎝rktk,i(1−rj)+tnt,i(1−rj)⎠
Inprincipleitispossibletoestimatethevaluesofoneparametrizationoutoftheparametersfrom
theother.Duetotheasymmetricuncertaintiesofthevaluestheerrorpropagationisdifficultand
anadditionalfitismorereasonable.

6.2.2pT/|y|binning
ChoosingthepTandybinningforadifferentialjet-cross-sectionmeasurementhastotaketwo
conflictiveaspectsintoaccount.Ontheonehand,onewantstomakeaveryfinebinninginorder
tohaveenoughwelldefinedpointstofittheassumedfunctionto.Ontheotherhandonehasto
haveenoughstatisticsineachbinfordoingreasonabletemplatefitswhichyieldthefractionofb
jetswithagoodenoughprecision.
AlsoitisnotnecessarytorunoverallMCsamplesforcreatingthetemplates.Weonlyusedthe
oneswhichinfluencethestatisticsbymorethen0.1%.Alsowetrytoavoidisolatedeventswith
verylargeweight.ThedifferentbinsinpTandtheMCsampleselectionforeachbinislistedin
table6.1.Furtherthereisanadditionalbinningintothebarrelregion|y|<1.5andtheforward
region1.5≤|y|<2.5ofthedetector.Studiesofafinerbinninginrapidityneedamergingofthe
pTbins.FortheupdateoftheCMSphysicalanalysissummary[CMS10e]suchastudywasdone,
butisnotpresentedinthisthesis.

esultsrFit6.2.3Theflavorcontentfitisperformedtomeasurethefractionofb-jetsinataggedsample.Anenriched
b-jetsamplewasused.Thereforethejethastopassthetightwokingpointofthesimplesecondary
vertexpurityb-jettaggerSSVP>2.Thisrepresentsacutonthesignificanceofthesecondary
vertexflightdistance[Sch08].Itisrequiredthatthevertexisreconstructedwithatleastthree
tracks.Theselectionresultsinapuresampleofb-jetswithasmallfractionoflightjetsandc-jets.
Thisselectionismotivatedtoreducethesystematicuncertaintiesoftheflavorcontentfitusingthe
secondaryvertexmassmSV.Thereforewehavetorequireawellunderstoodsecondaryvertex.
Figure6.9showsthedistributionofthesimplesecondaryvertexpurityb-jettaggerandthevertex
massaspublishedfortheCMScommissioningin[CMS10a].Thereisagoodagreementbetween
CManddataTheflavorcontentfitsareperformedinthedifferentbinsofthetransversalmomentumpT.Three
templatesforb-jets,c-jetsandlight-jetswerecreatedforeachpT/yregionforthefinalresult.As
acrosschecktheb-jetfractionwasalsoestimatedwithtwotemplates(b-jetandnon-b-jet).The
resultswerethesameinthestatisticalcontext.Table6.2showtheresultsoftheflavorcontent

6.2.UPDATEOFTHEFLAVORCONTENTFITTER115
samplesMCpTrange[GeV]QCD0to5QCD5to15QCD15to30QCD30to50QCD50to80QCD80to120QCD120to170QCD170to300QCD300to470QCD470to600QCD600to800QCD800to1000QCD1000to1400QCD1400to1800QCD1800
37≤pT<43xxxxxxHLTJet15U
43≤pT<49xxxxxx
49≤pT<56xxxxxx
56≤pT<64xxxxxx
64≤pT<74xxxxxx
74≤pT<84xxxxxx
84≤pT<97xxxxxxHLTJet30U
97≤pT<114xxxxxx
114≤pT<133xxxxxxHLTJet50U
133≤pT<153xxxxxx
153≤pT<174xxxxxHLTJet70U
174≤pT<196xxxxx
196≤pT<220xxxxxHLTJet100U
220≤pT<245xxxxx
245≤pT<272xxxxxHLTJet140U
272≤pT<300xxxxx
300≤pT<330xxxxxxx
330≤pT<362xxxxxxx
362≤pT<1000HLTJet180Uxxxxxxx
Table6.1:selectedbinsfortheanalysis
CMS Preliminary 2010,s = 7 TeV, L = 15 nb -1CMS Preliminary 2010,s = 7 TeV, L = 15 nb -1
120012001200Data120012001200Data
Sim.(light)Sim.(light)100010001000Sim.(charm)222100010001000Sim.(charm)
Sim.(bottom)Sim.(bottom)800800800800800800
Entries/0.14Entries/0.14Entries/0.14400400400Entries/0.16 GeV/cEntries/0.16 GeV/cEntries/0.16 GeV/c400400400
600600600600600600
200200200200200200
1.00501.0050
1Data/Sim1Data/Sim0.50.501234SSV High Pur Discriminator567012345678Three track vertex mass [GeV/c2]
Figure6.9:Distributionofthesimplesecondaryvertexhighpurityb-jettaggerandthecorre-
spondingvertexmassreconstructedwithatleastthreetracks.

116

CHAPTER6.BJETCROSSSECTIONMEASUREMENT

3templates2templates
pTrange[GeV]fberr(stat)fberr(stat)
37≤pT<430.7180.0140.7190.014
43≤pT<490.7220.0170.7210.017
49≤pT<560.7470.0190.7500.019
56≤pT<640.7610.0220.7610.022
64≤pT<740.7450.0270.7420.027
74≤pT<840.8150.0370.8150.038
84≤pT<970.7410.0130.7410.013
97≤pT<1140.7190.0160.7200.016
114≤pT<1330.6930.0080.6910.008
133≤pT<1530.7130.0120.7130.012
153≤pT<1740.7200.0110.7170.011
174≤pT<1960.7020.0150.7020.015
196≤pT<2200.7290.0160.7270.016
220≤pT<2450.6890.0240.6900.023
245≤pT<2720.6780.0260.6850.025
272≤pT<3000.6710.0350.6710.034
300≤pT<3300.7100.0440.7120.045
330≤pT<3620.7350.0610.7550.065
362≤pT<10000.5840.0720.5810.070
Table6.2:Fractionsofb-jetsinataggedjetsampleextractedbyfittingonthesecondaryvertex
distributionmass

fits.Thestatisticaluncertaintiesarealsoquoted.Furthertheresultforthethreetemplatefitis
plottedinfigure6.10.ThefitresultofeachbincanbeseeninappendixD.
Withinitsstatisticallimitationthefitresultsagreewiththeexpecteddistribution.Theexpectation
valuesarecalculatedfromthesampleswhichwereusedtodeterminethetemplatesforthefit.
ForthehighpTjetsweseeanoverestimationoftheb-jetfraction.Toexplainthisthespectrum
wasstudieddependentonthenumberofprimaryverticesexistingintheevent.Theresultofthis
isshowninfigure6.11forthebinswithenoughstatistics.
Inlinewiththestatisticsitispossibletoarguethatthemoreflatdistributioniscausedbythe
existenceofmorethanoneprimaryvertex.Thesecouldbeeffectsofanunderlyingevent[CMS10i]
oradditionalproton-protoninteractions(pileup).Toclarifythisissuefinally,morestatisticsis
needed.Atlastwetriedtostudythepuritydependentontheirrapidity.Thereforewehadtoreducethe
numberofbinsinpTtogetsufficientstatistics.Theresultisplottedinfigure6.12.
Againwefindagoodagreementbetweendataandsimulations.

tiesuncertainSystematic6.2.4Inthissectionstudiesonthesystematicuncertaintiesofthefittingprocedurearepresented.

statisticsmplateeTThebasicideaisthateachtemplatehasarandomstatusofthetruthdistributionforagiven
numberofevents.Eachbincontentfluctuatesaroundanunknowntruthmeanμ.Thefluctuations
followaPoissondistribution.Thedisagreementbetweenthesevaluesandthetruthcontributesas

6.2.UPDATEOFTHEFLAVORCONTENTFITTER

1CMS private work, 36 pb-1s = 7 TeV
nbi0.9y /t0.8 per p0.7bf0.6ets 0.5j-b0.4 f0.3on oidata (|y| < 2.5)0.2ractMC expectation0.1f021010

210

117

idata (|y| < 2.5)0.2ractMC expectation0.1f0321010pT, jetFigure6.10:ResultforthepTspectrumdeterminedbytheflavorcontentfitterfor|y|<2.5.The
parametrizationischosentomeasuredirectlytheb-jetfractionfortaggedjets(p=1).Only
statisticalerrorsareshowninthisfigure.

1CMS private work, 36 pb-1s = 7 TeV1CMS private work, 36 pb-1s = 7 TeV1CMS private work, 36 pb-1s = 7 TeV
/y bint0.9/y bint0.9/y bint0.9
0.80.80.8 per pb0.7 per pb0.7 per pb0.7
0.60.60.60.50.50.50.40.40.4fraction of b-jets f0.1MC DiJet 36Xfraction of b-jets f0.1MC DiJet 36Xfraction of b-jets f0.1MC DiJet 36X
0.30.30.30.2data (|y|<2.2, nPV=1)0.2data (|y|<2.2, nPV=2)0.2data (|y|<2.2, nPV>2)
010201020102
pt, jetpt, jetpt, jet
Figure6.11:Studyofthedependencyonpileupeffects.Thehistogramsshowtheb-jetfraction
ofataggedsamplefordifferentnumbersofprimaryvertices.Thestatisticsistoolowtoclaima
onclusion.cfinal

118

CHAPTER6.BJETCROSSSECTIONMEASUREMENT

1-CMS private work, 36 pb1/y bin0.5t per p0b-0.5-1fraction of b-jets f-21.51.0≤≤ |y| < 1.5 (-1) |y| < 2.0 (-1.5)
-1.5Pythia 6 QCD TuneZ20.50.0≤≤ |y| < 0.5 |y| < 1.0 (-0.5)
2.0≤ |y| < 2.5 (-2)-2.5210

s = 7 TeV

310pt, jet

Figure6.12:Theplotshowthemeasureddependenciesoftheb-jetsinpTand|y|determinedby
theflavorcontentfitter.Thereisaagreementbetweendataandsimulations.

asystematiceffecttoourfittingprocedure.
Toestimatethissystematicswevarythetemplatesbychangingthecontentofeachbinwitha
randomnumberfollowingaPoissondistributionwiththemeanoftheoriginalbinvalue.These
newtemplatesarefedtotheflavorcontentfitter.Thisisdonemanytimes.Theresultsofsucha
variationcanbeseeninfigure6.13.Thefitresultsvaryaroundtheoriginalfitvalue.Thewidth
ofthisdistributioncanbetakenasanestimateforthesystematicuncertaintyσtempStat.
Figure6.14showsthesystematicuncertaintyσtempStatwepresumedforthedifferentbins.The
systematicsintheforwardregionarehigherduetothesmallerstatisticsinthisregion.Furtherwe
seeariseoverpT.Thisrisingdrops,whenabinisreachedwhereadifferentMonteCarlosample
isused.ThisconfirmsthedecisiontouseonlythemostrelevantsamplesforaspecificpTregions.
Overalltheeffectsoflargeweightscontributetothiskindofsystematicuncertainty.

candlightfractions
Theshapesofthetemplateslookverysimilarforc-jetsandlightjets.Theflavorcontentfitwas
performedwithtwoandalsowiththreetemplates.Asshownintable6.2theresultofbothisthe
sameforallfittingregions.Theresultfortheb-jetfractionisindependentfotheusageofthetwo
orthreetemplatemethod.
Besidesthemeasurementoftheb-jetfractionitisalsointerestingtoanalyzethec-jetfraction.
Withtheresultofthethreetemplatemethodwealsogetanestimateonthisquantity.Figure6.14
showsontherightthefractionofthec-jetsfc.Tocalculatethec-jetfractioninthesamplethe
fittedvaluesmustbemultipliedbythenon-b-jetfraction.

fc=rc(1−fb)
Dependentontheselectionstogetapureb-jetsamplethestatisticsformeasuringthisispoor
andwegetlargestatisticalerrors.Adetailedstudyonfcisnotpartofthisthesis.Nevertheless

6.2.UPDATEOFTHEFLAVORCONTENTFITTER119
×103CMS private work, 39 pb-1s = 7 TeV
sysHist0sysHist0180|y|≤ 1.537 GeV < pT≤ 43 GeVMeanMeanEntriesEntries 0.7189 0.7189 1001 1001
160 0.004289RMS 0.004289RMSentries per bin1401201008060402000.7050.710.7150.720.7250.73
fbFigure6.13:Thefitresultsofdifferentfbestimationswithvariedtemplatesdescribenearlya
Gaussiandistributionwiththemeanoftheoriginalvalue.Thewidthofthiscanbetakenas
systematicuncertaintyσtempStat
0.05CMS private work, 39 pb-1s = 7 TeVCMS private work, 36 pb-1s = 7 TeV
Pythia 6 QCD TuneZ2tempStat0.0451.5 < |y| ≤ 2.510.50.0≤≤ |y| < 0.5 |y| < 1.0 (-0.5)
σ0.04|y|≤ 1.5/y bint1.51.02.0≤≤≤ |y| < 1.5 (-1) |y| < 2.5 (-2) |y| < 2.0 (-1.5)
0.5 per p0.0350cf0.03-0.50.0250.02-10.015-1.50.01-20.0050102103102103
ppTt, jetFigure6.14:Theplotontheleftshowsthevarianceoftheflavorcontentfit,ifthetemplateswere
variedwithintheirstatisticaluncertainties.Thesecontributetothesystematics.Ontheright
thefractionofc-jetsfciscomputedandcomparedtotheMCexpectations.Thegoodagreement
allowsamorestrictestimateontheuncertaintycausedbybadlysimulatedc-jets.

120

CHAPTER6.BJETCROSSSECTIONMEASUREMENT
1CMS private work, 36 pb-1s = 7 TeV
nPythia 6 QCD TuneZ2bi0.90.50.0≤≤ |y| < 0.5 |y| < 1.0
y /T1.51.0≤≤ |y| < 2.0 |y| < 1.5
2.0≤ |y| < 2.50.8 per p0.7b0.6∈0.50.40.30.20.10321010

210

310pT, jet

Figure6.15:Efficienciesforb-jetspassingtheSSVHP>2.0cutbypTandy.Thevaluesare
extractedfromaPythia6QCDMonteCarlogenerator.

theagreementbetweendataandsimulationsinthestatisticalcontextleadstoasmallsystematic
uncertaintyinvolvedbyapossiblydefectivec-jetcontribution.Usingthethreetemplatemethod,
systematicuncertaintiescausedbythisarecoveredbyastudyofthevariationsofthetemplate
tatistics.sbin

6.2.5TaggingefficienciesfromMonteCarlosimulation
Asdescribedabove,thejetsarerequiredtohaveaSSVHPdiscriminatorgreaterthan2.0.Thisis
the“tight”workingpointsuggestedbytheb-tagginggroup.Thiscutrejectsmostofthelight-jets
whichmightnotbewellsimulatedinMCandcouldintroducehugedifferencesbetweenthelight
contentofthedatasampleandtheMClight-template.Butontheotherhandonehastomeasure
theefficiencyforab-jetpassingthisdiscriminatorrequirementforeachpT/|y|bin.Sincethe
statisticsondataistolowforanefficiencymeasurement,thisisdoneonMC.
Theb-taggingefficiencyεbisdefinedas:

Nbtagged
εb=Nbtagged+Nbdumped
whereNbtaggedisthenumberofb-jetspassingtheSSVHP>2.0cutandNbdumpedisthenumber
thatfailthiscut.Sooneonlyhastocountthenumberofb-jetspassingthecutandthenumber
ofthosewhicharecutaway.
Figure6.15showstheefficiencyextractedfromthePythia6QCDMonteCarlosamplesforthe
bins.tdifferen

6.2.UPDATEOFTHEFLAVORCONTENTFITTER
9CMS private work, 36 pb-1s = 7 TeV
108/y bin100.0≤ |y| < 0.5Pythia 6 QCD TuneZ2
T1061.00.5≤≤ |y| < 1.5 (x10 |y| < 1.0 (x10-4-2))
-6 L per p1041.52.0≤≤ |y| < 2.0 (x10 |y| < 2.5 (x10-8))
∫/2b10N1-210-410-610-810-1010-1210-1410321010p

210

121

10321010pT,jetFigure6.16:Normalizeddistributionforthenumberofb-jetsextractedbytheflavorcontentfit.
ThevaluesarecomparedwithaPythia6QCDTuneZ2expectations.Thereisgoodagreement.
esultrdatedUp6.2.6IntheprevioussectionsIshowedtheupdatesIdidforthesinglepartsoftheanalysis.Nowwe
havetocombinetheresultsofallcolleaguesworkingatotherpartsofthisanalysistogetthefinal
b-jetcrosssectionmeasurement.Duetotheseconstraints,forthisthesisitwasnotpossibletoget
result.thisUnfortunatelytheunfoldingproceduredoesnotworkasexpectedfortheincreasedstatistics.
ThesteeplyfallingpTspectrumleadstoasystematiceffectwhileapplyingtheansatzfit.More
discussioninthecollaborationandmaybeanalternativeapproachareneededtoproduceafinal
result.NeverthelessIwillpresentapictureofthemeasurement,wheretheallinputsarecombined,but
withoutperformingtheunfolding.Letushavealookatthedifferentialb-jetcrosssection:
d2σb-jetNjetfbCNb-jetC
dpTdy=εbL·ΔpTunfoldΔy=L·ΔpTunfoldΔy.
Wehavealreadymeasuredtheb-jetpurityfbandthenumberofjetsNjet.FromourCMS
colleagueswegettheresultsofthemeasurementsoftheintegratedluminosityL.Theb-jet
efficiencyisstillextractedfromMC.Apartfromtheunfoldingcorrectionfactorweareableto
createanormalizedhistogramforthenumberofb-jets.Figure6.16showsthedeterminedvalues.
InadditiontheexpecteddistributioncalculatedbyPythia6QCDTuneZ2isdrawn.Nodisagree-
mentbetweendataandMCisfound.
AtlastIwillgiveanoutlookattheexpectedb-jetcrosssectiondistribution.Asclaimedbeforethe
unfoldingprocedureisnotyetaccomplished.Theplotinfigure6.17showstheresultwithupdated
statistics.TheresultiscomparedwithMC@NLOsimulations.Duetotheunfoldingchallengevaluesareonly
calculateduptoacertainthreshold.

122

CHAPTER6.BJETCROSSSECTIONMEASUREMENT

Figure6.17:Thecompleteanalysisoftheb-jetcrosssectionmeasurementwasupdated.Most
subpartsareupdated.Theupdateofthepuritymeasurementpresentedhereisincluded.Having
allanalysstogetherasimilarplotwillgoforpublication.

applicationesyNeuroBa6.3

Themethodspresentedfortheb-jetcrosssectionmeasurementsdonotyetuseNeuroBayesatall.
Therearedifferentpossibilitiestodothisanalysiswiththeuseofmultivariateanalysistechniques.
TheapparentistheuseoftheNeuroBayesb-jettagger,whichwaspresentedinthisthesis.Inthe
followingIwilldescribe,howtheflavorcontentfitmethodcanbechangedforthenewb-jettagger.

6.3.1NeuroBayestemplatefit
TheobviousapplicationoftheNeuroBayesb-jettaggeristousethediscriminatorvariablewithin
theflavourcontentfitter.Therearedifferentpossibilitiestoincludethistagger.
Itispossibletochangethetargetofthetemplatefit.Recentlythedistributionofthesecondary
vertexmasswasused.Thesecondaryvertexmassisareliablevariableforanalysisonearlydata.
Ithasanadequateseparationbetweenb-jetandnon-b-jets.Furtherithasaunderstandableshape,
whichmakesiteasiertonoticepossibleproblems.Forfirstdatathesecondaryvertxmasswasa
goodchoice.
Nowwehavestudiedallvariableswhichareusefulforb-jettagging.Thereconstructionsoftware
hasbeencalibratedandwefoundagoodagreementbetweendataandsimulations(seesection
5.3.3).Thisallowsustouseanothervariablewithmorediscriminationpowerinsteadofthe
mass.rtexevsecondaryFurtherwecanuseanothervariablefortheselectionofthesubsamplewhichisusedforthe
templatefit.Thereforethesimplesecondaryvertexb-taggermustbereplacedbyanewone.The
cutdownofthesamplewasmadetoreduceapossibledependencyonbadlysimulatedbackground
distributions.Buttheapplicationofthiscuthaditsprize.Thesamplewasreducednotonly

6.3.NEUROBAYESAPPLICATION123
CMS private work, 36 pb-1s = 7 TeVCMS private work, 36 pb-1s = 7 TeV
2data1.8pythia6 QCD b-jet4pythia6 QCD c-jet101.6pythia6 QCD light jet3114 GeV < pT≤0 < |y| 133 GeV≤0.51.4
10number of jets / 0.01 GeV1.2number of jets/fit per 0.01 GeV12100.80.6100.410.2000.120.2040.30.400.560.6080.700.80.9101000020.10.200.340.400.560.600.780.800.91010
NeuroBayes b-jet taggerNeuroBayes b-jet taggerFigure6.18:Exemplaryresultoftheflavorcontentfitter.TargetdistributionistheNeuroBayes
b-jettaggeroutput.Onthelefttheresultofthefittodatadistribtuionofthethreedifferent
classesisshown.Therightplotshowstheratiobetweenfittedtemplatesanddatadistribution.
inlight-jetsandc-jets,butalsoinb-jets.Fortheb-jetcrosssectionanalysisanestimateonthe
b-taggingefficienciyεbisneeded.Fortherecentmeasurementwetooktheexpectationvaluefrom
MC.Thiswasnecessary,becausetheaccuracywegotbyanefficiencyestimateondatawasnot
goodenough.Wedidnotgainbysuchameasurment.
Nowwehaveenoughdataforareasonableanalysisoftheb-taggingefficiency.Suchmeasuments
willbedoneatCMScoordinatedbytheb-tagginggroup(BTVPOG)forallofficialCMSb-jet
taggers.UnfortunatelythenewNeuroBayesb-jettaggerisnotyetofficial.Ameasurementofits
efficiencyisplannedatourinstituteinthefuture.
InthefollowingIwillpresentwhathappensiftheflavourcontentfit(FCF)isappliedtothewhole
datasamplewithoutapreselection.TheFCFisusedintheusualsettingswiththreetemplates
forlight-jets,c-jetsandb-jets.ForthetemplatestheinclusivedistributionsoftheNeuroBayes
combinedb-jettagger,whichwasintroducedinsection5.3.2,areused.Figure6.18showsan
exemplaryresultofaspecificpT/ybin,representativeforallbins(seeappendixE).
Ontheleftthetemplatedistributionsareplottedwithalogarithmicscaleonthey-axis.The
amountisscaledtothenumbersextractedfromthefit.Inblackthedatapointsareplottedfor
aneasycomparison.Itisalreadyvisiblethatinthecentralregionwemisssomejets.Theploton
therightconfirmsthis.Heretheratiobetweendataandthefinalfitdistributionisplotted.
ItseemsthattheMCdistribtuionsarenotaswellsimulatedasexpected.Forsmallvaluesof
theNeuroBayesoutputdistributionot≈0.05−0.1wefoundanexcessofjetsindata.Thisis
inaregiondominatedbylightjets.Itseemsthattheyappearatlargerotvaluesthanexpected.
Becauseofthelargefractionoflightjetsthiseffectsthefitoftheothertwotemplates.Jetsfrom
theothertwoclassesareneededtocompensatethemissinglight-jets.Inourcasethisleadsto
anoverestimationofc-jets.Inthecentralregionwecanseethis.Thefitexpectsmorejetsthan
availableindata.Atlasttheb-jettemplatefitstherestofthedistributionnotyetcoveredbythe
c-template.ThefittedfractiontendstobesmallerthantheexpectedvaluesformMC.
Figure6.19showstheresultoftheflavourcontentfitterappliedinallpT/ybinsforb-jets.Allfit
resultsliebelowtheexpectations.
Thesimulationsarenotgoodenoughfortheapplicationoftheflavorcontentfitteratthislevel.
Amoredetailedstudyoninclusivelightjetdistributionsisneededtoidentifytheobjectswhich

124

CHAPTER6.BJETCROSSSECTIONMEASUREMENT

1-CMS private work, 36 pbs = 7 TeVPythia 6 QCD TuneZ2/y bint0.080.5≤ |y| < 0.5 |y| < 1.0 (-0.02)
1.0≤ |y| < 1.5 (-0.04) per p0.062.01.5≤≤ |y| < 2.5 (-0.08) |y| < 2.0 (-0.06)
b0.040.02fraction of b-jets f0-0.02-0.04-0.06-0.08

210

pt, jet

Figure6.19:TheflavourcontentfitterwasusedwiththeNeuroBayesb-jettagger.theresultofthe
fitindifferentpT/ybinsisshown.Nopreselectiononthesamplewasapplied.TheNeuroBayes
variableissensitiveondifferencesbetweenb-jetsandnon-b-jets.Duetothedifferencesinthe
inclusiveshapesnoconvincingfitresultisfound.Thebadapplicationofthefitttingprocedure
pointstoaninsufficientsimulationofthejetdistributions.

6.3.cause

NEURhet

OBAshap

e

ESYAPPLICAifferencesd

b

TIONwteeen

data

and

C.M

125

126

CHAPTER.6BJETCORSSSECTIONMEASUREMENT

7Chapter

Conclusion

Themodelingofb-quarkproductionisoneofthemostchallengingtopicsinelementaryparticle
physics.Althoughthetheorybehind,QCD,wasdevelopedinthe1960sitwasnotpossibleto
predictthecorrectheavyquarkcrosssectionsforalongtime.Especiallyfortheb-quarksthisleads
tocuriousdiscrepanciesbetweenmeasuredresultsfromexperimentscomparedwithinsufficient
predictions.Inthe1990stheexperimentsatTevatron(Run1)aswellastheexperimentsat
LEPclaimedanexcessinb-quarkappearance.Notuntiltherevisionofthenexttoleading
ordercalculations,wheretheexpansionoflargelogarithmictermswereincorporated,andthe
enhancementintheparametrizationofthenon-perturbativepartsacceptablecalculationsofQCD
werefound.FollowingthismeasurementsattheCDFdetectorduringTevatronRun2approved
calculations.theseTodayweredidtheoldCDFmeasurementsattheCMSexperiment.Incontrasttotheresults
atthattime,furtherimprovementonthetheorywereincluded.AtHERAtheprotonstructure
functionwasmeasuredindetail.Thisenablesafurther,moreaccurateverificationofthetheory.
TheanalysispublishedsofarwasdonewithveryearlyCMSdatawithanintegratedluminosityof
60/nb.Theresultswerealsopresentedinthisthesis.Wefoundanoverallgoodagreementbetween
dataandPythiainthejettransversemomentumrange30<pT<150GeVandrapidity|y|<
2.0,withinabout2%statisticaluncertaintyand21%systematicuncertainty.Incomparisonwith
NLO@MCpredictionswefoundsignificantdifferencesinshape.
Furthermorethisthesispresentedthemeasurementsontheb-jetpurityforanupdateoftherecent
CMSanalysis.Theupdateincludestheexperiencesofoneyeardatatakingwhichresultsinan
integratedluminosityof36/pb.Theresultwillbepublishedinthenearfuture.
Ishowed,thattheresultsfromtherecentandtheupdatedmeasurementbaseonthesimple
secondaryvertexb-jettagger.Thisisaveryrobusttaggerdevelopedfortheuseonearlydata.
Forfurtherimprovementsoftheb-jetcrosssectionanalysisitisrecommendedtomovetoamore
powerfultagger.Forthisanewb-jettaggerwasconstructed.Thisbtaggerusesthemultivariate
analysisframeworkNeuroBayes.
ThelayoutandthefeaturesoftheNeuroBayesframeworkweredescribedindetailinthisthesis.
IpresentednewmultivariatetoolsbasedonNeuroBayes,whichallowsustocomparedataand
MonteCarlosimulations.Thesetoolsfacilitateaquickandeasysearchforunexpectedaspects
ofthedata.Thiscomparisonwasdoneforjetsintheb-jetspecificphasespace.Agoodoverall
found.aswtagreemenThisknowledgeenabledtheconstructionofadatabasedb-jettagger.Withthisspecificapproach
itispossibletoignorethepossiblybadlysimulatedbackgroundeventsfromMonteCarlo.Instead,

127

128

CHAPTER.7CONCLUSIONthisinformationistakenfromthedatasample.
Ishowedthatitisfurtherpossibletocorrectforthesmalldifferenceofthedata/MCcomparison.
ThecorrectionfactorwascalculatedfromoutputvaluescalculatedbytheNeuroBayesexpertise.
Furtherimprovementsweremadebythesocalledboostmethod,whichoptimizestheclassification
forapureb-jetselection.
Thefinaldatabasedb-jettaggerwascomparedtoexistingbtaggersfromtheCMScollaboration.
Thisisusuallydoneonthedifferentworkingpointscalledloose,mediumandtight,whichcorre-
spondtothevaluesinbtagefficiencycalculatedatmistagratesof10%,1%and0.1%.Compared
totheofficialb-jetprobabilitytagger(JBP)Ifoundefficiencyimprovementof3%,7%and29%.

Thenewb-jettaggerisanadditionaltoolformanyanalysesplannedattheCMSexperiment.
Especiallythepossibilitytoidentifyb-jetswithasmallrateofmisidentificationqualifiesforstudies
ofheavyparticleswhichdecayinb-quarks.ForSUSYsearches,exoticparticlesandtopphysics
theuseofthistaggermayplayadecisiverole.

Thefulloutputofthenewb-jettaggerisastronglydiscriminatingvariableforinclusivejetdis-
tributions.Thusameasurementoftheb-jetappearancewasarranged.Alargedependencyon
thelightjetcontributionwasfound.Forafinalresultmoredetailedstudiesofthebackground
distributionsareneeded.Withthisthemeasurementofthedifferentialb-jetcrosssectioncanbe
ed.vimpro

eAppAndix

taggingb-jetfoDistributionsriablesav

Thefollowingplotscontainthedistributionsofallvariablesusedinthisthesis.Thevariable
nameislabeledonthex-axisasdefinedinsection5.3.Theplotsaresortedinthesameorderas
introduced.Thefirstplotsshowthedata(black)andMC(red)distributionsofeachvariable.For
anquickandeasycomparisonthevariablesareplottedmanytimes:inthefirstcolumntheyare
theclassicalhistograms(partlywithlogarithmicy-axis).Thelastthreecolumnsshowtheresult
oftheprobabilityintegraltransform:fordata/MC,nonb-b-jetMC/b-jetMCanddata/b-jetMC.

CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®1datapythia 6 QCD TuneZ21250NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD450NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet300NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin-1102801200total400-2N/N10-4no. of events in flat binsno. of events in flat binsno. of events in flat bins220
260101150-32403501011003002001050-510180250-6100010160-710950140200-810×120406080100 10 2.880.2 5.360.4 9.810.6 20.80.8 1.12e+051 10 2.680.2 50.4 9.30.6 19.90.8 4.32e+041120 10 2.540.2 4.490.4 8.080.6 16.30.8 4.32e+041
trackMomtrackMomtrackMomtrackMomtrackMomtrackMomtrackMom
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
0.016datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacher®data 38Xpythia6 QCD400NeuroBayes <phi-t>® Teacher®pythia6 QCD b-jetpythia6 QCD non-b-jet280NeuroBayes <phi-t>® Teacher®data 38Xpythia6 QCD b-jet
per bin1250380total0.0142601200N/Nno. of events in flat bins1100no. of events in flat bins320no. of events in flat bins220
3600.01224011503400.010.00830020010502800.00610001802600.0049501602400.0029001402200-1.5-1-0.500.511.5 -2.60 -1.040.2 -0.3290.4 0.3320.6 1.030.8 2.611 -1.980 -0.680.2 -0.2220.4 0.2140.6 0.6580.8 1.971 -1.990 -0.7620.2 -0.2710.4 0.2460.6 0.7680.8 1.981
trackEtatrackEtatrackEtatrackEtatrackEtatrackEtatrackEta
1CMS private work 2010s = 7 TeV®CMS private work 2010s = 7 TeVCMS simulation®s = 7 TeV®CMS private work 2010s = 7 TeV
pythia6 QCD non-b-jetdatadata 38X per bin-1pythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet400NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
120010500total350-2N/Nno. of events in flat binsno. of events in flat bins10no. of events in flat bins1150300400-3101100250-410300200-5105010150-6200101000100-71010050950-810-80-60-40-20020406080100 -84.100.2 -0.6750.4 -0.1730.6 0.2450.8 0.7741 110 -86.80 -0.7130.2 -0.130.4 0.3810.6 1.180.8 1231 -84.10 -0.6660.2 -0.06910.4 0.4610.6 1.480.8 1231
trackSip2dSigtrackSip2dSigtrackSip2dSigtrackSip2dSigtrackSip2dSigtrackSip2dSigtrackSip2dSig

129

130APPENDIXA.DISTRIBUTIONSOFB-JETTAGGINGVARIABLES

CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®600® per bin1datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet400NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
total10-11200500350
N/Nno. of events in flat binsno. of events in flat binsno. of events in flat bins3001150-2104002501100-3103002001050-4101502001000-510100-61009501050-10000100020003000400050006000 -2.17e+030 -1.070.2 -0.4540.4 0.590.6 1.210.8 6.19e+031 -99.30 -1.080.2 -0.3210.4 0.7980.6 1.820.8 29010 -0.1990.2 -0.003030.4 -0.0002760.6 0.00195 0.007710.8 0.21
trackSip3dSigtrackSip3dSigtrackSip3dSigtrackSip3dSigtrackSip3dSigtrackSip2dtrackSip2d
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®400pythia6 QCD non-b-jetdatapythia 6 QCD TuneZ21250NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin10-11200500350
totalN/N10-3no. of events in flat binsno. of events in flat bins300no. of events in flat bins
3001150-21040025011002001050-4150102001000100950-51010050-0.15-0.1-0.0500.050.10.150.2 -0.20 -0.003150.2 -0.0007030.4 0.0009840.6 0.003730.8 0.210 -0.20.2 -0.00290.4 -0.0004520.6 0.00140.8 0.00551 0.21 -0.1990 -0.003030.2 -0.0002760.4 0.001950.6 0.007710.8 0.21
trackSip2dtrackSip2dtrackSip2dtrackSip2dtrackSip2dtrackSip2dtrackSip2d
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV600CMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®1datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet400NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bintotal10-11200500350
N/N-210-4no. of events in flat binsno. of events in flat bins300no. of events in flat bins
300115010400-31025011002001050-5101502001000-610100-71009501050-10-505100 -130.2 -0.005630.4 -0.001940.6 0.002610.8 0.00668 14.61 -3.830 -0.005310.2 -0.001310.4 0.003630.6 0.01040.8 2.391 -130 -0.005340.2 -0.0006630.4 0.00460.6 0.01430.8 14.61
trackSip3dtrackSip3dtrackSip3dtrackSip3dtrackSip3dtrackSip3dtrackSip3d
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
1250®®pythia6 QCD non-b-jet400®
per bin1datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD500NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
-1350120010total450-2N/N1010-4no. of events in flat bins1100no. of events in flat binsno. of events in flat bins
3004001150-3102503503002001050-510250150-6100010200-710010150950-8105010001020304050607080 8.28e-060 0.01330.2 0.03020.4 0.06260.6 0.1560.8 1901 1.55e-050 0.01230.2 0.02870.4 0.06480.6 0.2010.8 74.210 1.4e-05 0.01350.2 0.03170.4 0.07350.6 0.2560.8 74.21
trackLxytrackLxytrackLxytrackLxytrackLxytrackLxytrackLxy
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®1datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin10-11300380260
total360-2N/N10-4no. of events in flat bins1100no. of events in flat binsno. of events in flat bins
240101200340-310220320300-5200101000280-610180260900-710240160-880010200400600800100012001400 0.0001570 0.2380.2 0.390.4 0.5810.6 0.9470.8 1.55e+031 0.0006410 0.2550.2 0.4320.4 0.6710.6 1.190.8 1.88e+031 0.0001570 0.240.2 0.4040.4 0.6180.6 1.060.8 1.88e+031
trackPtReltrackPtReltrackPtReltrackPtReltrackPtReltrackPtReltrackPtRel
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®pythia6 QCD non-b-jet per bin1datapythia 6 QCD TuneZ21250NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD450NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet300NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
-110280total4001200-2N/N10-4no. of events in flat binsno. of events in flat binsno. of events in flat bins220
260101150-32403501011003002001050-5101802501000-610160950-71014020010-8×1900120
20406080100 0.8870 2.830.2 5.320.4 9.760.6 20.80.8 1.12e+051 0.890 2.630.2 4.930.4 9.240.6 19.80.8 4.32e+041 0.8880 2.490.2 4.440.4 8.020.6 16.20.8 4.32e+041
trackPpartrackPpartrackPpartrackPpartrackPpartrackPpartrackPpar
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
data®data 38X®pythia6 QCD non-b-jet®data 38X
0.03pythia 6 QCD TuneZ21250NeuroBayes <phi-t>® Teacherpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bin260400total12000.025N/Nno. of events in flat binsno. of events in flat binsno. of events in flat bins24011503500.0222011000.01530010502000.0110002501809500.00516020090000.050.10.150.20.250.30.350.40.450.5 2.85e-050 0.040.2 0.07940.4 0.1380.6 0.2430.8 0.51 3.66e-050 0.03720.2 0.07680.4 0.1370.6 0.2410.8 0.51 2.87e-050 0.04220.2 0.08210.4 0.1440.6 0.2480.8 0.51
trackJetDeltaRtrackJetDeltaRtrackJetDeltaRtrackJetDeltaRtrackJetDeltaRtrackJetDeltaRtrackJetDeltaR

131

CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
data®data 38X550®pythia6 QCD non-b-jet®data 38X
pythia 6 QCD TuneZ21200NeuroBayes <phi-t>® Teacherpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet350NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bin500-110total450300N/Nno. of events in flat binsno. of events in flat bins1150no. of events in flat bins400-22501035011003002001050-3250101502001000150-410010100950-0.09-0.08-0.07-0.06-0.05-0.04-0.03-0.02-0.010 -0.10 -0.006990.2 -0.003680.4 -0.002070.6 -0.0009520.8 -5.54e-0910 -0.10.2 -0.00902 -0.00430.4 -0.002340.6 -0.001060.8 -5.54e-091 -0.10 -0.009970.2 -0.004750.4 -0.002570.6 -0.001150.8 -1.16e-081
trackJetDisttrackJetDisttrackJetDisttrackJetDisttrackJetDisttrackJetDisttrackJetDist
0.04CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
1250®450®pythia6 QCD non-b-jet®
datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin2600.0351200total400N/Nno. of events in flat binsno. of events in flat binsno. of events in flat bins0.0324011503500.02522011000.0230010502000.01525010000.011809500.00520016000.050.10.150.20.250.30.350.40.450 8.09e-060.2 0.02290.4 0.04890.6 0.0886 0.1630.8 0.4771 2.51e-050 0.02910.2 0.06110.4 0.110.6 0.1950.8 0.4781 2.21e-050 0.03140.2 0.0630.4 0.110.6 0.1930.8 0.4781
trackPtRelFractrackPtRelFractrackPtRelFractrackPtRelFractrackPtRelFractrackPtRelFractrackPtRelFrac
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®450®datapythia 6 QCD TuneZ21250NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bintotal10-11200400
260N/Nno. of events in flat binsno. of events in flat binsno. of events in flat bins2401150350-2102201100300-31050200102501000180-4102009501600.880.90.920.940.960.981 0.8790 0.9870.2 0.9960.4 0.9990.6 10.8 11 0.8780 0.9810.2 0.9940.4 0.9980.6 10.8 110 0.8780.2 0.9810.4 0.9940.6 0.9980.8 11 1
trackPparFractrackPparFractrackPparFractrackPparFractrackPparFractrackPparFractrackPparFrac
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®1250pythia6 QCD non-b-jetdatapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet260NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin3600.061200total240N/Nno. of events in flat binsno. of events in flat binsno. of events in flat bins3400.0511502203200.0411003000.0310502000.0228010001800.0126095016000.511.522.533.544.5 0.004050 0.4820.2 0.6480.4 0.8050.6 1.060.8 51 0.01040 0.4460.2 0.6060.4 0.780.60.8 1.081 5 0.005840 0.4420.2 0.6080.4 0.7890.6 1.080.8 51
trackChi2trackChi2trackChi2trackChi2trackChi2trackChi2trackChi2
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
data®data 38X22000®pythia6 QCD non-b-jet®data 38X
1pythia 6 QCD TuneZ270000NeuroBayes <phi-t>® Teacherpythia6 QCD20000NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bin14000total180006000012000N/Nno. of events in flat bins40000no. of events in flat bins12000no. of events in flat bins8000
16000-11050000100001400010000-210600030000800060004000200004000-320001000010200000022.533.544.555.566.570 2 3 3 3 3 90.20.40.60.810 2 3 3 3 3 70.20.40.60.810 2 3 3 3 3 60.20.40.60.81
trackPxHitstrackPxHitstrackPxHitstrackPxHitstrackPxHitstrackPxHitstrackPxHits
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®pythia6 QCD non-b-jet per bin0.2datapythia 6 QCD TuneZ218000NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet4000NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
60000.18total350016000N/N50000.16140000.1no. of events in flat bins10000no. of events in flat bins3000no. of events in flat bins2000
30000.1440001200025000.1280000.081500200060000.06100040000.04100050020000.020101520253000 80.2 140.4 160.6 170.8 191 320 80 140.2 160.4 170.6 180.8 2910 80 140.2 160.4 170.6 180.8 291
trackHitstrackHitstrackHitstrackHitstrackHitstrackHitstrackHits
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®550®1datapythia 6 QCD TuneZ21300NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin10-11250500350
totalN/N10-21200450300
10-4no. of events in flat bins1100no. of events in flat binsno. of events in flat bins
4001150-3102503503002001050-5102501000-615010200950-715010100900-81001085000.511.522.5 00 0.0007730.2 0.001750.4 0.003210.6 0.006170.8 2.631 2.98e-080 0.0008830.2 0.001960.4 0.003590.6 0.007030.8 0.541 5.23e-070 0.0008110.2 0.001920.4 0.003620.6 0.007270.8 2.021
trackBDisttrackBDisttrackBDisttrackBDisttrackBDisttrackBDisttrackBDist

132APPENDIXA.DISTRIBUTIONSOFB-JETTAGGINGVARIABLES

CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®1datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38XNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet400NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin10-11200500350
pythia6 QCDtotal-2N/N10-4no. of events in flat bins1100no. of events in flat binsno. of events in flat bins
101150300400-310250300200-5105010150-6102001000100-710100950-85010010203040506070800 00.2 0.1610.4 0.3560.6 0.6160.8 1.021 84.9 3.41e-060 0.2060.2 0.4410.4 0.7560.6 1.310.8 64.21 0.0001730 0.1880.2 0.4160.4 0.7220.6 1.260.8 60.51
trackBdistSigtrackBdistSigtrackBdistSigtrackBdistSigtrackBdistSigtrackBdistSigtrackBdistSig
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®400pythia6 QCD non-b-jet per bin1datapythia 6 QCD TuneZ21250NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
3505001200total-1N/Nno. of events in flat binsno. of events in flat binsno. of events in flat bins1030011504002501100-2103002001050-3150102001000100950-4101005090000.10.20.30.40.50.60.70.80.9 00 0.9760.2 0.9830.4 0.9850.6 0.9860.8 0.9891 00 0.9650.2 0.9810.4 0.9840.6 0.9860.8 0.98910 00.2 0.9680.4 0.9820.6 0.9850.8 0.9861 0.989
trackBweighttrackBweighttrackBweighttrackBweighttrackBweighttrackBweighttrackBweight

CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®datapythia 6 QCD TuneZ2420NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet300NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin12000.05400total250N/Nno. of events in flat bins1000no. of events in flat bins380no. of events in flat bins0.043602008000.033401506003200.023001004002800.01502602000012345678910 0.2790 0.7490.2 1.290.4 2.030.6 3.590.8 84210 0.2790.2 0.9230.4 1.720.6 2.860.8 4.741 878 0.2790 1.110.2 1.930.4 2.960.6 4.40.8 4221
vertexMassvertexMassvertexMassvertexMassvertexMassvertexMassvertexMass
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®320®pythia6 QCD non-b-jetdatapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD700NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet180NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin0.035300total1600.03N/N0.02no. of events in flat bins240no. of events in flat bins500no. of events in flat bins
2806002600.0251401202200.0154001002000.01180803000.00516000.20.40.60.811.21.41.61.822.22.4 0.012500.2 0.2030.4 0.3740.6 0.6410.8 1.12 2.51 0.01190 0.2240.2 0.4390.4 0.7610.6 1.30.8 2.51 0.01340 0.2210.2 0.4150.4 0.6910.6 1.160.8 2.51
vertexPVDist2dvertexPVDist2dvertexPVDist2dvertexPVDist2dvertexPVDist2dvertexPVDist2dvertexPVDist2d
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
data®®pythia6 QCD non-b-jet®data 38X
-1pythia 6 QCD TuneZ2300NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD900NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet220NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bin10200800total280-210N/N10-4no. of events in flat bins240no. of events in flat bins500no. of events in flat bins
180700260160-360010140120400220-51001030080-62001020060-7180104010050100150200250 30 4.340.2 6.550.4 10.60.6 20.40.8 3431 30 5.340.2 9.110.4 15.70.6 28.90.8 3251 300.2 5.790.4 10.30.6 17.50.8 31.71 312
vertexPVSig2dvertexPVSig2dvertexPVSig2dvertexPVSig2dvertexPVSig2dvertexPVSig2dvertexPVSig2d
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD700NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin1803000.06total280N/N160no. of events in flat binsno. of events in flat binsno. of events in flat bins6000.052601400.045002401200.032204001000.02200803000.011800123456 0.01450 0.3010.2 0.5960.4 1.060.6 20.8 15.51 0.01230 0.2820.2 0.5570.4 0.9660.6 1.660.8 6.56160 0.01620 0.2820.2 0.5360.4 0.8980.6 1.530.8 6.531
vertexPVDist3dvertexPVDist3dvertexPVDist3dvertexPVDist3dvertexPVDist3dvertexPVDist3dvertexPVDist3d

133

CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
data®data 38X®pythia6 QCD non-b-jet®data 38X
10-1pythia 6 QCD TuneZ2300NeuroBayes <phi-t>® Teacherpythia6 QCD900NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet220NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bin200800total-228010N/N10-4no. of events in flat bins240no. of events in flat bins500no. of events in flat bins
180700-316026010600140120400-52201010030080-62001020060-7180101004050100150200250 0.8350 4.330.2 6.550.4 10.60.6 20.40.8 3881 2.110 5.340.2 9.120.4 15.70.6 28.90.8 3461 2.120 5.790.2 10.30.4 17.50.6 31.80.8 3091
vertexPVSig3dvertexPVSig3dvertexPVSig3dvertexPVSig3dvertexPVSig3dvertexPVSig3dvertexPVSig3d
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeV200CMS private work 2010s = 7 TeV
®®®0.07pythia6 QCD non-b-jetdatapythia 6 QCD TuneZ2300NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin1807000.06total280N/Nno. of events in flat binsno. of events in flat binsno. of events in flat bins1600.056002601400.045002401200.032204001000.02200300800.011806020000.050.10.150.20.250.30.350.40.45 0.0001130 0.02150.2 0.04330.4 0.07780.6 0.1390.8 0.51 7.06e-050 0.01750.2 0.03660.4 0.07060.6 0.1340.8 0.51 3.1e-050 0.01760.2 0.03520.4 0.06740.6 0.1290.8 0.51
vertexJetDeltaRvertexJetDeltaRvertexJetDeltaRvertexJetDeltaRvertexJetDeltaRvertexJetDeltaRvertexJetDeltaR
CMS private work 2010s = 7 TeV420CMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
1data®data 38X®pythia6 QCD non-b-jet®data 38X
pythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherpythia6 QCD1200NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet300NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bin400-110totalN/N10-23801000250
10-4no. of events in flat bins340no. of events in flat binsno. of events in flat bins
360-320010800150600320-510100300-640010280-7501020026010203040506070 0.0001610 0.1170.2 0.2170.4 0.3640.6 0.6090.8 1101 4.38e-050 0.09440.2 0.2030.4 0.3750.6 0.6320.8 1991 0.0001640 0.1790.2 0.3490.4 0.5440.6 0.750.8 46.91
vertexJetEFracvertexJetEFracvertexJetEFracvertexJetEFracvertexJetEFracvertexJetEFracvertexJetEFrac
CMS private work 2010s = 7 TeV14000CMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD30000NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin500012000-1total1025000N/Nno. of events in flat binsno. of events in flat binsno. of events in flat bins100004000-2200001080003000150006000-3102000100004000-410001050002000024681012141600 0 0 2 2 30.20.40.60.8 17100 0 0 2 3 40.20.40.60.8 23100 0 0 2 3 50.20.40.60.8 191
vertexNtracksvertexNtracksvertexNtracksvertexNtracksvertexNtracksvertexNtracksvertexNtracks
CMS private work 2010s = 7 TeV440CMS private work 2010s = 7 TeV1000CMS simulations = 7 TeVCMS private work 2010s = 7 TeV
1data®®pythia6 QCD non-b-jet®data 38X
pythia 6 QCD TuneZ2420NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet200NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bin-110900total400180N/N-4no. of events in flat bins340no. of events in flat bins700no. of events in flat bins
-238010800360-31601010320140600300-510120280-610500260100-72401000.511.522.530 2.45e-080.2 0.009890.4 0.01650.6 0.02490.8 0.0401 3.821 1.84e-080 0.007520.2 0.01290.4 0.02020.6 0.03420.8 2.91 1.84e-080 0.01060.2 0.01770.4 0.02680.6 0.04340.8 3.321
vertexTrackDeltaRvertexTrackDeltaRvertexTrackDeltaRvertexTrackDeltaRvertexTrackDeltaRvertexTrackDeltaRvertexTrackDeltaR
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®1datapythia 6 QCD TuneZ2420NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD1000NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet240NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bintotal10-1400220
900N/N380no. of events in flat bins340no. of events in flat bins700no. of events in flat bins160
200-210800360180-310-432010140600300-510120280500-61001026080400-71005001000150020002500 00 0.3630.2 0.4680.4 0.5610.6 0.6670.8 2.97e+031 00 0.3050.2 0.4130.4 0.5060.6 0.6140.8 5.04e+0310 00.2 0.3610.4 0.4620.6 0.5550.8 0.6611 511
vertexTrackEFracvertexTrackEFracvertexTrackEFracvertexTrackEFracvertexTrackEFracvertexTrackEFracvertexTrackEFrac
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeV14000CMS private work 2010s = 7 TeV
data24000®®pythia6 QCD non-b-jet®data 38X
per bin1pythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
220001200050000total20000N/Nno. of events in flat binsno. of events in flat binsno. of events in flat bins10000180004000016000-1108000140003000012000600010000-210200008000400060001000040002000-320001000.10.20.30.40.50.60.70.80.9100 0 0 0 00.20.40.6 1 10.8100 0 0 0 00.20.40.60.8 1 1100 0 0 0 00.20.40.6 1 10.81
vertexCategoryvertexCategoryvertexCategoryvertexCategoryvertexCategoryvertexCategoryvertexCategory

134

APPENDIXA.DISTRIBUTIONSOFB-JETTAGGINGVARIABLES

CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV600CMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®® per bin1datapythia 6 QCD TuneZ2150NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet180NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
550140total1605000.8N/Nno. of events in flat binsno. of events in flat bins130no. of events in flat bins4501201400.64001101203501000.430090100250800.28020070×1020040060080010001200140016000 0.3270.2 7.980.4 11.90.6 18.70.8 33.5 1.61e+061 0.2940 9.570.2 14.90.4 23.60.6 44.50.8 5.92e+051 0.2940 8.230.2 12.90.4 20.40.6 36.40.8 6.06e+041
muonMommuonMommuonMommuonMommuonMommuonMommuonMom
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®® per bin0.02datapythia 6 QCD TuneZ2150NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD500NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
1400.018total160450N/Nno. of events in flat binsno. of events in flat bins130no. of events in flat bins0.0160.0141401204000.0121100.011201003500.008900.006100803000.004700.00280250600-1.5-1-0.500.511.5 -2.560 -0.9950.2 -0.3220.4 0.3220.6 0.9810.8 2.521 -1.830 -0.5760.2 -0.2060.4 0.1950.6 0.5740.8 1.8810 -1.840.2 -0.7040.4 -0.2360.6 0.2520.8 0.705 1.851
muonEtamuonEtamuonEtamuonEtamuonEtamuonEtamuonEta
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®150 per bindatapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD460NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet170NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
140160440total130N/N15042010-2no. of events in flat bins110no. of events in flat bins380no. of events in flat bins130
1401204001203601001103409010032080903007080-3-2-10123 -3.140 -1.940.2 -0.6890.4 0.6190.6 1.860.8 3.1410 -3.140.2 -1.90.4 -0.6350.6 0.6460.8 1.89 3.141 -3.140 -1.940.2 -0.6940.4 0.5450.6 1.850.8 3.141
muonPhimuonPhimuonPhimuonPhimuonPhimuonPhimuonPhi
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®1® per bindatapythia 6 QCD TuneZ2150NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD550NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet220NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
200140total5000.8N/N130no. of events in flat bins110no. of events in flat bins400no. of events in flat bins140
1804501601200.61203501000.41003009080802500.26070200406000100020003000400050006000700080000 -5720.2 -10.4 -0.09290.6 0.6940.8 3.411 1.73e+03 -8040 -1.250.2 -0.0220.4 1.090.6 8.10.8 9.97e+031 -2790 -1.030.2 0.1840.4 1.650.6 8.510.8 9.97e+031
muonSip2dSigmuonSip2dSigmuonSip2dSigmuonSip2dSigmuonSip2dSigmuonSip2dSigmuonSip2dSig
CMS private work 2010s = 7 TeV®CMS private work 2010s = 7 TeVCMS simulation®s = 7 TeV®CMS private work 2010s = 7 TeV
240pythia6 QCD non-b-jet1501datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD600NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bintotal10-1140550220
200N/N10-2130500180
no. of events in flat bins110no. of events in flat bins400no. of events in flat bins140
450120160-310-410350120100-5300101009080250-6801060200-7701040150-6000-4000-2000020004000 -5600 -1.30.2 -0.120.4 1.170.6 4.410.8 3.08e+031 -7500 -1.460.2 0.3350.4 1.830.6 9.690.8 3.08e+0310 -6.82e+030.2 -1.160.4 0.7810.6 3.20.8 11.21 4.07e+03
muonSip3dSigmuonSip3dSigmuonSip3dSigmuonSip3dSigmuonSip3dSigmuonSip3dSigmuonSip3dSig
CMS private work 2010s = 7 TeV160®CMS private work 2010s = 7 TeV550CMS simulation®s = 7 TeV®CMS private work 2010s = 7 TeV
data 38Xpythia6 QCD non-b-jetdata1pythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet200NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bin500total1801400.8N/Nno. of events in flat binsno. of events in flat binsno. of events in flat bins4501601201400.64001203501000.410030080800.225060600200040006000800010000120001400016000 0.001060 0.310.2 0.5120.4 0.7890.6 1.330.8 1.76e+0410 0.0008940.2 0.4210.4 0.7560.6 1.250.8 2.24 1.76e+041 0.003010 0.4010.2 0.6930.4 1.110.6 1.880.8 3.39e+031
muonPtRelmuonPtRelmuonPtRelmuonPtRelmuonPtRelmuonPtRelmuonPtRel

135

CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®150datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin10-2140550180
total50010-4no. of events in flat bins110no. of events in flat bins400no. of events in flat bins
130N/N160-310450120140120350100-5103009010025080-610802007023456789100 1.60.2 3.070.4 3.640.6 4.160.8 4.811 10.6 1.60 2.780.2 3.410.4 3.990.6 4.680.8 10.61 1.60 2.790.2 3.40.4 3.930.6 4.550.8 10.31
muonEtaRelmuonEtaRelmuonEtaRelmuonEtaRelmuonEtaRelmuonEtaRelmuonEtaRel
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV600CMS simulations = 7 TeVCMS private work 2010s = 7 TeV
data®®pythia6 QCD non-b-jet®data 38X
pythia 6 QCD TuneZ2140NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bin180550total500130N/N160no. of events in flat binsno. of events in flat binsno. of events in flat bins450120-21014040011012035010030090100-3102508080200700.050.10.150.20.250.30.35 6.8e-050 0.02490.2 0.04780.4 0.08150.6 0.1460.8 0.41 6.66e-050 0.02240.2 0.04460.4 0.07980.6 0.1510.8 0.41 6.78e-050 0.02730.2 0.05060.4 0.08590.6 0.1580.8 0.41
muonJetDeltaRmuonJetDeltaRmuonJetDeltaRmuonJetDeltaRmuonJetDeltaRmuonJetDeltaRmuonJetDeltaR
CMS private work 2010s = 7 TeV160®CMS private work 2010s = 7 TeV550CMS simulation®s = 7 TeV®CMS private work 2010s = 7 TeV
1datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet200NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bintotal10-1500180
140N/N-210-4no. of events in flat binsno. of events in flat binsno. of events in flat bins120
45010160120-310140400350100-51001030080-68010250-7601060200040006000800010000120001400016000 0.001060 0.310.2 0.5120.4 0.7890.6 1.330.8 1.76e+041 0.0008940 0.4210.2 0.7560.4 1.250.6 2.240.8 1.76e+041 0.003010 0.4010.2 0.6930.4 1.110.6 1.880.8 3.39e+031
muonJetPparmuonJetPparmuonJetPparmuonJetPparmuonJetPparmuonJetPparmuonJetPpar
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
data®®pythia6 QCD non-b-jet®data 38X
1pythia 6 QCD TuneZ2150NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD500NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet180NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bin-114010totalN/N10-2130450160
10-4no. of events in flat binsno. of events in flat binsno. of events in flat bins120
120140-340010110100350-5901010030080-6108070-72501050010001500200025003000350060 0.0005040 0.03040.20.4 0.04850.6 0.07740.8 0.142 3.71e+031 0.000450 0.0160.2 0.02820.4 0.04990.6 0.1020.8 1.43e+031 0.00150 0.03830.2 0.06170.4 0.10.6 0.1820.8 1101
muonJetPparFracmuonJetPparFracmuonJetPparFracmuonJetPparFracmuonJetPparFracmuonJetPparFracmuonJetPparFrac
CMS private work 2010s = 7 TeV®CMS private work 2010s = 7 TeVCMS simulation®s = 7 TeV®CMS private work 2010s = 7 TeV
1datapythia 6 QCD TuneZ2160NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD600NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet220NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bintotal10-1550200
-4no. of events in flat binsno. of events in flat bins400no. of events in flat bins140
N/N10-2140500180
160450120-31010120350100100-5300108080250-6106020060-710401500100020003000400050006000700080009000 6.04e-090 0.7210.2 1.050.4 1.780.6 8.460.8 9.43e+031 6.04e-090 0.5990.2 0.9890.4 2.210.6 17.60.8 9.96e+0310 4.29e-080.2 0.6640.4 0.9370.6 1.380.8 4.381 8.59e+03
muonChi2muonChi2muonChi2muonChi2muonChi2muonChi2muonChi2

136APPENDIXA.DISTRIBUTIONSOFB-JETTAGGINGVARIABLES

CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®340 per bin1datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD450NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet200NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
-132010total180400N/N-2-4no. of events in flat binsno. of events in flat bins300no. of events in flat bins140
10300160350-31028010260120250-510240-61002001010-7×1220150
100200300400500 0.1510 5.370.2 9.380.4 15.50.6 30.20.8 5.5e+051 0.1510 4.570.2 7.670.4 13.50.6 28.50.8 5.5e+051 0.220 4.710.2 7.610.4 12.60.6 23.90.8 1.87e+051
eleMomeleMomeleMomeleMomeleMomeleMomeleMom
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®pythia6 QCD non-b-jet per bin0.02datapythia 6 QCD TuneZ2360NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD400NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet200NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
3803400.018total1803600.016N/N3200.01no. of events in flat bins280no. of events in flat bins300no. of events in flat bins
3400.0141603003200.0121402602800.0081202600.0062402400.0042201002200.0022000-1.5-1-0.500.511.50 -2.61 -1.20.2 -0.4880.4 0.4240.6 1.160.8 2.581 -1.860 -0.8810.2 -0.2870.4 0.2840.6 0.8720.8 1.851 -1.860 -0.9540.2 -0.3420.4 0.3060.6 0.9310.8 1.841
eleEtaeleEtaeleEtaeleEtaeleEtaeleEtaeleEta
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®® per bindatapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD380NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
340180360total320N/N10no. of events in flat bins280no. of events in flat bins300no. of events in flat bins140
340160300320-2260280120240260240220100-3-2-10123 -3.140 -1.90.2 -0.6370.4 0.610.6 1.860.8 3.141 -3.140 -1.90.2 -0.6620.4 0.6170.6 1.860.8 3.1410 -3.140.2 -1.940.4 -0.6960.6 0.5790.8 1.85 3.141
elePhielePhielePhielePhielePhielePhielePhi
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®pythia6 QCD non-b-jet per bindatapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD500NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
250340-110450total320-2N/N1010-4no. of events in flat bins280no. of events in flat bins300no. of events in flat bins150
200400-330010350-525010260100-62001024050150-710220-150-100-50050100150 -1840 -0.8180.2 -0.1630.4 0.4520.6 1.460.8 17610 -1580.2 -0.8910.4 -0.0770.6 0.6980.8 2.931 220 -7.85e+030 -1.10.2 0.4060.4 1.540.6 5.030.8 5.12e+031
eleSip2dSigeleSip2dSigeleSip2dSigeleSip2dSigeleSip2dSigeleSip3dSigeleSip3dSig
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV550CMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®® per bin1datapythia 6 QCD TuneZ2340NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
500250-110total320450N/Nno. of events in flat bins280no. of events in flat bins300no. of events in flat bins150
-220040010300350-310-410250260-510010200240-61501050220100-710-6000-4000-20000200040000 -7.85e+030.2 -1.220.4 -0.410.6 0.8550.8 2.171 5.12e+03 -4170 -1.260.2 0.2310.4 1.40.6 4.870.8 1.15e+031 -7.85e+030 -1.10.2 0.4060.4 1.540.6 5.030.8 5.12e+031
eleSip3dSigeleSip3dSigeleSip3dSigeleSip3dSigeleSip3dSigeleSip3dSigeleSip3dSig
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeV200CMS private work 2010s = 7 TeV
®®®pythia6 QCD non-b-jet1datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin340400-118010total320N/N-210-4no. of events in flat bins280no. of events in flat bins300no. of events in flat bins140
10350160300-310260-510120250240-610100220-7102001000200030004000500060007000 0.0008030 0.2290.2 0.4040.4 0.6520.6 1.160.8 7.07e+031 0.001350 0.2290.2 0.430.4 0.7230.6 1.40.8 7.07e+0310 0.0008030.2 0.2440.4 0.4410.6 0.730.8 1.33 1.21e+041
elePtRelelePtRelelePtRelelePtRelelePtRelelePtRelelePtRel
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeV200CMS private work 2010s = 7 TeV
data®data 38X®pythia6 QCD non-b-jet®data 38X
pythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bin340-210180400total320N/N-3no. of events in flat bins280no. of events in flat bins300no. of events in flat bins140
10160350300-410-510260120250-610240-7100220200102345678910 1.60 2.970.2 3.60.4 4.180.6 4.910.8 10.91 1.600.2 2.740.4 3.370.6 3.970.8 4.691 10.9 1.60 2.730.2 3.310.4 3.870.6 4.520.8 10.71
eleEtaReleleEtaReleleEtaReleleEtaReleleEtaReleleEtaReleleEtaRel

137

CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
190®®®datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin180340400total170320N/Nno. of events in flat bins280no. of events in flat bins300no. of events in flat bins140
160350300-215010130260120250240110-3102202001000.050.10.150.20.250.30.35 3.69e-050 0.02760.2 0.05310.4 0.0920.6 0.1640.8 0.41 3.69e-050 0.02530.2 0.05110.4 0.09190.6 0.1690.8 0.41 9.49e-050 0.0310.2 0.05830.4 0.10.6 0.1750.8 0.41
eleJetDeltaReleJetDeltaReleJetDeltaReleJetDeltaReleJetDeltaReleJetDeltaReleJetDeltaR
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeV200CMS private work 2010s = 7 TeV
®®®pythia6 QCD non-b-jet1datapythia 6 QCD TuneZ2340NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin400-118010total320N/N-210-4no. of events in flat bins280no. of events in flat bins300no. of events in flat bins140
10350160300-310260-510120250240-610100220-7102001000200030004000500060007000 0.0008030 0.2290.2 0.4040.4 0.6520.6 1.160.8 7.07e+031 0.001350 0.2290.2 0.430.4 0.7230.6 1.40.8 7.07e+031 0.0008030 0.2440.2 0.4410.4 0.730.6 1.330.8 1.21e+041
eleJetPpareleJetPpareleJetPpareleJetPpareleJetPpareleJetPpareleJetPpar
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
data®data 38X®pythia6 QCD non-b-jet®data 38X
1pythia 6 QCD TuneZ2340NeuroBayes <phi-t>® Teacherpythia6 QCD380NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet180NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bintotal1703203600.8N/Nno. of events in flat bins280no. of events in flat bins300no. of events in flat bins140
1603403001503200.61300.42802601202602401100.2240100220220050100150200250300350 0.0003350 0.02280.2 0.03670.4 0.06090.6 0.120.8 3591 0.0003350 0.009660.2 0.01930.4 0.03630.6 0.08470.8 35910 0.000780.2 0.02270.4 0.037 0.06210.6 0.1170.8 6041
eleJetPparFraceleJetPparFraceleJetPparFraceleJetPparFraceleJetPparFraceleJetPparFraceleJetPparFrac
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®pythia6 QCD non-b-jet-1datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD380NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet180NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin34010360total170320N/Nno. of events in flat bins280no. of events in flat bins300no. of events in flat bins140
340160300150320-210260130280-312024010260110220240100123456789 0.03470 0.5190.2 0.6920.4 0.9160.6 1.770.8 1010 0.03470.2 0.4920.4 0.6720.6 0.9330.8 1.821 10 0.05020 0.4660.2 0.6430.4 0.8890.6 1.720.8 101
eleChi2eleChi2eleChi2eleChi2eleChi2eleChi2eleChi2
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®0.12datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet250NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin400450total0.1400N/N0.06no. of events in flat binsno. of events in flat bins300no. of events in flat bins150
2003503500.083002502000.041002501500.02501002000-7-6-5-4-3-2-1010 -70.2 -0.3820.4 -0.2510.6 -0.05350.8 0.2851 1 -70 -0.3430.2 -0.210.4 -0.00210.6 0.3380.8 11 -70 -0.3780.2 -0.2290.4 0.008390.6 0.3840.8 11
eleIdeleIdeleIdeleIdeleIdeleIdeleId
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV400CMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®-1pythia6 QCD non-b-jet per bin10datapythia 6 QCD TuneZ2400NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet220NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
380-220010total360N/Nno. of events in flat binsno. of events in flat bins350no. of events in flat bins180-334010160320300-410140300-525010120280-626010100200240-78010220150-80-60-40-20020406080 -1790 -9.840.2 -3.050.4 2.650.6 9.330.8 1791 -96.80 -6.540.2 -1.810.4 2.410.6 7.310.8 94.210 -97.60.2 -7.460.4 -2.40.6 2.10.8 7.21 96.8
eleZposeleZposeleZposeleZposeleZposeleZposeleZpos
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
1data360®data 38X®pythia6 QCD non-b-jet®data 38X
per binpythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherpythia6 QCD380NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet180NeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
340total1703600.8N/N320no. of events in flat bins280no. of events in flat bins300no. of events in flat bins140
1603403001500.63200.41302602801202402600.21102202401002000100200300400500600700800900 0.2960 1.070.2 1.780.4 3.010.6 5.610.8 2.02e+0310 0.2960.2 0.9570.4 1.460.6 2.240.8 4.031 1.15e+03 0.3020 0.9260.2 1.460.4 2.30.6 4.310.8 4.12e+031
eleInvDeltaReleInvDeltaReleInvDeltaReleInvDeltaReleInvDeltaReleInvDeltaReleInvDeltaR

138APPENDIXA.DISTRIBUTIONSOFB-JETTAGGINGVARIABLES

CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®1datapythia 6 QCD TuneZ2340NeuroBayes <phi-t>® Teacherdata 38X400NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per binpythia6 QCD180380-110320total360N/N-210-4no. of events in flat bins280no. of events in flat bins300no. of events in flat bins140
10160340300-332010280260-510120260-624010240100-71022022000.20.40.60.811.21.41.61.82 2.68e-090 0.0002420.2 0.0004730.4 0.0008510.6 0.001890.8 2.181 1.46e-090 0.0001840.2 0.0003680.4 0.0006470.6 0.001230.8 2.181 3.25e-090 0.0002120.2 0.0003980.4 0.0006770.6 0.001280.8 1.111
eleGSFDifeleGSFDifeleGSFDifeleGSFDifeleGSFDifeleGSFDifeleGSFDif
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
data®®pythia6 QCD non-b-jet®data 38X
1pythia 6 QCD TuneZ2340NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD400NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bin180total3803200.8N/N360no. of events in flat bins280no. of events in flat bins300no. of events in flat bins140
1603403000.63202802600.41202602402400.2100220220×10-400-300-200-1000100 -4.3e+050 1.480.2 2.510.4 3.990.6 6.60.8 1.25e+051 -4.3e+050 1.530.2 2.60.4 4.230.6 6.990.8 1.25e+051 -9.49e+030 1.530.2 2.690.4 4.460.6 7.280.8 7.92e+041
eleBremeleBremeleBremeleBremeleBremeleBremeleBrem

CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
10-1datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacher®data 38Xpythia6 QCDdatapythia 6 QCD TuneZ2420NeuroBayes <phi-t>® Teacher®data 38Xpythia6 QCD
400 per bintotal-2 per bintotal0.012400
10380N/NN/N10-4no. of events in flat bins340no. of events in flat bins340
3800.011-3360103600.010.009320320-5103000.008300-6280102800.007260-7260102400.006100200300400500600700800900 840 1170.2 1330.4 1620.6 2080.8 9981-1-0.500.51 -2.50 -1.140.2 -0.3590.4 0.360.6 1.130.8 2.51
p(jet)jetPtjetPty(jet)jetEtajetEta
TCMS private work 2010s = 7 TeV440®CMS private work 2010s = 7 TeV1CMS private work 2010s = 7 TeV®CMS private work 2010s = 7 TeV
datapythia 6 QCD TuneZ2420NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDdatapythia 6 QCD TuneZ2420NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD
per bin per bintotaltotal400400-1N/N10380N/N10-1380
no. of events in flat bins34010-2no. of events in flat bins340
360360-232032010300300-310280280260260-310-410240240-1-0.8-0.6-0.4-0.200.20.40.60.810 -10.2 0.01650.4 0.04170.6 0.08490.8 0.1711 10.10.20.30.40.50.60.70.80.9 0.0005540 0.006210.2 0.009040.4 0.01420.6 0.03060.8 11
CSVjetCSVjetCSVCSVMVAjetCSVMVAjetCSVMVA
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
®®datadata per bin10-1pythia 6 QCD TuneZ2400NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD0.03pythia 6 QCD TuneZ250NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD
per bin45total10-2380total0.025
N/NN/N4010-4no. of events in flat bins3400.015no. of events in flat bins30
360-3350.021025320-5100.0120300-615100.00528010-7102600246810 00 0.1370.2 0.280.4 0.4770.6 0.8590.8 10.8101.522.533.544.555.50 0.6070.2 1.680.4 2.030.6 2.460.8 3.071 5.51
JBPjetBProbjetBProbSSVEjetSSVjetSSV
CMS private work 2010s = 7 TeV35®CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV®CMS private work 2010s = 7 TeV
0.04datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDdatapythia 6 QCD TuneZ260NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD
0.09 per bintotal0.03530 per bintotal
0.0850N/NN/Nno. of events in flat binsno. of events in flat bins250.070.030.0640200.0250.050.02150.04300.0150.03100.01200.0250.0050.01001.522.533.544.555.5 0.6070 1.850.2 2.330.4 2.80.6 3.430.8 5.5110-10-5051015202530 -1.08e+030 -1.370.2 -0.450.4 0.9210.6 2.290.8 1.36e+031
SSVPjetSSVPjetSSVPSETIPjetElectronIPjetElectronIP

139

CMS private work 2010s = 7 TeV60CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
1datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacher®data 38Xpythia6 QCD1datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacher®data 38Xpythia6 QCD
per bin per bin3055total10-1total10-1
5010-3no. of events in flat bins4010-3no. of events in flat bins20
N/N10-245N/N-225
103510-43010-415
-5-52510101020-610-61015100200300400500 0.001800.2 0.2050.4 0.3760.6 0.6160.8 1.071 524050010001500200025005 -3970 -1.190.2 0.2560.4 1.250.6 4.460.8 2.61e+031
SETPtjetElectronPtjetElectronPtSMTIPjetMuonIPjetMuonIP
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
1datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacher®data 38Xpythia6 QCD1datapythia 6 QCD TuneZ2420NeuroBayes <phi-t>® Teacher®data 38Xpythia6 QCD
per bin30 per bin10-1400
totaltotalN/N0.825N/N10-2380
no. of events in flat bins2010-4no. of events in flat bins340
360-3100.6320-50.41015300280-610100.2260-71024005001000150020002500300050 0.00301 0.3190.2 0.5260.4 0.8030.6 1.350.8 3.16e+031-60-40-20020406080100120 -65.90 0.6580.2 1.030.4 1.350.6 1.80.8 1381
SMTPtjetMuonPtjetMuonPtTCHEjetTCHEjetTCHE
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
1datapythia 6 QCD TuneZ2400NeuroBayes <phi-t>® Teacher®data 38Xpythia6 QCD
per bin-110380totalN/Nno. of events in flat bins-236010-334010-432010-530010-628010260-710-40-20020406080 -72.50 0.1590.2 0.690.4 0.9950.6 1.350.8 92.31
jetTCHPjetTCHPTCHP

0.1CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®® per bindatapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet1200NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
3000total25000.081000N/Nno. of events in flat binsno. of events in flat binsno. of events in flat bins250020008000.062000150060015000.04100040010000.022005005000010203040500 00 60.2 80.4 100.6 130.8 5610 00 70.2 100.4 130.6 170.8 7310 00 60.2 80.4 100.6 130.8 731
jetNTrackjetNTrackjetNTrackjetNTrackjetNTrackjetNTrackjetNTrack
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®datapythia 6 QCD TuneZ230000NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet9000NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin300001total800025000N/Nno. of events in flat bins25000no. of events in flat binsno. of events in flat bins70002000060002000050001500015000400030001000010000-120001050005000100000.511.522.533.5400 0 0 0 0 00.20.40.60.8 1100 0 0 00.20.40.6 1 1 10.8100 0 0 00.20.40.6 1 1 10.81
jetNSVjetNSVjetNSVjetNSVjetNSVjetNSVjetNSV
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
data®®pythia6 QCD non-b-jet10000®data 38X
1pythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD30000NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetNeuroBayes <phi-t>® Teacherpythia6 QCD b-jet
per bin30000-110total8000N/Nno. of events in flat bins25000no. of events in flat bins25000no. of events in flat bins-21020000200006000-3101500015000-44000101000010000-510200050005000-61000.511.522.533.544.5500 0 0 0 0 0 50.20.40.60.8100 0 0 0 0 0 50.20.40.60.8100 0 0 0 0 0 50.20.40.60.81
jetNMuonjetNElejetNElejetNElejetNElejetNElejetNEle
CMS private work 2010s = 7 TeV24000CMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®1datapythia 6 QCD TuneZ222000NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD22000NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet7000NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin20000total10-120000180006000
18000N/Nno. of events in flat bins12000no. of events in flat bins12000no. of events in flat bins4000
50001600016000-2140001014000-310000103000100008000800020006000-460001040004000100020002000-510012345600 0 0 0 00.20.40.6 10.8 6100 0 0 00.20.4 1 10.60.8 6100 0 0 0 00.20.40.6 10.8 61
jetNElejetNMuonjetNMuonjetNMuonjetNMuonjetNMuonjetNMuon

140APPENDIXA.DISTRIBUTIONSOFB-JETTAGGINGVARIABLES

CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®® per bindatapythia 6 QCD TuneZ2420NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD1600NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet300NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
400total2501400380-1N/N1010-2no. of events in flat binsno. of events in flat bins800no. of events in flat bins150
1200200360100034032060010030040028050-31020026000.10.20.30.40.50.60.70.80.91 2.11e-160 0.001550.2 0.009860.4 0.03880.6 0.1630.8 110 4.15e-120 0.02120.2 0.1450.4 0.7370.6 0.9980.8 110 6.18e-110.2 0.01430.4 0.1410.6 0.8460.8 0.9981 1
jetTrackProbtrackProbtrackProbtrackProbtrackProbtrackProbtrackProb
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeV120CMS private work 2010s = 7 TeV
®®®0.024 per bindatapythia 6 QCD TuneZ260NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD300NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
0.022100total0.0250250N/N0.0180.012no. of events in flat binsno. of events in flat bins150no. of events in flat bins60
800.016200400.014300.01401000.0082020500.0060.0040.10.20.30.40.50.60.70.8 0.06970 0.1550.2 0.260.4 0.40.6 0.6150.8 0.86510 0.04740 0.3190.2 0.5570.4 0.8560.60.8 0.9721 0.9990 0.0520 0.3860.2 0.6070.4 0.7380.6 0.7870.8 0.7981
jetVertexProbvertexProbvertexProbvertexProbvertexProbvertexProbvertexProb
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®® per bin0.06datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD300NeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet100NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
150total2501400.05N/N800.03no. of events in flat bins110no. of events in flat bins150no. of events in flat bins
1302000.0460120401001000.029020500.0180000.10.20.30.40.50.60.70.80.9 0.01440 0.2980.2 0.3380.4 0.4160.6 0.5610.8 0.9951 0.02720 0.3790.2 0.4530.4 0.5640.6 0.7170.8 11 0.008330 0.310.2 0.4060.4 0.560.6 0.7550.8 11
jetElectronProbelectronProbelectronProbelectronProbelectronProbelectronProbelectronProb
CMS private work 2010s = 7 TeV24000CMS private work 2010s = 7 TeV140CMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®® per bin0.09datapythia 6 QCD TuneZ222000NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet60NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
120200000.08total50N/N180000.05no. of events in flat bins12000no. of events in flat binsno. of events in flat bins30
0.0710016000400.061400080600.04100002080000.034060000.02104000200.0120000000.10.20.30.40.50.60.70.80.900 0 0 0 00.20.40.6 10.8 61 0.001420 0.2410.2 0.3710.4 0.650.6 0.8830.8 11 0.002430 0.2820.2 0.5540.4 0.7470.6 0.8590.8 0.9991
jetMuonProbjetNMuonjetNMuonmuonProbmuonProbmuonProbmuonProb
CMS private work 2010s = 7 TeV12000CMS private work 2010s = 7 TeV10000CMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®0.35datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jet3000NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bin100000.3total80002500N/Nno. of events in flat binsno. of events in flat binsno. of events in flat bins0.258000200060000.2600015000.154000400010000.1200020005000.05005101520250 0 000.2 1 10.40.6 20.8 2910 00 10.2 20.4 40.6 60.8 51100 00.2 10.4 20.6 4 60.8 361
jetNGoodTracknGoodTracksnGoodTracksnGoodTracksnGoodTracksnGoodTracksnGoodTracks
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV9000CMS simulations = 7 TeVCMS private work 2010s = 7 TeV
®®®0.7datapythia 6 QCD TuneZ2NeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCDNeuroBayes <phi-t>® Teacherpythia6 QCD b-jetpythia6 QCD non-b-jetNeuroBayes <phi-t>® Teacherdata 38Xpythia6 QCD b-jet
per bintotal0.6700080002500
70000.4no. of events in flat bins4000no. of events in flat bins5000no. of events in flat bins1500
6000N/N20000.56000500040000.33000100030000.22000200050010000.11000000.511.522.533.5400 0 0 0 00.20.40.6 10.8 4100 0 0 00.20.40.6 1 10.81 500 0 0 00.20.4 1 10.60.8 41
jetNGoodElenGoodElectronsnGoodElectronsnGoodElectronsnGoodElectronsnGoodElectronsnGoodElectrons

BndixeApp

ResultsofdatatoMonteCarlo

comparison

ThefiguresshowtheoutputdistributionsoftheNeuroBayesexperts,whichwerecalibratedto
comparephysicsobjectsfromthedetector(black)andsimulatedobjects,createdwithPythia
6QCDTune2Zeventgeneration[SMS06](red).ThenumberofQCDeventsandMonteCarlo
eventsareinthesameorder,sotheapriorifractionisaround0.5.Itisagoodindicationthat
theseparationofthetwoclassesissmall.NeverthelesseventswithsmallvaluesoftheNeuroBayes
outputvariablerepresentakindofeventwhichareunderestimatedinsimulation.Eventsinthe
regionaround0.5arewellsimulatedandeventswithlargervaluesoftheNeuroBayesoutputvariable
areoverestimatedinthesimulation.Thewidthoftheoutputdistributionisameasurementofthe
MCquality.Ifitistoobroad,itisnecessarytolookintomoredetailoftherelatedinputvariables.
Foreachphysicsobjects:tracks,secondaryvertices,electroncandidates,muonsandjetsthecom-
parisonwasdoneinthesixdifferenttriggerregions.Thefollowingplotsshowtheiroutputdistri-
butionsoftheNeuroBayesexperts.

141

142APPENDIXB.RESULTSOFDATATOMONTECARLOCOMPARISON
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
®®®12000NeuroBayes <phi-t>® Teachertarget 0target 122000NeuroBayes <phi-t>® Teachertarget 0target 112000NeuroBayes <phi-t>® Teachertarget 0target 1
20000no. of tracks/0.018000no. of tracks/0.0114000no. of tracks/0.018000
18000100001000016000120006000600010000800040004000600040002000200020000-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output11
®CMS private work 2010s = 7 TeV24000®CMS private work 2010s = 7 TeV®CMS private work 2010s = 7 TeV
16000NeuroBayes <phi-t>® Teachertarget 0target 122000NeuroBayes <phi-t>® Teachertarget 0target 1NeuroBayes <phi-t>® Teachertarget 0target 1
700020000no. of tracks/0.0112000no. of tracks/0.0116000no. of tracks/0.015000
14000600018000140001000040001200080001000030006000800020006000400040001000200020000-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output11-10-0.80.10.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output11
FigureB.1:ResultofthecomparisonofdataandMCfortrackobjects
CMS private work 2010s = 7 TeV2200CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jetNeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet2000NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet
2000140018001800no. of vertices/0.01no. of vertices/0.01no. of vertices/0.0116001200160014001000140012001200800100010008006008006006004004004002002002000-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.8110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.8110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.811
NeuroBayes outputNeuroBayes outputNeuroBayes output
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
1800NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet2000NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jetNeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet
80018001600no. of vertices/0.011200no. of vertices/0.011400no. of vertices/0.01600
70016001400500120010001000400800800300600600200400400100200200-10-0.80.10.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.8110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.811-10-0.80.1-0.60.2-0.40.3-0.20.400.50.60.20.70.40.80.6®0.90.811
NeuroBayes outputNeuroBayes outputNeuroBayes output
FigureB.2:ResultofthecomparisonofdataandMCforvertexobjects

143CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
®®®NeuroBayes <phi-t>® Teacherdata CMSSW38Xpythia6 MC b-jetNeuroBayes <phi-t>® Teacherdata CMSSW38Xpythia6 MC b-jetNeuroBayes <phi-t>® Teacherdata CMSSW38Xpythia6 MC b-jet
5000500010000400080004000no. of electron canditates/0.012000no. of electron canditates/0.012000no. of electron canditates/0.014000
3000600030002000100010000-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output11
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
®®®10000NeuroBayes <phi-t>® Teacherdata CMSSW38Xpythia6 MC b-jet12000NeuroBayes <phi-t>® Teacherdata CMSSW38Xpythia6 MC b-jet3000NeuroBayes <phi-t>® Teacherdata CMSSW38Xpythia6 MC b-jet
1000025008000no. of electron canditates/0.014000no. of electron canditates/0.014000no. of electron canditates/0.011000
80002000600015006000200050020000-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output11-100.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.7NeuroBayes0.40.80.6®0.90.8 output11
FigureB.3:ResultofthecomparisonofdataandMCforelectroncandidateobjects
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
1200NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet1600NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet1800NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet
140016001000no. of muons/0.01800no. of muons/0.011000no. of muons/0.011200
1400120010006008008006004006004004002002002000-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.811-10-0.80.1-0.60.20.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.8110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.811
NeuroBayes outputNeuroBayes outputNeuroBayes output
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
5000NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jetNeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jetNeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet
2500700no. of muons/0.01no. of muons/0.01no. of muons/0.014000600200050030001500400200030010002001000500100-10-0.80.1-0.60.2-0.40.3-0.20.400.50.20.60.40.70.60.8®0.90.8110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.811-10-0.80.1-0.60.2-0.40.3-0.20.400.50.20.60.40.70.60.8®0.80.911
NeuroBayes outputNeuroBayes outputNeuroBayes output
FigureB.4:ResultofthecomparisonofdataandMCformuonobjects

144APPENDIXB.RESULTSOFDATATOMONTECARLOCOMPARISON
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
7000NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jetNeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet18000NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet
7000no. of jets/0.01no. of jets/0.01no. of jets/0.011600060006000140005000500012000400010000400080003000300060002000200040001000100020000-10.1-0.8-0.60.2-0.40.3-0.20.400.50.20.60.40.70.60.8®0.80.9110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.8110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.80.911
NeuroBayes outputNeuroBayes outputNeuroBayes output
CMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeVCMS private work 2010s = 7 TeV
NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet14000NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet2500NeuroBayes <phi-t>® Teacher®data CMSSW38Xpythia6 MC b-jet
25000no. of jets/0.01no. of jets/0.01no. of jets/0.011200020002000010000150080001500060001000100004000500500020000-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.8110-10.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.811-100.1-0.80.2-0.60.3-0.40.4-0.20.500.60.20.70.40.80.6®0.90.811
NeuroBayes outputNeuroBayes outputNeuroBayes output
FigureB.5:ResultofthecomparisonofdataandMCforjetobjects

CndixeAppDependencycheck
Insection5.3.4thedifferencesbetweenthedataandtheMCtaggerwerediscussed.Tovisualize
thedependenciesofthetwoapproachesthefollowingequationwasdeduced:
ot1f1/f2
ot2=1−P(S)+ot1(P(S)(f1+1)+f1/f2−1)
Tocheckthecorrectimplementationtheframeworkwastested.FirstonMCtwoNeuroBayes
calibrationsweretrainedwithdifferenttarget0samples.Oncewithbackgroundsimulationsand
oncewithcompletedatasimulations.TheexpectedP(S)=0.036.Theresultisplottedina
scatterplot(figureC.1).
ForthesecondtesttwoNeuroBayescalibrationsweretrainedonthesametargetbutwithdifferent
samplefraction.Theresultisalsoplottedtoeachotherinascatterplot.Heretheexpected
P(S)=0.FigureC.1confirmsthatthedetermineddependencyiscorrect.
CMS simulations = 7 TeVCMS simulations = 7 TeV
0.91NeuroBayes <phi-t>® Teacher®0.91NeuroBayes <phi-t>® Teacher®
(track)(electron)0.80.8t2o0.7t20.7o0.60.60.50.50.40.40.30.30.20.20.1diagonalMC expectation0.1diagonalMC expectation
000.10.20.30.40.50.60.70.80.91000.10.20.30.40.50.60.70.80.91
ot1(track)ot1(electron)
FigureC.1:DependencycheckfortwoNeuroBayescalibrationstrainedondifferentscenarios.Left:
signaleventswhereaddedtothetarget0sample.Ontherightonlythefractionoftwosamples
changes.Thepointsfollowtheredlinewhichconfirmsthedetermineddependencyequation.
145

146

APPENDIXC.DEPENDENCYCHECK

DndixeAppFithistogramsofflavourcontent
fitterThefollowinghistogramsshowthefitsoftheflavourcontentfitter.Eachplotcorrespondstoa
regioninwhichtheprocedurewasperformed.ItstartswiththelowpTinthebarrelregionofthe
detector.Afterthattheresultsfortheforwardregionsareshown.
500CMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeV
220datapythia6 QCD b-jet300datapythia6 QCD b-jet200datapythia6 QCD b-jet
400pythia6 QCD c-jetpythia6 QCD light jetpythia6 QCD c-jetpythia6 QCD light jet180pythia6 QCD c-jetpythia6 QCD light jet
25037 GeV < pT≤|y|≤ 43 GeV 1.543 GeV < pT≤|y| 49 GeV≤ 1.516049 GeV < pT≤|y|≤ 56 GeV 1.5
140number of jets / 0.31 GeV200number of jets / 0.31 GeVnumber of jets / 0.31 GeV30012015010020080100601004050200001523451015206789secondary vertex mass [GeV]2530100001523451015206789secondary vertex mass [GeV]2530100001523451015206789secondary vertex mass [GeV]253010
CMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeV
9050140datapythia6 QCD b-jet80datapythia6 QCD b-jetdatapythia6 QCD b-jet
120pythia6 QCD c-jetpythia6 QCD light jetpythia6 QCD c-jetpythia6 QCD light jetpythia6 QCD c-jetpythia6 QCD light jet
704010056 GeV < pT≤|y|≤ 64 GeV 1.56064 GeV < pT≤|y|≤ 74 GeV 1.574 GeV < pT≤|y|≤ 84 GeV 1.5
number of jets / 0.31 GeVnumber of jets / 0.31 GeVnumber of jets / 0.31 GeV3050804060203040201020100001523451015206789secondary vertex mass [GeV]2530100001523451015206789secondary vertex mass [GeV]2530100001523451015206789secondary vertex mass [GeV]253010
CMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeV
250450datadatadata400pythia6 QCD c-jetpythia6 QCD b-jetpythia6 QCD c-jetpythia6 QCD b-jet1200pythia6 QCD c-jetpythia6 QCD b-jet
350pythia6 QCD light jet200pythia6 QCD light jet1000pythia6 QCD light jet
84 GeV < pT≤|y|≤ 97 GeV 1.597 GeV < pT≤|y| 114 GeV≤ 1.5114 GeV < pT≤|y|≤ 133 GeV 1.5
300number of jets / 0.31 GeVnumber of jets / 0.31 GeVnumber of jets / 0.31 GeV1508002506002001001504001005020050000152345101520678925301000015234510152067892530100001523451015206789253010
secondary vertex mass [GeV]secondary vertex mass [GeV]secondary vertex mass [GeV]147

148APPENDIXD.FITHISTOGRAMSOFFLAVOURCONTENTFITTER
CMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeV
600datapythia6 QCD b-jet700datapythia6 QCD b-jetdatapythia6 QCD b-jet
pythia6 QCD c-jetpythia6 QCD light jet600pythia6 QCD c-jetpythia6 QCD light jet300pythia6 QCD c-jetpythia6 QCD light jet
500250133 GeV < pT≤|y|≤ 153 GeV 1.5500153 GeV < pT≤|y|≤ 174 GeV 1.5174 GeV < pT≤|y|≤ 196 GeV 1.5
400number of jets / 0.31 GeVnumber of jets / 0.31 GeVnumber of jets / 0.31 GeV20040030015030020010020010050100000152345101520678925301000015234510152067892530100001523451015206789253010
secondary vertex mass [GeV]secondary vertex mass [GeV]secondary vertex mass [GeV]CMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeV
300250400datapythia6 QCD b-jetdatapythia6 QCD b-jetdatapythia6 QCD b-jet
350pythia6 QCD c-jetpythia6 QCD light jet200pythia6 QCD c-jetpythia6 QCD light jet250pythia6 QCD c-jetpythia6 QCD light jet
300196 GeV < pT≤|y|≤ 220 GeV 1.5220 GeV < pT≤|y|≤ 245 GeV 1.5200245 GeV < pT≤|y|≤ 272 GeV 1.5
number of jets / 0.31 GeVnumber of jets / 0.31 GeVnumber of jets / 0.31 GeV150250150200100100150100505050000152345101520678925301000015234510152067892530100001523451015206789253010
secondary vertex mass [GeV]secondary vertex mass [GeV]secondary vertex mass [GeV]CMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeV
120160datapythia6 QCD b-jetdatapythia6 QCD b-jet60datapythia6 QCD b-jet
140pythia6 QCD c-jetpythia6 QCD light jet100pythia6 QCD c-jetpythia6 QCD light jetpythia6 QCD c-jetpythia6 QCD light jet
50120272 GeV < pT≤|y|≤ 300 GeV 1.580300 GeV < pT≤|y|≤ 330 GeV 1.5330 GeV < pT≤|y|≤ 362 GeV 1.5
number of jets / 0.31 GeVnumber of jets / 0.31 GeVnumber of jets / 0.31 GeV10040608030604020402010200001523451015206789secondary vertex mass [GeV]2530100001523451015206789secondary vertex mass [GeV]2530100001523451015206789secondary vertex mass [GeV]253010
CMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeV
100140datapythia6 QCD b-jetdatapythia6 QCD b-jet80datapythia6 QCD b-jet
pythia6 QCD c-jetpythia6 QCD light jet120pythia6 QCD c-jetpythia6 QCD light jet70pythia6 QCD c-jetpythia6 QCD light jet
80362 GeV < pT≤|y| 1000 GeV≤ 1.510037 GeV < pT1.5 < |y| ≤≤ 43 GeV 2.56043 GeV < pT1.5 < |y| ≤≤ 49 GeV 2.5
number of jets / 0.31 GeVnumber of jets / 0.31 GeVnumber of jets / 0.31 GeV506080406040304020202010000001523451015206789secondary vertex mass [GeV]253010001523451015206789secondary vertex mass [GeV]253010001523451015206789secondary vertex mass [GeV]253010
CMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeV
35datadatadata70pythia6 QCD c-jetpythia6 QCD b-jetpythia6 QCD c-jetpythia6 QCD b-jet25pythia6 QCD c-jetpythia6 QCD b-jet
30pythia6 QCD light jetpythia6 QCD light jetpythia6 QCD light jet605049 GeV < pT1.5 < |y| ≤≤ 56 GeV 2.52556 GeV < pT1.5 < |y| ≤≤ 64 GeV 2.52064 GeV < pT1.5 < |y| ≤≤ 74 GeV 2.5
number of jets / 0.31 GeVnumber of jets / 0.31 GeVnumber of jets / 0.31 GeV20154015301010205510000152345101520678925301000015234510152067892530100001523451015206789253010
secondary vertex mass [GeV]secondary vertex mass [GeV]secondary vertex mass [GeV]

149CMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeV
70datadatadata16pythia6 QCD c-jetpythia6 QCD b-jet100pythia6 QCD b-jetpythia6 QCD c-jet60pythia6 QCD c-jetpythia6 QCD b-jet
14pythia6 QCD light jetpythia6 QCD light jetpythia6 QCD light jet1274 GeV < pT≤1.5 < |y| 84 GeV≤ 2.58084 GeV < pT1.5 < |y| ≤≤ 97 GeV 2.55097 GeV < pT≤1.5 < |y| ≤ 114 GeV 2.5
number of jets / 0.31 GeVnumber of jets / 0.31 GeVnumber of jets / 0.31 GeV40106083040620420102000152345101520678925301000015234510152067892530100001523451015206789253010
secondary vertex mass [GeV]secondary vertex mass [GeV]secondary vertex mass [GeV]CMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeV
140250datapythia6 QCD b-jet120datapythia6 QCD b-jetdatapythia6 QCD b-jet
pythia6 QCD c-jetpythia6 QCD light jetpythia6 QCD c-jetpythia6 QCD light jet120pythia6 QCD c-jetpythia6 QCD light jet
100200114 GeV < pT≤ 133 GeV133 GeV < pT≤ 153 GeV100153 GeV < pT≤ 174 GeV
1.5 < |y| ≤ 2.5801.5 < |y| ≤ 2.51.5 < |y| ≤ 2.5
number of jets / 0.31 GeVnumber of jets / 0.31 GeVnumber of jets / 0.31 GeV15080606010040405020200001523451015206789secondary vertex mass [GeV]2530100001523451015206789secondary vertex mass [GeV]2530100001523451015206789secondary vertex mass [GeV]253010
CMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeV
8090datadatadata80pythia6 QCD c-jetpythia6 QCD b-jet70pythia6 QCD c-jetpythia6 QCD b-jet40pythia6 QCD c-jetpythia6 QCD b-jet
pythia6 QCD light jet35pythia6 QCD light jetpythia6 QCD light jet70174 GeV < pT≤1.5 < |y| ≤ 196 GeV 2.560196 GeV < pT≤1.5 < |y| ≤ 220 GeV 2.530220 GeV < pT≤1.5 < |y| ≤ 245 GeV 2.5
60number of jets / 0.31 GeV20number of jets / 0.31 GeV20number of jets / 0.31 GeV10
50255040204030153051010000152345101520678925301000015234510152067892530100001523451015206789253010
secondary vertex mass [GeV]secondary vertex mass [GeV]secondary vertex mass [GeV]CMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeV
1860datapythia6 QCD b-jet25datapythia6 QCD b-jetdatapythia6 QCD b-jet
16pythia6 QCD c-jetpythia6 QCD light jetpythia6 QCD c-jetpythia6 QCD light jetpythia6 QCD c-jetpythia6 QCD light jet
1450245 GeV < pT≤1.5 < |y| ≤ 272 GeV 2.520272 GeV < pT≤1.5 < |y| 300 GeV≤ 2.512300 GeV < pT≤1.5 < |y| ≤ 330 GeV 2.5
number of jets / 0.31 GeVnumber of jets / 0.31 GeVnumber of jets / 0.31 GeV4010153081062045102000152345101520678925301000015234510152067892530100001523451015206789253010
secondary vertex mass [GeV]secondary vertex mass [GeV]secondary vertex mass [GeV]CMS private work, 39 pb-1s = 7 TeVCMS private work, 39 pb-1s = 7 TeV
9datapythia6 QCD b-jet12datapythia6 QCD b-jet
pythia6 QCD c-jetpythia6 QCD c-jet8pythia6 QCD light jetpythia6 QCD light jet107330 GeV < pT≤ 362 GeV362 GeV < pT≤ 1000 GeV
61.5 < |y| ≤ 2.581.5 < |y| ≤ 2.5
number of jets / 0.31 GeVnumber of jets / 0.31 GeV564432210001523451015206789secondary vertex mass [GeV]2530100001523451015206789secondary vertex mass [GeV]253010

150

APPENDIXD.FITHISTOGRAMSOFVFLAOURCONTENTFITTER

AppEndixe

FithistogramsofNBflavour
contentfitter

Thefollowinghistogramsshowsomeofthefitsoftheflavourcontentfitter.Eachplotcorresponds
toaregioninwhichtheprocedurewasperformed.Thebinsarechoseninawaythatitispossible
toseetheeffectoftheinsufficentdistributionsinthewholepT/yphasespace.
CMS private work, 36 pb-1s = 7 TeVCMS private work, 36 pb-1s = 7 TeV
2data1.8pythia6 QCD b-jetpythia6 QCD c-jet31.6pythia6 QCD light jet1056 GeV < pT≤0 < |y| ≤ 64 GeV0.51.4
number of jets / 0.01 GeV1.2210number of jets/fit per 0.01 GeV10.8100.60.410.2000.120.2040.30.400.560.6080.7NeuroBayes b-jet tagger0.800.9101000020.10.200.340.400.560.600.7NeuroBayes b-jet tagger80.800.91010
CMS private work, 36 pb-1s = 7 TeVCMS private work, 36 pb-1s = 7 TeV
2data4101.8pythia6 QCD b-jetpythia6 QCD c-jet1.6pythia6 QCD light jet10384 GeV < pT≤ 97 GeV1.4
1 < |y| ≤1.5number of jets / 0.01 GeV1.22number of jets/fit per 0.01 GeV1010.8100.60.410.2000.120.2040.30.400.560.6080.700.80.9101000020.10.200.340.400.560.600.780.800.91010
NeuroBayes b-jet taggerNeuroBayes b-jet tagger

151

152APPENDIXE.FITHISTOGRAMSOFNBFLAVOURCONTENTFITTER

1-CMS private work, 36 pbs = 7 TeVdata4pythia6 QCD b-jet10pythia6 QCD c-jetpythia6 QCD light jet103133 GeV < pT≤ 153 GeV
0.5 < |y| ≤1number of jets / 0.01 GeV210101000.120.2040.30.400.560.6080.7NeuroBayes b-jet tagger00.80.91010
1-CMS private work, 36 pbs = 7 TeV4data10pythia6 QCD b-jetpythia6 QCD c-jetpythia6 QCD light jet310196 GeV < pT≤1.5 < |y| 229 GeV≤2
number of jets / 0.01 GeV210101000.120.2040.30.400.560.6080.700.80.91010
NeuroBayes b-jet tagger1-CMS private work, 36 pbs = 7 TeVdatapythia6 QCD b-jetpythia6 QCD c-jet3pythia6 QCD light jet10272 GeV < p≤ 300 GeVT1 < |y| ≤1.5number of jets / 0.01 GeV210101000.120.2040.30.400.560.6080.7NeuroBayes b-jet tagger00.80.91010

1-CMS private work, 36 pbs = 7 TeV21.81.61.41.2number of jets/fit per 0.01 GeV10.80.60.40.200020.10.200.340.400.560.600.7NeuroBayes b-jet tagger80.800.91010
2CMS private work, 36 pb-1s = 7 TeV
1.81.61.41.2number of jets/fit per 0.01 GeV10.80.60.40.200020.10.200.340.400.560.600.780.800.91010
NeuroBayes b-jet tagger2CMS private work, 36 pb-1s = 7 TeV
1.81.61.41.2number of jets/fit per 0.01 GeV10.80.60.40.200020.10.200.340.400.560.600.7NeuroBayes b-jet tagger80.800.91010

210number of jets / 0.01 GeV101

10

1

CMS private work, 36 pb-1s = 7 TeVCMS private work, 36 pb-1
2data1.8pythia6 QCD b-jetpythia6 QCD c-jet1.6pythia6 QCD light jet362 GeV < pT≤2 < |y| 1000 GeV≤2.51.4
1.2number of jets/fit per 0.01 GeV10.80.60.40.2000.120.2040.30.400.560.6080.70.800.9101000020.10.200.340.400.560.60
NeuroBayes b-jet taggerNeuroBayes b-jet tagger

00

0.1

0.30.2042

0.40

0.560.6080.7NeuroBayes b-jet tagger0.800.91010

0.20

0.3

0.404

s = 7 TeV

0.560.600.7NeuroBayes b-jet tagger80.800.91010

153

154

APPENDIXE.FITHISTOGRAMSOFNBVFLAOURCONTENTFITTER

FiguresofList

2.1Eightfoldway.......................................
2.2electronscattering....................................
2.3normalizedDrell-Yanspectrum.............................
2.4heavyquarkFCRproductionmechanism.......................
2.5heavyquarkFEXproductionmechanism.......................
2.6heavyquarkGSPproductionmechanism........................
2.7heavyquarkproductionmechanismpTspectrum...................
2.8UA1bcrosssectionmeasurement............................
2.9Tevatronbcrosssectionmeasurements.........................
2.10ZEUSbcrosssectionmeasurements..........................
2.11CDFRun2bcrosssectionmeasurement........................
2.12CDFRun2bb¯crosssectionmeasurement.......................
3.1LHCgeographicalview..................................
3.23DmodeloftheCMSdetector.............................
3.33DmodeloftheCMSdetector.............................
4.1CMSluminosity.....................................
4.2primaryvertexreconstruction..............................
4.3anti-ktjetalgorithm...................................
4.4jetenergycorrections...................................
4.5trackcountingb-jettagger................................
4.6jetprobabilityb-jettagger................................
4.7softmuonb-jettagger..................................
4.8simplesecondaryvertexb-jettagger..........................
4.9performanceoftheb-jettagger.............................
4.10jetfraction36X......................................
4.11CMSSW36XMCamountofstatistics.........................
4.12jetfraction38X......................................
4.13CMSSW36XMCamountofstatistics.........................
4.14triggerturnon......................................
4.15datastatistics.......................................
4.16comparisonofthejetmomentumspectrumwithMC.................
5.1Probabilityintegraltransformation...........................
5.2orthogonalpolynomialfit................................
5.3Probabilityintegraltransformation-targetdistributions...............
5.4orthogonalpolynomialfit................................

155

2131411771818102122232328292138304344464647484941525454565758506162626

156

FIGURESOFLIST36365666761777779708182848.............4878889898192939798989001101201201301301701801801901011111211311511711711811911911021121

5.5Standardizationofinputvariable............................
5.6Matrixofcorrelationcoefficients............................
......................................iagonalizationD5.75.8Artificialneuralnetwork.................................
5.9purityinterpretationofNeuroBayesoutput......................
5.10largeweighteffects....................................
5.11architectureoftheNeuroBayesb-jettagger......................
5.12internalboostshapedifferences.............................
5.13spectrumtransformationfit...............................
5.14signedtrackimpactparametersignificance.......................
5.15NeuroBayesoutputtrack/vertextrainingNBMC...................
5.16numberoftracksconnectedtothesecondaryvertex.................
5.17transversemomentumrelativetothejetaxisoftheelectron.............
5.18NeuroBayesoutputleptontrainingNBMC.......................
5.19NeuroBayesoutputjettrainingNBMC.........................
5.20trackprobabilityfortheboosttraining.........................
5.21otafterboostweighing..................................
5.22NeuroBayesoutputjetboosttrainingNBMC.....................
5.23NeuroBayesoutputjetboosttrainingNBMC.....................
5.24performanceoftheNBMCb-jettagger.........................
5.25exemplarydifferencesfoundbycomparison......................
5.26trackdatatrainingoutput................................
5.27datatrainingtrackcomparison.............................
5.28pT,reldiscrepancy.....................................
5.29datatrainingtrackcomparison.............................
5.30datatrainingvertexcomparison.............................
5.31vertextracknumbervariation..............................
5.32datatrainingleptoncomparison.............................
5.33datatrainingjetcomparison...............................
5.34performanceoftheNBDb-jettagger..........................
....................................efficiency-taggingb6.16.2secondaryvertexmassfit................................
....................................efficiency-taggingb6.36.4b-taggedsamplepurity..................................
6.5b-taggingefficiencyvariation..............................
6.6Leadingsourcesofsystematicsuncertainty.......................
6.7Measuredb-jetcrosssection...............................
6.8Measuredb-jetcrosssectionratio............................
6.9SSVPdistribution....................................
6.10fcf:b-jetfractionfortaggedjetsinpTbins......................
6.11fcf:primaryvertexdependencies............................
6.12fcf:b-jetfractionfortaggedjetsinpT/ybins.....................
6.13Templatevariationduetostatisticalfluctuations...................
6.14fcf:systematicsstudies..................................
6.15SSVHefficiencies.....................................
6.16fcf:normalizedNb....................................

LISTOF6.17681.91.6

1.B2.B3.B4.B5.B

1.C

FIGURESupdatedb-jetcrosssection................................
NBexperttemplatefit..................................
NeuroBayesflavourcontentfitresult..........................

dataMCcomparison:track...............................
dataMCcomparison:vertex..............................
dataMCcomparison:electroncandidate........................
dataMCcomparison:muon...............................
dataMCcomparison:jet................................

dependancycheck.....................................

157

221132421

412241341341441

541

158

LISTOFFIGURES

yBibliograph

[A+87]C.Albajaretal.BeautyproductionattheCERNproton-antiprotoncollider.Physics
LettersB,186(2):237–246,1987.
[A+91]C.Albajaretal.BeautyproductionattheCERNppcollider.PhysicsLettersB,
1991.256(1):121–128,[A+92]F.Abeetal.MeasurementoftheB-mesonandb-quarkcrosssectionsat√s=1.8
TeVusingtheexclusivedecayB±→J/ψK±.Phys.Rev.Lett.,68(23):3403–3407,Jun
1992.[A+93a]F.Abeetal.Measurementofbottomquarkproductionin1.8TeVpp¯collisionsusing
muonsfromb-quarkdecays.Phys.Rev.Lett.,71(15):2396–2400,Oct1993.
[A+93b]F.Abeetal.Measurementofthebottomquarkproductioncrosssectionusingsemilep-
tonicdecayelectronsinpp¯collisionsat√s=1.8TeV.Phys.Rev.Lett.,71(4):500–504,
1993.Jul[A+94]F.Abeetal.MeasurementoftheBmesonandbquarkcrosssectionsat√s=1.8TeV
usingtheexclusivedecayB0→J/ψK∗(892)0.Phys.Rev.D,50(7):4252–4257,Oct
1994.[A+95a]S.Abachietal.Inclusiveμandb-QuarkProductionCrossSectionsinpp¯Collisionsat
√s=1.8TeV.Phys.Rev.Lett.,74(18):3548–3552,May1995.
[A+95b]F.Abeetal.MeasurementoftheBMesonDifferentialCrossSectiondσ/dpTinpp¯
Collisionsat√s=1.8TeV.Phys.Rev.Lett.,75(8):1451–1455,Aug1995.
[A+99]C.Adloffetal.Measurementofopenbeautyproductionathera.PhysicsLettersB,
1999.467(1-2):156–164,[A+00a]B.√Abbottetal.Small-AngleMuonandBottom-QuarkProductioninpp¯Collisionsat
s=1.8TeV.Phys.Rev.Lett.,84(24):5478–5483,Jun2000.
[A+00b]B.Abbottetal.Theb-bbarProductionCrossSectionandAngularCorrelationsin
p-pbarCollisionsatsqrt(s)=1.8TeV.PhysicsLettersB,487(3-4):264–272,2000.
[A+01]M.Acciarrietal.Measurementsofthecross-sectionsforopencharmandbeauty
productioninγγcollisionsat√s=189-GeVto202-GeV.Phys.Lett.,B503:10–20,
2001.[A+02a]D.Acostaetal.MeasurementoftheB+totalcrosssectionandB+differentialcross
sectiondσ/dpTinpp¯collisionsat√s=1.8TeV.Phys.Rev.D,65(5):052005,Feb
2002.

159

160

BIBLIOGRAPHY

[A+02b]D.Acostaetal.Measurementoftheratioofbquarkproductioncrosssectionsinpp¯
√√collisionsats=630GeVands=1800GeV.Phys.Rev.D,66(3):032002,Aug
2002.[A+05]P.Achardetal.Measurementofthecrosssectionforopen-beautyproductionin
photon-photoncollisionsatLEP.Phys.Lett.,B619:71–81,2005.
[A+09]L.Agostinoetal.CommissioningoftheCMSHighLevelTrigger.J.Instrum.,4(CMS-
2009.Junp,14NOTE-2009-012):P10005.[A+10]R.Aaijetal.Measurementofsigma(pp→b¯bX)at(s)=7TeVintheforwardregion.
Phys.Lett.,B694:209–216,2010.
[AMST06]WolfgangAdam,BorisMangano,ThomasSpeer,andTeddyTodorov.TrackRecon-
structionintheCMStracker.TechnicalReportCMS-NOTE-2006-041.CERN-CMS-
NOTE-2006-041,CERN,Geneva,Dec2006.
[And33]C.D.Anderson.THEPOSITIVEELECTRON.Phys.Rev.,43:491–494,1933.
[B+00]GLBayatyanetal.CMSTriDASproject:TechnicalDesignReport;1,thetrigger
systems.NumberCERN-LHCC-2000-038inTechnicalDesignReportCMS.2000.
[B+01]J.Breitwegetal.Measurementofopenbeautyproductioninphotoproductionat
HERA.Eur.Phys.J.,C18:625–637,2001.
[BBB+06]J.Baines,S.P.Baranov,O.Behnke,J.Bracinik,M.Cacciari,etal.Heavyquarks
(WorkingGroup3):SummaryReportfortheHERA-LHCWorkshopProceedings.
2006.[BBK71]S.M.Berman,J.D.Bjorken,andJohnB.Kogut.InclusiveProcessesatHighTrans-
verseMomentum.Phys.Rev.,D4:3388,1971.
[BCF+07]S.Baffioni,C.Charlot,F.Ferri,D.Futyan,P.Meridiani,I.Puljak,C.Rovelli,
R.Salerno,andY.Sirois.ElectronreconstructioninCMS.TheEuropeanPhysical
JournalC-ParticlesandFields,49(4):1099–1116,2007.
[Ber01]E.L.Berger.Supersymmetryexplanationforthepuzzlingbottomquarkproduction
crosssection.Arxivpreprinthep-ph/0112062,2001.
[BL98]V.BlobelandE.Lohrmann.StatistischeundnumerischeMethodenderDatenanalyse.
TeubnerVerlag,1edition,1998.
[BRO70]C.G.BROYDEN.TheConvergenceofaClassofDouble-rankMinimizationAlgorithms
1.GeneralConsiderations.IMAJournalofAppliedMathematics,6(1):76–90,1970.
[C+00]AkosCsillingetal.Charmandbottomproductionintwo-photoncollisionswithOPAL.
2000.[C+04]S.Chekanovetal.Bottomphotoproductionmeasuredusingdecaysintomuonsindijet
eventsinepcollisionsats**(1/2)=318-GeV.Phys.Rev.,D70:012008,2004.
[Cac04]MatteoCacciari.Riseandfallofthebottomquarkproductionexcess.2004.
[CDD+03]B.Clement,BlochD.,GeleD.,GrederS.,andRipp-BaudotI.SystemDorhowto
getsignal,backgroundsandtheirefficiencieswithrealdataonly.D0Note,4159,June
2003.

BIBLIOGRAPHY

161

[CDF05]CDFCollaboration.Inclusiveb-jetproduction.CDFNOTE,(8418),September2005.
[CDF07]CDFCollaboration.b-bbardijetproductionusingsvt.CDFNOTE,(8939),April
2007.[CER08a]CERNPublicWebPages.CERNinanutshell,ff.,2008.
[CER08b]CERNPublicWebPages.CERN-TheLargeHadronCollider,ff.,2008.
[CKKT06]SusannaCucciarelli,MarcinKonecki,DanekKotlinski,andTeddyTodorov.Track
reconstruction,primaryvertexfindingandseedgenerationwiththePixelDetec-
tor.TechnicalReportCMS-NOTE-2006-026.CERN-CMS-NOTE-2006-026,CERN,
Geneva,Jan2006.
[CMS06]CMSCollaboration.CMSphysics:Technicaldesignreport.VolumeII:PhysicsPer-
formance,CERN/LHCC,21(CERN-LHCC-2006-001;CMS-TDR-008-1):2006,2006.
[CMS07a]CMSCollaboration.EvaluationofudsgMistagsforb-taggingusingNegativeTags.
CMSPAS,BTV-07-002,2007.
[CMS07b]CMSCollaboration.PerformanceMeasurementofbtaggingAlgorithmsUsingData
containingMuonswithinJets0.CMSPAS,BTV-07-001,2007.
[CMS08a]CMSCollaboration.PlansforJetEnergyCorrectionsatCMS.CMSPAS,JME-07-002,
2008.Jul[CMS08b]CMSPublicWebPages.CMS-Detector,ff.,2008.
[CMS09a]CMSCollaboration.AlgorithmsforbJetidentificationinCMS.CMSPAS,BTV-09-
2009.Jul001,[CMS09b]CMSCollaboration.TrackreconstructionintheCMSTracker.CMSPAS,TRK-09-001,
2009.[CMS10a]CMSCollaboration.Commissioningofb-jetidentificationwithppcollisionsat√s=
7tev.CMSPAS,BTV-10-001,2010.
[CMS10b]CMSCollaboration.CommissioningoftheParticle-FlowEventReconstructionwith
theFirstLHCcollisionsrecordedintheCMSdetector.CMSPAS,PFT-10-001,2010.
[CMS10c]CMSCollaboration.CommissioningoftheParticle-FlowReconstructioninMinimum-
BiasandJetEventsfromppCollisionsat7TeV.CMSPAS,PFT-10-002,2010.
[CMS10d]CMSCollaboration.Electronreconstructionandidentificationatsqrt(s)=7TeV.
CMSPAS,EGM-10-004,2010.
[CMS10e]CMSCollaboration.Inclusiveb-jetproductioninppcollisionsatsqrt(s)=7TeV.CMS
PAS,BPH-10-009,2010.CMSPASBPH-10-009.
[CMS10f]CMSCollaboration.JetPerformanceinppCollisionsat7TeV.CMSPAS,JME-10-
2010.003,[CMS10g]CMSCollaboration.Measurementofcmsluminosity.CMSPAS,EWK-10-004,2010.
[CMS10h]CMSCollaboration.MeasurementoftheInclusiveJetCrossSectioninppCollisions
at7TeVusingtheCMSDetector.CMSPAS,QCD-10-011,2010.

162

BIBLIOGRAPHY

[CMS10i]C√MSCollaboration.MeasurementoftheUnderlyingEventActivityattheLHCwith
s=7TeV.CMS-PAS,QCD-10-010,2010.
[CMS10j]CMSCollaboration.Particle-flowcommissioningwithmuonsandelectronsfromJ/Psi
andWeventsat7TeV.CMSPAS,PFT-10-003,2010.
[CMS10k]CMSCollaboration.Performanceofmuonidentificationinppcollisionsats**0.5=7
TeV.CMSPAS,MUO-10-002,2010.
[CMS10l]CMSCollaboration.TrackingandPrimaryVertexResultsinFirst7TeVCollisions.
CMSPAS,TRK-10-005,2010.
[CMS10m]CMSCollaboration.TrackingandVertexingResultsfromFirstCollisions.CMSPAS,
2010.TRK-10-001,[CMS11]CMSCollaboration.MeasurementofDrell-YanCrossSection(dsigma/dM).CMS-
2011.S-EWK-10-007,AP[CRS02]SergioCittolin,AttilaR´acz,andParisSphicas.CMStriggeranddata-acquisition
project:TechnicalDesignReport.NumberCERN-LHCC-2002-026inTechnicalDesign
ReportCMS.CERN,Geneva,2002.
[CSS08]MatteoCacciari,GavinP.Salam,andGregorySoyez.Theanti-ktjetclusteringalgo-
2008.04:063,,JHEPrithm.[D+08]JulienDoninietal.EnergyCalibrationofbQuarkJetswithZ→bb¯Decaysatthe
TevatronCollider.Nucl.Instrum.Meth.,A596:354–367,2008.
[Dys49]F.J.Dyson.TheRadiationtheoriesofTomonaga,Schwinger,andFeynman.Phys.
1949.75:486–502,,ev.R[Fan07]LivioFano.Multiplepartoninteractions,underlyingeventandforwardphysicsatlhc.
TechnicalReportCMS-CR-2007-064.CERN-CMS-CR-2007-064,CERN,Geneva,Sep
2007.[Fei04]MichaelFeindt.ANeuralBayesianEstimatorforConditionalProbabilityDensities.
ArXivPhysicse-prints,physics/0402093,February2004.
[FFF78]R.P.Feynman,R.D.Field,andG.C.Fox.Quantum-chromodynamicapproachforthe
large-transverse-momentumproductionofparticlesandjets.Phys.Rev.,D18:3320,
1978.[FNW03]StefanoFrixione,PaoloNason,andBryanR.Webber.MatchingNLOQCDandparton
showersinheavyflavourproduction.JHEP,08:007,2003.
[FW02]StefanoFrixioneandBryanR.Webber.MatchingNLOQCDcomputationsandparton
showersimulations.JHEP,06:029,2002.
[FWV07]RFr¨uhwirth,WolfgangWaltenberger,andPascalVanlaer.AdaptiveVertexFit-
ting.TechnicalReportCMS-NOTE-2007-008.CERN-CMS-NOTE-2007-008,CERN,
Geneva,Mar2007.
[HHL+77]SWHerb,DCHom,LMLederman,JCSens,HDSnyder,JKYoh,JAAppel,
BCBrown,CNBrown,WRInnes,etal.Observationofadimuonresonanceat9.5GeV
in400-GeVproton-nucleuscollisions.PhysicalReviewLetters,39(5):252–255,1977.

BIBLIOGRAPHY

163

[Ind04]AndreS.Indenhuck.DasStandardmodellderTeilchenphysik.web.physik.rwth-
2004.ebruaryF,aachen.de[Jun03]H.Jung.kt-factorizationandCCFM-thesolutionfordescribingthehadronicfinal
states-everywhere?Arxivpreprinthep-ph/0311249,2003.
[K+11a]VardanKhachatryanetal.MeasurementoftheB+ProductionCrossSectioninpp
Collisionsatsqrt(s)=7TeV.2011.
[K+11b]VardanKhachatryanetal.Inclusiveb-hadronproductioncrosssectionwithmuonsin
ppcollisionsatsqrt(s)=7TeV.2011.
[K+11c]VardanKhachatryanetal.MeasurementofBanti-BAngularCorrelationsbasedon
SecondaryVertexReconstructionatsqrt(s)=7TeV.2011.
[KRW06]T.Kluge,K.Rabbertz,andM.Wobisch.FastpQCDcalculationsforPDFfits.In14th
InternationalWorkshoponDeepInelasticScattering(DIS2006),20-24Apr2006,page
483,Tsukuba,Japan,April2006.
[M+92]G.Marchesinietal.HERWIG:AMonteCarloeventgeneratorforsimulatinghadron
emissionreactionswithinterferinggluons.Version5.1-April1991.Comput.Phys.
1992.67:465–508,,Commun.[Mar09]D.Martschei.Developementofasoftelectronbasedb-jettaggerfortheCMSex-
periment-EntwicklungeinesaufElektronenbasierendenB-Jet-Taggersf¨urdasCMS-
Experiment.Master’sthesis,Universit¨atKarlsruhe(TH),2009.IEKP-KA/2009-6.
[Mar10]DanielMartschei.DifferentbenchmarksforMVAmethods-Comparisonofmethods
fromTMVAwithNeuroBayes,December2010.
[Mor06]J.Morlock.OptimizationofthedecaytimeresolutionofsemileptonicBSdecays
usingartificialneuralnetworks.Master’sthesis,Universit¨atKarlsruhe(TH),2006.
IEKP-KA/2007-6.[MPQW06]ThomasM¨uller,ChristianPiasecki,GunterQuast,andChristianWeiser.Inclusive
SecondaryVertexReconstructioninJets.TechnicalReportCMS-NOTE-2006-027.
CERN-CMS-NOTE-2006-027,CERN,Geneva,Jan2006.
[N+10]KNakamuraetal.Reviewofparticlephysics.J.Phys.,G37:075021,2010.
[Nag02]ZoltanNagy.Three-jetcrosssectionsinhadronhadroncollisionsatnext-to-leading
order.Phys.Rev.Lett.,88:122003,2002.
[NDE88]P.Nason,S.Dawson,andR.KeithEllis.TheTotalCross-SectionfortheProduction
ofHeavyQuarksinHadronicCollisions.Nucl.Phys.,B303:607,1988.
[NDE89]P.Nason,S.Dawson,andR.KeithEllis.TheOneParticleInclusiveDifferentialCross-
SectionforHeavyQuarkProductioninHadronicCollisions.Nucl.Phys.,B327:49–92,
1989.[Nob]Nobelprize.org.”thenobelprizeinphysics1969”.
[NP33]J.NeymanandE.S.Pearson.Ontheproblemofthemostefficienttestsofstatisti-
calhypotheses.PhilosophicalTransactionsoftheRoyalSocietyofLondon.SeriesA,
ContainingPapersofaMathematicalorPhysicalCharacter,231:pp.289–337,1933.

164

BIBLIOGRAPHY

[P+02]J.Pumplinetal.Newgenerationofpartondistributionswithuncertaintiesfromglobal
QCDanalysis.JHEP,07:012,2002.
[PL05]M.PivkandF.R.LeDiberder.sPlot:Astatisticaltooltounfolddatadistributions.
NuclearInstrumentsandMethodsinPhysicsResearchA,555:356–369,December2005.
[PRS61]E.Pickup,D.K.Robinson,andE.O.Salant.pi-piResonanceinpi–pInteractionsat
1.25Bev.Phys.Rev.Lett.,7:192–195,1961.
[PT10]Phi-T.TheNeuroBayesUser’sGuide.April2010.
[RC56]FrederickReinesandClydeL.Cowan.Theneutrino.Nature,178:446–449,1956.
[RHW87]D.E.Rumelhart,G.E.Hinton,andR.J.Williams.LearningInternalRepresentations
byErrorPropagation.InD.E.Rumelhart,J.L.McClelland,etal.,editors,Parallel
DistributedProcessing:Volume1:Foundations,pages318–362.MITPress,Cambridge,
1987.[Ros58]FRosenblatt.Theperceptron:Aprobabilisticmodelforinformationstorageandorga-
nizationinthebrain.PsychologicalReview,65(6):386–408,1958.
[RPS06]AndreaRizzi,FabrizioPalla,andGabrieleSegneri.Trackimpactparameterbased
b-taggingwithCMS.TechnicalReportCMS-NOTE-2006-019.CERN-CMS-NOTE-
2006-019,CERN,Geneva,Jan2006.
[SAF+06]T.Speer,W.Adam,R.Fr¨uhwirth,A.Strandlie,T.Todorov,andM.Winkler.Track
reconstructionintheCMStracker.NuclearInstrumentsandMethodsinPhysicsRe-
searchA,559:143–147,April2006.
[Sch08]A.Scheurer.AlgorithmsfortheIdentificationofb-QuarkJetswithFirstDataatCMS.
PhDthesis,Universit¨atKarlsruhe(TH),2008.IEKP-KA/2008-19.
[Sil04]W.DaSilva.Measurementoftheopenbeautyandcharmproductioncrosssections
intwophotoncollisionswithdelphi.NuclearPhysicsB-ProceedingsSupplements,
126:185–190,2004.ProceedingsoftheInternationalConferenceontheStructureand
InteractionsofthePhoton,Includingthe15thInternationalWorkshoponPhoton-
ollisions.CPhoton[SMS06]T.Sjostrand,S.Mrenna,andP.Z.Skands.PYTHIA6.4PhysicsandManual.JHEP,
2006.05:026,[SPF+06]ThomasSpeer,KirillProkofiev,RFr¨uhwirth,WolfgangWaltenberger,andPascalVan-
laer.Vertexfittinginthecmstracker.TechnicalReportCMS-NOTE-2006-032.CERN-
CMS-NOTE-2006-032,CERN,Geneva,Feb2006.
[SPS50]J.Steinberger,W.K.H.Panofsky,andJ.Steller.EVIDENCEFORTHEPRODUC-
TIONOFNEUTRALMESONSBYPHOTONS.Phys.Rev.,78:802–805,1950.
[SS37]J.C.StreetandE.C.Stevenson.NEWEVIDENCEFORTHEEXISTENCEOFA
PARTICLEOFMASSINTERMEDIATEBETWEENTHEPROTONANDELEC-
TRON.Phys.Rev.,52:1003–1004,1937.
[Tho97]J.J.Thomson.Cathoderays.Phil.Mag.,44:293–316,1897.
[VPB07]TejinderVirdee,AchillePetrilli,andAustinBall.Cmshighleveltrigger.TechnicalRe-
portLHCC-G-134.CERN-LHCC-2007-021,CERN,Geneva,Jun2007.revisedversion
16:57:09.2007-10-19nosubmitted

Danksagung

Zuletztm¨ochteichnochalldenendanken,ohnediedieAnfertigungdieserArbeitnichtm¨oglich
are.w¨esengewDazumussmanwissen,dassvordreiJahrendieArbeitsgruppevonMichaelFeindtausschließlich
Datenanalysierte,dieamCDF-ExperimentdesTevatronsgesammeltwurden.Schwerpunktdieser
StudienwarenundsinddieSpektroskopie-Messungenvonb-Hadronenundsogenannteb-flavour-
Physik.IchselbsthabedortauchschonimRahmenmeinerDiplomarbeitorbitalangeregte
Zust¨andedesBdMesonsentdeckt.DerBeginnmeinerDoktorarbeitwargenauinderZeit,als
dieletztenVorbereitungenf¨urdieInbetriebnahmedesLHCdurchgef¨uhrtwurden.Nochinnerhalb
einesJahressolltediesesf¨urmichsehrfaszinierendeProjektstarten.
MichaelFeindtundG¨unterQuasterm¨oglichtenesmir,inderCMSKollaborationmitzuwirken.Es
wurdeeineneueArbeitsgruppeamInstitutf¨urexperimentelleKernphysik(ekp)unterderRegie
vonMichaelFeindtgegr¨undet,derenAufgabeesseinsollte,denhiervorgestelltenNeuroBayes
b-jettaggerzuentwickelnundwennm¨oglichzuetablieren.DieGruppeumfasstenebenmirnoch
hei.MartscDanielF¨urdieAnfangszeitsindvorallemArminScheurerundChristopheSaoutzuerw¨ahnen,dieuns
immerhilfsbereitzurSeitestandenundeserm¨oglichten,unsinderkomplexenWeltderCMS
Softwarezurechtzufinden.
FinanziertwurdeichzudieserZeitdurcheinStipendiumdesGraduiertenkollegs(GK)f¨ur
Teilchen-undAstroteilchenphysikderFakult¨atf¨urPhysikderUniversit¨atKarlsruhe.DasGK
unterst¨utzteseineMitgliederebensodurchdie¨UbernahmederKostenvonDienstreisenundstellt
eineVielzahlvonM¨oglichkeitenzurWeiterbildungbereit.Dadurchistesm¨oglich,effizientwis-
senschaftlichzuarbeiten.Diesem,denProfessorenundStudentenvertretern,derenEinsatzdas
GK¨uberhauptm¨oglichmachte,dankeichhiermit.InderHauptzeitwurdeichdannvomLand
Baden-W¨urttembergfinanziert.DiesemunddemBundesministeriumf¨urBildungundForschung
dankeichebenso.
NachderPhasederEinarbeitunghabenwiresrelativschnellgeschafft,einenneuenb-jetTagger
f¨urCMSzupr¨asentieren.VorallemdieHilfevonChristopheSaoutseihiernochmalserw¨ahnt,
dessenSachverstanduns¨ubervieleH¨urdengeholfenhat.Diesistumsobemerkenswerter,daer,
alsFrank-PeterSchillingseineb-taggingT¨atigkeiteneingestellthatte,zus¨atzlichauchnochdie
politischenInteressendesekpinderCMSb-taggingGruppevertretenmusste.
DieSuchezurBesetzungdernunfreienPostDoc-Stellef¨urdieekpb-taggingGruppestelltesich
dannalsschwierigerherausalserwartet.DadurchergabsichvonorganisatorischerSeitehereine
schwierigeSituation.DarumbedankeichmichbeiThomasKuhrundJeannineWagner-Kuhr,die
unsindieserZeitalsAnsprechpartnerzurVerf¨ugungstandenundunsmitRatundTatzurSeite
standen.Anfang2010wurdedieStelledannmitJyothsnaKomaragiribesetzt.Dadurchhattenwirwieder
einenwichtigenVertreterinderCMSb-taggingGruppe.NichtzuletztdurchihrenEinsatzwurde
mirdieMitwirkunganderdifferentiellenb-JetWirkungsquerschnittsmessungerm¨oglicht.Daf¨ur

165

166

BIBLIOGRAPHY

bedankeichmich.
F¨urdieb-JetMessungwurdeeineArbeitsgruppe,bestehendausFachleutenausderJet-Physikund
b-Physik,gegr¨undet.DieAnalysewurdevondenb-taggingKonvenernWolfgangAdamundAndrea
Rizziinitiiert.DesweiterendankeichMikkoVoutilainenf¨urseineLeitungundOrganisation
dieserGruppe.EbensobedankeichmichbeidenanderenMitgliedern,vorallembeiPhilipp
Schieferdecker,DanielMartschei,HaukeHeld,JyothsnaKomaragiriundNikiSaoulidou.
MeineDoktorarbeitbestanddannausdenzweigroßenThemenb-taggingundb-jetWirkungsquer-
schnitt.F¨urdieAnalysehabeichjeweilsNeuroBayesvewendet,dasvonderFirmaPhi-Tvertrieben
wird.IchdankehiermitdenEntwicklernderSoftware,vorallemMartinHahn,aberauchDaniel
Martschei,diemirimmerwiederbeiFragenundProblemenzuNeuroBayesweitergeholfenhaben.
DieArbeitkonnteauchnichtohnediezahlreicheHilfemeinerKollegenamekpfertiggestelltwerden.
VieleGespr¨acheundDiskussionenwarenn¨otig.Daf¨urm¨ochteichallenamekpherzlichdanken.
Imspeziellenm¨ochteichmichbeidenKorrektorenderDoktorarbeitJyothsnaKomaragiri,Iris
Gebauer,DanielMartschei,SebastianNeubauer,MichaelFeindt,ThomasKuhrundAnˇzeZupanc
en.dankebProf.Dr.ThomasM¨ullerdankeichf¨urdie¨UbernahmedesKorreferatsundProf.Dr.Ulrich
Nierstef¨urdie¨UbernahmedesMentoriats.
BeimeinemDoktorvaterProf.Dr.MichaelFeindtbedankeichmich,dassichbeiihmdieseArbeit
anfertigendurfte.Ichdankef¨urdasst¨andigeVertrauen,dasserinmichundmeineArbeithatte
unddassmirdieFreiheitgegebenwurde,diesesozugestaltenwiesieschlussendlichgewordenist.
IchbedankemichbeiDanielMartschei,dasserzusammenmitmirdasCMSExperimentgewagt
hat.Ichdankeihmf¨urdieletztenJahre,dasswirtrotzvielerHochundTiefsnichtdenSpaßan
derArbeitverlorenhabenundunsimmergegenseitigunterst¨utzthaben.
UmmichichimBereichTeilchenphysikaufdiePr¨ufungvorzubereiten,habeichmichmitSusanne
Mertens,SebastianNeubauerundFelixWickzusammengesetztunddiegroßenThemenund
ResultatederTeilchenphysikdiskutiert.Allendreienm¨ochteichhiermitdanken.Desweiteren
bedankeichmichbeidenHerrnProfessoren,diemirdiePr¨ufungabgenommenhaben.Dassind
nebendenReferenten:Prof.Dr.WimdeBoer,Prof.Dr.JohannK¨uhnundProf.Dr.Gerd
on.h¨ScEbensodankeichmeinerFamilief¨urihreUnterst¨utzung:meinenElternMarliesundNorbertHonc
undmeinenSchwiegerelternDinaundSeppWeintraut.
AmmeistendankeichmeinerFrauLisa,dieichunendlichliebe,undmitderenRatschl¨agen,
Ermutigungen,HumorundklugerKritikichdieZeitendesimmerwiederkehrendenFrustsnicht
¨uberstandenh¨atte.
Euchallendankeichnochmalsherzlichf¨urdasGelingendieserDoktorarbeit.