121 Pages
English

Computational steering of CFD simulations on teraflop-supercomputers [Elektronische Ressource] / Petra Wenisch

Gain access to the library to view online
Learn more

Description

Lehrstuhlfur¨ BauinformatikFakultat¨ fur¨ Bauingenieur-undVermessungswesenTechnischeUniversitat¨ Munchen¨ComputationalSteeringofCFDSimulationsonTeraflop SupercomputersPetraWenischVollstandiger¨ AbdruckdervonderFakultat¨ fur¨ Bauingenieur-undVermessungswe senderTechnischenUniversitat¨ Munchen¨ zurErlangungdesakademischenGradeseinesDoktor IngenieursgenehmigtenDissertation.Vorsitzender: Univ. Prof. Dr. Ing. habil. MichaelManhartPrufer¨ derDissertation:1. Univ. Prof. Dr. rer. nat. ErnstRank2. Univ. Prof. Dr. rer. nat. UlrichR ude,¨Friedrich AlexanderUniversit at¨ Erlangen N urnber¨ gDie Dissertation wurde am 7. November 2007 bei der Technischen Universitat¨Munchen¨ eingereichtunddurchdieFakultat¨ fur¨ Bauingenieur-undVermessungswe senam7. Februar2008angenommen.fur¨ HeikeundHellaAbstractComputational methods have by now become established techniques of every dayworkflowincivilengineering,especiallyinthefieldofstructuralmechanics.Another field of application with increasing importance is computational fluiddynamics (CFD), where it is most prominently used for simulations in the con textofhydraulicsorforinvestigationswithrespecttofluid structureinteractionssuchaswindloadsonbridges,skyscrapers.Inbuildingconstructionpracticetheuseofsimulationsfortheanalysisofin doorair flowismuchlesscommon,becauseespeciallyduringtheplanningphaseofuniqueconstructionslikenon standardbuildings,thelimitedtimeavailabletothe engineer often does not allow detailed simulation series.

Subjects

Informations

Published by
Published 01 January 2008
Reads 27
Language English
Document size 33 MB

Bauinformatikur¨fLehrstuhlFakult¨atf¨urBauingenieur-undVermessungswesen
TechnischeUniversit¨atM¨unchen

ComputationalSteeringofCFDSimulationson
computerseraop-SuperT

enischWPetra

Vollst¨andigerAbdruckdervonderFakult¨atf¨urBauingenieur-undVermessungswe-
senderTechnischenUniversit¨atM¨unchenzurErlangungdesakademischenGrades
eines

Dissertation.genehmigten

-IngenieursDoktor

Vorsitzender:Univ.-Prof.Dr.-Ing.habil.MichaelManhart
Dissertation:derufer¨Pr1.Univ.-Prof.Dr.rer.nat.ErnstRank
2.Univ.-Prof.Dr.rer.nat.UlrichR¨ude,
Friedrich-AlexanderUniversit¨atErlangen-N¨urnberg

DieDissertationwurdeam7.November2007beiderTechnischenUniversit¨at
M¨uncheneingereichtunddurchdieFakult¨atf¨urBauingenieur-undVermessungswe-
senam7.Februar2008angenommen.

ur¨f

Heike

und

Hella

AbstractComputationalmethodshavebynowbecomeestablishedtechniquesofevery-
dayworkowincivilengineering,especiallyinthefieldofstructuralmechanics.
Anotherfieldofapplicationwithincreasingimportanceiscomputationaluid
dynamics(CFD),whereitismostprominentlyusedforsimulationsinthecon-
textofhydraulicsorforinvestigationswithrespecttouid-structureinteractions
suchaswindloadsonbridges,skyscrapers.
Inbuildingconstructionpracticetheuseofsimulationsfortheanalysisofin-
doorair-owismuchlesscommon,becauseespeciallyduringtheplanningphase
ofuniqueconstructionslikenon-standardbuildings,thelimitedtimeavailableto
theengineeroftendoesnotallowdetailedsimulationseries.Hence,theseobvi-
ouslythumbarehelpfulstillearlyusedinstead.simulationShouldstudiesitturncannotoutbeduringcarriedtheoutatbuildingallandphaserulesthatof
aredesignisinevitable,thecostsareconsiderablyhigherthanthoseforare-
designduringtheplanningphase.Takentogether,theneedforsimulationsand
thelackoftimeduringdesignphaseledtoadesireforanalternativetoolgiv-
ingtheengineerthepossibilityofperformingcasestudiesofthequalitativeow
behaviorthroughshort-cyclesimulationruns.Byprovidingmeansofmonitor-
ingandinteractivesteeringtherunningsimulation,theengineercanwatchthe
developingow,adjustparametersaccordingtohisneeds,andreceiveinstant
feedbacktohisinteractions.Thiskindofsteerablesimulationandon-the-yvi-
sualizationofsimulationresultsrepresentthedefiningfeaturesofcomputational
steering.Withthegeneralavailabilityofaffordablehigh-performancecomputing
systemsandperpetuallyincreasingCPUpower,computationalsteeringapplica-
tionshavebecomepossibleeveninsocomputation-andresource-intensivefields
ascomputationaluiddynamicsinthesedays.
Thisthesisfocusesonthedesignofacomputationalsteeringframeworkuti-
lizingsupercomputersforthenumericaluiddynamicssimulationsandhigh-
endVirtual-Realityvisualizationworkstationsforonlinemonitoringandsteer-
ingofthesimulation.ThecomputationalsteeringtooliFluids,whichhasbeen
developedbasedonthisframework,opensawholerangeofnewapplications
indifferentfields—notparticularlyrestrictedtocivilengineering.iFluidsrep-
resentsaninteractivetoolasameansofexploratoryshort-terminvestigationfor
checkingventilationdesignsintheirbasicfunctionalityandqualitativebehav-
ior—assuchmeetingatleastpartoftheengineersrequirementsduringthe
designphase.Inparticular,theengineercanexploreandwatchdevelopaow
configurationduringtherunningsimulationandcaninteractivelyadjustglobal
owparameters,defineormodifyboundaryconditionsandevenchangethege-
ometricalsetupwithouttheneedtore-preprocessandrerunaswithmostother
simulations.Theadaptationoftheowaswellastransienteffectsuntilasteady
stateisreachedcanbewatchedonasteeringandvisualizationterminalonthe
y.Inthisway,itispossibletoperformvirtualexperimentstoquicklygainan
intuitiveunderstandingofagivenventilationproblem.
iFluidsisbasedontheLattice-Boltzmannmethod,whichhasonlybeenap-
pliedforsometwentyyearsinuiddynamicsresearch.Itoffersseveralsignif-

6

icantadvantagesoverotherclassicalmethodsoftacklinguiddynamicsprob-
lems.Thepresentthesisdescribesthespecializationsandoptimizationsofthe
implementationofthismethodwhichprovidethekeytoenablingtheuserto
easilychangeboundaryconditions,owparameters,andthegeometriclayoutof
thesimulatedsceneduringtheon-goingcomputation.
However,thesimulationsoftenneedtoberunwithacoarserdiscretization
oncurrentmid-levelhardwaresuchasworkgroupclustersand,accordingly,are
notalwaysabletoresolvealldetailsoftheuidbehavior.Additionally,thecom-
putationalmodelmuststillbekeptsimplewithrespecttoboundaryconditions
orturbulencemodels.Nevertheless,itenablessmallcase-studysimulationsfor
feasibilityanalysesandfortestingthebehaviorinquestion.Therequirements
arisingwhendevelopingafull-growncomputationalsteeringapplicationand
howthesecanbeaddressedrepresentanimportantaspect,whichisspecifically
concentratedoninthetext.Inparticular,detailsaregivenonhowcomputational
steeringhigh-endsoftwarvisualizationecanbefacilities.developedAlso,onthetodayscurrentlyhigh-performanceachievablecomputersperformanceandis
benchmarkedandlimitationswithregardtothecurrenthardwaretechnology
out.pointedearTodemonstratetheapplicabilityofthecomputationalsteeringframeworkto
thestudyofindoorventilationsystems,iFluidsisusedforanalyzingareal-case
operatingroomasanexample.Resultsoftheinteractivesimulationtoolarecom-
parmadeedthrwithoughmoreusingdetailedthistool.simulationsFinally,thetoshowapplicabiwhatlityofkindtheofcomputationalstatementscansteerbe-
ingframeworkinotherfieldswillbetouched.

ZusammenfassungRechnergest¨utzteSimulationenhabensichmittlerweileimBauwesenundins-
besondereimBereichderStrukturmechanikalsfesteHilfsmittelimArbeitsalltag
etabliert.EinweiteresAnwendungsgebiet,indemdieBedeutungnumerischer
SimulationenimBauwesenzunimmt,istdieFluiddynamik.HierliegtderSchwer-
punktjedochmehraufFragestellungenderHydraulikundGew¨asserkundeoder
Fluid-Struktur-Interaktionen,wiesiebeiumstr¨omtenexponiertenBauwerkenwie
WolkenkratzernoderBr¨uckenauftreten.
InderPraxisdeskonstruktivenIngenieurwesensistdieSimulationvonIn-
nenraumstr¨omungennocheherun¨ublich,dadierelativkurzeEntwurfsphase
vongr¨oßerenBauwerkenmeistensnichtdenzeitlichenSpielraumf¨urdetaillierte
Simulationsreihengew¨ahrt.DaherwerdensolcheBerechnungennursehrsel-
tendurchgef¨uhrtundmeistlediglichErfahrungswertef¨urdiePlanungherange-
zogen.Stelltsichw¨ahrendderBauausf¨uhrungjedochheraus,dasseine¨Uber-
arbeitungdesKonzeptesunumg¨anglichist,fallendiedamitverbundenenKosten
deutlichh¨oheraus,alsesw¨ahrendderEntwurfsphasederFallgewesenw¨are.
DertrotzdesengenzeitlichenRahmensgrunds¨atzlichbestehendeBedarfan
SimulationenerzeugtaufIngenieursseitedenWunschnacheinemanderenals
denbislangzug¨anglichenSimulationswerkzeugen:EinderartigesWerkzeugmuss

7

einemIngenieurdieM¨oglichkeitbieten,eineReihevonFallstudieninnerhalb
kurzerZeitdurchzuf¨uhren,umeinqualitativesStr¨omungsverhaltenbestimmen
zuk¨onnen.DurchdiegleichzeitigeVisualisierungaktuellerStr¨omungsdatenund
interaktiverSteuerungw¨ahrendderSimulationsollderAnwenderdiezeitliche
EntwicklungeinerStr¨omungbeobachtenunddurchdieAnpassungvonPara-
meternindielaufendeSimulationeingreifenk¨onnen.DieAuswirkungenseiner
Ver¨anderungenm¨ussenanschließendinderzurBerechnungparallellaufenden
Visualisierungunmittelbarsichtbarwerden.DieseVerschmelzungvonsteuer-
barerSimulationundgleichzeitigerVisualisierungnenntmanComputationalSteer-
ing.Aufgrundderheuteverf¨ugbarenundnachwievorrasantwachsendenRe-
chenleistungbeiHoch-undH¨ochstleistungsrechnernistesmittlerweilem¨oglich
geworden,ComputationalSteeringsogarinsorechenintensivenGebietenwieder
verwirklichen.zuFluiddynamikDievorliegendeArbeitbeschreibtdenallgemeinenAufbaueinerComputa-
tionalSteering-Applikationundzeigt,wiesicheinederartigeAnwendungdurch
dieVerwendungvonH¨ochstleistungsrechnernf¨urdienumerischeFluidsimula-
tionundVirtual-RealityVisualisierungsanlagenzurDarstellungderErgebnisse
sowiezurSteuerungderBerechnungrealisierenl¨asst.AnhandderBeispielan-
wendungiFluidswirdeininteraktivesWerkzeugzurUntersuchungvonInnen-
raumluftstr¨omungerl¨autert,dasinkurzerZeiterlaubt,qualitativeAussagen¨uber
einBel¨uftungssystemzutreffenundseinegrunds¨atzlicheFunktionzupr¨ufen.
SomitkanniFluidsdemIngenieuralsHilfsmittelw¨ahrendderPlanungs-und
Entwurfsphasedienen,dasdenwesentlichenTeilderobengenanntenAnspr¨uche
abdeckt.Sokannmanw¨ahrendderlaufendenBerechnungdieStr¨omungsent-
wicklungbeobachtenundinteraktivglobaleStr¨omungsparameter,Randbedin-
gungenodersogardengeometrischenAufbauderProblemstellungver¨andern.
ImGegensatzzudenmeistenheuteverf¨ugbarenSimulationswerkzeugenistnach
derartigenInteraktionenkeinNeustartderBerechungmehrerforderlich.DieAn-
passungderStr¨omungebensowietransienteEffektebiszueinemm¨oglichen
Gleichgewichtszustandk¨onnenanderSteuerungs-undVisualisierungsanlage
unmittelbarbetrachtetwerden.AufdieseWeisek¨onnenvirtuelleExperimente
mitgeringemZeitaufwanddurchgef¨uhrtwerdenunddadurcheinintuitiverEin-
druckdesStr¨omungsverhaltensbeieinergegebenenBel¨uftungsfragestellungge-
den.werwonneniFluidsberechnetdieStr¨omungssimulationmittelsderLattice-BoltzmannMe-
thode,dieeinnochrelativjungesVerfahrenaufdiesemGebietdarstellt.Ihr
EinsatzbietetimVergleichzuanderennumerischenMethodeneinigeVorteile
f¨urdieComputationalSteering-Anwendung.DievorliegendeArbeitgehtdaher
aufdieSpezialisierungundOptimierungdieserMethodeein,diedieinterak-
tiveVer¨anderungvonRandbedingungen,Str¨omungsparameternunddesgeome-
trischenAufbausw¨ahrendderlaufendenSimulationm¨oglichmachen.
Insbesondereaufwenigerleistungsf¨ahigerenRechnernwiez.B.Arbeitsgrup-
pen-Clustern,m¨ussenzurBeschleunigungderSimulationendiegeometrischen
Objektemeistwenigerfeindiskretisiertbleiben.Sok¨onnenh¨aufignichtalle
Str¨omungsph¨anomenevollaufgel¨ostwerden.AuchhinsichtlichderRandbedin-

8

gungenoderf¨urdieTurbulenzmodellierungm¨ussenoftvereinfachteModelle
verwendetwerden.Dennochk¨onnenmitdieserArtvonSimulationFallstu-
diendurchgef¨uhrtwerden,diediegrunds¨atzlicheVerwendbarkeitundFunk-
tionsweisevongeplantenBel¨uftungssystemenpr¨ufen.Dar¨uberhinausgehtdie
ArbeitauchaufdieAnforderungenein,diew¨ahrendderEntwicklungderCom-
putationalSteering-Anwendungdeutlichwerden,undzeigt,wiemandiesenge-
rechtwerdenkann.Insbesonderewirdbeschrieben,wieComputationalSteering-
SoftwareaufheutigenH¨ochstleistungsrechnernimZusammenspielmitmoder-
nenVisualisierungsanlagenentwickeltwerdenkann.Schließlichwirdauchdie
derzeitmitiFluidserreichbareBerechnungsleistunguntersuchtunddiedaraus
ersichtlicheLimitierungbez¨uglichheutigerHardwareherausgearbeitet.
UmdieAnwendbarkeitdieserComputationalSteering-L¨osungimBereichder
Innenraum-Str¨omungssimulationzubeurteilen,wirdmithilfevoniFluidsdasBei-
spieleinesrealenOperationssaalssimuliert.DieErgebnisseausderinteraktiven
Simulationwerdenhierbeimitdeneneinerherk¨ommlichenSimulationverglichen,
umherauszustellen,welcheArtvonAussagenmandurchEinsatzvonComputa-
tionalSteeringtreffenkann.UmdieFlexibilit¨atderentstandenenSoftwarezu
demonstrieren,wirdabschließendgezeigt,dassiFluidsnichtnuraufProbleme
ausdemBauwesenbeschr¨anktist,sonderneinbreitesSpektrumverschiedenster
kann.abdeckenAnwendungen

Contents

1

2

3

4

5

6

SteeringComputational1.1Introduction................................
1.2RelatedWork...............................
1.3FieldsofApplication...........................
1.4ComputationalSteeringinCivilEngineering.............

iFluidsSimulationsFluidInteractive-2.1ArchitecturalSoftwareConcept.....................
2.2InteractiveNumericalKernel......................
2.3TheVisualizationandSteeringFront-End...............
2.4RequirementsofInteractiveSimulationSoftware...........

ComputationalFluidDynamicsUsingtheLattice-BoltzmannMethod
3.1ComputationalFluidDynamics.....................
3.2Lattice-BoltzmannMethod........................
3.3ImplementationoftheLBMSolver...................

SupercomputersonSimulationFluid4.1High-PerformanceComputing.....................
4.2HitachiSR8000-F1SystemArchitecture................
4.3ParallelizationoftheLattice-BoltzmannSolver............
4.4OptimizationoftheSimulationKernel.................
4.5PortingandOptimizingtheSolverforSGIAltixSystems......

ExplorationDataInteractive5.1ScientificVisualization..........................
5.2VisualizationwithiniFluids.......................

InteractiveProblemDefinitionandGridGeneration
6.1SteeringofGlobalSimulationParameters...............
6.2InteractingwiththeGeometricModel.................
6.33DUserInterface.............................
6.4GridGeneration..............................

9

1111131517

2020232425

27272834

363639434652

565657

6262636972

10

7

8

9

10

CONTENTS

RealizationAspectswithRespecttoComputationalSteering81
7.1CommunicationLayout.........................81
7.2FrameworkPerformance.........................83
7.3VisualizationandSteeringonMultipleClients............88

GeneralApplicability—ACaseStudy89
8.1VentilationSystemsofOperatingRooms................89
8.2SimulationStudieswithVaryingGridResolutions..........96

UniversalApplicabilityofiFluids—ComputationalSteeringFrame-
100work9.1Vascularreconstruction..........................100
9.2ExtensionsofiFluidsforBloodFlowSimulations...........100

Summary

....

..

......

......

..

105

1Chapter

SteeringComputational

Thischaptershallserveasashortintroductiontothetopicofcomputationalsteer-
ing.First,thetermcomputationalsteeringisdefinedandanoverviewofthe
currentstateoftheartispresentedbyprovidingsketchesofaselectionofre-
latedwork.Then,possiblefieldsofapplicationaregivenbywayofanexample
followedbyadetaileddiscussionontheapplicationrangeincivilengineering.

Introduction1.1Traditionally,computationalengineeringstudiesinhigh-performancecomputing
aregeometriccarriedoutmodelingina(i.e.sequenceCAD,ofmeshsteps.orAtgridthebeginninggeneration)oneandhasthetodefinitionsupplytheof
simulationspecificstartandboundaryconditions,whichhasbecomepracticalon
isstandarfolloweddbydesktoptheactualmachines.Thissimulationisreferrandedatoasconcludingthepre-prpost-processingocessingstepstepwhichto
extractandevaluateparticularresults(seeFigure1.1).

Figurstudies:eIn1.1:theIterationtraditionalofworkapproach,stepstheforstepsoptimiofprzatione-prprocessing,oblemsinsimulationengineeringand
evaluationofthesimulationresultsaretraversedsequentiallyanditeratedin
adrawn-outloop.Assumingthebest-casescenario,potentialwaitingtimes
forrequiredresourceshavebeenignoredinthisscheme.Furthermoreitisas-
prsumede-prthatocessingaprocedurmodificationeandofisthethereforsimulatedecutshortmodelindoeseachnotroptimizationequirethecycle.whole

mostRegarchallengingdingthenumericalcomputationalsimulatieffort,onswithengineeringsubstantialprroblemsesourstillcerequirbelongtoementsthe

11

12

SteeringationalComput

thatariseasmoreandmorephysicaldetailsaretakenintoconsideration.Accord-
ingly,itisoftenadvantageoustoperformthecomputationonasupercomputer
orcluster.Onthistypeofmachineasimulationmostcommonlyhastobesub-
mittedtoaqueuingsystemasabatchjobwhichhastowaituntiltherequested
resourcesbecomeavailablebeforeitcanbeexecuted.Attheendofthesimula-
tion,theresultsareusuallystoredonafilesystemand,occasionally,havetobe
transferredtoanadequatepost-processingfront-end,whichmaybelocatedona
different,moresuitablesystem.There,theresultsobtainedarefinallyevaluated
inthepost-processingstepbymeansofappropriateanalysingandvisualization
techniques.Whenstudyingseveraltestcasesofasimulationscenariothisoften
tediousandlengthychainofprocessesisratherinconvenientfortheengineer
withoutthepossibilityofimmediateinteractionwithhisexperiments.
AccordingtoJohnsonetal.(1999),thefirstpublishedstatementsindicating
thedesireforcomputationalsteering,whichintegratesthesesinglestepsofa
pipelineintoonesingleprocesscycle,appearedinthelate1980s.McCormick
etal.(1987)reectedthatscientistswanttobeabletointeractwithsimulations
closetoreal-time.Correspondingly,JohnsonandParker(1994)havedefined
ComputationalSteeringasthecapacitytocontrolallaspectsofthecomputational
scienceandengineeringpipeline.Thiscomprisesthestepsofpre-processing,
computation,andpost-processingasmentionedabove(cf.Figure1.2).

Figursteps,eas1.2:shownClosinginFigurtheeloop:1.1,theComparedoptimizationwithofthethreetraditionaliterationformstepsofissequentialspeeded
upconsiderablybyusingacomputationalsteeringapplication.Foronething,
protherepr,theocessing,visualizationcomputationandandcomputationpostprocessitimesngarecanrbeeduceddoneinsincetheseparallel.prForocessesan-
usuallybenefitfrompreviousresultsanddonotnecessarilyneedtoberestarted
beginning.verytheomfr

Sincethearticulationoftheseearlyvisions,thepowerofcomputershasin-
creasedseveralfoldandwasparalleledbythedevelopmentofinteractivesimu-
lationapplications.However,mostofthesefocusedonthepost-processingand
visualizationstepanddidnotallowdirectinteractionwiththecomputation(van
Liereetal.,1996).Asamatteroffact,computationalsteering—eventoday—
isoftenmis-interpretedasonlinemonitoringwithonlybasicfunctionalitieslike
stopping,pausingandresumingasimulation.Despiteofferingthefirststepto-
wardsrealcomputationalsteering,thesebasicfeaturesclearlyfallshortofrepre-
sentingafullydevelopedinteractionwiththesimulationduringruntime.Com-
putationalsteeringinitsexplicitmeaning,however,enablesthescientiststodi-
rectlychangesomeoralloftheparametersofthesimulationprocessduringits

WRelated1.2.ork

13

executionandtheavailabilityofafront-endtoanalyzetheeffectsoftheseinter-
actionsimmediatelyMulderetal.(1999).

orkWRelated1.2Thefollowingsectionoutlinessomeexamplesrepresentingthestate-of-the-artin
thefieldofcomputationalsteering.Theseprojectscanbeclassifiedintolibraries,
problemsolvingenvironmentsandapplicationframeworks.

LibrariesThegVizlibrary(Brodlieetal.,2004)allowsuserstovisualizedatainapost-
pruseofocessinggridstepcomputingoron-the-yandduringcollaborativeasimulation.workingBothfacilities.scenariosInadditionsupporttothethe
casethevisualizationchangingcapabilitiesofparametersparametersduringcanrbeuntimesenttoisthesupportedunderlyingbythesimulationsimulationin
kernel.IntheRealityGrid(Brookeetal.,2003;Picklesetal.,2004)projectanappli-
cationmoduleistypicallycommunicatingstructurbyedmeansintooftheaclient,steeringthelibrarysimulation.Theandlibrarytheisdesignedvisualizationto
able.simplifyAkeytherequirchangesementrequirforedtoextendingmaketheanexistingapplicationcodeinthiswaycomputationallyistheinsertionsteer-
ofulationcheck-hasandtobrbereak-pointsestarted,rwhereespectivelymodified.WithparametersRealityGridarefetchedsimulationsandaretheeithersim-
registermonitorededindataarsets.ead-onlyTheusermodemayorsteerpause,edrbyesume,modifyingdetachandparametersstoptheorprsimulationeviously
orwhichrunitmakesfromitapossiblecheckpoint.toinvestigateRealityGridcurrentseparatesresultswithvisualizationarbitraryandharsteering,dware
suchashigh-endvisualizationworkstations,laptopsorPDAs.
AnotherlibraryhasbeendevelopedwithintheCAVEstudyproject,cf.Re-
ulationnambotetcodeal.does(2001).notInneedcontrasttobetotheadaptedatlibrariesall.Instead,describedaboveinformationtheaboutoriginalasim-sim-
areulationsissued.inputTheandCAVEstudyoperationsislibraryplacedenablesinafilevisualizationdescribingandtherwayemotecommandssteering
withinVirtualReality(VR)environmentslikeacave.

EnvironmentsSolvingProblemAssolvingopposedenvirtotheonmentcomputat(PSE)ional(Parkeretsteeringal.,1999).libraries,AprSCIrunoblemisasolvingso-calledenvirpronmentoblem
isacomputationalsystemthatprovidesacompleteandconvenientsetofhigh-
InlevelsuchtoolsPSEsforsolvingapplicationspraroblemseoftenfromacomposedspecificthrdomainoughavisual(Abramspretal.,ogramming2007).
SystemsinterfaceIncsimilar(AVStoInc.,the2007)),widely-knownforexample.AVSfrTheont-enduserhasto(AdvancedsetupaVnetworkisualizationof

14

SteeringationalComput

modulesandinteractsviatheircorrespondinggraphicaluserinterface.Several
parameterscanbechangedduringthesimulationwithouttheneedtostopit.
Themodules,affectedwhichmodulereactisraccordingle-executedy.andOthersendschangesupdatwithedaoutputdeepertoallimpactconnectedonthe
simulationrequireanautomaticcancellationandrestartofthesimulation.
COVISE(Covise,2007)isacollaborativevisualizationandsimulationenvi-
renvironment.onmentsCOVISEforranalyzingenderingdatasetsmodulesintuitivelysupport.aThiswiderangedistributedofVirtualenvironmentReality
ferentintegratesprocessingsimulation,stepspost-prbeingreprocessingesentedandbyvisualizationmodules.InW¨ossnerfunctionalities,etal.the(2005)dif-
thesemodulesareextendedtosupportthesetupofaninteractiveCFDsimula-
tion.Themainfocusherebyistoinvestigatethefeasibilityofusingatangible
attachedinterfaceaswithanspecialintuitivemarkersinputaredevice.placedInaVwithiirtualntheVRealityirtualenvirRealityonmentsetup.obstaclesAset
ofcamerastracksthepositionofthesemarkerstodeterminethepositionandori-
entationoftheassociatedobstacles.Iftherepresentativesintherealworldare
tobemoved,triggertheedscenebyhasthetouserberbypremeshedessingforathebutton.followingsimulationcycle.Thishas

FrameworksApplicationInGeorgiiandWestermann(2005)anapproachtorealizeinteractivesimulation
onaformableconsumerbodiesPCduringisprruntime.esented.TheHere,simulationexternalisforbasedcescanonabemulti-gridappliedtosolverde-
1ritsunninggraphicsonacarPCsd.WCPUithinawhileprthee-prrenderocessingenginestepaisruncertaininparallelscenarioonisthesetupGPUonceon
ingand,finiteduringrelementuntime,meshtheofthesimulationdeformableenginebodyconsecutivelaccordingytodisplacestheuserthesunderly-interac-
tion.makingVFRealthe(Kfirst¨uhner,steps2003)towardswasanaprinteractiveecursorCFDapplicationsimutolationthisforthesisindoorapprcomfortoach
solverstudies.withItison-the-ydesignedasvisualiazation.monolithicTheapplicsteeringationrpossibilitiesunningacompriseLattice-Boltzmanntheplace-
everment,oftheseveralunderlyingbasicgeometrysimulationgridprimitivesforastheseuidobjectsobstaclesisrintoequirtheedinscene.aprHow-e-ge-
time.simulationatformneratedTheinsightsthathavebeenmadeduringtheprocessofportingtheVFReal
mentapplicationplayedontoanaactivesuperpartcomputerinanddevelopingconnectingthenewittoappraVoachirtualprRealityesentedenvirinthison-
thesis.Inthepresentformtheusercannowinteractwitheachofthethreepro-
chine.cessingstepsBoundaryevenwhenconditionsruncanbedistributedsetoronaadjustedvisualizationinteractivelyand,asissimulationcommonma-
inofinepreprocessingfront-ends.Inaddition,thegeometryofthesimulated
1graphicsTheboargraphicsdsprwhich,ocessinginspecialunit(GPU)applications,istheisalsodedicatedusedforgraphicsgeneralrenderingpurposedevicecomputations.onmodern

ApplicationofFields1.3.

15

scenecanbemodifiedduringruntime.IncontrasttoW¨ossneretal.(2005)arbi-
trarygeometriescanbeinsertedthroughoutthesimulationrunbyloadingfrom
thefilesystemwithoutpre-meshingoranyotherkindofpre-processing.Regard-
ingthesimulation,thenumericalmodelandmethodcanbeadjustedduringex-
ecutiontimeand,ofcourse,interactionwiththevisualizationofcurrentresults
forpost-processingissuesissupported.Theevent-basedframeworkworksfully
automatically,i.etheinteractionsareincorporatedwithoutanyextraactionslike
startingaremeshingprocess,restartingthesimulationorupdatingvisualization
data.FurtherdetailsoftheapplicationanditsfeaturesaredescribedinChapter2.
InBorrmann(2007)thisapplicationframeworkwasextendedforamultipleclient
engineering.collaborativesupporttoversion

ApplicationofFields1.3

Computationalsteeringhasawide,andstillincreasingrangeofpotentialap-
differplicationentarfieldseas.ofTointershowestthebenefitinghighfromversatilityofcomputationalthismethodsteeringafewareprexamplesesentedof
.below

Non-InvasiveReconstructionascularV

Arelativelynewfieldofapplicationforcomputationalsteeringcanbefoundin
medicalengineering.Thenumericalsimulationofbloodowingthroughblood
vesselsisapopularmatterofinterestinthisrespect.Thenon-invasivevascular
reconstructionasdiscussedinSlootetal.(2004)mayserveasatypicalexample
field.thisomfrArteriesandveinsareincreasinglyaffectedbyagrowingnumberofvascu-
lardiseases.Twocategoriesofvasculardysfunctionhavetobedifferentiated:
aneurismsandstenoses.Ananeurysmaldiseasereferstoballoon-likeswellings
intheartery,whereasstenosisrepresentsanarrowingorblockageoftheartery.
Avascularreconstructioninterventionaimsattreatingtheseabnormalvessels
throughsurgery.Inthecaseofaneurysmsthismeansaddingshunts,bypasses,
andplacingstentsor,forstenosis,applyingthrombolysistechniquessuchasbal-
loonangioplasty,bypasses,etc.Itisobviousthatfindingtheoptimaltreatmentis
farfromtrivialandasimulationtooltosupporttheverificationofaplanned
operationmayprovetobeavaluablesupplementtoclassicalapproaches.A
groupofresearchersunderProf.SlootattheUniversityofAmsterdamhasde-
velopedagrid-basedproblemsolvingenvironmenttotestseveraltreatmentsin
thisrespect.However,theystillhavetoprocessthewholesequentialpipelineof
pre-processing,computationandpost-processing.Nevertheless,theapplication
simulations.eal-timeralmostachieves

16

SteeringationalComput

SystemsManufacturingofSimulationRecentsibilitiesfordevelopmentsinteractiveinthesimulationsimulationwithinofamanufacturingcomputationalsystemssteeringalsoframeofferswork.pos-
Fortheoptimizationofmanufacturingsystemsatraditionalsimulationcycle
consistsofpreparinginputvariables,selectingsimulationparametersandrun-
ningthesimulation,whichisfollowedbyreviewingtheresultsafterthecom-
putation.Becauseofthecomplexwhat-if-scenarioanalysis,severalsimulation
cyclestained.areThereforneedede,beforSudhireanyandinitialKesavadasresults(2000)ofsufprficientoposetointerestutilizeorvaluecomputationalareob-
steeringconceptswithinaninteractivevirtualenvironment.Inthisway,on-the-
yvisualizationofresultsdeliveredbyamanufacturingsimulationcouldbeused
toallowforinstantfeedbackfromthesystemaftermodificationwithinthevir-
tualstudyenvirthisappronment.oachAwithsimplepromisingconceptualinitialrsystemesults.hasalreadybeenimplementedto

Helios-ComputerAidedLightingTechnologies
Toimproveandspeedupthedevelopmentalcycle,computer-aidedtechnologies
(CAx)putationalaresteeringextensivelycanusedbeinhelpfulinautomotivevariousdesignways.andOneconstrinteructionestingandapplicationoftencom-is,
e.g.,tems,thewhichdesigncanofbeautomotiveintegratedlightingstraightforwarindlyComputerintotheAidedLightingdevelopment(CAL)process.sys-
CALvidedmayintobetheseenthreeasastepsvirtuaofprle-prlightingocessing,laboratorysimulationwhereofthetheworkconstrisuctionagainanddi-
post-processing.AlloftheseissuesareaddressedbytheCALapplicationHelios
developedbyHellaKG(HellaKG,2007)fortestingautomotivelightsinasimu-
latedenvironment(BiermannandKalze,1996).Theabove-mentioneddevelop-
prmentototypecyclecanisrbeepeatedbuilt.untilIntegratingthespecificthevirtuallightinglightingrequirementslaboratoryareintometasocomputa-thata
aidedtionaltestingsteeringprocessframeworkand,morcouldeoverpr,obablywouldimprallowoveforandanintuitspeediveupthemodificationcomputerof-
designs.lighting

SimulationsPhysics-BasedInteractiveAtopicthatcloselyrelatedtocomputationalsteeringisthemoreandmorepopu-
larinteractivephysics-basedsimulationofreal-timescenariosincomputergames
orothervirtualworlds.Thistypeofsimulationdoesnotquitesatisfythedefi-
nitionofcomputationalsteering,sinceintheseworldsthegoalisnotthesteering
ofthesimulationitself.Infact,theactualintentionistosimulatearealorficti-
tiousworldinarealisticway,i.e.,torespectphysicallawsandthebehaviorof
thescenariodependingonallinuences,alsoincludingtheuserinthisworld.
Theremarkablysophisticatedgenreofcomputergamesisthemostprominent
representativeofthiscategory.

1.4.ComputationalSteeringinCivilEngineering

17

However,therearealsoacademicexamplessuchasthesimulationofrivers,as
describedinKipferandWestermann(2006).Here,watercanspringfromseveral
sources,owoveraheightfieldtoformriversortoclusterintolakes,perpetually
isinuencedbasedonbygravitation,smoothed-particlewindand,hydrofodynamicscourse,terrainwithsurfaceobstacles.extractionThersimulationunning
GPU.theonRelatedexamplescanbefoundinTh¨ureyetal.(2005)andT¨olke(2006),where
uidswithafreesurfaceareanimatedbasedontheLattice-Boltzmannmethod.
Th¨ureyetal.(2005)mainlypresentsdropsanduidstreamsfallingintoapool
ofuid.TheauthorsstatethatduetotheuseoftheLattice-Boltzmannmethod
itiscomparativelyeasytosetboundaryconditions.Therefore,itispossibleto
interactivelyplacedropsofuidaboveabowl-shapedobstacle,intowhichthese
dropssplashincomplexshapesdrivenbygravity.Similarly,albeitnotsteerable,
T¨olke(2006)simulatethefreesurfaceowoverabarragewithitshydraulicjump
wave.surfaandAnotherspectacularexampleofadistributedandparallelgraphicsapplica-
tionutilizingVirtualRealityistheFlowVRapplicationframework(Allardand
Raffin,2006).Itiscapableofsimulatingrigidbodies,mass-springobjectsandan
approximationofuidbehaviorbyaninviscid,multi-bodysimulation(Eulerian
uid)usinga32processorclusterforcomputationanda22processorvisualiza-
tionclusterplus5FireWirecamerasfortrackinginVR.Thepositionofthefreely
movinguseristrackedtoconsiderhisinuenceduetocollisionswiththevirtual
simulation.theinbodies

1.4ComputationalSteeringinCivilEngineering
Oneaspectofthisthesisistheevaluationoftheapplicabilityofcomputational
steeringforcivilengineeringproblems.Afterpresentingaseriesofgeneralex-
amplesofcomputationalsteering,thefollowingsectionconcentratesonthefield
ofcivilengineering,whichstandsoutfrommostotherengineeringdisciplines
becauseofthelackofdesigningprototypes.
Industrysectorswithlarge-scaleproductionsuchastheautomotiveindustry
usuallyinvestconsiderableamountsoftimeandmoneyinthedesignphaseofthe
prototypeforanewproduct.Inthisplanningandtestingphase,extensivesimu-
lationsareconducted,frequentlyleadingtorepeatedre-designingoftheconcept.
Nevertheless,prototypesarealsobuiltandtested.Afterafairlylongperiod,se-
rialproductioncommencesandthecostsofplanninganddesignarerecuperated
es.figursaleshighbyIncontrast,thespecialtyofcivilengineeringistheconstructionofunique
copies.Therefore,thedesignphasehastobemuchshorterandlessextensive
tobeprofitable.AsshowninFigure1.3thepossibleinuenceofthedesignofthe
projectanditscostissignificantlyhigherduringtheplanningphasethanduring
theconstructionphase.Thedevelopmentpatternoftheprojectcostsistheex-
actopposite.Therefore,agoodandcoherentconceptfortheplanningphaseis
required,sincelaterchangeswillcausedifficultiesandresultinhighadditional

18

1997).(Seidenschwarz,costs

ComputSteeringational

Figure1.3:Inuencesonabuildingprojectduringtheplanningandconstruc-
tionphases:Thisgraphshowsthehighdegreeofinuenceonaplannedcon-
prstructionojecthasinprtheproceeded,ojecttheprlesseparationexibleisphase.theThedesign.furtherSincethechangesatimplementationlaterofphasesthe
resultinhigheradditionalcosts,theimportanceofawellthought-outconceptis
evident(Seidenschwarz,1997).(takenfromDiederichs(1984))

Duetothelackoftimefordetailedsimulationsduringtheplanningphase,
oftenonlyempiricalrulesareusedfordesigninginstead.Nevertheless,itiswell
knownthatshort-comingsstilloccurandthebelatedeliminationoftheseisdiffi-
cultandcostintensive.Thisleadstothedesireforaninteractivesimulationtool
forpreliminaryinvestigations,whichmakesitpossibletorunshortsimulation
cyclestoproveorfindthebasicconcept,possiblyfollowedbyafewcarefully
selectedsimulationsetupsformoredetailedinvestigations.Thisisaclassical
situationwherecomputationalsteeringcanfinditsappliance.
Conceivabletasksinthefieldofcivilengineeringthatmightbenefitfromin-
teractivesimulationsare,forexample,theanalysisofthespreadofpollutiondue
tonaturalwinds,escaperoutesimulationduringvariousemergencyscenarios,
trafficsimulation,oodsimulation,orthebroadeningoffireandsmokeinbuild-
ings.Theusecaseforthisthesisisthesimulationofuiddynamicsforindoorair
ow,e.g.forventilationsystemsinrooms.Inthisinstance,thepointsofinterest
arewhethertheventilationaffectsallpartsoftheroomwhichneedtobecircu-
lated,andwhethertheowvelocitiesaresmallenoughtopreventuncomfortable
airmovementsorevensicknessoftheoccupants.
Thecomputationalsteeringframeworkdevelopedinthisthesisfortheissues
ofindoorCFDsimulationisbasedontheLattice-Boltzmannmethod(Succi,2001;
Kafczyk,2001).Toshortenthecomputationcycle,thenumericalmodelissimpli-
fiedcomparedtocurrentstate-of-the-artimplementations,whichusuallyinclude

1.4.ComputationalSteeringinCivilEngineering

19

advancedmethodssuchasspatialandtemporaladaptivity,multiplescales,and
boundarylayermodels.Anadditional,butalsothemostsignificantaccelera-
ontionisachievedhigh-performancebyusingaspectssuperofcomputers.computationalThereforsteering,e,specialespeciallyemphasiswithrisegardplacedto
thevectorsystemssuperavcomputerailableandduringthethisSGIthesis,Altixnamely3700/4700theofHitachitheLeibinizSR8000-F1Computingpseudo-
Munich.in(LRZ)CenterisintrTheoducedincomputationalchapter2.steeringChaptersframework3–7givesaiFluidsdetailedforindoorintrairoduction-owtosimuthelationsdevel-
opmentofadistributedcomputationalsteeringframework,i.e.thetheoryofthe
underlyingLattice-Boltzmannmethod,itsparallelizationandoptimizationfor
supercomputers,followedbyadescriptionofthevisualizationandsteeringmod-
ulequentlyand,thetheefficientapplicabilitycouplingofanthroughinteractiveasuitablesimulationtoolcommunicationincivilconcept.engineeringSubse-is
investigatedtakinganoperatingtheatreasthemainexample.Finally,theexibil-
toityaofcomptheletelymodulardiffersoftwarentpreoblem,conceptisnamelythedemonstratedinteractivebyapplyingbloodowthesimulationframework
.arteryanwithin

Chapter2

iFluidsFluidInteractive-Simulations

Thischapterservesasanintroductiontothecomputationaluiddynamicsap-
plicationiFluids,whichwillonlybrieybepresentedwithregardtoitscharac-
teristicarchitectureandfeaturestoprovideanoverviewofthemainobjectivein
thisthesis.Thedetaileddescriptionwillbegiveninlaterchapters.
iFluidsisanapplicationwhichhasbeendevelopedmainlyasatoolforindoor
uidowsimulationsbutitisalsoeasytoextendforsimulationstudiesinother
fieldswithafocusongeometricsetup.Whatdistinguishesitfromothercom-
putationaluiddynamicsapplicationsisitslayoutasacomputationalsteering
frameworkforhigh-performancecomputers.Itsuserisabletovisualizecurrent,
nearreal-timesimulationresultson-the-yandtointeractwiththesimulation
whileitisstillrunning.Besidesbasicinteractionoptionssuchas(re-)starting,
stopping,andpausingthecomputation,theusercanmodifythegeometryofthe
simulatedsceneaswellasitsboundaryconditionsinastandarddesktopora
high-endvirtualrealityuserinterface(Fig.2.1).Itisalsopossibletoadaptthe
computationalkernelintermsofthenumericalmodelorthebestoptimization
availableforaparticularhardwareplatform.
ThefollowingsectiondescribesthefunctionalconceptofiFluids.First,the
structuralcharacteristicsofthesimulationkernelaregivenfollowedbyapresen-
tationofthesteeringandvisualizationterminal.Finally,thespecialrequirements
neededtoachieveaninteractivesimulationtoolprovidingtherequiredrapidre-
sponsestouserinteractionsaresummarized.

ConceptSoftwareArchitectural2.1

Toutilizetop-levelhardwaresuchassupercomputersforsimulationandvirtual
realityenvironmentsforvisualizationandsteeringinanefficientway,theiFluids
frameworkhasbeendesignedasasetofindependentmodulesthatcanberun
(optimized)onmanydifferentplatforms.Anadditionalbenefitofthisarchitec-
tureisthepossibilityofrunningiFluidsdistributedonseveralinhomogeneous
setups.edwarhar

20

2.1.ArchitecturalSoftwareConcept

owFigurinean2.1:officeThisroomsequenceoveroftime.snapshotsThefirstshowsframethesimuladepictstedthedevelopmentsteady-stateofairair
owdefinedthroughbefortheehandroomthrandoughitstheunderlyingapplicationsboundaryVRuserinterconditions,face.Thewhichowhavefieldbeenis
visualizedbystreamlineswhiletheboundaryconditionsonthewallsarecolor-
codedspeed.frFromombthelue,rlefteprsideesentingoftherresoomtingairair,toentersredthrforoughairanmovingopenwithdoorstrmaximumeaming
dirhasectlybeentoaddedantoopenthewindowsceneandintheonecanoppositeclearlywall.seeInhowthethesecondowadjustsframeatothedesk
usernewtointeriorinvestigatedesignaseriesafterofonlyascenariosfewsecondsinteractivelyofandsimulation.observeThistheirenablesimpacttheon
theresultingowconfigurationstraightaway.Theinvestigatingengineercan
notnavigationonlychangesuchasthewalkingexternalandviewyingofthethroughscenethebutalsovirtualremployoomindifferhisentinteractivemodesof
.study

21

22

-iFluidsSimulationsFluidInteractive

TheschematicdiagraminFigure2.2showsthemoduleconceptofiFluids:the
visualizationandsteeringfront-endisrunonsuitableselectedgraphicshardware
rangingfromstandardlaptopstohigh-endvisualizationenvironments,receiv-
ingresultsfromthesimulationkernelthatcomputesthecurrentowconfigura-
tioninthebackground.Optimally,thesimulationkernelwouldnotbeexecuted
onthevisualizationmachinebutonanadditionalPC,clusterorsupercomputer,
dependingonthemodelsize,levelofdetail,andtheavailablehardware.The
graphicssetupsusedduringthisthesisweretheholobenchattheLeibnizCom-
putingCenter(LRZ),thePowerwallattheLehrstuhlf¨urBauinformatik,anda
notebookwithgood3Dgraphicscapabilities.Thesimulationwaseitherrunon
asingleLinuxPC,theBauinformatikLinuxCluster,ortheHitachiSR8000super-
computeratLRZ,dependingonperformancerequirementsandhardwareavail-
ability.Inadditiontodataexploration,theusercan—withinthesamefront-end
—intuitivelyadaptthegeometricalsetupofthesimulatedmodeltopreciselythe
setupheisinterestedin.Thesteeringterminalforwardstherelevantinteraction
eventstothesimulationkernel.Thekernelimmediatelyincorporatesthechanges
andsendsbacktheadaptedowdatasettotheuserfront-endshowinganinitial
trendofthenewconfigurationinthescenealmostwithoutdelay.Asmentioned
above,thesimulationispreferablyrunonappropriatenumbercrunchinghard-
ware,whichismostoftendifferentfromthevisualizationandsteeringterminal.

Figure2.2:Moduleconceptanddatatransfer:atthevisualizationclientthe
theusercanrunningexploresimulationthebycontinuouslymodifyingupdatedgeometryow,databoundaryinteractivelyconditionsandorcontrgeneralol
theowsolver,parameterswhichattheimmediatelysametime.incorporatesTheuserthesnewmodificationssetupand,areafteratransmittedsingletimeto
stepofcomputation,sendsbackafirsttrendofthenewowconfigurationalmost
datawithoutsetstodelaythe.TheCFDvisualizationkernelandcontinuessteeringtheclientinsimulationregularandintervalstransfersunticurrltheent
occurs.interactionusernext

NumericalInteractive2.2.Kernel

KernelNumericalInteractive2.2

23

Thesignedtosimulatimeetonkernelseveralspbehindecialanrequirinteractiveements.Toengineeringprovideaapplicationconvenienthastolevelbede-of
tobeinteractiveabletotakesimulationintoaaccountspeciallyuseradaptedmodificationsnumericalfullysolverisautomaticallyneeded,.whichFurtherhas-
themore,owthesolverconfigurationmustbewithinfastaenoughshorttospaceofgeneratetimeusefulandgiveatinformationleastandescribinginitial
trendexibilityofthecomesowatabehaviorcostandwithinisausuallyrsub-secondeectedintimeconstraintsframe.suchNeedlessasatorsayeduced,this
resolutionandcomplexityofthemodel.
handleInorderarbitrarytosupportgeometricdifferentsetups.Insimulationthisrespectscenarios,iFluidsthekernelcoversashouldbecomparativelyableto
ofwidefreightrangeofferries,ofapplicationsfices,andasblitoodhasbeenvesselsused(cf.forFig.the2.3).Itsimulationisaccorofdinglyengineradvan-ooms
totageouspermitthatthenospecialimmediatedatauseprofsucheparationlikegeometriesprine-generationtheofcomputationalgridsisnecessarysteering
framework.

Figure2.3:ExamplesofgeometriessimulatedwithiFluids:thetoppicture
ofshowsthertheoom.modelTheofpicuraneoffice,bottomwherleftetheshowsowafieldseparatordevelopsroomofwithinabigthecargoextensionferry
(FlensburgerSchiffbauGesellschaft,2007).Here,acomparativelycomplexge-
shownometryonthedominatesright,thewherescene.anAarterycompletelydefinesthedifferrentegionofapplicationbloodowofinitsiFluidsinte-is
rior(datakindlyprovidedbytheUniversityofAmsterdam:SectionComputa-
(2007)).Sciencetional

Whilethesimulationisrunning,theusercanenableordisabletheaccounting

24

SimulationsFluidInteractive-iFluids

ofaturbulencemodel,hecanchangethesimulatedscenewithregardtoitsgeom-
etry,itsboundaryconditionsandallrelevantowparameters.Advancedusers
canalsointeractivelytunetheperformanceofthekernelbychoosingadequate
optimizationstrategiesdependingonthehardwarearchitectureinuse.
emerThegedassolverainiFluidscomplementaryisbasedontechniquethefortheLattice-Boltzmanncomputationofmethod,uidowwhichphe-has
thenomenaLattice-Bol(seee.g.tzmannSuccimethod(2001);isKafczykimplemented(2001);onWcartesianolf-Gladrowgridsr(2000)).eprTesentingypicallythe,
spatialautomaticdiscrgridetizationgeneration.oftheInsimulationeachtimedomain,stepthethuspermittingLattice-Boltzmannafastandalgorithmfully
acomputesnumberofthecollisiondistributionofmicrfunctionsoscopic,ateachvirtualgridparticlespointandmodeledmigratesstatisticallythedistri-by
butionfunctionsoftheseparticlestoneighboringlatticesitesintheso-called
propagationstep.Fortunately,thecomputationofthecollisiondoesnotrequire
aranyeafdatafectedthrexchangeoughwithpropagatedneighboringdistributions.gridnodesTherandeforonlye,thedirectlyLattice-Boltzmannadjacentcells
methodallowsforanefficientparallelizationofthesimulationkernel.

2.3TheVisualizationandSteeringFront-End

WithinacomputationalsteeringapplicationlikeiFluidsthevisualizationand
ticularsteeringfrimportance.ont-endasTotheprovidecentralanaturalinterfaceandbetweenintuitiveuserwayandtoexplorsimulationetheisofsimula-par-
tionalizationdataandandtosteeringinteracthaswithbeenthercombinedunningintoasimulation,singletheclientuserfrinterfaceont-end.forResultsvisu-
araryevisconditionsualizedtowithallowbothfortheabettergeometryofunderstandingtheinvestigatedandtoscenesupportanditsaccuratebound-and
well-directedinteraction.Inadditiontotheconventionaldesktopinterfacethevi-
sualizationclienthasbeenimplementedtosupportvirtualrealityenvironments
withmultipleprojectionscreensandtrackingofthepositionsoftheusershead
andofinputdevicestoallowabetterimmersionintothesimulatedscene.
MercurysOpenInventor(MercuryComputerSystems,Inc.,2007b)isused
fornavigation,visualizationandandmenuscene-graphhandling.Itmanipulationsupportsofvariousobjectgeometrymodificationsandofposition,objects,
suchastranslation,rotation,andscaling,aswellasthetransformationofmapped
rdataealizedsuchwithasthetheseedhelpofpointstheofaDataVparticleizextensiontracinglforibrariesexample.ofDataOpenInventorvisualization,whichis
providesarangeofstandardtechniqueslikestreamlines,iso-surfaces,andcross
sections.Themaintaskofthesteeringinterfaceistoprovideuserinteractionfacilitiesto
formanipulateuidtheparameterssimulationandrstart-upun.Anothergeometryoptionaltobeprfeatureedefinedallowstogethertheinitialwithsetupthe
prcorreprespondingocessingmoduleboundaryintegratedconditionsinthebeforethecomputationalsimulationsteeringstarts.Inthisframeworkcase,canthe
berunasastand-aloneapplication(Kollinger,2007).

2.4.RequirementsofInteractiveSimulationSoftware

25

VRenvirDuringtheonmentssimulationsupportsadirectcontext-basedaccessto3Dobjectmenuparametersforimprlikeovedboundaryusabilitycon-in
ditionsandintuitiveinteractionswithpost-processingobjects(Marcheix,2004).
beNewuidtransformedobstaclesorrcanemovedbefrimportedomthefromsimulatedthefilescene.systemandAnalogouslyexisting,itisobjectspossiblecan
toaddandmodifyboundaryconditionsduringtheongoingsimulation.
ulesTheinteractingvisualizationviaandclearlysteeringdefinedclientsinterfacesaretoagainrsimplifyealizedtheasexchangeencapsulatedwithothermod-
visualizationtools.Thisinterfaceconceptisalsotheprerequisiteforextensions
suchastheadaptationofthesingle-userapplicationtocollaborativemulti-client
sessions.Finally,theperformanceofthecommunicationinitiatedbyforwardingauser
interactionfromthesteeringinterfacetothesimulationkernelisofvitalimpor-
tanceforaresponsivecomputationalsteeringapplication.

2.4RequirementsofInteractiveSimulationSoftware

Asmentioned,computationalsteeringmaybedefinedasthefusionofthetradi-
ationallyclosedloop.separatedThestepsofcomputationalpreprocessing,steeringcomputationapplicationandshouldpostprenableocessingauserintoto
runasimulationandmonitorcurrentresultstoestimatethestateortrendofthe
simulation.Inaddition,theminimuminteractionoptionsshouldcomprisepaus-
pring,ovidedstopping,byandmodifyingrestartingsomethe(fairlysimulation.simple)Ansimulationadvancedparameters.levelofFinallyinteraction,trueis
computationalsteeringsupportsamultitudeofrepeatedmodificationswithout
ofthetheneedtosimulatedrestartscenethecanbesimulation.modifiedThistogetherimplies,inwithparticularboundary,thattheconditions,geometryand
generalscription,owseveralparameters.fundamentalTorealizepreraequisitescomputatareionalidentified,steeringasprojectdescribedofthisinde-brief
.below

•Numericalmethod:Theunderlyingnumericalmethodshouldallowforthe
easyincorporationofuser-initiatedmodifications.Accordingly,itshouldbe
basedonagridormeshthatissuitableforfastgenerationandmodification.
Furthermore,thefeasibilityofon-the-yvisualizationofsimulationresults
ed.ensurbemust

•geometricalComputation:Tomanipulations,allowthetheusertosimulationwatchtheneedstoadaptationbefastoftheenough.uidtoThishisre-
quiresahighlyoptimizedsolverrunningpreferablyinparallelonadequate
hardwarearchitecture.

Currvisualization:Data•minimumwithplayed

entesultsrhavetobeavailableandmustbedis-playedwithminimumdelay.Thedatavisualizationmoduleshouldenable
interactiveexplorationoftheresultantdata.Displayingthegeometryofthe

26

iFluidsSimulationsFluidInteractive-

simulatedscenealongwiththesimulationresultsmakesforabetterunder-
standingofthephysicalbehavior.

•Steeringandproblemdefinition:Steeringtheapplicationrequiresanin-
terfacewhichshouldbecombinedwiththevisualization,preferablyina
Thesinglemainterminalpartoftotheprovidesteeringamorpreocessintuitiveistheprinteractionoblemwithdefinitionthecomprisingsimulation.
thegeometricsetupandmanipulationaswellassettingtheboundarycon-
dition.

Communication:dingRegartheusabilityofthecomputationalsteeringap-plication,thecommunicationcouplingthemodularbuildingblockshasto
beexibleandefficientastheusernotonlyexpectstobeshowncontinually
updatedsimulationresultsduringdataexplorationbutalsowantstohave
instantresponsestohismodifications.

Inthefollowingchaptersthemodulesmakingupthecomputationalsteering
frameworkwillbeintroducedwithspecialattentionpaidtotherequirements
above.listed

3ChapterDynamicsFluidComputationalMethodLattice-BoltzmanntheUsingThischaptergivesashortgeneralintroductionintocomputationaluiddynam-
ics(CFD)followedbyabriefderivationoftheLattice-Boltzmannmethod(LBM)
tionwhichofhasthisbeenchapterusedfordescribesthethenumericalsimulationsimulationskernelinthisimplementedthesis.forThethethirdcompu-sec-
tationalsteeringprojectiFluids.
DynamicsFluidComputational3.1ThemousNavierconventional-Stokesformofequations.describingTheyuidcanbephenomenasummarizedinasCFDtwoissetsbasedofontheequationsfa-
toNavierexpress-Stokesbothequatimassonsandformomentumdescribinganconservation.incompressibleOften,onlyNewtoniantheuid,simplifiedi.e.
ρ=const,areused.Theycanbefoundinliteratureas
∂uα=0(3.1)
x∂α∂uα+uβ∂uα=1∂p+ν∂∂uα(3.2)
∂t∂xβρ∂xα∂xβ∂xβ
thatwithtermsthecontainingdensityρ,rprepeatedessurepGr,eekvelocityindicesuandhavethetobekinematicinterpretedviscosityaccorν.dingNote,to
theEinsteinsummationconvention,i.e.xαyα=αD=1xαyα.
classicalTherearappreoachesnumerarouseformethodsexampleofsolving(wikipedia,these2007a):equationsnumerically,some
FiniteDifferencesMethod(FDM):Applyingthismethodthederivativesin
thepointstransportusuallyonequationararegulareapprgrid.oximatedAccorbydinglyT,aylorthediscrexpansionsetizationaterrdiscroreteis
givenbythedifferencebetweenthevaluesonthediscretegridandtheexact
solution.Itispossibletoreducethiserrorbytakingtermsofhigherorder
27

28ComputationalFluidDynamicsUsingtheLattice-BoltzmannMethod

inthisthemeTthodaylorisapprthattheoximationsolutionintoofthedifconsiderationference(Noll,equations1993).isnotAproblemnecessarilyof
domainconservative,arenoti.e.initcanbalancehappen(Sch¨onung,thatinwar1990).dandoutwarduxesofagiven

FiniteprVoacholumeasitisMethodusedmost(FVM):oftenFVMinmaycommerbeseencialasthesoftwareclassicalandroresearstandarchdcodes.ap-
Herumese,theovercompwhichtheutationaldifferdomainentialisequationsdiscraretizedeintointegrated.so-calledFromcontrtheseolvol-vol-
umeintegralsortheircorrespondingsurfaceintegralsonecanderivebal-
tageanceoverequationstheFDMwhichliesintheguaranteeconsaervativeconservativediscrdiscretization,etization.whichTheallowsadvan-non-
equidistantandcurvilinearmeshes(Noll,1993;Sch¨onung,1990).

FiniteturalElementanalysisofMethodsolids,(FEM):butitisThisalsomethodapplicableistouidparticularlydynamicspopular(Schfor¨stronung,uc-
1990).Insteadofsolvingthepartialdifferentialequationsdirectly,solu-
tionsintegral.toaThisweakmethodformulationcanbeoftheappliedtoequationsunstrareuctursoughtedbygridsminimizingconsistinganof
trianglesorquadrangles,thelatteralsobeingpreferredforuiddynamic
oblems.pr

Theabove-mentionedmethodsareallbasedonthediscretizationoftheNavier-
Stokesdifferentialequations.Analternativeapproachtosimulateuiddynamics
istheLattice-Boltzmannmethod(LBM)(Wolf-Gladrow,2000).Insteadofsolving
theNavier-Stokesequationsdirectly,LBMcanbeseenasadiscretemicroscopic
modelwhichconservesmassandmomentumbyconstruction.Thecorrespond-
ingmacroscopicquantitiesareobtainedthroughamulti-scaleanalysis.

3.2MethodLattice-Boltzmann

ThemechanicsBoltzmannandisequationused,e.g.,isthetocentraldescribetheequationofdistributiontransportofparticlestheoryininauid.statisticalIt
ofdescribesparticlestheintimetheevolutionphase-spaceofthevolumed3distributionxd3uatfunctitimeton,f(wherx,ue,xt)ofandtheuarenumberposi-
tionandvelocity,respectively.Consideringchangesintheparticledistributions
duetoexternalforcesForthroughinternalcollisionsΩbetweenparticles,the
eadsrequationBoltzmann∂f∂f∂f
∂t+∂xαuα+∂uαFα=Ω(f).(3.3)
BeginningwithadiscretizedversionoftheBoltzmannequation,theLBMap-
prcreteoachnumbercomputesofthevelocitiesdynamicsonaofcompusuchtationalstatisticalgridparticleorlattice(cf.distributionsFig.for3.1).aThedis-

MethodLattice-Boltzmann3.2.

Figure3.1:Two-dimensionallatticesiteintheLBM:Thisschemeshowsnine
discretevelocityvectorsforagridpointina2Dlattice,whichrepresentthe8
velocitiestotheneighboringnodesandtherestingvelocity0inthecenterof
thecell.ThisisjustoneexampleinanumberoflatticemodelsDkQb,themost
popularonesbeingD2Q9,D3Q15,andD3Q19.Inthisnotationintroduced
byQianetal.(1992),kdenotesthespatialdimensionofsimulationspaceandb
referstothenumberoflatticevelocities.Inchoosinganappropriatelatticemodel
itisimportanttobearinmindthatasufficientsymmetryofthelatticeisguar-
anteed,otherwisetheLBMcannotcorrectlyreecttheNavier-Stokesequations
1986).al.,et(Frisch

29

statisticaldescriptionalsorepresentsthemainimpr1ovementoftheLBMoverits
historicalorigin,theLattice-Gasautomata(LGA).
Bydesign,theLBMmethodconservesthequantitiesofmassandmomentum
tofulfillthehydrodynamiclaws.TheLattice-Boltzmannalgorithmcomputesthe
collisionofmicroscopic,virtualparticlesandupdatesthevelocitydistribution
functionsineachsimulationtime-stepfollowedbyapropagationstepwherethe
migrationofthesedistributionfunctionstoneighboringcellstakesplace.Typi-
cally,theLBMisimplementedonuniformCartesiangrids,whichmakesitpar-
ticularlywell-suitedfortakingadvantageofparallelizationand/orvectorization
capabilitiesofhigh-performancesupercomputersandallowstohandlecomplex
geometries.AlthoughtheLBMrepresentsarelativelymodernapproachithasalready
beenextendedinmanyways.Thereare(mainlyresearch)codesformultiphase
orfreesurfaceow(GinzburgandSteiner,2003;Th¨ureyandR¨ude,2004;T¨olke,
2001;Heetal.,1999;ShanandChen,1993),thermaluidsimulations(vanTreeck,
2004;LallemandandLuo,2003;Mezrhabetal.,2004;vanTreecketal.,2006),
acousticsimulations(LallemandandLuo,2003;HaydockandYeomans,2003;
Neuhierl,2006),andmedicalsimulations(Bernsdorfetal.,2006;Hirabayashi
1ticeAofcellslatticewithgasalocalautomatonupdateisarulespecialdeterminingtypeofeachcellularcellssautomaton,state.Thiswhichupdateisrdefineduleisbyaappliedlat-
neighboringsimultaneouslycells.toallEachcellscellandcanonlyeitherusesbeinforemptyormationofoccupiedacellsbyacurrsingleentstateparticle,andthisthatrofestrictioncertain
beingthecharacteristicdifferencefromLBM.

30ComputationalFluidDynamicsUsingtheLattice-BoltzmannMethod

etal.,2003;Artoli,2003;G¨otz,2006;Slootetal.,2004).Bynow,LBMisawell
understoodandacceptedmethodofsimulatinguiddynamicsandisalsoreal-
izedinthecommercialproductPowerFLOWdevelopedbyExaCorporation(Exa
2007).Corporation,Asmentionedabove,theLBMhasbeendevelopedfromtheLattice-Gasau-
tomata.ThemainmotivationforthetransitionfromLGAtoLBMwasthedesire
toremovethestatisticalnoisebyreplacingtheBooleanparticlenumberinalat-
ticedirectionbyitsensembleaverage,theso-calleddensitydistributionfunction
(wikipedia,2007b).Thisreplacementhastobeaccompaniedbyaconsecutive
modificationof,thediscretecollisionrulestoacontinuousfunction—thecolli-
.operatorsionTherearenumerouswaysofintroducingtheLattice-Boltzmannequation(LBE).
FollowingChenandDoolen(1998),thestartingpointisthediscreteversionof
thekineticEquation(3.3)fortheparticledistributionfunctionneglectingexter-
ces:fornalfi(x+eix,t+t)=fi(x,t)+Ωi(fi(x,t)),(i=0,1,..,N)(3.4)
wherefirepresentstheparticlevelocitydistributionfunctionalongtheithdirec-
tion(cf.Figure3.1);Ωiisthecollisionoperatorexpressingtherateofchangeof
fiduetocollision.fi(x,t)istheprobabilitydensityofparticlesinxattimet.
Therefore,themacroscopicdensityρ(x,t)canbecomputedasthezero-thorder
momentvelocity

(3.5)

Nρ(x,t)=fi(x,t).(3.5)
=0iMoreover,theuidmomentumisthefirstordervelocitymoment
Nρ(x,t)u(x,t)=fi(x,t)ci(3.6)
=0iwiththemacroscopicvelocityu(x,t)andthemesoscopiclatticevelocityci.
Applyingtheconservationofmassandmomentumtotheequations,twocon-
straintsonthecollisionoperatorarefound:
NΩi(f)=0(3.7)
=0iNciΩi(f)=0(3.8)
=0iTotransformthediscreteLBEintoacontinuousequationaccurateuptosecond
orderint,aTaylorexpansionisapplied:

(3.7)

MethodLattice-Boltzmann3.2.

31

fi(x+cit,t+t)=fi(x,t)+∂fi(x,t)ciαt+∂fi(x,t)t
t∂x∂α+1∂2fi(x,t)cicit2+1∂2fi(x,t)t2(3.9)
2∂xα∂xβαβ2∂t2
+∂2fi(x,t)ciαt2+O(t3)
xt∂∂αAfterthat,theparticleeqdistributionfunctionsareexpandedbytheequilibrium
distributionfunctionfiusingtheChapman-Enskogmulti-scaleexpansion,i.e.

(3.11)(3.12)

fi=fieq+εfi(1)+ε2fi(2)+O(ε3)(3.10)
=fieq+εfi(neq)
withthenon-equilibriumdistributionfunctionfi(neq)=fi(1)+εfi(2)+O(ε2).
InanalogytoEquations(3.5)and(3.6),fieqshouldsatisfy
fieq=ρ,fieqci=ρu,(3.11)
iirequiringforthenon-equilibriumpartsfi(k)withk={1,2}that
fik=0,fikci=0.(3.12)
iiTaylorexpansionofΩi(f)aboutfeqassumingfneqfeqyields
NeqNN2eq
Ωi(f)=Ωi(feq)+∂Ωi(f)fjneq+1∂Ωi(f)fjneqfneq+O(|fneq|3)(3.13)
kj=0∂fj2j=0k=0∂fi∂fk
FromEquation(3.4)weseethatΩi(feq)=0forε→0.Bykeepingonlyterms
linearinfineqEquation(3.13)canbesimplifiedtothelinearizedcollisionoperator
N∂Ω(feq)N
Ωi(f)=ifjneq=Mij(fj−fjeq)(3.14)
j=0∂fjj=0
eqwhereMij≡∂Ωi(f)satisfyingtheconstraints(Benzietal.,1992)
f∂jNNMij=0,ciMij=0.(3.15)
=0j=0jAssumingthatthelocalparticledistributionrelaxestoanequilibriumstatewith
asinglerate1/τwehaveMij=−τ1δijandtheBGKcollisionterm(Bhatnagar
1954)al.,et

32

ComputationalFluidDynamicsUsingtheLattice-BoltzmannMethod

(3.16)

Ωi=−τ1(fi−fieq),(3.16)
whichleadstotheLBGKequation
fi(x+ei,t+1)=fi(x,t)−fi−fieq.(3.17)
τThefollowingsectionshowsthatthemacroscopicvelocityuobtainedfromthe
solutionofthisequationfulfillstheNavier-Stokesequationuptosecondorder
.accuracyInadditiontoEquation(3.10),theChapman-Enskogexpansionisemployedto
obtain∂∂∂∂∂∂t=ε∂t1+ε2∂t2,∂x=ε∂x1.(3.18)
CombiningthecontinuousTaylor-expandedLBE(3.9)andthelinearizedcollision
getwe(3.14)operator

∂fi∂fi1∂2fi∂2fi∂2fi1Neq
∂t+∂xαciα+2t(∂xα∂xβciαciβ+∂t2+2∂t∂xαciα)=tj=0Mij(fj−fj)).
(3.19)ApplyingtheChapman-Enskogexpansions(3.10)and(3.18)nowleadsto
Neqeq∂∂fti1+∂∂xf1iαciα=1tMijfj(1)(3.20)
=0j

toorderε0andto

∂feq∂f(1)∂f(1)1∂2feq∂2feq∂2feq1N
∂ti2+∂ti1+∂x1αiciα+2t(∂x1α∂ix1βciαciβ+∂t12i+2∂t1∂ix1αciα)=tMijfj(2)
=0j(3.21)toorderε1.Aftersomealgebrathefirstorderequationcanbesimplifiedto
∂fieqN1∂fj(1)∂fj(1)1N(2)
∂t2+j=0(δij+2Mij)(∂t1+∂x1αciα)=tj=0Mijfj.(3.22)
latedNowtheandrzero-thecombinedorderwithvelocityrespecttomomentstheiroftimeEquationscale,i.e.(3.20)and(3.22)arecalcu-
ρu∂ρ∂α∂t+∂xα=0,(3.23)
whichisthemassconservationsatisfiedbytheLBEuptosecondorderaccuracy.

MethodLattice-Boltzmann3.2.

33

Inthenextstepthefirstordervelocitiesarecalculatedaccordinglyandafterre-
combiningthescaleswearriveat
∂ρu∂∂tα+∂xβΠ=0(3.24)
withthemomentumuxdensitytensor
NNΠ=ciαciβ(fieq+(δij+21Mij)εfj(1)).(3.25)
=0j=0ilatticeFinally,strweucturneede.Totosimplifyspecifythethederivationequilibriumwithoutdistributionsloosingcorrgeneralityesponding,wetotakethea
lookatatwo-dimensionalsquarelattice(cf.3.1,ChenandDoolen(1998)).
Therefore,ninelatticevelocitiesaredefined:
ππei=(cos(2(i−1),sin(2(i−1)),fori=1,2,3,4;
ei=√2(cos(π(i−1)+π),sin(π(i−1)+π)),fori=5,6,7,8;(3.26)
4242e0=0forthezero-speedvelocity.
AccorfunctiondingcantobeChenwrittenetal.toO((1992)u2)asthegeneralformoftheequilibriumdistribution
fieq=ρ(a+bei∙u+c(ei∙u)2+du2(3.27)
withthelatticeconstantsa,b,c,andd.ItisonlyforsmallMachnumbersandby
obeyingtheconstraintsin(3.11)thattheseconstantscanbedeterminedas
fieq=ρωi(1+cicα2uα+2c12(cicα2ciβ−δαβ)uαuβ)(3.28)
ssswithasoundspeedcs=√13andω0=94,ω1..4=91,andω5..8=361.
Insertingtheequilibriaintotheuxtensor(3.25)weobtain

NΠ(0)αβ=ciαciβfieq=pδαβ+ρuαuβ,(3.29)
=0iNNΠαβ(1)=ciαciβ(δij+21Mijεfj(1))=ν(∂∂xρuα+∂∂xρuβ)
i=0j=0βα
wherethepressurep=3ρandthekinematicviscosityν=(2τ6−1).
Theresultingmomentumequationisnow
∂uα∂uα∂p∂∂ρuα∂ρuβ
ρ(∂t+uβ∂xβ)=∂xα+ν∂xβ∂xβ+∂xα,(3.30)
whichresemblestheNavier-Stokesequationaslongasthedensityvariationδpis
enough.small

34ComputationalFluidDynamicsUsingtheLattice-BoltzmannMethod

3.3ImplementationoftheLBMSolver
TheLattice-Boltzmannkernelimplementedforthecomputationalsteeringappli-
cationiFluidsisaBGKsolverbasedonaD3Q15lattice.
The15velocitiesaredefinedas
01−100001−11−11−11−1
ci=0,..,14=cs0001−1001−11−1−11−11,
000001−11−1−111−1−11
andthecorrespondingequilibriacanbefoundtobe
f0eq=2ρ1−3uu,
29fieq=91ρ1+3eiu+29(eiu)−23uu,fori=7,..,14
391fieq=72ρ1+3eiu+2(eiu)−2uu,fori=7,..,14,
whereei=c1sci.
ForhighReynoldsnumbers(see(3.31))anoptionalturbulencemodelisinte-
gratedintothesolvertotakeunresolvedsub-gridphenomenaintoaccount.To
decidewhetheraturbulencemodelisneededornottheReynoldsnumbermust
estimatedbeas

uL(3.31)=eRνwithuthemeanuidvelocity,Lthecharacteristiclength,andνthekinematic
.viscosityuidLaminarowoccursatlowReynoldsnumbers(i.e.Re2100),whileturbu-
lentowoccursathighReynoldsnumbers(i.e.Re4000).Thetransitionbe-
tweenlaminarandturbulentowisoftenindicatedbyacriticalReynoldsnum-
berwhichdependsontheexactowconfiguration.
Assumingatypicalindoorventilationsetup,theairvelocitycanbeestimated
as≈0.13m/stokeeptheroomclimatecomfortable.Aceilingheightof2.75mis
takentobethecharacteristiclength,whichleadstoaReynolds-numberofRe≈
23800forakinematicuidviscosityof1.5∙105(atatemperatureof20oC).
ApopularturbulencemodelistheLargeEddySimulation(LES)usingthe
Smagorinskysub-gridmodel.Thismodelhasalsobeenimplementedforthe
Lattice-BoltzmannsolverwithintheiFluidscomputationalsteeringapplication.
Theadvantageofthismodelisitsrelativelylowcomputationalcost,compared
tothelargeparameterstabilityimprovementofthesolver.Simulationsoncom-
parativelycoarsegrids,inparticular,aremadepossiblewiththisenhancement
(Th¨ureyandR¨ude,2005).

3.3.ImplementationoftheLBMSolver

35

ThekeyideaoftheSmagorinskysub-gridmodelistheconceptofaneddy
byviscositysmallνtscales.asaνtissyntheticrelatedtoindicatorthelocalofthestraindampingtensorefasfects(Succionetlaral.,ge1995;scalesKafczyk,caused
2005)Hartmann,2001;

with

√νt=Cs2S2

(3.32)

with1Sαβ=2(∂αuβ+∂βuα).(3.33)
CsistheempiricSmagorinskyconstant.AnobviousadvantageoftheLBM
usingthisturbulencemodelisthelocalavailabilityofthestraintensorateach
latticesiteandisobtainedby(cf.Th¨ureyandR¨ude(2005);Artoli(2003)):

14Sαβ=eiαeiβ(fi−fieq).
=0i

(3.34)

4Chapter

SupercomputersonSimulationFluid

AfterashortintroductionintoHigh-PerformanceComputing(HPC)ingeneral,
thischapterdescribestheoptimizationandparallelizationstrategiesfortheLat-
tice-BoltzmannsolverinthecomputationalsteeringframeworkiFluids.Thesol-
verhasmainlybeenoptimizedfortheHitachiSR8000systematLRZbuthas
alsobeenportedtoSGIAltix3700/4700systems.Therefore,thearchitectureof
theHitachiSR8000anditsfeatureswillbeintroducedfirst,followedbyadetailed
discussionofthemethodsappliedforparallelizationandoptimization.Then,the
architectureoftheSGIAltixwillbesketchedtobeabletomotivatethemaindif-
ferencesintheparallelizationandoptimizationapproach.Finally,thesimulation
performanceofbothmachineswillbecompared.

ComputingHigh-Performance4.1

formingHigh-PerformancenumericalsiComputingmulations.refersInthetothelate1980s,utilizationtheofUSsuperGovernmentcomputersfordefinedper-
supercomputersassystemshavingoneormoreprocessorscapableofachiev-
inganaggregateperformanceofmorethan100MFlops/s1(Standish,2006).In
analmostindustrialeveryyearsector(seeFigurdevelopinge4.1)soitisfastthatobviousitsthatcumulativedefinitionslikeperformancethisaredoublesunten-
able.definesEvensupertodayscomputerPCssas(2006)alrcomputingeadyrresoureachcesuptowhich5prGFlops.ovidemorAnotherethanappranoachor-
derofmagnitudehighercomputingpowerthanisusuallyavailableonmodern
2006).(Standish,PCsclustersEvenofwithofthisf-the-shelfdefinition,workstationsHPCupsystemstothelarspangesttheandrangefastestfromsuperdepartmentalcomputers
inastheworld.ComputationalTFluidraditionally,Dynamics,HPCisusedMolecularinscientificDynamics,andandengineeringQuantumfieldsChromo-such
Fortran,Dynamics.muchDuefewertotheirinC.scientificNeweroriginimplementationsHPCcodesalsoareusestillmainlyobject-orientedwrittenlan-in
guagessuchasC++andevenJava.

1MFlops=MegaFlops=106FloatingPointOperationsperSecond

36

ComputingHigh-Performance4.1.

Figure4.1:Thischartshowstheperformancedevelopmentofthetop500su-
percomputersduringthelastdecade.Thepeakperformanceofthefastestsuper-
isrcomputerepresentedonthebylisttheispurplemarkedline.bytheTheredgrline,eenlinewhiletheshowstheperformanceaccumulatedtheweakestpeak
performanceofallsupercomputersonthelist(takenfromTOP500(2007)).Ev-
idently,theperformanceofthesesystemsnearlydoubleseveryyear.

37

AsdescribedinStandish(2006)andDowdandSeverance(1998),high-perfor-
mancecomputersareusuallyclassifiedintothreecategoriesaccordingtotheir
e:chitecturarocessingpr

ectorVmputersco:TheCPUsofvectormachinesovidepructionsinstrwithvectoroperandsthatallowtheCPUtoefficientlyexploitthevectorhard-
warewithitsmultiplearithmeticunits.Byorganizingmemoryinbanksand
usingfastmemorytechnologyrunningatCPUclockspeedthebandwidth
betweenCPUandmemorycanbeincreaseddramatically.Optimizingcom-
pilersareabletotransformcertaindataaccesspatternsintoseriesofvector
operations,thusallowingsomelegacycodetobeusedwithoutchange.The
downsideofsuchsophisticatedmemorysubsystemsandcustomproces-
sorsistheircost.Vectorcomputersare20to40timesmoreexpensiveper
peakMFlopthancommodityprocessors.Onealsohastokeepinmindthat
vectorsystemsarehighlyproprietaryarchitecturesandassuchdependon
specializedsoftware.Duetotheresultingportabilityissuestheysufferfrom
arestrictedrangeofavailablesoftware.

•computersSymmetricuseMulticommodityProcessoror(SMP)RISC-basedandprocessorsShared-MemorysharingtheSystemsmain:mem-SMP
oryoryofconcurrthesystem.ently,theByhavingmemorymultiplebottleneckprofocessorscommodityattemptingtoworkstationsaccessismem-com-
pounded,sothatmodernSMPdesignsarrangetheconnectionbetween
calmemorysegmentsandforCPUeachviaCPUso-calledwhichcrossbars.otherCPUsHere,canmemoryaccessisonlylaid-outrinemotelylo-
throughthecrossbarnetworkwithasmallcommunicationoverhead.In

38

FluidSimulationonSupercomputers

addition,consistencytechnicalbetweenthemeasurpresocessorsneedtobecaches.implementedBesidesthetoincrensureeasedmemoryband-
width,thebiggestadvantagewiththistechnologyisthatlargeamountsof
memoryperformanceareavailablepenaltytowithsingleregarprdtoocessors.dataHoweverplacement:,therForeisalsooptimalapossibleperfor-
themanceamountaprofogramdatadoesmovednotonlybetweenneedCPUsahighalsohaslocalitytoofbe(cache)minimizedreferdueences,to
theoverheadofthecache-coherencemechanism.
•ciple,ClustersclustersofofferworkstationsultimateorPCsscalabilityor—ifyouDistributed-MemoryneedmoreCPUSystems:powerIn,prin-just
addanotherPCatcommoditycost2.However,thenumberofnetworking
componentsperCPUwillincreaselogarithmicallywiththetotalnumber
ofmachinesCPUsinwiththe1000scluster,ofandCPUswillareeventuallypossibleinthisdominateway.theThecost.Indownsidepractice,of
handleddistributed-memoryexplicitlybythemachinesprisogrammerthat.exploitationofparallelismmustbe

Besidesthedifferentkindofhigh-performanceresourcesinuse,theperfor-
mancegainachievablethroughparallelizationdependsonthesimulationprob-
rlem.efersAstoathefactquantitatoriveindicatingmeasurehowthemuchparallelfasterspeedupthehasparallelbeenprintrogramoduced,versionwhichcan
beexecutedascomparedtoitsserialcounterpart.Accordingly,thespeedupis
bydefined

S(np)=tt1(4.1)
npwheret1andtnprepresenttherunningtimeoftheserialapplicationonone
processorandtheparallelversionusingnpprocessors,respectively.FromEqu.
(4.1)theparallelefficiencyisderived
E(np)=S(np)=t1.(4.2)
nptnpnp
Itassessestheper-processorutilizationfortheparallelprogramasafraction
speed.executionserialtheofDifferentalgorithmsusuallyarenotequallysuitedforparallelization,depend-
ringeticalontheirspeedupfractionwhichofcanbenon-parallelizableachievedwithcode.aTogivenestimateamounttheofresourmaximumces,theo-Am-
dahlsLawcanbeused,namely

1S(np)=α+1−α.(4.3)
np2Atthecurrentstateoftheartthescalabilityinsystemswithacommodityinterconnect(Giga-
bitEthernet)islimitedtoapproximately1000CPUs

4.2.HitachiSR8000-F1SystemArchitecture

39

Here,non-parallelizable,npreferstothefractionnumberoftimeofproftheocessorscode,inusewhichandisαusuallymeasuresnon-trivialtheserial,toi.e.de-
termine.Inthelimitofaninfinitenumberofprocessorsthisleadstoamaximum
ofspeedup

1=S,maxα

(4.4)

exceeded.becannotwhichInpractice,thetheoreticalmaximumspeedupthroughparallelizationisfur-
therimpairedbytheso-calledparalleloverheadduetothreadcreationandschedul-
ing,communication,andsynchronization.Thisoverheadintroducesfurtherterms
intoAmdahlsLaw.Itthentypicallytakestheform

1S(np)=(α+β)+1−α+knp.
np

(4.5)

tionwithpatternpositiveoftheparametersapplicationβandandkthewhichinterconnectdependonharthedware.specificBothcommunica-parameters
worsenthespeed-up;thekterminfactdegradesperformanceforasufficiently
.ngelarpThequantitiesintroducedaboveallowtodescribethepossibleimprovements
duetoparallelization.Inaddition,differentkindsofresourcescanbevalidated
withrespecttotheirsuitabilityforthespecificnumericalproblem.

ArchitectureSystemSR8000-F1Hitachi4.2

Inian2000,theHigh-PerformanceLeibnizComputingSupercomputerCenter,a(LRZ)HitachiinMunichSR8000-F1installedthepseudo-vectorfirstBavarma--
chine.computeThissyservers.stemAtwasthetartimegetedoftoserveinstallationasoneitofwasthrtheeefastestfederalsuperGermancomputertop-levelin
EuropeandthefirstTFlop/smachineforgeneralpurposeresearchintheworld.
Itrankedatposition5oftheTop500SupercomputerList(TOP500,2007)inJune
and2000.rIneachedearlyrank200214theoftheTinstallationop500waslistinJuneupgraded2002in(TaOP500,second2007)installationwithaphaseLIN-
PACKperformanceof1.65TeraFlop/s(2TeraFlop/speakperformance).
AlthoughLRZsHitachiSR8000wasshutdownin2006,thismachineandthe
optimizationswithregardtotheimplementationofiFluidswillbedescribedin
thefollowing,becauseinsightsgainedonthismachinecaneasilybetransferred
toputerscurrisentstillvectortheorbest-suitpseudo-vectoredforLatticemachines.BoltzmannTothisday,applicationsthisclass(Wofelleinsuperetcom-al.,
2006).

40

FluidSimulationonSupercomputers

DescriptionHardware3TheSymmetricHitachiMultiprSR8000atocessorLRZ4derivedconsistedfromof168IBMsnodes,PowerPCeacharbeingchitecturaewithRISC-basedeight
feredcomputeapeakCPUsandperformanceoneofCPU1.5dedicatedGFlop/sprtotheoducingaoperatingtheoreticsystem.totalofEach12CPUGFlop/sof-
thepernodeSR8000(8rCPUseachedwithon1.5averageGFlop/s),1.5to2.5whiletheGFlop/srealperapplicationnode.Theperformancenodeswerofe
equippedwitheither8or16GBytesofsharedmemoryandcouldbeaccessed
withthe(still)strikingbidirectionalbandwidthof32GBytes/s.
TheSR8000wasarepresentativeoftheratherunusualpseudo-vectorarchi-
andtecturthee.This(parallel)typeofscalarprmachineenogrammingablestheparadigmuseofboth(COMPtheASvectormode)for(PVPsuitablymode)
structuredcodes,especiallycodesdesignedforclassicalvectorsystems(Leibniz
bytheRechenzentrprogrammerumM¨thrunchen,ough2007).compilerThedifdirferectives.entInparadigmsthecanfollowingeasilythesebetwoactivatedspe-
cialmodesaredescribedinmoredetail.

Co-OperativeMicroProcessorsinSingleAddressSpace
Co-OperativeMicroProcessorsinSingleAddressSpace(COMPAS)isHitachisname
fortheautomaticdistributionofcomputationalworkamongthe8CPUsofan
SMPnodebythecompiler(seeFigure4.2)ormanuallybytheuserthroughinser-
tionofparallelizationdirectives.Inordertoachieveoptimalperformancewhen
processingloops—evenforloopswithcomparativelysmallgranularity—a
rapidsimultaneousstart-upofprocessesisprovided.Cachecoherencyisguaran-
teedautomaticallywhenaforkorajoinsequenceisexecuted.Thisisimportant,
becausethedatastoredbyaprocessormightbereferredtobyanotherprocessor
executingasucceedingpart(Tamakietal.,1999).

ProcessingectorPseudo-VTraditionally,avectorCPUexecutesoperationsinavectorpipelinewhichdeliv-
ersoneormorememoryreferencespercycletothemulti-elementvectorregisters
oftheCPU.Inthisway,thearithmeticunitsarecontinuouslyfedwithdatare-
quiring,however,anexpensiveinterleavedandpipelinedmemorysubsystem.
RISCarAimingchitectureparticularlyto160atoatingscientificpointrcomputing,egistersandHitachiaddedprextendede-loadIBMsandprPowerPCe-fetch
capabilitiestofullyexploittheavailablememorybandwidthandthusalleviate
thetachirtypicaleferstomainthisdeficitextensionofasRISC-basedPseudo-VsystemsectorPrincocessingomparison(PVP).toThrvectoroughCPUs.itsPVPHi-
chip3AcomplexityReducedInstrbyusinguctionSetsimplerComputerinstructions.(RISC)isThus,basedtheonaprmicrocessorocodearlayerchitecturwitheitswithrassociatededuced
overheadcanbeeliminatedtoimproveperformance(answers.com,2007).
mor4eSymmetricidenticalprMultiprocessorsarocessingeconnected(SMP)istoaasinglemultiprsharocessoredmaincomputermemoryarchitectur(wikipedia,ewher2007d)etwoor

4.2.HitachiSR8000-F1SystemArchitecture

Figure4.2:TheparallelstructureoftheDO-loopsisaforkjoin,whichconsistsof
aserialexecutionpartandparallelexecutionpartappearingoneaftertheother.
Theserialpartisassignedtoonethreadandexecutedononeprocessor,thepar-
allelpartsareassignedtomultiplethreadsandexecutedonmultipleprocessors
(takenfromTamakietal.(1999)).

41

featurestheSR8000canschedulethefetchingofmemoryinapipelinedman-
nertimelybeforearithmeticexecution,allowingfornon-blockingexecutionina
mannercomparablewithvectorprocessors.
Pre-fetchtransfersrequestedcachelinesofdataasynchronouslyfromthemain
memorytothecache,whereaspre-loadtransfersrequestedelementdata—again
asynchronously—frommemorydirectlytotheregisters.Bothpre-fetchandpre-
load(seeFig.4.3)areusefulmainlyforcodeswithsmallornocachereuseratio
(forincacheloadsindeedthanisasmallperformancepenalty),i.e.memory-
boundcodes.Whetherapre-fetchorpre-loadismoreefficientdependsonhow
thememoryreferencesareorganizedwithinthecode.
Pre-fetchisthepreferredmethodforreferencingcontiguousmemoryareas,
whereaspre-loadwillgivebetterperformancefordiscontiguous(e.g.stridelon-
gerthan2)accesses.Thereasonforthisisthat,inthelattercase,pre-fetchwill
inducetransferswhichforthemostpartdeliverunreferenceddatatothecache,
whileforcontiguousdatapre-fetchwilldeliverdoublethebandwidthofpre-
load.Thecompilerschoicebetweenpre-fetchingandpre-loadingcan(andsome-
timesmust)beoverridden(orsuppressed)byadirectiveinthecode.
Byaccessingthecachewhendataisinthecachebutaccessingthemainmem-
oryinapseudo-vectormannerwhendataisnotinthecachethePVParchitecture
providesastableandhighdata-referencethroughput(Tamakietal.,1999).

42

FluidSimulationonSupercomputers

Figure4.3:Dependingonthememoryreferenceorganizationwithinthecode,
prdata,e-fetchsinceorprwholee-loadcacheislinesusedarforedatatransferraccess.ed.ForPre-fetchsingleisdataprefficiente-loadforthecontiguousmemory
isdir(1999)).ectlyaccessedandcopiedintotheCPUregisters(takenfromTamakietal.

Figure4.4:Pre-fetchandpre-load:Ontheleftsideofthisfigurethedatause
bypre-fetchissketched,ontherightitspre-loadcounterpart.Pre-fetchloadsa
completecacheline(ofcontiguousdata)intothecachefromwheredataelements
canbeloadedandprocessedinvectormannerbytheCPU.Withapre-loadsingle
dataelementsaredeliveredfromthememorydirectlytotheCPU(takenfrom
(2000)).Lanfear

4.3.ParallelizationoftheLattice-BoltzmannSolver

43

4.3ParallelizationoftheLattice-BoltzmannSolver
Asdescribedabove,theHitachiSR8000offerspossibilitiesofparallelizationon
severallevels(cf.Figure4.5).Betweencomputingnodesthecomputationcan
bedistributedusingmessagepassing,typicallyviatheMPIlibrary5.Withina
nodetheworkloadcanbedistributedoverthe8computationCPUsviaOpenMP
orHitachisCOMPAS-mode.Finally,aftercarefulmanualoptimizationofthe
innermostloopsthecomputationcanbevectorizedtofullyexploitthePVPcapa-
bilitiesofthehardware.Inthisway,anaturalhierarchyofparallelizationmeth-
odsisgivenbytheSR8000architecturewithMPIprovidingworksharingonthe
level.coarsest

Figure4.5:Threelevelsofparallelization:CommunicationviaMPIexchanges
databetweennodesof8CPUseach.Oneachnodethecomputationalworkis
processedinCOMPASmodeby8threadsinparallelinashared-memoryenvi-
ronment.Duetothepseudo-vectorizationcapabilitiesoftheHitachiSR8000,
theloopscanbeexecutedvector-wise(takenfromLanfear(2000)).

IncomparisontootherapproachestosimulateuiddynamicstheLattice-
Boltzmannmethodisparticularlywell-suitedforparallelization.Asdescribed
intionwithChapter3theneighboringcollisiongridsteppoints,cansincebethecomputedcollisionwithoutoperatortherequirneedesforonlyinterac-data
whicharelocallyavailable.Therefore,thecalculationofthisstepcanbecon-
ductedwithoutcommunicationbetweendifferentnodes.Inthepropagationstep
theadvectionofparticledistributionscomprisesonlythemigrationofdistribu-
52007)ThecoveringMessagebasicPassingpoint-to-pointInterface(MPI)(send/risaeceive)standaranddizedadvancedcommunication(collective)libraryfunctionality(MPI-For.Itum,is
Inthebasicmostmessagecommonpassing,methodtheofprprocessesogrammingarecoormulti-prdinatedocessorbyexplicitsystemswithcommunication,distributedi.e.,memorysending.
andreceivingofdata(Pacheco,1996)

44

FluidSimulationonSupercomputers

tionsfromonegridpointtoitsnextneighborscorrespondingtotheirdirections
andtheLattice-Boltzmannmodelinuse.Therequiredcommunicationcanbe
keptsimple,sinceonlycellsatthebordersofeachdomainboundaryhavetobe
exchanged(onlypointtopointcommunicationbetweendomainboundariesis
ed).equirr

DecompositionDomainSinceMPIcommunicationhasbeendesignedforparallelizationondistributed-
memorysystems,itsuse,bynature,requirespartitioningofthecomputational
rworkesultsthrinoughequaldomaindistributionofdecomposition.theworkloadAnapproverallopriatenodesdomain(loadbalancingdecomposition),and
triesmosttosensitivekeepinterspot-nodeforintroducingcommunicationasefperformance-limitingficientaspossiblelatencies.sincehereDependingisthe
onthus,thehasproblememergedintocharacteristicsaresearchdomainfieldinitsdecompositionownrightcanbe(DomainfarfromtrivialDecompostion,and,
2007).SpecificallyfortheLattice-Boltzmannmethod,severalapproachesfordomain
mitteddecomposition2007)ishavebasedonbeenhierartaken.chicalAngridsapproachandprusestheesentedfreeinFrMETISeudigerlibraryetal.(METIS,(sub-
2007).METISisapartitioningtoolforirregulargraphsandFE-meshes.Forexam-
ple,mediathiswithlibrarythehasbeenLattice-BoltzmannusedbySchulzmethod,etal.since(2002)thefordatatherefersimulationenceoflayoutporousfor
cellsimulationinformationvolumeinistheirnoaplongerproachreferisenced.basedonWhenlistsusingbecauseregularamajorgridspartasinofiFlu-the
ids,cuboidalsubdomainsareoftenmoreadvantageous.SatofukaandNishioka
(1999)havefoundthatonaHitachiSR2201pseudo-vectormachinea2Ddomain
suredecompositionabouttherintoeasonslicesforisthis).moreFigurefeficient4.6thanshowsintodifferboxesent(thecuboidalauthorsardecomposi-enot
tions,impactonwhichthedividenumbertheofadomainprocessalongone,communicationtwo,orthreepartnersaxestoandthedemonstrateamounttheof
communicationdata.iFluidssupportsuser-defineddomaindecomposition,i.e.
onedeterminingcandefinethehowbestmanydomainsubdivisionsdecompositionshouldonebehascrtoeatedconsideralongeachmanyaxis.aspects.For
Theefficiencyofthedecompositionpossibilitiesdependsonfactorsliketheto-
talexploitnumberofoptimizationprocessesfeaturesavailable,withinthetheharrdwareducedeinpruse;oblem,whetherwhetheritisthepossiblememoryto
layoutinuencesperformance(latencyandbandwithcapabilities,distributedor
sharciencyedofmemory).deriveddataFurthermortypesore,itdependsdependenciesonthebetweenMPIcommunicationimplementation(e.g.partners)effi-
and,lelizationofcourse,particularonthetotheHitachicharacteristicsSR8000ofthesuggestedsimulation.astandarThedthree-leveldecompositionparal-
.iFluidsforslicesintoumeAsisloadmentionedbalancing.above,aItiscleardecompositionthatdividingaspectabesidesdomaintheintorcommunicationegularcuboidsvol-

4.3.ParallelizationoftheLattice-BoltzmannSolver

45

doesnotnecessarilyachieveoptimalloadbalance.Loadbalance,however,isnot
due(yet)totheconsideredoften-changingwithinthegeometryiFluidsduringcomputationalthesimulationsteeringthisframeworwouldrk,equirebecausethe
andomainadaptivetoberdecompositione-decomposedwasafternoteachconsidergeometryedwithinmodification.thisthesis.Also,theideaof

Figure4.6:Differentlayoutsofdomaindecomposition:Theyellow-coloreddo-
blue)mainsinranepresentLBMthesimulationcommunication(e.g.theD3Q15partnersofmodel).aItissingled-outevidentprthatocessthe(colormorede
andcubicalmorethedomainscommunicationaredividedpartnersup,aretheinvolved.smalleristhecommunicationvolume

CommunicationMPIAfterpartitioningthecomputationalworkandassigningittodifferentprocesses,
thenextaspecttoconsideristheMPIlayoutforinter-processcommunication.To
achievethebestpossiblecommunicationperformancebetweentheSMPnodesof
theHitachicommunicationSR8000,consistsitsofvendormultiple-optimizedsendsMPIandrlibrarieseceivesarewhichused.haveThebeeninterprimple-ocess
formentedcedusingcommunicationnon-blockingorder.MPITherdataoutinestobetoavoidexchangedwaitingduringtimesaprasinopagationthecasestepof
istiguouslyillustratedandforhasato2DbeexamplespeciallyinprFigurepareed4.7.forUsuallysending.,thisTherdataearisenotbasicallystoredthrcon-ee
formethodsexample,(whendefinesusingderivedC/C++)todatatypespreparasetheMPIstrsendinguct.prHere,ocess.theOnememoryapproach,loca-
tionsoftherelevantdatacollectedforasendaredescribedonceandcanbeused
asastencilforeachcall.Ondistributed-memorysystemsMPIcopiesthese
dataMPIintounpack,internalwhichbufis,fersforhoweversending.,lessAnothercomfortablepossibilitytoisimplement.touseMPILikewise,packandthe
thirdatadinoptthisioniscasetoisalsomanuallyusuallycopythecopieddataintointoainternalcontinuousbuffersbeforvectoreandsending.senditTheto
tiesthewithrcommunicationegardtotheirpartner.performanceLueckeandandWfoundangthe(2005)useofinvestigatedderivedthrMPIeedatatypespossibili-
MPImostefficient.implementation.HoweverW,ithinthisriFluidsesultprderivedobablyMPIdependsdataontypestheareharuseddwarasetheandstan-the
method.ddarDomaindecompositionsintoslicesofferafourthoptionofdatatransferwith
overlappingdomains(seeFigure4.8)ashasbeendemonstratedbyPohletal.

46

FluidSimulationonSupercomputers

Figure4.7:DataExchangeintwodimensions:Theredarrowsrepresentdistri-
butionsofdomainboundarynodeswhichhavetobesenttoneighboringdomains.
Asthereisnosimplelineardatalayouttostorethesedistributionscontiguously
inmemory,onehastodesignspecificmechanismsforsendingthisdataefficiently
MPI.via

(2004).ForgeometricdimensionsofdimxxdimyxdimzandNsubdomainslices
orthogonaltothex-axisthisincreasesthetotalnumberofcellsbyafactorof
1+2(diNm−x1).Byoverlappingthedomainsitispossibletoexchangethecomplete
datasetofallparticledistributionasacontiguousdataset.Althoughthecom-
municationvolumeincreasesapproximatelybyafactorof2inthisapproach,this
ansatzstillleadstoanoverallperformancegain.
ForadditionalparallelizationwithinanMPIprocess,themaincomputingloops
weredistributedoverthe8CPUsofanodebyinsertionofCOMPASorOpenMP
directives.Forexploitingthisadditionaloptionofparallelization,thecodeneeded
tobeadaptedasdescribedalongwithfurtheroptimizationsinthefollowingsec-
tion.

4.4OptimizationoftheSimulationKernel
Besidesoriginalformparallelizingneededrtheewritingcomputationalandmanualkernel,theoptimizationmaintofullycomputingexploitloopsinHitachistheir
vectorizingandsoftware-pipeliningcapabilities.Figure4.9showsthestriking
lelperformanceLattice-BoltzmanngainofankerneloptimizedthathadversionbeenofusedtheinKoriginal¨uhner(2003).straight-forwardparal-

PropagationandCollisionTheLBMisusuallydividedintothestepsofcollisionandpropagation(seeChap-
ter3),which—atfirstglance—implicatesacodestructureofseparatedloops
forcollisionandpropagation.Toreducethedatatransferbetweenmainmemory
andprocessorandforbettercacheutilization,thesestepscanbefusedintoone

4.4.OptimizationoftheSimulationKernel

Figurboundaries.e4.8:IntheOverlappingfigurewhitedomains:arrowsSlicedepictsubdomainsdistributionscanbewhichoverlappedarenotatavail-their
ableThere,withintheraequirpredocessanddistributionshavetoinberquestioneceivedarefrommarkedthered,neighboringwhileblackrdomain.efers
tochangingdistributionsthecompletewhichrarowebeforcomputedethelastwithinonethe(storeddomaincontiguouslyindependentlyin.Bymemory)ex-
rtoedtheonesnextfillindomain,thegaps.blackdistributionsareoverwrittenbyidenticalvaluesand

Figure4.9:Comparisonofperformancebeforeandaftermanualoptimizations.
BytachipayingSR8000specialitwasattentionpossibletototheachieveacapabilitiesconsiderableandcharacteristicsperformanceofgaintheoverHi-
implementation.Lattice-Boltzmannparallelstraight-forwarda

47

48

FluidSimulationonSupercomputers

turloopewith(Wrilkeetespectal.,to20pr03;Wopagation,elleinettwoal.,arrays2006).arIneorused,dertowhichsimplifyholdthethecodedistributionstruc-
tocardensitieseaboutofthesuccessiveorderoftimestepsupdatingtandthet+1.neighboringCorrcellsespondinglyinthe,prthereopagationisnoneedstep
anymore,andacomplexcodestructurecanbeavoided.
Therearetwopossibilitiesofimplementingthepropagationandcollisionloop:
theandtheso-callednewpullanddistributionspushareversion.prInopagatedthepushtotheversionneighboringcollisioniscells.Incomputedcontrast,first
thepullversioncollectstherelevantdistributionsfromtheneighboringcellsfirst
andconditionsthenthecomputespull-versiontheiscollisiononadvantageousthisbasis.(seeCrForousethe(2003)),modelingbutof(asboundaryclaimed
byPohletal.(2004))thepushversionperformsbetterontheHitachiSR8000due
tothedifferentmemoryaccesspatternsandisthereforeimplementedwithinthe
LatticeBoltzmannsolverpresentedinthisthesis.

AccessandLayoutDataRegardistributionsdingtheofacellcomputationcontiguouslyoftheincollisionmemorystepdueittoistheadvantageousline-fetchingtostorcacheeac-all
cess.successionInC/C++andtheshouldlastthusindexbeofantraversedarrayinaddrtheessesinner-mostmemoryloops.locationsCorrinespond-linear
inglysions,ofthethearraysimulationlayoutreadsdomainandif[x][y][z][i]refersto,thewherexnumber,yandofzaredistributionsthedimen-per
cellcauses(cf.loadsFig.fr4.10).omFormemorythelocationsimmediatelydistantfrfollowingomtheprlocationopagationofthisthedatacurrentlayoutcell
bandwidthdistributions.fromBecausethemainofHitachismemoryprthise-fetchdisadvantagecapabilitiescanbecombinedcompensated,withitshow-high
.ever

Figure4.10:Ontheleftthedistributionfunctionsofalatticesiteina2Dex-
amplehavebeensketched.Twodifferentmemorypatternsofstoringanodes
distributionfunctionswithinanarrayarevisualizedontheright.Thetoppat-
ternhasthedistributionfunctionsforonenodearrangedconsecutively,whereas
thebottomlayoutstacksthearraysofthedifferentdistributionfunctions.

4.4.OptimizationoftheSimulationKernel

49

Intryingtoimprovetheefficiencyofdataaccess,pointerarithmetichasbeen
eliminatedasfaraspossible.Forexample,thethreedistinctindicesx,yandz
[i]have→fbeenmerptr[xyz+i]gedinto.Asaarcombinedesult,therunningcompilerloopisableindextoxyz,analyzei.e.,andf[x][y][z]optimize
thesimplifiedloopstructureinthiscodeversionmuchmoreefficientlywithre-
vectorization.todgar

SoftwareandectorizationVPipelining

hardwarePseudo-vectorizationcapabilitiesof(Fig.the4.11)HitachiandSR8000,softwarewhichpipeliningarenot(Fig.generally4.12)refertoavailablespecialon
otherarchitectures.Bythesetwomeanscodecanbespedupconsiderablyon
theHitachi,butonlyatthecostofcarefulcodetuningandthroughassistingthe
.compilerthatToloopsbenefitfroperateomonlongpseudo-vectorizationlineararrays.theInpadatarticularlayout,longhasbeeninnermostdesignedloopssuchare
advantageouswithregardtopseudo-vectorization(Hageretal.,2003).
Tomakethecodeaccessibletosoftwarepipelining,conditionalstatementsfor
handlingdominantthedifnumberferentofuidboundarynodes,theconditionsif-statementshavebeengetrreplacemoved.edInbycaseofequivalentapre-
rarithmeticeal-valuedcoefoatingficientpointarraysoperations,fori.e.,multiplicationBooleanasexprshownessionsinFiarguree4.13.mappedBecauseonto
offorthethedatatimestepslocalitytandoftthe+1(seecollisionabove)(seenoChapterdata3)anddependenciestheusagebetweenoftwotwoarraysloops
cyclesoccur,whichwoulddecreasethesoftwarepipeliningsefficiencyoreven
.completelyitohibitprThis,ofcourse,causesextracomputationalcost.Nevertheless,anamazing
performancegaincanstillbeachievedwiththismethod,becausetheavoidance
oftheplementationevenofcostlierconditionalsbranching—instrturnsuctionsoutto—beasintrseveraloducedtimesbymortheeefstandarfective.dim-
izedbyAnotherahighperoptimizationcentageapprofwalloachespnodesecially(e.g.porsuitedousformedia)uidistoscenariosintroducecharacterlists-
storingnodesofthesametypeofboundarycondition.Thismethodrequiresex-
tramemoryand,additionally,thevelocitydistributionsofonenodeandthoseof
thenextoneinthelistmaybelocatedfarawayfromeachotherinmemory.An
additionalperformancegainapartfromsoftwarepipeliningandvectorization
isuidachievedvolumebyareas.theseThelistsbysimulationskippingofthebloodlarowgewithinfractionanofarteryinternal(seenodesFigurein2.3)non-
asmayhereservetheasanfractionofexampleuidwhichnodesbtoallenefitsnodesstronglywithinfromthethisboundingkindofboxoptimizationtypically
liesintherangefrom5%to15%.

50

FluidSimulationonSupercomputers

Figure4.11:Thispicturecomparestheadditionofaconstanttoavectorona
vectormachine(left)andonapseudo-vectorizationarchitecture(right).With
itsspecialmemoryinterconnectofpipelinedandinterleavedmemoryaccessthe
vectorcomputercanloadvectorialdatatoitsregistersveryefficiently.Mak-
inguseoftheselargeregisters,thevectoradditionisprocessedinparallelby
multiplearithmeticunits.Subsequently,theresultingdatavectorisstoredin
memoryagain(DowdandSeverance,1998).Incaseofpseudo-vectorizationa
software-assistedpre-fetchfunctionisusedtoenableahardware-basedmemory
lookaheadmechanism.Inthisway,waitingtimesbetweensuccessiveinstruc-
tionsareeliminatedbypipeliningdatafetchesfrommemory.IntheCPUseveral
arithmeticunitsprocessthevectoradditioninparallel.Finally,theresultsare
storedtomemoryagain.(Lanfear(2000))

cessedFigurein4.12:serialWand,ithoutonlyinsoftwarpreinciple,pipelining,infiniteeachresouriterationces(i.e.,ofarloopegisters,hastobearithmeticpro-
isunits,nodataetc.)woulddependencyallowallbetweenloopcyclesiterationtobesteps.conductedInrealityin,onestephoweveras,thelongasloopthercy-e
clescanbeexecutedinapipelined(partiallyoverlapping)mannerdependingon
theavailableresourcesandthedatadependencybetweeniterationsteps(Lanfear
(2000))

4.4.OptimizationoftheSimulationKernel

Figure4.13:Thistableshowsinitsleftcolumnvariousboundaryconditions,
whilethethreecolumnsinthemiddleshowwhetherthevaluesofthedistribution
functionf,thelocalmacroscopicdensityp(f),andthelocalmacroscopicveloc-
ityv(f)aredirectlyavailableorwhethertheyneedtobecomputedattimestept.
Thelastcolumnshowsforwhichboundaryconditioncollisionhastobecom-
putedorwhereonlytheequilibriumdistributions(feq)havetobedefinedtoget
totimestept+1.Usingtherowwiththevelocityconditionasanexample,the
tableconveysthatthedistributionfunctionsareavailableattimetandthatthe
densitypisafunctionoff,whilethevelocityvneedstobesettoacertainvalue.
Ingeneral,equilibriumdistributionsarefunctionsofthelocalmacroscopicden-
sitypandofthelocalmacroscopicvelocityv.Inproceedingtotimestept+1,
alldistributionfunctionsaresettotheequilibriumdistributions.Finally,to
avoidconditionalifbrancheswithinloopsoverallgridpointsofadomain,a
universalapproachforallboundaryconditionsisshowninthelastrow.De-
pendingonwhetheraboundaryconditionissetornot,ablendingfactorof1or
0isintroduced,respectively.Theseblendingfactorsofeachboundarycondition
aremultipliedbythecorrespondingformulaeforf,p,v,feqattimetandfat
timet+1toactivatethemifneeded.Toimprovetheclearnessofpresentation,
theblendingfactorshavebeensubstitutedbycoloreddotsinthelastrowofthe
table.

51

52

FluidSimulationonSupercomputers

4.5PortingandOptimizingtheSolverforSGIAltix
SystemsToprovetheportabilityofthecomputationalframework,iFluidshasbeenported
toanSGIAltixLinuxsystem.ThiswasdoneontheAltix3700machineatSara
ComputingCenterinAmsterdamandthewholecomputationalsteeringappli-
cationwasbenchmarkedonthissystemwhichisverymuchdifferentfromthe
Hitachiarchitecture.Experiencesmadeduringtheportingprocesscoulddirectly
beappliedonthenewAltix4700supercomputerattheLeibnizComputingCen-
Munich.inter

DescriptionHardwareTheSGIAltix3700machineatSaraComputingCenterconsistsof416IntelIta-
hasnium2accessCPUsto2(1.3GBofGHz,local3MBmemoryCache),andwhichhostscan832beGBaccessedofmainverymemoryfast..InEachaddition,CPU
allCPUsofanode(theoreticallyupto512CPUs)canaccessthewholememory
ofHoweverthis,nodeviamemoryccNUMAaccessthrough(cache-coherccNUMAentNonisslowerUniformascomMemoryparedtoAccess)directlinks.ac-
cess3700ofislocalbenchmarkedmemory6.as2.2ThepeakTFlops/s.totalperformanceoftheAmsterdamSGIAltix

AccessandLayoutDataIncontrasttotheHitachiSR8000theperformanceonAltixsystemscanbein-
creasedfurtherbyaslightlymodifieddatalayout.Donath(2004)comparedthree
layoutsofdata(f[x][y][z][i],f[i][x][y][z],andf[x][i][y][z])and
foundf[x][i][y][z]performingbest,becauseinthecomputationtheden-
sitydistributionslabelledthroughiarelocatedclosertoeachother.Thesameis
trueforthelocationswheretheupdateddensitydistributionhastobecopiedto
opagation.prtheduringWilkeetal.(2003);Donath(2004),inaddition,suggestgridmergingandgrid
compression.Insteadofusingtwogridsforthetimestepstandt+1aninterleaved
gridisusedtoimprovespatiallocality(seeFigure4.14).
Gridcompression,again,increasesthespatiallocalityofmemoryaccessand
savesmainmemory.Sinceforthepropagationthecomputationonlyrequiresthe
directneighbors,theideaistoshiftthegridoneunitineachdirection.Therefore,
onecommongridcanbeusedextendedbysomeghostlayers(seeFigure4.15).
Thedirectionoftheshiftthendeterminesthesequenceinwhichthecellswill
needtobeupdatedtoavoidoverwritingcellinformationwhichisstillrequired.
6access.TheDependingccNUMAlinkonhowallowsmanyforrouterdata-transferhopsatneedatoratebeof0.25takentothe0.5%latencycomparisedincrtoeasedlocal(aboutmemory50
cyclesperhop).ThemainproblemofthisarchitectureisapotentialcongestionoftheccNUMA
ogramspparallelprinlink

4.5.PortingandOptimizingtheSolverforSGIAltixSystems

Figure4.14:A2Dexampleofgridmergingisshowninthisfigure:Toimprove
fordatastoringlocalitytheinthedistributionpropagationfunctionsstep,atthetitwomesteps—otandriginallyt+1arseparateestor—edasarraysan
ging.merafterarrayinterleaved

dataFigureneeded4.15:forGridcollisioncomprandessionprisopagation,usedtoandfurthertosaveimprovememoryspatial.Thelocalityarrayofhold-the
ingthedistributionsattimestept(left)iscoloredblue,whiletheyellowarrayis
theraystararegetfortranslatedstoringbytheoneunitindistributionseachdirafterection,thenoprrelevantopagation.dataSinceistheoverwritten.twoar-
Forthefollowingtimestept+1(right)theshiftedarraysneedtobeprocessesin
.orderentdiffera

53

54

StrategyParallelization

FluidSimulationonSupercomputers

ThehybridparallelizationmodelasshowninFigure4.5wasusedtooptimize
iFluidsontheHitachiSR8000,combiningMPIandCOMPASparallelization.On
thememorySGIAltixcapabilitiesonlyMPIofthissuperparallelizationcomputerhasarebeenonlyappliedexploitedand,wherhence,eSGIsthesharvendored--
optimizedBesidestheMPIusuallibrarycanpitfallsmakeoneusehasoftothiskeeparinchitecturminde.whenusingOpenMP,us-
ToingitachieveefficientlygoodontheperformanceSGIAltixoneNUMAhastosystemobeyrtheequiresfirst-touchcarefulprplacementogramming.rule
whenwhichcanaccessingbeaccessedmemoryveryorfast,data.itisSinstrceonglyeachrprocessorecommendedhasitstoownplacelocaldatathatmemorywill,
touchmainlybeplacementusedbypinsoneprthedataocessortointhatitscorrCPUsespondinglocalmemorylocalwhichmemory.touThechesfirst-the
etdataal.first(2006)(usuallyfounddonegoodduringparalleldataefficiencyinitialization)forAMD(WelleinOpteretonal.,NUMA2006).arWchitec-ellein
turesduetotheirseparatepathstomemory.Accordingly,onIntelXeonmulti-
pr(2002)ocessorinvestigatedsystemstheatmeparallelizationmoryofmodeltheLBMcausesausingmemoryOpenMPbottleneck.onanSGIBellaetOriginal.
3200andfoundagoodspeed-upbehavior.

PerformanceComparisonSGIAltixSystemsversusHitachiSR8000

Fortunately,manycodeoptimizationsthathadbeenimplementedwiththeHi-
tachiarchitectureinmindalsoperformedwellontheAltixrightfromthestart,
thesincefusionmanyofoptimizatcollisionionandpraspectsopagationarealsointovalidoneforloopthis(Welleinsystem.etal.,An2006;exampleWilkeis
etiFluids.al.,2003).TheAltix-specificperformanceoftheoptimizationsapplicationcanbeonbothactivatedviamachinesoptionisagscomparwithinedin
toFigurthee4.16.SR8000Theliesatperformanceabout70%,gainwhileachievedbothbycodesusingshowtheagoodAltix3700speed-upasoncompartheired
machine.

4.5.

Porting

and

Optimizing

the

Solver

for

SGI

Altix

Systems

Figure4.16:ThisgraphshowstheperformanceoftheLattice-Boltzmannkernel
goodrunningspeed-uponHitachibehaviorSR8000onthe(grreen)espectiveandSGImachineAltix.3700The(blue).performanceBothcodesgainshowseen
afterbothportingmachines.thekerneltotheAltixwasabout70%whenusing40processorson

55

5Chapter

ExplorationDataInteractive

AfterthedevelopmentofafastCFDsolver,thenextsteptowardsacomputa-
tionalsteeringapplicationistheimplementationofon-the-yvisualizationof
data(onlinemonitoring),whichwillbecoveredinthischapter.First,theterm
scientificvisualizationwillbeintroduced,followedbyabriefdescriptionofthe
supportedhardwarewithinthecomputationalsteeringframeworkiFluidsand
thepresentationofthevisualizationmodulethathasbeendeveloped.

isualizationVScientific5.1

Scientificvisualizationreferstovisualizationofscientificdatasets.Inthiscontext
wewillnowfocusonthegenerationofvisualrepresentationsfromtheresultsof
scientificsimulationsinthefieldofuiddynamics.
Withincreasingcomputationalresourcestheresultingdatasetsaregrowingas
well.EspeciallysimulationsinthefieldofCFDfrequentlyproducehugeamounts
ofdatathathavetobepostprocessed.Modernvisualizationtechniquesprovide
exploration.datafortoolspowerfulAccordingtoBellemann(2003)onehastodifferentiatebetweenstaticanddy-
namicenvironments.Staticenvironmentsareusedforinvestigatingtimeinvari-
antdatasuchasinclassicalpostprocessingofprecomputeddata.Inthecaseof
dynamicenvironmentsanexternalprocessgeneratesnewdata.Forbothcases
interactivedataexplorationisacompulsoryfeatureofmodernvisualizationin-
terfaces.Itisimportanttonotethatinthiscontextuserinteractionreferstothe
uencingvisualizationaprsimulation.ocessonlyAprandernotequisitetoanyofachievingcomputationalinteractivesteeringdatainteractionsexplorationin-
isthatthevisualizationisfastenough(i.e.,atleast10framespersecond)without
sacrificingimportantdetails.Inaddition,fastresponsestouserinteractionsare
requiredtoallowanaccuratecontrolofthevisualizationandtoavoiduncertainty
side.susertheonAmodernvisualizationenvironmentshouldofferanintuitivemeansofinter-
actiontopermitausertomodifyparameterscontrollingthepresentationinorder
tobeabletoextractqualitativeandquantitativeinformationfromtheinvesti-
gateddatasets.Accordingly,theinteractionmethodsshouldnotrequireanyex-

56

iFluidswithinisualizationV5.2.

57

planationoruserguidestoavoidapossiblylongfamiliarizationphase(Norman,
1993).McCormick,andSanders1988;ThefirstbreakthroughfulfillingtheaboverequirementsisduetoBrysonand
Levit(1992)whosetupavirtualwindtunnel.Theydevelopedavirtual-reality
environmenttoexploreprecomputedunsteadyowfieldsin3D(cf.Fig.5.1).
Theirvirtual-realityenvironmentallowedausertoexploreowdatasimilarto
thewayitisdoneinarealwindtunnelby,e.g.,placingsmoketracesintotheuid
tovisualizestreamlines.Oneimportantadvantageofthevirtualwindtunnelwas
thecloseobservationoftheowwithoutdisturbancesduetomeasurementfacil-
itiesasinareal-lifetunnel.

Figure5.1:Thevirtualwindtunnel:Flowvisualizationaroundaspaceshut-
tlewithinavirtual-realityenvironment.Withinthisvirtualwindtunnelthe
userwearsahead-trackedstereodisplay,effectivelydisplaying3Dinformation,
andaninstrumentedgloveforintuitivepositioningofowvisualizationtools.
(takenfromBrysonandLevit(1992))

iFluidswithinisualizationV5.2AsmethodfoundforbyintuitiveBrysonanddataLevitexploration.(1992),Therimmersiveefore,thevisualizationcomputationalisthebeststeeringsuited
frameworkiFluidshasbeendesignedtoallowconnectionstoseveralvirtualenvi-
ronments.StandarddisplayssuchasusedforworkstationPCsandLaptopsare
5.2).Fig.(seesupportedalso

EnvironmentsirtualV•DesktopSystem:InthiscaseiFluidsusesaconventionaldesktopmonitor
asawindowtothevirtualworld.Traditionally,thisisaatprojectionof
3Dgraphicsofadataset.ThroughtheuseofasetofLCDshutterglasses,
visualizationcanbeextendedtosupportstereoscopicviewing.Thismethod

58

ExplorationDataInteractive

iscalledactive-stereoasthedisplayonthescreenalternatesbetweenaleft
andrighteyeviewofthe3Dsceneand,inaccordance,theshutterglasses
switchbetweenopaqueandtransparentalternatingfortherightandleft
eye.Theswitchingoccursfastenoughtoletthebrainfusethetwoviewsto
.vieweoscopicstera

PassiveStereoProjection:ApassiveeosterojectionbackprsysteminstalledatthecomputationalChairforsteeringBauinformatiksetup.Itc(TUonsistsM¨ofunchen)ahasspecial-madealsobeenscreenusedinforfrontthe
oflarizationtwoprfilters.ojectorsOnewhichprareojectorequippeddisplayswiththetwoviewfortheorthogonallyrighteyeorientedandthepo-
otherthecorrespondingviewforthelefteye.Whenusingorthogonally
prorientedojectorsarepolarizationfilteredforfiltertheglassesdedicatedthetwoeyetorsuperimposedeceivetheviewscorrfromespondingthe
ster.vieweoscopic

Holobench:ThedthirintegratedoptionfortheinteractiveuidsimulationisofferedbytheholobenchattheLRZ.Thisisacombinationoftwopro-
jectionscreensmountedatarightangletoformanL-shapedvisualization
workbench.Ideally,itismeantforcreativeteamworkforgroupsof2or3
people.Again,withLCDshutterglassesasterescopicimagegeneratesac-
thetivevsterirtualeo.worldAdditionallyisadapted,theaccormasterdinglyuserstoheadtheispositiontrackedofandthistheuser.viewThisof
enhancestheimpressionof3Dobjectsinfrontofthespectator.

mentFigure(left)5.2:conveSupportedntional3DEnvirgraphicsonments:andWsterithintheeoscopicstandarviewsdaredesktopsupported.environ-A
largepassivestereobackprojectionscreen(middle)offersmoreimmersivedata
usingexploration.a2-screenOneholobstepenchfurther(right,intotakenimmersivefromenvirConceptCaronments(2007))canbetogetherachievedwithby
headtrackingandatrackedinputdeviceforimprovedintuitiveinteractionfea-
es.tur

iFluidswithinisualizationV5.2.

59

ExplorationDataForabetterunderstandingofthedataandtoassisttheusersorientationinthe
virtualworlddatarepresentationobjectsandCAD-generatedgeometryofthe
simulatedscenearevisualizedatthesametime.Occlusionofobjectsiscounter-
actedbyprovidingamodewithtransparentvisualization.
Anintuitiveexplorationofasceneisenabledviathreenavigationmodes.The
mostunconstrainedistheymode,inwhichausercanfreelymovethrough
themodesceneisinthoughtalldirtoectionssupportandartheealisticviewpointinspectionisofadjustedroomsaccoranddinglybuildings.Theinwalkthat
theuserspointofviewisfixedtoanadjustablez-level,whereashecanstillfreely
moveinxandydirection.Inbothmodestheviewrepresentingtheusersdirec-
tionofgazecanberotatedfreelysothatobservationfromhiscurrentpositionis
nototherwiseconstrained.Thethirdnavigationmodeistheclassicalbirdseye
view.Heretheuserspositionisfixedandinsteadofmodifyingtheuserspoint
ofviewthesceneistransformedbyrotation,translationandzoominginandout.
Duringtheinteractivedataexplorationtheusercanswitchtoanymodeatany
time,dependingonhisrequirements.
Forvisualizationofuiddataaseriesofstandardtechniquesforvisualanaly-
implemented:beenhavesis•Vectorplanesandfieldsvisualizethedirectionoftheowsvelocityfield
atalocationinspacethroughanarrowandthevelocityvaluedetermines
itslengthandcolor.Vectorplanesorthogonaltotheroomsaxescanbe
inserted,translatedandrotatedinteractively.Vectorfieldsaredisplayedin
acuboidwithinteractivelydefinableextensions.
•Streamlinesareintegratedpathsalongthevelocityvectorsoftheowfield.
Thestartingpointsofintegrationareinsertedintothescenewiththehelp
ofgeometricseedobjectssuchasrectangles,lines,orcircles.Theserepre-
sentativescanbetransformedarbitrarilyandthenumberofstartingpoints
canbeadjusted.Thevelocityisrepresentedthroughthecoloringofthe
path.itsalongeamlinestr•theMotionlocationparticleswherearetheyanimatedwereinsertedparticlesintomovingthescene.alongtheInsertingstrtheeamlinesparticlesfrom
isdoneanalogouslytotheinsertionofstreamlinestartingpoints.
•faceisIsosurfacesasurfacecanbeuseddisplayingforallthepointsevaluationinaofdatascalarfieldwithquantities.acertainAnisosurvalue.-
WithiniFluidsanisosurfacecanbeappliedtopressureandtemperature
fieldsorvelocitycomponents.Thevaluedeterminingthesurfacecanbe
adjustedcontinuouslyandthequantityvisualizedcanbechangedatany
time.Thecolorataparticularpointontheisosurfacestandsforthecorre-
spondingmagnitudeofthedisplayedvalue.
•Cross-sectionsalsodisplayscalarvaluesusingacolormap.Theycanbe
insertedandtransformedinthesamewayasvectorplanes.

60

DataInteractiveExploration

Alldatarepresentationobjectscanbeaddedtoorremovedfromthesceneat
anytimeandasoftenasneeded.Figure5.3showssomescreenshotsofexample
visualizations.

Figure5.3:Dataexploration:Ontheleftcuttingplanesdepictthex-component
ofuidtheow.velocityWithinfield,bothwhilescrtheeenshotsrightthescrcolorieenshotngshowsrangesstrfromeamlinesbluetoredfollowingrepre-the
sentingtheminimumandmaximumvalues,respectively.

adjustedBesidestheinteractiveltyransformationtofacilitatealleasyparametersandofcomfortabltheevisualizationexplorationobjectsofthecandata,be
indifferentparticular,functions.aForcontext-basedallthesemenuinteractionshasbeendifferentimplementedkindsofforinputcontrdevicesollingthecan
beandusedmiceorarecombinedsupported,withasareache3Dother.SpacemiceSofar,andaWconventionalanddevice,desktopwhichiskeyboarusedds
forvisualizationandnavigationespeciallyinvirtualrealityenvironments(see
5.4).eFigur

Figure5.4:3DSpacemouseandthePolhemusStylusWand

Withinthecomputationalsteeringapplicationthedataiscontinuouslyvarying
overtime(dynamicenvironment).Therefore,thevisualizationmoduleneedsto
updatethedatafieldsautomaticallytoallowforanintuitivevisualizationand

iFluidswithinisualizationV5.2.

61

steeringterminal.Inthefollowingthedesignofthevisualizationmodulewillbe
esented.pr

DesignoftheVisualizationFront-End
3DThescenevisualizationgraphfrlibrariesontend(MerhascurybeenComputerimplementedSystems,usingMerInc.,curys2007b).ToOpenInventorguaran-
teesmoothinteractivedataexploration,amultithreadedviewerisused(seeFig-
ure5.5).Multithreadingisnecessarytoallowinteractionwithdatarepresentation
throbjectsead.wThishile,isconcurrespeciallyently,theimportantdataforisprocessedtime-varyingfordatavisualizationsets,whichinaprevailseparatein
applications.steeringcomputational

Figure5.5:Designofthevisualizationmodule:Themultithreadedviewerper-
mitsconcurrentpostprocessingofdataanduserinteractions.Thevisualization
threadcommunicateswiththethreadreceivingtheexternalowdataviaaspe-
cialinterface.Aninternalmechanismtakescareoftheautomaticupdateofthe
datarepresentationobjectsinthevisualization.

Inaddition,themultithreadeddesignneedstobeextendedwithamechanism
toguaranteetheautomaticupdateofcurrentlyreceivedexternaldata.Therefore,
theso-calledscenePrgraphopertyisbuiltObjects,withtoavoidspecialvisualcopyingizationdatasetsnodes,intowhichthesearenodes.connectedThroughto
thisconnectionthevisualizationobjectslistenontheirpropertylinksandare
automaticallyupdatedassoonasnewdataarrive.Incontrasttothedatarep-
resentationobjectsinthescenegraph,thesepropertyclassesworkwithpointers
andarenotifiedofchangesviacallsofanupdateroutine.
Thevisualizationfront-endhasbeendesignedasamodulewithinterfacesfor
dataexchangeandisthereforeencapsulatedfromotherstructures.This,addi-
tionally,enablesaconnectionofthevisualizationtootherdataservices,aswellas
makingtheapplicationreadyformulti-clientextensions(Borrmannetal.,2006).

6Chapter

GridInteractiveGenerationProblemDefinitionand

Thepreviouschaptersdescribedthesimulationkernelandtheinteractivedata
explorationofiFluids.Thosearethecentralcomponentsofanonlinemonitored
simulation.Toachieverealcomputationalsteeringandthusallowtheuserto
interactwiththesimulationduringitsexecution,anadditionalsteeringenvi-
ronmentisrequired.Thischapterwillthereforeintroducetheimportantas-
pectsofsteeringinthecontextofaCFDsimulation,whichcomprisesbasic,grid-
independentsteeringoptions,thedefinitionofboundaryconditionsaswellas
theonlinemodificationofthegeometricmodel.Furthermore,theuserinterface
ofthegeometricinteractivemodificationssimulationintowillthebesimulationbrieydescribed.modelisSincecentralthetoenablingincorporationfullofin-
teraction,thefocusinthesecondpartofthischapterwillbeputonaspecially
.generatorgridoptimized

6.1SteeringofGlobalSimulationParameters

Thesimulationsimplestrun.levelThisofcanbeinteractionextendedwitharathersimulationeasilyistotochangesstop,ofpauseglobalorrestartparame-the
tersthisandpointthephysicalinteractionconstants,facilitiese.g.,theofmostviscosityotherinthecasecomputationalofuidsteeringsimulations.frame-At
worksend.Themaincharacteristicofthesetypesofinteractionisthattheirinu-
enceactiononthetypesoffersimulationedbyisiFluidsaregrid-independent.thepossibilitySometooftheactivatemoreturbulencesophisticatedmodelinginter-
randoutinestosochooseastobebetweenabletoseveraladjustoptquicklyimizationandeasilystrategiestoforthethediffermainentstrcomputationengthsof
thepresentlyusedhardware.
Thesebasicinteractionsareaccessiblethroughacontext-based3Dmenuwhich
canbeusedwithstereoscopicvisualizationsystemsaswellaswithconventional
micedesktopwithstandarworkstations.dkeyboarTheds,supportedthePolhemusinputdevicesStylusarewand,andconventional3DSpacemice,desktop
whichhavebeendescribedinChapter5.

62

6.2.InteractingwiththeGeometricModel

63

6.2InteractingwiththeGeometricModel
OnesteeringofthekeyapplicationsfeaturisestheofiFluidspossibilityofdistinguishinginteractiveitfrommodificationsotherofthecomputationalgeomet-
ricsetupanditsboundaryconditionsduringongoingcomputation.Thegeomet-
ricmodelwillnowbeintroducedasastartingpointofthedescriptionofthis
characteristic.

ModelGeometric(BrTheep),geometricstoredinmodeltheSTLusedsteriniFluidseolithographyisbasedfileonformat.boundaryrThiseprformatesentationcontainsobjectsa
descriptionofatriangulatedsurface,namelythecoordinatesoftheverticesand
thenormalofeachfacet(wikipedia,2007c).STLfilescanbeobtainedeasilysince
thisnumerformatousistoolsaallowstandardconvertingexportoptionmostofinthemostCADwell-establishedsystemsand,formatsinsuchaddition,as
3DS,DXF,OBJ,IV,AM,andVRMLintoSTL.

Figure6.1:WorkingplacescenariostoredintheSTLfileformat:Thisfigure
showstriangulatedsurfacesstoredinSTLformat.Onefilecancontainseveral
distinctobjects,whichcanbeloadedandstoredasagroup.

AnSTLfileisabletoholdthedescriptionofseveraldistinctobjectsandas
suchscribedbypermitsagrgroupoupingofofobjectsobjects.comprisingForaexample,desk,aanofchairfice,aworkingtelephone,placeacandesktopbede-PC
etc.(seeFig.6.1).ForusewithiniFluids(andbetweeniFluidssessions)thestan-
darwithoutdSTLlosingfileitsformatcancompatibilitybetosupplementedstandardbySTLrseveraleadersuidorimportmechanicalsoftware.attributesThis
arhaseonlybeenevaluatedachievedbythroughiFluidstheanduseofotherwisespecialignormagiced.Wkeyworiththedcommentsfunctionalitieswhichof

64

InteractiveProblemDefinitionandGridGeneration

grpossibleoupingtoobjectsdefineaandcompleteattachingscenaricorroofespondingaroomuidwithdoors,mechanicalwindows,attributesitventila-is
tionfacilitiesandfurniture,whichcanbeloadedatapplicationstarttogenerate
theinitializationinitialgeometryphaseassetupcomparandedtoboundarystartingfrconditions.omscratchThiswshortensithoutanytheinitialinteractivestart
updescription,wherethehardcodedstartupisasimplecuboidroomwithadoor
andawindowasinletandoutlet,respectively.

eiFluidsPrPreprocessorineOfTgrooupextendandsplitstandardobjectsSTLforfilesthewithinitialuidstartingmechanicalscene,theattributespreprandocessingtobetoolableiFlu-to
idsPrehasbeendeveloped(Kollinger,2007).iFluidsPrehasbeenimplementedas
avancedmodule3DofVtheisualizationgeneral-purposeandVolume3DandModelingVR(MervisualizationcurysystemComputerAmiraSystems,(Ad-
2007a)).Inc.,loadingWithinSTLiFluidsPrfilesewithathecompletegeometricsceneofauidconfigurationsimulationforancanindoorbeconstrsimulationucted.ofBya
rwalls,oom,theuidcomplementeddomainbyandmoritseboundarydetailssuchconditionastheisbasicdefinedthrfurnishing,oughtherventilationooms
facilitiesorotherinneruidobstacles.
grAsoupsalrstoeadyredinamentioned,singleSTLtherfile.oomsTheobjectsgeometricbuildingconfigurationuptheisgroupbasedinonthisobjectcon-
textwillbereferredtoassubobjectsinthefollowing.Whendoubleclickingonan
andobject(grdisplaysoup),theiFluidsPrindividualeenterssubobjectstheinso-calleddifferentSubobjectcolorsasModeshownofinsuchFig.agr6.2.oup

Figure6.2:Acompositionoffacilitiesanddevicesneededinasurgeryroomis
shown.Thedifferentcolorsindicatetheindividualsubobjectsofthisgroupedset
esentations.eprrgeometricalof

6.2.InteractingwiththeGeometricModel

65

iFluidsPreallowstheusertogroupseveralobjectsfromdifferentSTLfilesand
tostorethisgroupofobjectsinanewfile.Withinexistingobjectsorsubobjects
theusercanfurthermoredefineagroupoffacetswhichcanbestoredasanew
subobject.Besidesdefiningnewsubobjects,itisalsopossibletomergeseveral
subobjectsintooneobjectortodeleteunneededsubobjectsorsinglefacets.
AsamoduleofAmiraiFluidsPre,inaddition,offersAmirasfullfunctionality
withregardtomodifyingthegeometryoffacetsurfaces.Thiscomprisescoars-
eningandrefiningareasofafacetmesh,removingselectedtriangles,aswellas
modifyingtheorientationoftheirnormalsorcoordinates.
ThemainobjectiveofiFluidsPre,however,istheassistanceindefininguid
mechanicalattributesforwholeobjectsorgroupsoffacets.Therefore,groupsof
facetscanbedefinedandtaggedbyboundaryconditionsandmaterialparame-
ters(seeFigure6.3).Currently,boundaryconditionsofthefollowingtypescan
bechosen(onepergroup):
•Atboundariesano-slipconditionstatesthattheuidsvelocityvanishes
alongtheboundary.FollowingHeetal.(1997)thishasbeenimplemented
asabounce-backconditionintheLatticeBoltzmannsolverofiFluids.
•Atfrictionlessboundariesaslipconditionisimplemented,whichdoesnot
inuencethevelocityalongtheboundarybutsetsthevelocityorthogonal
o.zertoitto•Atowinletsasgiven,forexample,atventilationfacilities,avelocitycon-
ditioncanbeset.TodeterminethecorrectvaluefortheLatticeBoltzmann
Model,similarityconsiderationshavetobetakenintoaccount,e.g.,byus-
ingtheReynoldsnumber(seeEquation(3.31)).
•Itisalsopossibletosetapressureconditionataboundary.Incasethe
boundaryfaceisorthogonaltoacoordinateaxis,thepressureconditioncan
beenhancedbydeterminingmissingdistributionsfollowingZouandHe
(1997).•Inaddition,atemperatureandatemperaturegradientconditioncanbe
attachedtothesurfacesoftheinvestigatedmodeltosupportcurrentlyon-
goingdevelopmentsofiFluidswhichwillconsiderthermalphenomenafor
indooruidsimulationsandcomfortstudies(vanTreecketal.,2007).
•Tobepreparedforfutureextensionsstubsforauserdefinedconditionhas
implemented.beenalsoCombinationsofboundaryconditionsarenaturallysupported,however,the
userhimselfmusttakecaretolimithimselftomeaningfulcombinations.Forex-
ampleoneshouldconsiderthatthecombinationofpressureandvelocitycannot
besetatthesameboundarywithoutalreadyknowingthesolutionoftheproblem
(T¨olke,2001).Furthermore,massconservationmustbeguaranteedifonlyve-
locityconditionsareused.Besidestheboundaryconditionsalreadymentioned,
furtherattributesandmaterialparameterssuchasthespecificheatorthesurface
coefficientofheattransfercanbesetaswell.

66

eFigur

Amira.

6.3:

On

This

efigur

shows

Interactive

the

oblemPr

embedding

theleftanobjectsetisdisplayed

of

in

Definition

the

and

eiFluidsPr

dGri

Generation

module

into

aviewwhereitssubobjects

arerepresentedindifferentcolors.Thecurrentlyselectedsubobject(theyellow

inletofventilationinstalledattheceiling)ismarkedwithfacetsoutlinedinred.

Ontherightthecurrentsettingsoftheboundaryconditionsareshownandcan

beadjustedaccordingtotheuserswishes.

6.2.InteractingwiththeGeometricModel

67

ModificationsGeometryInteractiveAsometricmentionedmodelandabovetothedefinekeyorfeaturchangeeofiFluidsassociatedistheboundarypossibilitytoconditionsmodifyduringthege-a
rtionalunningobjectssimulation.canbeloadedConcerning(i.e.,theinserted)interactionsfromwithadditionaltheSTLgeometrfiles,yitself,whichaddi-may
loadingoptionallyprbeocessprbyepareddisplayingwithpriFluidsPreviewse.Theofthecontext-basedcontentsof3DthemenuSTLfilesupportsswithinthe
thescenecurrcanentlybeselectedtranslated,dirscaled,ectory.Objectsdistortedororrwholeotatedgroupswithinoftheobjectsscene,orloadedtheytocanthe
bedoneaddedwithintoorrobjectemovedgroups,fromi.e.it(seesubobjectsFigurecan6.4).beThesametransformedrmodificationselativetocanotherbe
subobjectsofitsgroup,orcanbedeleted.

Figure6.4:Interactionwiththegeometricmodel:Theupperleftscreenshot
showstheoriginalobjectgroup.Itsselectionisrepresentedbydisplayinga
boundingboxwithdraggerswhichenablecontrolofvarioustransformations.
Intheupperrightscreenshotatranslationinthex-z-planeisshown.Inthe
lowerlefttheobjectgroupisdistortedinseveraldirectionsbyusingthescaling
draggers,whilethelowerrightpicturedemonstratesarotationtransformation.

draggersSuch(seeinteractionsFigure6.4)withfortheallgeometrytransformationsarerealizedwithaeitherbyconventionalusingmouseOpenInventoror3D
Wand,orbyusinga3DSpacemousetoaccesstheobjectsdirectlyforallpossible
transformations.Thesemodificationsareimmediatelyandautomaticallyincorporatedintothe
runningsimulationandtheresultsofthesimulationkerneladapttotheupdated
.dinglyaccorconfiguration,

68

InteractiveProblemDefinitionandGridGeneration

ConditionsBoundaryofDefinitionInteractiveBesidescomputationalthemodificationssteeringofframeworkthegeometricaliFluidsishowlayoutsaboundarycentralaspectconditionswithinarede-the
finedoradjustedduringarunningsimulation.Thescenecanbemodifiedbyre-
arranginginnerobjectswithregionscarryingcertainboundaryconditions,e.g.
windows,doorsorventilationinletsinanofficeroom.
Theconfiningsurfacesofthesimulationdomaincanbemarkedwiththeat-
wouldtributecoverhulltheanduserwillsthenviewbeontodisplayedinnerinpartsaoftransparthesceneentasmannershowninwhenevFigurerethey6.5.
inAsalraddition,eadyanmentioned,informativesubobjectspop-upofamessselectedageoconveyingbjectarethecolorcurredentdifprferentlyopertiesand,of
thepickedsubobjectisdisplayed(seealsoFigure6.5).

ofFigurtheeroom6.5:areThisdefinedfigurethrshowsoughawalls,simpleadoorsetupandofaanwindowofficer,oom.whichTheareboundariesassembled
intohullonegrattribute,oupofwhichsubobjects.causesIntheadditidisplayon,oftheonlywallsthosehavepartsbeenofmarkedtheroomwithcon-the
finement,whichdonothideotherpartsofthescene.Thiseffectisbestobserved
bycolorsofcomparingthesubtwoobjectsdifferarenteshownviewsasofthewellassamethescene.propertiesFurthermorinformatione,theofdiffereitherent
thedoor(leftscreenshot)orthewalls(rightscreenshot).

Frequently,boundaryconditionsarerepresentedbygeometriescoplanarto
facesofthebasicgeometry,e.g.windowsinawall.Thismodelingshortcutis
madepossiblethroughassigningapriorityleveltotheseboundaryconditions
whichcan-sotospeak-overwriteothersettingswithlowerpriority.However,
thecoplanarityofthefacetsinvolvedusuallymakesitdifficultforthemtobe
selectedunambiguously.Therefore,ExplosionModecanbeactivated,which
temporarilymovesapartorexplodesallcoplanarfaceswithrespecttotheobjects
centerandinthismannerallowsaneasyselectionasshowninFigure6.6.
Tosetnewboundaryconditionsortomodifyexistingones,dialogshavebeen
implemented,whichcanbedisplayedineither2D,or3Dforeithernormaldesk-
topenvironments,orenvironmentsofferingstereoscopicviews.Thedialogsoffer

InterfaceUser3D6.3.

Figure6.6:Overlappingplanes:Thesescreenshotsshowexplodedplanesto
simplifyselection(left)andthecorrectlyselectedbutpartlyhiddensubobject
(right).

69

themeanscorrofespondingnamingthevalues.boundaryInaddition,conditions,theofabove-descchoosingribedtheirhulltype,andattributeofcansettingbe
set,andforoverlappingboundaryconditionsaprioritycanbedefinedtospecify
whichconditionshouldoverwriteothers(cf.Fig.6.7).
tionsInarecomparisoncoarseningtoortherofineefiningprtheeprocessorfacetedsurfaceiFluidsProrethechangingonlytheunsupportedorientationfunc-of
thetrianglenormals.ThelatterrelyonAmira-internalcallsandwouldhavetobe
implemented.separately

InterfaceUser3D6.3

Asalreadymentionedinthepreviouschapter(seeChapter5),itisadvantageous
Tforosupportcomputationalanintuitivesteeringsteeringapplicationsofthetobersimulationuninainvirtual-rvirtualrealityealityenviraspeciallyonment.
useradaptedinterfaceuseravoidinginterfaceisrclassicalequir2Ded.Therinteractionefore,hasanbeenimmersivedeveloped.andItintuitiveservesas3Da
context-controlled3Dconsoleassistingavirtualuid-owexperimentwhilestill
preservingthepossibilitytoperformcomputationalsteeringoftheapplication
viaastandard2Dworkstationmouseandkeyboard.
asTomousebeableortowand,sticktwotomajorsimple(andmodesofpossiblyoperatioalsonhavesingle)beenintrinteractionoduced:devicesThesuchfirst
iscalleddirectmodeandprovidesthepossibilityofdirectinteractionwithand
navigationwithinthescenebyusingthestandardinputdevicesWand,Space-
mouse,displayedandinthedesktoplowermouse.partofInthethesecondviewportmodegiving(menuaccesstomode)aavarietymenuofpanelcontrolis
possibilities.

70

InteractiveProblemDefinitionandGridGeneration

Figurditionseof6.7:aThisselectedfigureobjectshowsorthesubobject.dialogforsettingormodifyingboundarycon-

ModeDirectThebetweenapplicationtwoisnavigationstartedandindirtwoectselectionmode,whermodes.eTtheouserexplor-eagainthe-scene,canhechoosemay
ythroughorexamineitfromafixedpointofview,forexample,byrotatingthe
corresponding3Dmodel.Thefirstselectionmodeallowstochoosebetweenthe
differentobjectsrepresentingtheconfigurationofthesimulation,i.e.uidobsta-
tionclesandbetweenboundarythedataconditions.visualizationIntheobjectssecondi.e.,modecuttingtheuserplanes,canstrswitcheamlines,thevectorselec-
planes,andisosurfaces.Onceanobjecthasbeenselected,itmaybemovedtoa
thedifferscene,entorposition.selectingDependingdataonvisualizationthecurrentobjectsinteractionorsimulation(eitherobjects,navigatingobstacleswithin
orboundaryconditions),acontext-basedmenucanbeinvoked,whichappears
options.opriateapprwith

ModeMenuAscurralrenteadycontext.mentioned,Thefirsttheroneeare(ScthreneeedifMenu)ferentismenucalledwhilecategoriesexploringaccordingthetoscenethe
innavigationmode.Thismenuofferstheinsertionofnewuidobstacles,new
ofboundaryglobaluidowconditions,parameters.datarepresentations,oralegendaswellasthemodification
thefileWhensystemaddingtoaloadnewtheobstacle,geometricforexample,descriptiontheofamenuallowsCAD-generatedbrowsingobject.throughThe

6.3.InterfaceUser3D

71

filebrowsermakesuseofthecapabilitiesofthe3Dvisualizationinthatitshows
asmallrotatingpreviewofthecurrentlybrowsedobject.
Acontext-basedmenuofferstheadvantagethattheusercannavigatedirectly
throughthemenuwithouthavingtosearchfortherespectivesubmenu.This
allowsfasteraccessandamoreintuitivewayofworkingascomparedtouser
interfaceswithdeepmenutrees.Furthermore,themenuinformationandgraph-
icalrepresentationisreducedtoitsessentialminimumtopreventannoyingoc-
clusionsofthescene.Iftheselectionofanoptioncallsanothermenubranchthe
precedentmenuwillbereplacedwiththenewone.Sincethemenucanbeacti-
vatedorleftatanytimewedecidedittobedisplayedview-fixedregardlessof
theuserscurrentpointofview,whichmaychangeduringnavigation.Asshown
inFigure6.8,themenuhasbeenarrangedinfrontoftheusersimilartoaconsole
orpanel.Thispreservestheviewontothesceneevenwhilecontrollingthemenu
inafullyimmersivemanner.
Anotherhelpfuldetailisthehighlightingofthecurrentlyselectedmenuitem
togivetheuserfeedbackwhichoptionwouldbeselectedinthemenu.Themenu
optionscanbeselectedeitherbyusingthearrowkeysonthekeyboard,rotations
ofthespacemouseknob,orbypointingatandclickingonthemwithawandor
mouse.desktoptheForcomplexsettingsasisthecasewhendefiningboundaryconditions,the3D
menuswitchestoa2Dshape(alsoview-fixed)tobeabletoseeallsettingsoptions
atoneglanceinanefficientway(seeFig.6.7).

Figure6.8:ThisscreencaptureshowstheSceneMenuofferingtheuseroptions
toaddobstaclesordatarepresentations,tocontrolowparametersandother
simulation.theofoptionsconfiguration

72

Grid6.4Generation

InteractiveProblemDefinitionandGridGeneration

Asalreadymentioned,akey-featureofiFluidsmakingitstandoutfromother
computationalsteeringapplicationsisthepossibilitytochangethegeometryand
itsassociatedboundaryconditionsduringasimulation.Thefewapplications
whicharealsocapableofhandlinginteractionwiththegeometryarerestricted
topredefinedandparameterizedobjects.WithiniFluidsarbitrarygeometrycan
beloadedatanytimeintothecurrentlyinvestigatedsimulationscenewithout
particularpreparationsandcanbefreelymanipulated.Thisisduetoapowerful
gridgeneratorwhichmapsthegeometryontothecomputationalgridoftheLat-
ticeBoltzmannsimulationatcomparativelylittlecomputationalcosts.Inaddi-
tion,thelatticesitesofthecomputationalgridcontainobject-specificinformation
abouttheboundaryconditionsneededfortheuidsimulation.

oxelizationVtoTheapr3Docessvolumeofdiscrconvertingetizationtheonsurfaceagridreprofaesentationgivenrofaesolutiongeometricisreferrobjectedtoorasscenevox-
elization.Theelementaryunitvolumewithinsuchagridiscalledavoxelinanal-
onogythetogrtheidrpixelinesolution2D-andscanlinetypeofconversiongeometryfor,therastervoxelizedgraphicsobjectdisplays.mostoftenDependingisnot
anequivalentrepresentationoftheoriginalbutonlyanapproximation.
rEvenesentationso,oftheranearobjecteisseveralnecessaryfieldsoforapplicatiadvantageous.on,inForwhichaexample,volume-basedvoxelizationrep-
methodshavebeenusedforvolumevisualization(Jones,1996)),radiosityand
raytracing(Krumhaueretal.,1999),volumetricmodelrepair(KolbandJohn,
2001),andcollisiondetection(Gibson,1995).Furthermore,measurementsin
medical,physical,orengineeringapplicationsoftenresultinvolumetricdata.
rFinallyequiring,certainthetypestransformationofnumericofalpointorsimulationsurfacearedatacomputedintoaonvolumerCartesianepresenta-grids
tion.ForuidsimulationsbasedontheLattice-BoltzmannmethodCartesiangrids
areusedcomparativelyquitesimple,typically.butThebrute-forhandlicengofvoxelizationcomplexofCADascenegeometrywithsevdataeralis10thusto
100thousandsofsurfacedescribingpolygonsremainsacomputation-intensive
task.applyingItisathereforespace-partitioningworthwhiletoalgorithmtrytotospeedgenerateupathevolumevoxelizationreprpresentationocessviaby
anoctrdependingee.Anonaoctreecertainisacriterionhierar—chicalwithdataeitherstructureighteorwithnoonechildrootnodes.nodeTheandlatter—
maythemselvesbefathernodestofollow-uptreesorendthetreeasleavesif
withoutchildren(seeFig.6.9).
sinceAnitoctriseeisobvioustoespeciallyidentifywellthesuitedeightforchildrspaceenofpartitioningtherootprnodeoblemswithin3Dthespaceeight
tantscartesianwhichoctantsstillofcontainanpartsobjectsofortheascenesgeometrycanboundingbercube.epeatedaspeSubdividingcifiedthenumberoc-

GenerationGrid6.4.

73

oftimes,wherebythenumberofrecursionsdeterminesthelevelofrefinementof
6.9).(Fig.eeoctrthe

Figure6.9:Thisschemeshowstworefinementstepsofhowtogetfromasingle
root-nodeoctant(left)toalevel2octree(right).

ThevoxelizationalgorithmsusediniFluidsworkontriangulatedsurfacesas
input,whichmostCADsystemsofferasastandardexportoption.IniFluidsgrid
generationmodulethecommonSTLstereolithographyfileformat(wikipedia,
2007c)issupported.ForagivensurfacetheSTLfilecontainsalistofitstriangles
vertexcoordinatesandnormalvector.
Thereareaseriesofapproachestovoxelization,thethreemostrelevantmeth-
odshavebeenpresented,e.g.,byStolteandKaufman(2001),HaumontandWarzee
(2002),andMundani(2006).Themethodofgridgenerationappliedwithinthe
computationalsteeringframeworkiFluidsisrelatedtotheseinsofarassurfaces
areconvertedintoavolumetricrepresentationusingoctreedatastructures.Al-
thoughStolteandKaufman(2001)alsoconcentrateonafastconversion,their
ansatzislimitedtoobjectsdescribedanalyticallythroughanimplicitdefinition.
TheotherstudybyHaumontandWarzee(2002)usespolygonsasinput,how-
ever,theirfocusliesonmodelrepair,visibilitydetermination,collisiondetection,
andcompleteinterior/exteriorclassificationofallvoxelswithveryhighcompu-
tationalcost.Thisleadstoavoxelizationprocesstoocomplexforinteractiveuse.
Thethirdmethod(Mundani,2006)convertssurfacesintoanoctreedescription
usinghalfspacepartitioning.Thisalgorithmreliesonconvexobjectsdefinedby
closedpolygonalsurfacesand,accordingly,isveryefficientforbasicgeometrical
shapes.iFluidsofferstwomethodsofgridgeneration,oneforcreatingauniform
Cartesiangrid,theothertoproducehierarchicalCartesiangrids.

74

GridsCartesianUniform

InteractiveProblemDefinitionandGridGeneration

aTheuniformLattice-BoltzmanCartesiangrid.solverAccor(describeddingly,thein3)gridcurrentlygenerationusedprinocessiFluidsstartsisbasedwithonan
emptygridwithoutanygeometrydataorboundaryconditions.Tomapthegeo-
arootmetricaloctantdataiswithitscomputedcorrforespondieachngtriangle.boundaryDependingconditionontheinformationsizeandeforientionficiently,
oftanttheandfacet,thegridtheserrootesolutionoctantsalsoaredefinesusuallytheratherlevelofrsmall.efinementThesizeofneeded.theItrootcanoc-be
oughthrdetermined

octlevel=log2lenghtmax+1,(6.1)
whereoctlevelisthemaximumlevelofrefinementintheoctreeandlenghtmax
representsthemaximumlengthoftheboundingboxofafacetingridunits.Note
thatthislevelofrefinementcreatessub-gridleavesasshowninFigure6.10to
improvethegeometrysapproximation.

setFigurtoeappr6.10:oximateRecursivethertriangleefinementwhileforthea2Dbluegridexample:pointsTheblackindicategridwhichpointspointsare
wouldhavebeensetadditionallywithoutthesub-gridleavescreatedbytheextra
efinement.r

GenerationGrid6.4.

75

triangleFigurean6.11individualshowstheoctreeoctreehasstrbeenucturcreeatedfortheandrdescribedecursivelybeenalgorithm.refined.Foreach

Figure6.11:InthisexampleaPentakisdodecahedronhasbeenvoxelizedona
uniformCartesiangrid.Foreachtrianglethegeneratedrootoctantswiththeir
treestructurewith64x64x64gridresolutionaredisplayed.

GridsCartesianHierarchicalLattice-Boltzmannsolverscanalsobeperformedonhierarchicalgridsaspro-
tionposedbyaccessible(Crousewithetthisal.,kind2003;ofT¨olkegridetisal.,the2006;simpleThr¨urey,ealization2007).ofAnpatchesimportantofhigherop-
rbeusedesolutionforarbloodoundowcriticalsimupartslationsofastheproposedgeometryin.GAn¨otzextr(2006).emeThevariantoctrofeethisansatzcan
canbeutilizedtoidentifyregionsaroundthegeometry(thearterywiththein-
teriorow)forsettingupthecomputecells,whiletheremaininggridoutside
theseboxesisdiscarded.Toalsosupportsuchsolvers,amodifiedalgorithmhas
beensimulationdeveloped.scene,Herwhiche,istherefinedalgorithmtothestartssamewithleveloneofrrootefinementoctantasfortheobtainedwholeby
Equ.tested6.1.forInthecontoctantsrasttoandathelistofalgorithtrianglemdescribedcandidatesabove,allintersectingtrianglesthecurrhaveenttooc-be
oftantrwillefinement,beagiventosecondthenextcriteriontogenerationstoptheofrchildefinementoctants.ofInthecurradditionenttooctant,thelevelthen,
isanemptycandidatelist.Sincethetriangleshavetobetestedseveraltimesfor

76

InteractiveProblemDefinitionandGridGeneration

byintersectionanotherpartwithofthetheoctroctantsee),this(althoughalgorithmitmayisturnsloweroutthanthatthetheyhaveabove-describedtobeset
grids.Cartesianuniformforversiongrid.FigurToe6.12obtainashowsgridthewhichoctreecanstrbeucturusede,forservingaastheLattice-Boltzmannbasisforthesimulation,hierarchicalthe
roctreeefinementconstrofuctedneighboringthroughcellsthismustmethodnotnedifedsfertomorbeethansmoothed,agiveni.e.,thelimitlevel(oftenof
1).edequirr

Figure6.12:Again,aPentakisdodecahedronhasbeenvoxelizedona64x64x64
andgrid.wasInthencontrastrtoecursivelyFig.r6.11efined.onerTheootoctroctanteehasobtainedbeencrcaneatedbeforusedtheasawholebasisforobjecta
grid.computationalCartesianchicalhierar

EvaluationPerformanceandOptimizationsInbothalgorithmsdescribedabove,themaincomputationisthetestofwhether
aintersectiontriangletestintersectsahighlyanopoctant,timizedliesrcompletoutineelybasedinitsoncodeinterior,ordevelopedoutside.byForAkenine-this
used.beenhas(2001)MoellerM¨ollerThisandtestrHaines,outiner1999;eliesonEberly,the2000)separatingwhichaxisstatestheorthatemtwo(Gottschalkconvexetal.,polyhedra,1996;
AnormalandBofaraefacedisjointofAiforBthey,orcanalongbeanseparatedaxisformedalongfromeithertheancrossaxisprparalleloductoftoana

Grid6.4.Generation

77

edgeofAandanedgeofB.Tosimplifythesetests,theboxandthetriangleare
transformedinawaythatthecenteroftheboxcoincideswiththeoriginandthe
axis-aligned.earfacesAftertheinitialtransformation,threetestsareperformed:
•Testtheminimalboundingboxofthetriangleagainsttheoctant:Hereitis
checkedwhethertheminimumandmaximumcoordinate-componentsin
x-,y-,andz-directionarelocatedoutsidetheoctantsminimaandmaxima
not.or•Testtheintersectionofthetrianglesplaneagainsttheoctant:Heretheplane
ofthetriangle,definedbyonevertexandthetrianglenormal,istested
againsttheoctantaccordingtoM¨ollerandHaines(1999)andHainesand
(1994).allaceW•Thelasttestcomputestheninecrossproductsofallcombinationsofthe
cooredgedinatesvectorsofofthethetriangletriangleandandthetheoctantnormalsontoofit,theitisoctant.Bydetermined,projectingwhetherthe
theprojectedtriangleliesoutsidetheprojectedoctantornot.
erationWhenofralltestsefinementarecanpassed,bethestarted.triangleTheaboveintersectsteststhecanbeoctantandimplementedthenextveryit-
efficiently(Akenine-Moeller,2007).ForthespecialcaseofiFluidsthealgorithm
canbefurtheroptimized,sincethenormalofatriangleisalreadyknownandthe
octantsarealwaysaxis-alignedboxes.Furthermore,vectorscomputedoncefor
atrianglecanbestoredtemporarilyandreusedduringthefollowingrefinement
steps.

Comparedtotheuidcomputations,thevoxelizationofevencomplexscenes
orwithderofhighagridsecondrforesolutions2563takesvoxelsonlyanda106comparativelyfacets).shortAdditionallytime,(typicallyvoxelizationontheis
onlyrequiredontheoccasionofgeometryorboundaryconditionchangesdue
toandusertherefore,interaction.aftertheStill,oponetimizationmightwantconsiderationstospeedregarupthisdingprperocessformanceevenofmorthee
gridgeneratoronasingleprocessor,thetimeconsumptionofthevoxelization
processcanbereducedthroughparallelization.
tweenRegarseveraldingthewaysoftechnicalrparallelizationealizationofsuchtheastheMPIimplementationmessageonepassingcanchooselibrarybe-or
OpenMPcompilerdirectivesasexamplesofdistributedorshared-memoryap-
prtationoaches.oftheSincesourceparallelizationcodesstrviauctureMPIand,alwaysespeciallyrequiresonasharcertainefed-memoryfortofaradap-chi-
tectures,tendstointroduceadditionalcommunicationoverhead1,plainshared-
onmemorybothcmachines,ommunicationtheHitachiusingtheSR8000OpenMPandtheprSGIogrammingAltix4700.paradigmwaschosen
thr1Ifoughnotadditionalperfectlybufferoptimizedcopying(zeraso-copycomparedtotransfers)directMPIshared-memorycommunicationiscommunicationslowed(Grdownopp
2005)Thakur,and

78

InteractiveProblemDefinitionandGridGeneration

TheparallelizationstrategyitselffirstlydependsonwhetherauniformCarte-
thesiandiforferaenthierarthreadschicalperformCartesiantheirgridisworkused.ontheRegarsame,dingshareduniformgridpointsCartesianandgrids,the
trianglesaredistributedamongthem.Incontrast,forthehierarchicalCartesian
gridthetrianglesaresharedandthegridisdistributedtothethreads.Withre-
specttotheformerapproachonehastokeepinmindthatmultipleprocesses
havelattertovariantwriterontoequirtheesonlysamegridsimultaneous(necessarilyreadinanaccess.orInderedbothmanner),casesdatawhileplace-the
mentneedstobetakenintoaccounttominimizememoryaccesspenalties.
usedTowhichevaluateisthequitepopularvoxelizationinthisperformancecontext.ThetherDragonespectivetestgeometrygeometrisyhasshownbeenin
Fig6.13togetherwithitsvoxelapproximationcomputedbyiFluidsgridgenera-
.toronanTheAMDperformanceDualOpterofonthe2.4twoGHzalgorithmssystemfordescribedtwodifferaboveenthasgridrbeenesolutionscomparanded
varyingnumbersoftrianglesapproximatingtheDragon.Thesingleprocessor
tationvoxelizationtimeoftimestheofvoxelbothreprmethodsesentationareshowndependsinonFigthe6.14.oneOfhandcourse,onthethecompu-number
oftrianglesconstitutingthegeometricobjectand,ontheotherhand,onthegrids
fineness.Finally,thefirstmethodhasalsobeenbenchmarkedinaparallelizedversion
usingusingtwooneorCPUstwoonCPUstheforOptertheon.sameFigtest6.15casecomparasinesFigthe6.14.voxelizationtimeswhen
ThevoxelizationperformanceonuniformevaluationCartesianrevealsgrids.ThisoverwhelmingenablesresultsiFluidstoparticularlmapyevenforde-the
tailedandcomplexgeometriesontofinegridswithinafractionofasecond.To
optimizeLattice-Boltzmannsolvers,especiallytheideaofusingpatcheswith
higherresolutionorboxesforareducedgridcouldbeofinterestforinteractive
applications.Here,acombinationofbothvoxelizationmethodswouldleadto
themostperformantvoxelizationwithiniFluids.

6.4.

Grid

Generation

eFigur

6.13:

performance

e,Her

the

Dragon

test

geometry

is

voxelization.iFluidsofevaluation

shown,

Onthe

which

atop

takenfromTheStanford3DScanningRepository(2007)

definedby499870triangles.Thebottom

of

this

model

on

a

512x512x512

grid.

epictur

shows

the

is

has

been

used

for

geometrydragon

iswhichshown,

voxel

etizationdiscr

79

80

InteractiveProblemDefinitionandGridGeneration

Figure6.14:Usingthealgorithmforhierarchicalgrids,theuppergraphs(blue
andgreen)showthevoxelizationtimefortheDragontestgeometryindepen-
dence128x128x128,onthernumberespectivelyof.Onlytrianglesforforauniformgridrgridsitesolutionisofpossibleto256x256x256usethefasterand
algorithmandthevoxelizationtimecanbereducedtoabout22%forthefiner
(orange)andto25%ofthecoarser(grey)grid.

Figure6.15:Theparallelizationgraphshownheredemonstratesthespeedup
achievedbyperformingthevoxelizationwithtwoascomparedtooneprocessor.
ForthisbenchmarktheDragonshowninFig.6.13wasused,composedofvary-
ingnumbersoffacets.Thevoxelgridsresolutionwassetto512pointsineach
ectiondir

7Chapter

RealizationComputationalAspectsSteeringwithRespectto

IntheframeworklastfourunderlyingchaptersiFluidsthemainhavebeencomponentsdescribed,ofthenamelythecomputationalsimulationsteeringker-
nel,Tothegridfacilitaterealgeneratorandcomputationalthevisualizationsteeringonandasteeringhigh-endclient.systemconfiguration
end,withatheseTeraopcomponentssuperncomputereedtobeandanconnectedexternalinanefvisualizationficientway.andThissteeringchapterfront-will
showoverallhowtheperformancemodulesofthehaveonlinebeensteerableconnectedwithapplicationeachiFluidsother.andevaluatesthe

LayoutCommunication7.1

AsdescribedinChapters5and6,thevisualizationworkstationprovidesthe
ulatedfunctionalitysceneattothedisplaysamethetime.currentInthesimulationfollowing,datafocusandtowillinteractbeputwithonthehowsim-an
appropriatedesignofcommunicationbetweentheindividualmodulescanbe
laidouttomeetthesespecialtechnicalchallengesofcomputationalsteering.
sentWhentotheausersimulationinteractionengine,haswhereoccurrtheed,theinformationcorrisespondingincorporatedmodificationsintothearsim-e
ulationmodelimmediately.Onthesupercomputernewresultsarecomputed
databasedareonsentthetotheupdatedvisualizationsimulationclient,configuration.wheretheAsusersooncanastheyobservearetheavailableadaptationthe
oftheuidinaseriesofowupdates(seeFig.7.1).
inAFigurclosere7.2.viewontothedetailsofthemodelsandtheirdatastreamsisshown
Onthevisualization(VIS)andsteering(STEER)sideanadditionalcommuni-
cationthread(COM)hasbeenincludedtouncouplethecheckingforincoming
rpost-presultsandocessing.thesToendingkeepoftheuserdatatrmodificationsansferasandshorttoandnotinfrinterrequentuptassteeringpossible,and
cessonlyisnotmodificationstriggeredtountiltheaftersetuptheareuserforwarhasded.Tcompletedhereforanye,themodifications.transmissionprSinceo-

81

82

RealizationAspectswithRespecttoComputationalSteering

Figure7.1:Onthevisualizationandsteeringworkstationtheusercanvisualize
thecurrentowdatareceivedfromthesimulationrunningconcurrentlyonthe
supercomputer.Ontheoccasionofuserinteractionsthespecificmodifications
aresenttothesupercomputerwheretheyareincorporatedintothesimulation
.immediatelymodel

Figure7.2:Ontheleftthesteeringandvisualizationfront-endisshown.It
consistsofthreethreads,oneforvisualization(VIS),oneforuserinteraction
(STEER),andthecommunication(COM)thread.Ontherightthesimulation
master(MASTER)andthesimulationslaves(SIM)areshown.BetweenCOM
andMASTERsteeringparametersandsimulationresultsareexchanged,often
inadistributedandheterogeneoushardwaresetup.Withinthesupercomputer
simulationresultsoftheslavescomputationaldomainsaresenttotheMAS-
TER,whichincorporatesthesteeringparametersintothecomputationalmodel
and,accordingly,forwardsdecomposedgridscontainingallsimulationdatato
slaves.the

PerformanceFramework7.2.

83

theresultsarenotnecessarilysentatregularintervalseither,thereceiptofdatais,
in(2003)essence,usedaanevent-drivencommunicationprocelayoutssinwithbothdirfixedections.intervalsInforcontrastsendingtothisandKr¨uhnereceiv-
events.occurringanyneglectingingThecomputationalsimulationsteeringmasterframework(MASTER)canconnectingbeseenthemorevisualizationorlessasandthesteeringcoreoffrtheon-
tendaction,tothetheyaresimulation.incorporatedWhentheintothemasterreceivescomputationalmodificationsmodel,usingduetotheuserpowerfulinter-
gridgeneratordescribedinsection6.4.Thenitperformsthedomaindecompo-
sitionandsendsthecomputationalgridandallfurthernecessaryinformationto
thethesimastermulationandaresalvesprepar(SIM).edforTherbeingesultssentbackcomputedtothebythevisualslavesizationareandgatheredsteeringby
client.TheBoltzmannslaveprcomputation.ocesses(SIM)Toontheachievesupergoodcomputerperformance,mainlyitisperformessentialthetoLattice-use
vendormunication-optimizedwithotherprintra-machineocesses.(andAslongpossiblyasnointerinteraction-machine)hasMPIforoccurrtheed,com-the
slavessendcurrentresultsatuser-defined,regularintervalsandcheckforup-
newdatedresultscancomputationalbesentgrids.afterInjusttheafeeventwoftime-stepsintetoraction,giveathenewusergridisfastrinitialeceivedfeed-and
backdependingonhismanipulations.Consequently,thetransmissionintervals
areSincenotthenecessarilyvisualizationregularisthrusuallyoughouttheconductedrun.onanexternalgraphicsworksta-
prtionopriatewithaMPIdifferentderivathareisdwareneededarforchitecturtheefrinteromthat-machineofthesupercommunication,computer,anbetweenap-
masterandvisualizationandsteeringfront-end,whichalsoenablestheslavesto
usevendor-optimizedMPI.Inthisrespect,theGlobusMPICH-G2(MPICH-G2,
2007)UnfortunatelyandP,ACX-MPIalleffortsto(Pacx-MPI,port2007)MPICH-G2librariestothehaveSR8000beenwithtestedvendorwithin-MPIiFluidsen-.
SGIabledAltix,havenotGlobusbeenMPICH-G2successful.wasAtthenottimeyettheavailablemeasuronementsthiswersystem.edoneTherforeforthee,
allperformancedatapresentedinthefollowingsectionarebasedonthePACX-
MPInamelylibrarythe.factInthatthisitcontextstartsittwoisextraimportantMPItoprnoteocessesoneoneachdrawbacksysteofmPforACX-MPI,internal
purposes.InthecaseoftheHitachi,whichhasonlyeightinteractivenodes,only
sixthatnodesperformancecouldbedecrusedeasedforthebyapprapplicatoximatelyion,accor5%dinglydue.toIntheaddition,overheadwecausedobservedby
PtosimplyACX-MPIacceptKellerthis(2005).fortheInthetimeabsencebeing.ofothercompetitivealternativesonehas

Framework7.2Performance

Afterallinvolvedmodulesareconnectedtothecomputationalsteeringframe-
work,theoverallperformanceisinvestigatedandsolutionsforproblemswhich
couldbedetectedarediscussedinthefollowing.

84

RealizationAspectswithRespecttoComputationalSteering

ComputationandCommunicationOverlappingAsshowninFigure7.2,themasterprocessuncouplesthecommunicationbe-
tweenvisualizationandsteeringterminalandsimulation,toavoidcommuni-
cationdependenciesbetweensimulationslavesandsteeringterminal.Thede-
couplingofcomputationandcommunicationisshowninFigure7.3foratrace
collectedwiththeIntelTraceAnalyzer(intel,2007).
Anotherimportanttaskforthemasteristocollectresultsfromtheslavesand
sendthem,combinedinasinglemessage,tothevisualizationclienttoavoid
additionallatencies.Thisisespeciallyimportantwhenthenetworkconnection
betweensupercomputerandvisualizationclientislimitedbyrouters,firewalls,
andslowconnections—maybeevenincompetitionwithotherusers.

Figure7.3:Thistracedepictsthedistributionofcomputationandcommunica-
tion.simulationInthismastercase,five(prprocessocesses1),andwerethrreeecorded,slaves(prnamelyocessesvisualizatio2-4).Then(prtimelineocess0),is
giveninitialization.ontheRedabscissamarksandreprcoversesent11MPItime-stepsfunctionstartingcallsfrwhileomgrtheeenshowsapplicationpe-s
riodsofcomputationorotherapplication-specificprocessing.Frequentchecks
forcanbeusertakenintofferactiontheandcomputationtime-consumingprocessesbyintrcommunicationoducingtoaanmasterexternalnode.machine

Finally,Figure7.3alsorevealsthemainadvantageofintroducingacollec-
tornode:Duringthetime-consumingtransferofresults(fromprocess1to0),
theslaveprocesses(2-4)areabletooverlapcomputationwithcommunicationas
longasthecomputationtimeislongerthanthecommunicationtime.Itisalso
necessarytopointoutthatusingnon-blockingMPIcommunicationontheSR8000
(andseveralotherMPIimplementations,cfWhiteIIIandBova(1999))doesnot
overlap.thisallowTheimpactofthisfactontheperformanceoftheapplicationisalsoshownin
Figure7.4,whichcomparestheperformanceoftheapplicationwithandwithout
thismasternode.Themeasurementshavebeencarriedoutwiththecomputation

Framework7.2.Performance

85

runningontheHitachiSR8000atLRZandthevisualizationonanexternalDual
OpteronLinuxPCattheChairforBauinformatik.

Figurtervalsein7.4:Thedependencetwoongraphstheshownumbertheofscalingcomputationbehaviorfornodesvariousused.dataPerformanceupdatein-is
measuredinMLup/s(millionlatticesiteupdatespersecond)atthevisualization
prupdateocess.Theintervalsleft-handof60orpanelmorreeferstimetorunsstepswithshowagoodsimulationscalingalrmasteready.,Runswherwitheas
measurshorterementsintervalswerehaveaperformednegativewithoutinuence.aOnsimulationthemasterright-handandpanel,thetheapplicationsame
showspoorscalingefficiency(<50%)inallcases.

Furthermore,therightgraphinFigure7.4showsthedependencyonthedata
updateintervals.Truescalingoftheapplicationcouldonlybeachievedusinga
masternodeincombinationwithlongupdateintervalsofthevisualizedscene.
Thesaturationofperformanceiscausedduetoonlyasmallamountofcomputa-
tionalworkperslavewhilethecommunicationbetweenmasterandvisualization
front-endtakesmoretimethanthecomputationbytheslaves.Afterafewcycles
already,themasterisstilltransferringdatawhiletheslaveshavetowaituntil
beingabletosendnewresultstotheirmaster.Evidently,thishassevereimpact
ontheperformanceandmaybeidentifiedasanetworkbottleneck,whichwill
bediscussedinmoredetailinthefollowingsection.

BottleneckNetworkWhenrecognizingtheimpairedscalingbehaviorduringtheperformancemea-
surementswithshortupdateintervals,welookedforpossibleexplanationsand
foundthatthenetworkconnectionoftheSR8000toanyothercomputerwasfairly
unsatisfactory.ItturnedoutthatthemaximumthroughputonitssingleGigabit

86

RealizationAspectswithRespecttoComputationalSteering

Ethernetlineachievedonly230MBits/secmaximum.Togetanideaofthein-
uenceofthenetworkconnection,thesamemeasurementshavealsobeenper-
formedonSarasSGIAltix3700andwerebenchmarkedthereadditionally.
Tobeabletocomparethesemeasurementstheofineperformance(i.e.per-
formancewithoutaconnectiontoavisualizationandsteeringterminal)hasbeen
measuredonbothsystems(seeChapter4)andisshownagaininFigure7.5.

Figure7.5:ThisgraphshowstheperformanceoftheLattice-Boltzmannkernel
goodrunningspeed-uponHitachibehaviorSR8000onthe(grreen)espectiveandSGImachineAltix.3700The(blue).performanceBothcodesgainshowseen
bothafterportingmachines.thekerneltotheAltixwasabout70%whenusing40processorson

AltixForthemachinesetupalsoatSarviaa,anGigabitexternalEthernet.Itvisualizationwassynclienttheticallywasconnectedbenchmarkedtotheto
abandwidthof720Mbits/sec.Comparingthesaturationperformancesonthe
70%SR8000(seeandFigurtheeAlti7.6).xThis3700corrshowedespondsatoperformancedataupdatesgainofevery210%13assecondscomparonedHi-to
tachisSR8000andonly4secondsontheSGIAltixfor380x355x122gridpoints
ofFigurthee7.6examplealsorshowoomthatwithevendimensionswithaofhigher7.6mxperformance7.1mxof2.24m.theThecomputationalgraphsin
kerneldataupdatesevery13secondscannotbesurpassedwiththenetworkcon-
nectionsuperiorprefovidedficiencyinofthethesetupnetworkofatSR8000Saraiswithabletoexternalfurtherdecrvisualization.easetheOnlydatatheup-
.significantlytimedateTofurtherimprovetheapplicationwithrespecttothenetworkbottleneck
evenforhighupdateratesandhighresolutiongrids,compresseddatatransfer

PerformanceFramework7.2.

87

betweencomputationandvisualizationshouldbeinvestigated.Ifpossiblethe
amountofdatasenttothevisualizationandsteeringfront-endshouldbemin-
imized.Thisis,forexample,anoptionforporousmedia,wheretheboundary
doesnodesnotneedseemnottobesolvesent.theDataprroblemeductionperse,assincesuggestedtheinmessageK¨uhnersizeetcanal.only(2001)be
decreasedbyasmallamountinevitablyaccompaniedby(considerable)compu-
costs.tational

theFigurenumb7.6:erofThesetimepairsstepsofbetweengraphsrshowesulttheupdatesperformancetothegaininvisualizationdependenceforrunson
ontheSR8000andAltix3700,respectively.Bothmachinesshowperformance
exceedssaturationwhencomputationfrequentlytime.Updatingupdatingrlessesults,oftenbecausethaneverycommu40timenicationstepstimequicklythen
restoresgoodscalingbehavior.ItisevidentthattheSGIcanmakemuchbetter
useofitsoutgoingGigabitethernetnetworkthantheolderHitachisystem.

Stillanotherwayofbypassingthebottleneckwouldbetomovethepost-
processingandgraphicscomputationontothesupercomputer(maybeevenin
adilemma,parallelbutversion).maynotThis,inalwaysfact,beseemsapplicable.tobethemostEspeciallypronomisingtheHitachisolutionforSR8000the
prthisocessingansatzorwasvisualizationinfeasible,asduethistoitsmachinebadwasperformancenotinwell-suitedtonon-vectorizablesupportcode.post-
Analternativesolutionwouldbeacoupledsystemofahigh-performancecom-
andputerahigh-end(well-suitedforgraphicstheworkstationLattice-Boltzmann(suchasmethodanSGIsuchPrismasorvectoravisualizationcomputers)

88

RealizationAspectswithRespecttoComputationalSteering

cluster)computingwhichcenterare.Then,connectedthevviaisualizationspecializeddataandcoulddedicatedbetransferrnetworksedatasthea(com-same
pressed)videostreamassuggested,e.g.,inHeinzlreiter(2005).Althoughfirst
pralreadyojectsinavailablethisdir(rectionemotehavevisualizationalreadystartedservicesandsuchprasoductsSGIinOpenVthisizdirServerection,IBM,are
HP,SUN,...,VNC),theycurrentlydonotyetaddressallrequirementssufficiently.
Forinstance,immersiveenvironmentscannotbeusedwiththisapproachyet.

7.3VisualizationandSteeringonMultipleClients

Themodulesbuildingthecomputationalsteeringframeworkarestrictlycoupled
andimplementedwithinterfacesthatallowcombiningthemandinteractingwith
eachother.Therefore,itispossibletoexchangemodulesorinsertadditional
modulesashasbeendoneinBorrmann(2007).Thisexampleimplementedan
extensiontoiFluidstoallowforcollaborativeengineering.
AsitisshowninFigure7.7acollaborationserverhasbeeninsertedbetween
thesimulationmasterandthevisualizationandsteeringclient.Afterextending
thecommunicationmodule(COM)ontheclientsideforsomeadditionalcollab-
orationparameters,allothermoduleswerereused.Thecollaborationserveris
designedinawaythatenablestheframeworktobeconnectedtodifferentsim-
ulationserversatatimeandtoattachordetachanarbitrarynumberofclients
duringacollaborativesession(seealsoBorrmannetal.(2006)).

Figure7.7:Forcollaborativeengineeringacollaborationserverhasbeenin-
sertedintotheoriginallayoutshowninFig.7.2.Inthisway,multi-client
possible.madeissteeringcomputational

8Chapter

StudyGeneralApplicability—ACase

Thischapterinvestigatestheapplicabilityofacomputationalsteeringenviron-
mentforthestudyofindoorventilationproblems.Tothisend,atestcasehas
beensimulatedasabatchjobonagridwithhighresolutiontoserveasahigh-
accuracyreference.Itisthencomparedwithsimulationson(increasingly)coarser
gridssuchastheresolutionsusedininteractiveonlinesimulations.Thisap-
proachhasbeentakentohelpdeterminetheminimumgridresolutionneeded
forapreliminarystudyorestimationofanindoorairowproblemandwhichis
themaximumgridresolutionthatcanbehandledinteractively.Thetestcasein
thisstudyhasbeenthesimulationofaventilationsystemasitisusedinareal
setupofanoperatingroomattheKlinikumRechtsderIsarinMunich.

8.1VentilationSystemsofOperatingRooms

oftheModernworkingoperatingplaceroomsand,setinhighparticularrequir,intheementsvicinitywithrofegarthedtopatienaseptictandtheconditionsoper-
ationsite.Especiallyforoperationslastingseveralhourstheriskofthepatients
thewoundcompletbecomingelybacteriainfectedfreebybacteriaoperatingrmustoombeisrnoteducedfeasibleasmuchforastechnicalpossible.reasons,Since
aventilationsystemisusedforapplyingfreshlyfilteredabacterialairontothe
airpatientsshouldbewounddirtoectedpushstraighttheroomsontotheairaside.woundsoTherasefortoe,avoidthestranyeamofcontactfilterwithed
humanbeingsoroperatingfacilitiesasfaraspossible.
AttheKlinikumRechtsderIsar,twooperatingroomswithdifferentventila-
tionandOPsystemsII)arehaveofrbeenoughlyinspectedsimilarandsize,modelednamely6.3maccorxdingly6.25m.xThe3.5mtworandooms5.45m(OPxI
ure6.25m8.3xand3.1m.8.4showFigurea8.1close-upand8.2oftheshowdifferentphotographsventilationofthesefacilitiesrooms,ofeachwhileroom.Fig-

89

90

Figurenteringefr8.1:omOPtheI:ceiling.ThisTheoperatingrphotographoomis

starts.operationan

General

equippedtheshows

lityApplicabi

A

Case

Study

withstandardasetupventilationjustsystembefore

facilities

entering

the

of

systems

arrangement

ventilation

has

classical

room

this

I

howss

ebefor

an

operation.

8.2:

Figure

OP

also

to

contrast

In

II:

oneomfr

photographThewall.

OP

91

8.1.

entilationV

Systems

of

Operating

Rooms

92

eFigur

8.3:

oundaring

OP

the

adjustedisair3/h.m2780of

of2780m

/h.

I:

woT

arofea

0.24to

ectangularr

General

ventilation

lityApplicabi

inlets

ear

placed

at

A

Case

the

Study

ceil-

theoperatingtable.Thevelocityoftheinstreaming

m/swhichcorrespondstoa

airventilated

volume

8.1.

entilationV

eFigur

Systems

8.4:

OP

II:

of

Operating

Almost

the

lationinlet.Theairisblown

to

a

ventilation

volume

of

Rooms

whole

awith3m14035/h.

14035

m

wall

is

used

velocity

/h.

of

as

the

0.28

horizontal

m/s

venti-

espondingcorr

93

94

GeneralApplicability—ACaseStudy

theDuringasepticanzone.operationMoreoverthr,eeattoleastfiveonepeopletablearewithusuallysurgicalworkinginstrumentstogetherisplacedwithin
attheendofthepatientstable.Becauseofthemanypeopleinvolvedand,addi-
tionallyventilation,theand,lightingthereforfacilitiese,theplacedbacteriaclosetotheconcentrationoperativeatthesitus,woundtheisimpactnotofeasilythe
predictable.TypicaloperationscenesareshownforbothroomsinFigures8.5and
8.6

Figure8.5:AstandardsceneduringanoperationinOPI.Here,amobileven-
tilationfacilityisusedinadditiontotheceilingventilation.

8.1.

entilationV

eFigur

Systems

8.6:

positionthe

optimal

A

of

Operating

standard

scene

ofoperationthe

ventilation.

Rooms

ingdur

table

an

with

operation

the

in

patient

OP

can

II.

be

As

opposed

eelyfr

to

placed

OP

for

I

an

95

96

GeneralApplicability—ACaseStudy

8.2SimulationStudieswithVaryingGridResolutions

rTheespondingtwooperatingboundaryrooms,conditionsOPI1.andByOPII,doinghaveaseriesbeenofmodeledsimulationswiththeirwithcordif--
ferentgridresolutions,theminimumresolutionhasbeendeterminedwhichis
stillneededrtounnablestillasallowaninteractivephysicallysimulation.meaningfulprFiguresedictions8.7andbut,8.8attheshowsamethetime,modelsis
oftheoperatingroomsIandIIduringanoperation,whileFigures8.9and8.10
givethesimulationresultsofthesesscenes.

Figure8.7:ThissnapshotshowsthesimulatedsurgerysceneinOPI.

AftersimulatingtheoperationroomsOPIandOPIItheconfigurationofOPII
withtions.theTheadditionalsimulationseriesventilationshowsdevicethatalrhaseadybeenarrunesolutionwithdifofferonlyentgridr126x70x125esolu-
voxelsresultsinnon-negligibledeviations.Foraresolutionof180x100x179,i.e.,
onegridpointevery3.5cm,theresultsagreequitewellwiththeresultsofthe
referencesolution.After2500to3000Lattice-Boltzmanntimestepstheresultsos-
cillatearoundthesteadystate.Nevertheless,thequalitativebehavioroftheow
canalreadybepredictedafterabout1500timesteps.
Runningthissetupattheminimumresolutionof180x100x179gridpointson
theSGIAltix3700(connectedtoastandardlaptopforsteeringoverGigabitEth-
1Thegeometryoftheoperatingroomisbasedonmodelstakenfromthearchives
www.turbosquid.com,www.3dcafe.com,andwww.lightscape.com/VRML/lib/or.wrl

8.2.SimulationStudieswithVaryingGridResolutions

Figure8.8:Here,thesimulationmodelforOPIIduringasurgeryisshown.

97

ernetBoltzmannasdonestepsinthewouldarrivebenchmarksattheinChaptervisualization7),aanddatasteeringupdatefrafterontend60withLattice-a
rateof1frame/s.Therefore,1500timestepswouldtakeabout25secondstocom-
putetime,andwhenthethefirstfirstreliableprmodificationsedictionstothecanbesetupmade.areThisstartingalsotomarksmakethesense.pointAf-in
teranadditional25secondsor3000Lattice-Boltzmannsteps,theresultsallowto
predicttheprincipleow.
(suchTakingasLRZsintoSGIaccountAltixthat4700eventogethermorewithpowerfultheirnewsystemsarevisualizationalreadyclusteravailableand
adedicated10GigabitEthernetconnection),theseresultsseemquitesatisfactory
frtionominanthisengineeringchapteristhatstandpoint.acomputationalAccordingly,steeringthebottomapplicationlineofsuchtheastheinvestiga-one
developedandpresentedinthisworkcan,infact,beahelpfulsupplementinthe
engineeringpracticetothecurrentstateoftheart.

98

General

lityApplicabi

A

Case

Study

Figure8.9:ThisfigurepresentsthesimulationresultsforOPIIforagridres-
olutionof545x625x310.Thestreamlinesdisplaytheuidstreamingfromthe
ventilationinletonthebackwalltotheajardoorinthefront.Someoftheven-
tilationarrivesatthepatientssitus,butthemainuidstreamowsaroundthe
zone.gerysur

8.2.

Simulation

Studies

with

aryingV

Grid

Resolutions

theFigureceiling.8.10:ItThecanbeupperseeninsnapshotthissetupshowsthatthethefilterventilationedairinisOPIdeectedcomingthrfroughom
thelampsand,accordingly,thepatientswoundisnotaswellventilatedasin
OPdeviceII.hasThebeenbottominserted.snapshotThisreprdeviceesentscanabescenefreelywhereadjustedanasadditionalneededand,ventilationevi-
dently,isabletocompensatethemissingairowontotheoperationsitus.Both
simulationswererunwithagridresolutionof630x625x350.

99

Chapter9

UniversalComputationalApplicabilitySteeringofFrameworkiFluids—

ThischaptershowsthecomputationalsteeringframeworkunderlyingiFluidsbe-
ingappliedonacompletelydifferentfieldofapplication,thusdemonstratingits
capabilityandsuitabilitytohandleawholevarietyofcomputationalengineering
problems.Inthiscase,theframeworkwasusedforinteractivebloodowsimula-
tionwithrespecttovascularreconstruction,whichhasbeendoneincooperation
withtheInstituteforComputationalScienceattheUniversityofAmsterdam.

V9.1reconstructionascular

AversityspecialoffieldAmsterofdamexpertiseistherofesearthechInstitonuteforcomputationalComputationalhemodynamicsScienceatastheneededUni-
forgicalvascularoperationrseconstrlikeuctionaddingshunts,simulations.bypassesVascularandrplacingeconstrstentsuction(inincludesthecasesurof-
aneurysm)orapplyingthrombolysistechniques,balloonangioplasty,bypasses,
etc.forastenosis.Tofindthebesttreatmentisfarfromtrivialandasimula-
tionsupplementtooltotosupportclassicaltheapprverificationoaches.ofInthecollaborationoperationwithplanthismayserveinstituteastheacom-good
putationalframeworkwasappliedtothiskindofsimulation.
Toevaluatetheexibilityofthecomputationalsteeringframeworktheaim
anwastoartery.allow,Again,intheprinciple,underlyinganonlinesimulationsimulationkernofelbloodwasowbasedonwithintheapartLattice-of
Boltzmannmethodbutdidnotincorporatedetailsofhemodynamicsasdescribed
(2003).Artoliin

9.2ExtensionsofiFluidsforBloodFlowSimulations

Intheadaptionoftheoriginalstandarduidsimulationtobloodowsimulation,
thelayoutofiFluidscouldbecompletelyreused.Onlytwocentralchangeshad
tobemade.Firstly,thegridgeneratorneededtobeextendedtoprovideafilling

100

9.2.ExtensionsofiFluidsforBloodFlowSimulations

101

sincealgorithmincontrastwhichtoalsothesetsindoorvoxelsinsimulationstheinteriortheofbloodanowobject.simulationThiswastakesnecessaryplace,
withintheobjects,.i.e.thearteriesandshunts,addedtothescene.Secondly,due
toowsimulationsimulationeftheficiencygridthepointsdatamostlayoutoftenneededaretoonlybesparselyadapted,becausepopulatedin—acellsblood
representingbloodrepresentonlyafractionintherangebetween5%to15%of
thescenesboundingboxvolume(see9.1).

Figure9.1:Ontheleftthisfigureshowsthetriangulatedsurfaceoftheaorta.
Ontherightitsvoxelrepresentationisdepicted.Theunderlyinggridresolution
was433x206x126and127432voxelsweresetasboundarynodes(including
boundaryconditionslikeinletandoutlet)and403734uidnodesintheinterior.
Therefore,thepercentageofrelevantgridnodesis4.73%.Thevoxelizationwas
donein0.68secusingasingleCPUonanSGIAltix4700(1.6GHz).

Toreplicatetheobstructionofabloodvesselanditssurgicalremedytwosim-
pleformsofinteractionduringthebloodowsimulationaresupported.Onthe
onehandblockerstotheinteriorofthearteryfornarrowingorcompletelyblock-
ingtheowcanbeaddedtothescene.Ontheotherhandartificialbypassescan
beaddedorremovedfromtheartery,ascanbeseeninFigure9.2.

102

UniversalApplicabilityofiFluids—ComputationalSteeringFramework

Figure9.2:Thisfigureshowsthreescreenshotstakenduringaninteractivesim-
ulationsession.Onthelefttheunmodifiedarterywithbloodowinitsinterior
isshown.Inthemiddlepictureonebranchoftheaortaisblocked.Finally,on
therightabypassisinsertedtosupplytheblockedpartofthearterythroughthis
vessel.artificial

GenerationGridprObjectsetedasaddeduidtoaobstacles,scenewhichduringmayanhaveindoorboundarysimulationarconditionse,byontheirdefinition,surfacesinter-
suchtabletopas,e.g.,ventilatora.personalTherfore,computertheinteriorasaofheatthesourceobstaclesorcanvelocitybeneglected.boundariesHow-ofa
ever,incaseofthebloodowsimulationespeciallytheinteriorofthearteryis
oftheintergridestgandeneratorthesurrmarksalloundingnodesnodesoftheareoutsideneglected.oftheToarterycopeaswiththisneglectable,situationthe
asboundaryuidnodes.nodesAnwithimportanttheircorrextensionespondinginthisboundaryrespectistheconditions,capabilityandtheofinterinterior-
thesectionsaddingofofobjectsabypasswitheachsuchthatother.theThis,forboundaryinstance,nodesisoftheneededbypasswhenwhichmodelingex-
tendintotheaortaareremovedonbothsidesandtheaortaisopenedwherethe
higherbypassenterspriorityit.andWhenaretheraddingeforeanotblockerreplacedthesebyuidboundarynodes.nodesThisarespecialattributedforma
ofintersectionfunctionalityisdemonstratedinFigure9.3forasmallerpartof
thepurposeaorta(seeshownFigurine9.4).Figure9.1,whichhasbeenextractedandscaledupforthis

iFluidsofExtensions9.2.

SimulationsFlowBloodfor

Figure9.3:Thesetwofiguresshowtheboundaryconditionsforasmallpartof
theaorta.Intheupperparttheoriginalaortacanbeseen,whileinthelowerpart
acolorbypasstothehasbeenoutlet.added.FluidThenodesredarecolorcolorrefersedtoblue,thetheinletwallconditionnodesandblack,theandgrtheeen
neglectablenodesoutsidethearteryaredepictedingrey.Thetriangulatedaorta
surfaceisshowninFigure9.4

Figure9.4:Thisfigureshowsasmallpartoftheaortawithstreamlinesvisual-
izingtheinteriorbloodow.

103

104

UniversalApplicabilityofiFluids—ComputationalSteeringFramework

LayoutData

Sinceanarteryusuallyisarangystructurewithmultipleandincreasinglydeli-
catebranches,thefractionofuidnodeswithintheboundingboxrangesapprox-
imatelybetween5%to15%.Thisfactallowsaspecialformofoptimizationofthe
simulationkernelandhasbeenimplementedasanadditionaloption,whichskips
ofalluidnon-uidandnodesboundaryandnodesperformsonly.theLattice-Boltzmannupdateonaliststructure
Additionally,thenetworktrafficbetweenthesupercomputerandanexternal
nodesvisualizationandtheirandlocationsteeringwithiterminalntheisrsimulationeducedbyfield.onlyAssttransferringatedinthese(Artoli,relevant2003),
awithinReynoldsanabdominalnumberofaorta.apprToroximatelyesolve500thisiskindsufofficientowforbloodsituationowforthesimulationexem-
plaryarteryshowninFigure9.1,amoderatelyfinegridissufficient.Inthiscase
thetransfereddatavolumecausesnobottleneck.Thedescribedaortasimulation
couldformedbeonrunaonlaptopanSGI(IntelAltix2.133700,GHzprwhileocessorthestewitheringATIandMobilityvisualizationFireGLwasV5000).per-
Interactionwiththesimulationanditsresponsewas,bysubjectiveimpression,
uent.andswift

10Chapter

Summary

aInthiscomputatiothesisnalthecentralsteeringaspects,frameworkprhaveoblems,beenandelaboratedprincipleappon.rInoachtoparticularr,ealizingthe
utilizationtechniquesoflikesuperimmersivecomputersVforirtual-RealityuidEnvirsimulationsonmentsandforhigh-endaremotevisualizationvisualiza-
tionandsteeringfrontendhavebeencoveredindetail.Ithasbeenshownthat
withintheiFluidscomputationalsteeringframeworkauserisenabledtointer-
ractestartingwiththeit.Thesimulationmainfeaturduringeitsdistinguishingexecutionitwithoutfromtheallneedcurrofentlyinterrknownuptingcom-or
totheputationalgeometricalsteeringapprlayout.oachesItisisitspossiblepowerfultoloadinteractionarbitrarypossibilitygeometrieswithrexportedegard
frpromeparations.CADsystemsOtherorapprsimilaroaches,softwarsofare,fromsupporttheonlyfilesystempredefinedwithoutanyparameterizedspecial
objects—ifgeometrycanbechangedsignificantlyatall.Besidesinteractingwith
thegeometry,theusercanmodifyowparameters,defineneworchangeexist-
ingboundaryconditionsduringruntime.Foradvancedusersorbenchmarking
Bypurposesusingevenanintegratedoptimizationfront-endoptionsforcanbesimultaneouschangedduringrvisualizationuntime.andsteering
activeaugmentedsimulationthroughtoolaVbecomesirtual-RealityintuitiveEnvirandonmentallowsthetousagequicklyofgainthiskindinsightsofintereven-
phenomena.owcomplexstudyingwhenmannThemethodsimulationandhaskernelbeenunderlyingoptimizedthefortwoframeworktypesisofbasedsuperonthecomputersthatLattice-Boltz-have
beennamelytheavailableHitachiattheSR8000LeibnizandtheComputingSGIAltixCenter37001.duringThesethetimemachinesofrthiseprthesis,esent
theoppositionalarchitecturesofapseudo-vectorandashared-memorysystem,
.espectivelyrWhenrunningthesimulationontheSGIAltixandthevisualizationonan
externalgraphicsworkstation,bothconnectedviaGigabitEthernet,anupdated
ison,datasetthetypicallyperformancecouldberachievableeceivedwithandthevisualizedSR8000everyconnected4seconds.witharInemotecomparvi--

1ThecodehasalsobeentunedforperformanceontheLRZLinuxCluster,wheretheapplica-
tionframeworkhasbeenmainlydevelopedon.

105

106

Summary

sualizationwasonlyabout25%ofthisvalue,eventhoughtheperformanceof
thekernelrunningofinelayatabout60%.Itturnedoutthatthereasonfor
thisfindingwasduetoanetworkbottleneckbetweenvisualizationandcompu-
tationinsofarasHitachisoutgoingnetworkconnectioncouldonlyprovide230
bandwith.maximumMBits/secThenewsystemsetupattheLRZconsistingofanSGIAltix4700andavisu-
alizationsystembasedonaSUNx4600Multi-CoreOpteronsystemconnected
throughadedicated10GigabitEthernetinterfacepromisesanenormousperfor-
mancegainforourcomputationalsteeringapplication.Justthesame,dataup-
datesevery4secondsseemsalreadyquitesatisfactoryascomparedtotheusual
waitingtimesof(several)minutesorevenhoursforgeneralpurposecommercial
codes.Theapplicabilityofthepresentedcomputationalsteeringapplicationhasbeen
testedbysimulatingrealoperatingroomsmodeledaftertworoomsattheKlini-
kumRechtsderIsarinMunich.Thesehavebeensimulatedwithaveryfinegrid
resolutiontoobtainareferencesolutionforcomparisonwithinteractivesimu-
lationsonvaryingdiscretizationgridsizes.Itwasfoundthatalreadyformoder-
atelyfinegrids(e.g.,aroomof6mx6mx3.5m,discretizedtoonegridpointevery
3.5cm)agoodqualitativeestimationoftheowcanbemadeafter50seconds(us-
ingavisualizationlaptopconnectedtotheSGIAltix3700viaGigabitEthernet).
Thisenablesanengineertoquicklytestseveralsetupsandexperimentwiththem
interactivelyduringthesimulationwithinonlyashortamountoftime.Afterex-
aminingtheprincipleuidbehavior,afewcarefullyselectedtestcasescanberun
additionallyinamoredetailedofinesimulationforquantitativeanalysis.
ToevaluatetheexibilityoftheiFluidscomputationalsteeringframeworkit
ishasablebeentoapplied(partially)blockexemplarilyantoarteryaorsimpaddlifiedbypassesarteryduringsimulation,simulationwherertheuntime.user
Afteronlyafewadaptationsinteractivesimulationscouldbeperformedforad-
equategridresolutionsusingtheSGIAltix3700andalaptop(Intel2.13GHz
processorwithATIMobilityFireGLV5000)forvisualization.
Hopefully,thisthesiscouldshowthatacomputationalsteeringapplicationas
developedwithinthisworkcanbeavaluableenrichmentforengineersduring
theconstructiondesignphase.However,therearestillsomeopenrequirements
withrespecttocomputationalsteering.Tobeabletouseacomputationalsteer-
ingapplicationlikeiFluidsinreallifeonthesupercomputerorcluster,thecor-
respondingnecessaryresourcesonthemachinemustbeavailableforexclusive
interactiveaccessduringanengineersworkingtime.Especiallyinthecaseof
aareconcurravailableentatthecollaborativetimeofsession,appointment.itisThisindispensablecouldbethat(andtherpartiallyequiredisralresouready)ces
realizedthroughanenhancedschedulingsystemofferingthepossibilityofre-
servingresourcesforacertaindayandtimeofanappointment.Thisfeatureis
referredtoasadvancedreservationwhich,understandably,isonlyhesitantly
usedincomputingcentersandnotmadepubliclyavailableoraccessibletothe
.usergeneralFurthermore,whenworkingwithlargecomputationalgridsandcorrespond-

107

inglylargevisualizationdatasets,thecommunicationbetweencomputation,vi-
sualizationandsteeringfront-endcanbecomeabottleneck,especiallybetween
remotesites.Tocounterthisnetworkbottleneck,somesortofmiddlewarewould
beneeded,whichsupports(remote)visualizationonspecializedhardwarethatis
connectedtothesupercomputerpowerfullyenoughandtransfersprecomputed
visualizationdatatotheclient—perhapsasasimplevideostream.Twofinalre-
quirementsinthisrespectwouldbethatalsoVirtual-RealityEnvironmentswith
theirspecialtypesofinputdeviceandcollaborativeengineeringwithindepen-
dentviewsontothesimulatedscenearesupported.Firstproductsorprojectsin
thisdirection(SGIOpenVizServer,IBM,HP,SUN,...,VNC)havestartedandare
becomingavailablealready.However,theycurrentlydonotyetaddressthese
requirementssufficiently.
FuturedevelopmentsofiFluidswillinvestigatethesecurrenttechniquesofre-
motevisualizationinmoredetail.Itwillbeinterestingtofindout,howthey
couldhelptofurtherimprovethecomputationalsteeringexperienceoratleast
mentioned.issuesthesolvepartiallyToextendthefieldofapplicationforindoorsimulations,thecomputational
kernelofiFluidswillbeimprovedbyusingamoredetailedphysicalmodel.In
particular,thiseffortwillbecontinuedwithintheresearchprojectComfSim(SIE-
MENSAG,2006),whichaimsatdevelopinganinteractiveCFDenvironmental-
lowingananalysisoflocalthermalcomfortbyutilizinghigh-performancesuper-
computingfacilitiesandVirtual-Realitytechniques.

Bibliography

Abrams,M.,Allison,D.,Kafura,D.,Ribbens,C.,Rosson,M.B.,Shaffer,C.,and
(2007).http://research.cs.vt.edu/pse/intro.htmlL.atson,WVol.Akenine-Moeller6(1):pp.,29–33T.Fast(2001).3Dtriangle-boxoverlaptesting.JournalofGraphicsTools,
Moller/Akenine-Moeller(2007).,T.http://www.cs.lth.se/home/TomasAkenine
Allard,plications.J.andInRafVRfin,06:B.ProceedingsDistributedofthephysicalIEEEVbaseirtualdRealitysimulationsConferforencelarge(VRVR2006)ap-,
7.p.12.IEEEdoi:http://dx.doi.orComputerSociety,Wg/10.1109/VR.2006.53.ashington,DC,USA(2006).ISBN1-4244-0224-
(2007).http://www.answers.com/topic/riscanswers.com.Artoli,vanA.AmsterM.damMesoscopic(2003).ComputationalHaemodynamics.PhDthesis,Universiteit
(2007).http://www.avs.comInc.VSABella,namicG.,Filippone,Lattice-BoltzmannS.,Rossi,code.N.,InandPrUbertini,oceedingsS.ofUsingEWOMPOpenMP2002on(2002).ahydrody-
Bellemann,R.InteractiveExplorationinVirtualEnvironments.PhDthesis,Univer-
(2003).damAmsterofsityBenzi,andR.,applications.Succi,S.,andPhysicsVerReportsgassola,,VM.ol.The222:pp.lattice145–197Boltzmann(1992).equation:theory
icalBernsdorf,simulationJ.,Harrison,ofclottingS.E.,prSmith,ocesses:S.aM.,latticeLawford,PBoltzmann.V.,andHose,applicationD.R.inNumermedical-
physics.Math.Comput.Simul.,Vol.72(2-6):pp.89–92(2006).ISSN0378-4754.
Bhatnagar,P.,Gross,E.,andKrook,M.Amodelforcollisionprocessesingases.
PhysicalReview,Vol.94(3):pp.511–525(1954).
Biermann,G.andKalze,F.-J.Helios-computeraidedlighting,thepathfrom
simulationtoprototype.In29thISATAConference.Florenz,Italy(1996).ISBN
0-7803-8431-8.

108

BIBLIOGRAPHY

109

Borrmann,A.Computerunterst¨utzungverteilt-kooperativerBauplanungdurchIn-
tegrationinteraktiverSimulationenundr¨aumlicherDatenbanken.PhDthesis,
Lehrstuhlf¨urBauinformatik,TUM¨unchen(2007).

Borrmann,A.,Wenisch,P.,vanTreeck,C.,andRank,E.Collaborativecom-
putationalsteering:PrinciplesandapplicationinHVAClayout.Integrated
Computer-AidedEngineering(ICAE),Vol.13(4):pp.361–376(2006).

Brtionalodlie,K.,steeringWood,onJ.,theDuce,grid.D.,InUKandSagare-Science,M.AllgVHandsiz:VMeetingisualization,pp.54and–60computa-(2004).
1-904425-21-6.ISBNBrooke,Porter,J.AE.,.R.Coveny,PComputational.V.,Harting,steeringJ.,inJha,S.,RealityGrid.Pickles,S.InM.,UKPinning,e-ScienceR.AllL.,Handsand
(2003).MeetingBryson,tions,VS.ol.and12(4):pp.Levit,C.25–34The(1992).virtualISSNwindtunnel.0272-1716.ComputerGraphicsandApplica-
Chen,tionsH.,usingChen,aS.,lattice-gasandMatthaeus,BoltzmannW.H.method.RecoveryPhysicalofReviewtheA,NavierVol.-Stokes45:pp.5339–equa-
(1992).42Chen,ReviewS.ofandFluidDoolen,MechanicsG.D.,Vol.Lattice30:pp.Boltzmann329–364(1998).methodforuidows.Annual
psa-ConceptCar.design-centre/design-centre-2.php(2007).http://features.conceptcar.co.uk/
(2007).http://www.hlrs.de/organization/vis/coviseCovise.Crouse,thesis,B.LehrstuhlLattice-Boltzmannf¨urStrBauinformatik,¨TUomungssimulationenM¨unchenauf(2003).Baumdatenstrukturen.PhD
Crtiveouse,owB.,Krafczyk,simulations.M.,T¨olke,InternationalJ.,andJournalRank,ofE.AModernLB-basedPhysicsBappr,Vol.oachfor17(1-2):pp.adap-
(2003).109–112(1984).Diederichs,C.J.KostensicherheitimHochbau.DeutscherConsultingVerlag,Essen
(2007).http://www.ddm.orgDecompostion.DomainDonath,ContemporaryS.OnArOptimizedchitectures.ImplementationsBachelorsofThesistheLattice(2004).BetrBoltzmann.Wellein,MethodHageron,
Deserno..F,ZeiserDowd,Inc.,K.andSebastopol,Severance,CA,USAC.High(1998).ISBNPerformance156592312X.Computing.OReilly&Associates,

110

BIBLIOGRAPHY

Eberly,D.3Dgameenginedesign:apracticalapproachtoreal-timecomputergraphics.
MorganKaufmannPublishersInc.,SanFrancisco,CA,USA(2000).ISBN1-
55860-593-2.(2007).http://www.exa.comCorporation.ExaFreudiger,S.,Hegewald,J.,andKrafczyk,M.Aparallelizationconceptfora
multi-physicslatticeBoltzmannprototypebasedonhierarchicalgrids(submit-
2007).tedFrisch,U.,Hasslacher,B.,andPomeau,Y.Lattice-gasautomatafortheNavier-
Stokesequation.PhysicalReviewLetters,Vol.56:pp.1505–1508(1986).
Georgii,J.andWestermann,R.Interactivesimulationandrenderingofheteroge-
neousdeformablebodies.InVision,ModelingandVisualization2005(2005).
FlensburgerSchiffbauGesellschaft.http:www.fsg-ship.de(2007).
Gibson,S.F.Beyondvolumerendering:Visualization,hapticexploration,and
physicalmodelingofvoxel-basedobjects.InVisualizationinScientificComput-
ing95,pp.10–24.Springer-Verlag,NewYork(1995).
Ginzburg,I.andSteiner,K.LatticeBoltzmannmodelforfree-surfaceowandits
applicationtofillingprocessincasting.J.Comput.Phys.,Vol.185(1):pp.61–99
0021-9991.ISSN(2003).Gottschalk,S.,Lin,M.C.,andManocha,D.OBBTree:ahierarchicalstructure
forrapidinterferencedetection.InSIGGRAPH96:Proceedingsofthe23rdan-
nualconferenceonComputergraphicsandinteractivetechniques,pp.171–180.ACM
Press,NewYork,NY,USA(1996).ISBN0-89791-746-4.
G¨otz,J.NumericalSimulationofBloodowinAneurysmsusingtheLatticeBoltz-
mannMethod.Mastersthesis,Lehrstuhlf¨urInformatik10(Systemsimulation),
Friedrich-Alexander-Universit¨atErlangen-N¨urnberg(2006).
Gropp,W.andThakur,R.AnevaluationofimplementationoptionsforMPIone-
sidedcommunication.InLectureNotesinComputerScience:RecentAdvances
inParallelVirtualMachineandMessagePassingInterface,Vol.3666,pp.415–424.
(2005).SpringerHager,G.,Deserno,F.,andWellein,G.Pseudo-vectorizationandRISCoptimiza-
tiontechniquesforthehitachiSR8000architecture.InHigh-PerformanceScien-
tificandEngineeringComputing,Munich2003(2003).ISBN3-540-00474-2.
Haines,E.andWallace,J.Shaftcullingforefficientray-tracedradiosity.InEuro-
graphicsWorkshoponRendering.Springer-Verlang,Berlin,Germany(1994).
Hartmann,H.DetailedSimulationsofLiquidandSolid-LiquidMixing:Turbulent
agitatedowandmasstransfer.PhDthesis,TechnischeUniversiteitDelft(2005).

BIBLIOGRAPHY

111

Haumont,D.andWarzee,N.Completepolygonalscenevoxelization.Journalof
GraphicsTools,Vol.7(3):pp.27–41(2002).
Haydock,D.andYeomans,J.M.LatticeBoltzmannsimulationsofattenuation-
drivenacousticstreaming.J.Phys.A:Math.Gen.,Vol.36:pp.5683–5694(2003).
He,X.,Chen,S.,andZhang,R.AlatticeBoltzmannschemeforincompressible
multiphaseowanditsapplicationinsimulationofRayleigh-Taylorinstabil-
ity.J.Comput.Phys.,Vol.152(2):pp.642–663(1999).ISSN0021-9991.
He,X.,Zou,Q.,Luo,L.-S.,andDembo,M.Analyticsolutionsofsimpleows
andanalysisofnonslipboundaryconditionsforthelatticeBoltzmannBGK
model.JournalofStatisticalPhysics,Vol.87(1-2):pp.115–136(1997).ISSN0022-
(Online).1572-9613(Print)4715Heinzlreiter,P.Interactiveresultvisualizationonthegrid.InGridComputingfor
ComplexProblems,pp.20–21.VEDA,VEDA(2005).ISBN80-969202-1-9.
(2007).http://www.hella.comKG.HellaHirabayashi,M.,Ohta,M.,R¨ufenacht,D.A.,andChopard,B.Characteriza-
tionofowreductionpropertiesinananeurysmduetoastent.Phys.Rev.
E,Vol.68(2):p.021918(2003).
http://www.intel.com/software/products/cluster/intel.(2007).tcollector/index.htmJohnson,C.R.andParker,S.G.Acomputationalsteeringmodelforproblemsin
medicine.InSupercomputing94,pp.540–549.IEEEPress(1994).
Johnson,C.R.,Parker,S.G.,Hansen,C.D.,Kindlmann,G.,andLivnat,Y.In-
teractivesimulationandvisualization.IEEEComputer,Vol.32(12):pp.59–65
(1999).Jones,M.W.Theproductionofvolumedatafromtriangularmeshesusingvox-
elisation.ComputerGraphicsForum,Vol.15(5):pp.311–318(1996).
Kafczyk,M.DieGitter-Boltzmann-Methode:VonderTheoriezurAwendung.Habil-
tionsschrift,Lehrstuhlf¨urBauinformatik,TUM¨unchen(2001).
Keller,R.Personalcommunication(2005).
Kipfer,P.andWestermann,R.Realisticandinteractivesimulationofrivers.InGI
06:Proceedingsofthe2006conferenceonGraphicsinterface,pp.41–48.Canadian
InformationProcessingSociety,Toronto,Ont.,Canada,Canada(2006).ISBN
1-56881-308-2.Kolb,A.andJohn,L.Volumetricmodelrepairforvirtualrealityapplications.pp.
(2001).EUROGRAPHICS249–256.

112

BIBLIOGRAPHY

Kollinger,M.Definitionstr¨omungsmechanischerRandbedingungenf¨urinteraktive
CFDSimulationen.Diplomarbeit,Lehrstuhlf¨urBauinformatik,TUM¨unchen
(2007).Krumhauer,P.,Tsygankov,M.,Reich,C.,andEvgrafov,A.Efficientvolumeren-
deringusingoctreespacesubdivision.InVisualDataExplorationandAnalysis
VI,Vol.3643,pp.211–219.TheInternationalSocietyforOpticalEngineering
(1999).K¨uhner,S.VirtualReality-basierteAnalyseundinteraktiveSteuerungvon
Str¨omungssimulationenimBauwesen.PhDthesis,Lehrstuhlf¨urBauinformatik,
(2003).unchen¨MTUK¨uhner,S.,Rank,E.,andKrafczyk,M.Efficientreductionof3Dsimulationre-
sultsbasedonspacetreedatastructuresfordataanalysisinVirtualRealityen-
vironments.InAppliedVirtualRealityinEngineeringandConstruction.Goteborg,
(2001).SwedenLallemand,P.andLuo,L.-S.TheoryofthelatticeBoltzmannmethod:Acous-
ticandthermalpropertiesintwoandthreedimensions.PhysicalReviewE,
(2003).68(036706)ol.VLanfear,T.SR8000concept.http://research.ac.upc.edu/HPCseminar/
(2000).concept.pptSEM9900/SR8000LeibnizRechenzentrumM¨unchen.http://www.lrz-muenchen.de(2007).
vanLiere,R.,Mulder,J.D.,andvanWijk,J.J.In,pp.696–702(1996).
Luecke,G.andWang,Y.Sendingnon-contiguousdatainMPIprograms.Techni-
calReport,IowaStateUniversity(2005).
Marcheix,L.A3DUserInterfaceforaVirtualEnvironment.Diplomarbeit,Lehrstuhl
f¨urBauinformatik,TUM¨unchen(2004).
McCormick,B.H.,DeFanti,T.A.,andBrown,M.D.Spessialissueonvisualiza-
tioninscientificcomputing.ComputerGraphics,Vol.21(6)(1987).
MercuryComputerSystems,Inc.http://www.amiravis.com(2007a).
MercuryComputerSystems,Inc.http://www.tgs.com(2007b).
(2007).http://glaros.dtc.umn.edu/gkhome/views/metisMETIS.Mezrhab,A.,Bouzidi,M.,andLallemand,P.Hybridlattice-Boltzmannfinite-
differencesimulationofconvectiveows.ComputersandFluids,Vol.33:pp.
(2004).623–641M¨oller,T.andHaines,E.Real-timerendering.A.K.Peters,Ltd.,Natick,MA,USA
1-56881-101-2.ISBN(1999).

BIBLIOGRAPHY

113

(2007).forum.org/docs/docs.htmlhttp://www.mpi-um.MPI-For(2007).http://www3.niu.edu/mpiMPICH-G2.Mulder,J.D.,vanWijk,J.J.,andvanLiere,R.Asurveyofcomputationalsteering
environments.FutureGener.Comput.Syst.,Vol.15(1):pp.119–129(1999).ISSN
8.739X(98)00047-g/10.1016/S0167-doi:http://dx.doi.or0167-739X.Mundani,R.-P.HierarchischeGeometriemodellezurEinbettungverteilterSimulation-
saufgaben.PhDthesis,TechnischeUniversit¨atM¨unchen(2006).
Neuhierl,B.MehrfeldsimulationvonStr¨omungsvorg¨angen,
str¨omungsakustischenPh¨anomenenundSchallwellenausbreitungmitder
Lattice-Boltzmann-Methode(2006).EingeladenerVortrag,Lehrstuhlseminar,
Lehrstuhlf¨urBauinformatik,TUM¨unchen.
Noll,B.NumerischeStr¨omungsmechanik:Grundlagen.Springer-Verlag(1993).ISBN
3-540-56712-7.Norman,D.A.ThePsychologyofEverydayThings.BasicBooks,NewYork(1988).
0-465-06709-3.ISBNPacheco,P.S.ParallelprogrammingwithMPI.MorganKaufmannPublishersInc.,
SanFrancisco,CA,USA(1996).ISBN1-55860-339-5.
http://www.hlrs.de/organization/pds/projects/Pacx-MPI.(2007).mpipacx-Parker,S.G.,Miller,M.,Hansen,C.D.,andJohnson,C.R.Computationalsteer-
ingandtheSCIRunintegratedproblemsolvingenvironment.InDagstuhl97,
ScientificVisualization,pp.257–266.IEEEComputerSociety,Washington,DC,
0-7695-0505-8.ISBN(1999).USAPickles,S.M.,Haines,R.,Pinning,R.L.,andPorter,A.R.Computationalsteering
inRealityGrid.InUKe-ScienceAllHandsMeeting(2004).ISBN1-904425-21-6.
Pohl,T.,Th¨urey,N.,Deserno,F.,R¨ude,U.,Lammers,P.,Wellein,G.,andZeiser,T.
Performanceevaluationofparallellarge-scaleLatticeBoltzmannapplications
onthreesupercomputingarchitectures.InProceedingsoftheIEEE/ACMSC2004
Conference(SupercomputingConference04,Pittsburgh,06.-12.11.2004),pp.1–13
0-7695-2153-3.ISBN(2004).Qian,Y.-H.,DHumiers,D.,andLallemand,P.LatticeBGKmodelforNavier-
Stokesequation.Europhysicsletters,Vol.17:pp.479–484(1992).
Renambot,L.,Bal,H.E.,Germans,D.,andSpoelder,H.J.W.CAVEStudy:An
infrastructureforcomputationalsteeringandmeasuringinvirtualrealityen-
vironments.ClusterComputing,Vol.4(1):pp.79–87(2001).
Sanders,M.S.andMcCormick,E.J.HumanFactorsinEngineeringandDesign.
0-07-054901-X.ISBN(1993).McGraw-Hill

114

BIBLIOGRAPHY

Satofuka,N.andNishioka,T.ParallelizationoflatticeBoltzmannmethodfor
incompressibleowcomputations.ComputationalMechanics,Vol.23:pp.164–
(1999).171

Sch¨komplexenonung,B.E.BerandungenNumerische.SpringerStr¨-Vomungsmecerlag(1990).hanik:ISBNInkompressible3-540-53137-8.Str¨omungenmit

Schulz,ficiencyM.,ofCFDKrafczyk,M.,computationsT¨olke,J.,inandcomplexRank,E.geometriesParallelizationusingstrategiesLattice-Blotzmannandef-
methodsonhigh-performancecomputers.InHigh-PerformanceScientificandEn-
HPSECgineering,pp.Computing,Pr115–122(2002).oceedingsofthe3rdInternationalFORTWIHRConferenceon

UniversityofAmsterdam:SectionComputationalScience.http://www.
(2007).science.uva.nl/research/scs/index.html

Seidenschwarz,management.SchW.¨afNieferwieder-PoeschelzuVteuer!:erlag,10StuttgartSchritte(1997).zumISBNMarktorientierten3-7910-1019-0.Kosten-

Shan,pleX.phasesandandChen,H.components.LatticePhys.BoltzmannRev.E,modelVol.for47(3):pp.simulating1815–1819owswith(1993).multi-

SIEMENSPicturesofAG.theFuturInteraktive,e,issuelokalespring2006Komfortsimulation(2006).ineinerVR-Umgebung.

Sloot,P.M.A.,Tirado-Ramos,A.,Hoekstra,A.G.,andBubak,M.Aninterac-
tivegridenvironmentfornon-invasivevascularreconstruction.In2ndInter-
nationalWorkshoponBiomedicalComputationsontheGrid(BioGrid04)inconjunc-
tionwithFourthIEEE/ACMInternationalSymposiumonClusterComputingandthe
Grid(CCGrid2004).Chicago,Illinois,USA(2004).ISBN0-7803-8431-8.

Standish,R.Introductiontohighperformancecomputing.http://www.ac3.
(2006).intro/edu.au/edu/hpc-

Stolte,N.andKaufman,A.Noveltechniquesforrobustvoxelizationandvisual-
izationofimplicitsurfaces.Graph.Models,Vol.63(6):pp.387–412(2001).ISSN
1524-0703.

Succi,S.TheLatticeBoltzmannEquationforFluidDynamicsandBeyond.Oxford
UniversityPress(2001).ISBN0-19-850398-9.

Succi,JournalS.,ofAmati,StatisticalG.,andPhysics,Benzi,Vol.R.81:pp.Challenges5–16in(1995).latticeBoltzmanncomputing.

Sudhir,A.andKesavadas,T.Computationalsteeringofmanufacturingsteer-
ingAutomationusing,Vvirtualol.3,rpp.eality.In2654–2658.ICRASan00:Francisco,InternationalCA,USAConference(2000).onISBNRobotics0-7803-and
5886-4.

BIBLIOGRAPHY

115

Tamaki,Y.,Sukegawa,N.,Ito,M.,Tanaka,Y.,Fukagawa,M.,T.Sumimoto,and
N.Ioki.Nodearchitectureandperformanceevaluationofthehitachisuper
technicalserverSR8000.In12thInternationalConferenceonParallelandDis-
tributedComputingSystems,pp.487–493(1999).ISBN1-880843-9-3.
TheStanford3DScanningRepository.http://graphics.stanford.edu/
(2007).data/3DscanrepTh¨urey,N.PhysicallybasedAnimationofFreeSurfaceFlowswiththeLatticeBoltz-
mannMethod.PhDthesis,UniversityofErlangen-Nuremberg(2007).
Th¨urey,N.andR¨ude,U.FreesurfaceLattice-Boltzmannuidsimulationswith
andwithoutlevelsets.InMPIforComputerScience:VMV04Proceeding,pp.
199–208.MaxPlanckCenterforVisualComputingandCommunication(2004).
Th¨urey,N.andR¨ude,U.Technicalreportonturbulentfreesurfaceowswith
theLatticeBoltzmannmethodonadaptivelycoarsenedgrids.TechnicalRe-
port,Lehrstuhlf¨urInformatik10(Systemsimulation),Friedrich-Alexander-
Universit¨atErlangen-N¨urnberg(2005).
Th¨urey,N.,R¨ude,U.,andK¨orner,C.InteractivefreesurfaceuidswiththeLat-
ticeBoltzmannmethod.TechnicalReport,Lehrstuhlf¨urInformatik10(Sys-
temsimulation),Friedrich-Alexander-Universit¨atErlangen-N¨urnberg(2005).
T¨olke,J.Gitter-Boltzmann-VerfahrenzurSimulationvonZweiphasenstr¨omungen.
PhDthesis,Lehrstuhlf¨urBauinformatik,TUM¨unchen(2001).
T¨olke,J.ModellvergleichSurfwelle:Lattice-BoltzmannMethoden.InBerichtedes
LehrstuhlsundderVersuchsanstaltf¨urWasserbauundWasserwirtschaft,104,p.260
(2006).T¨olke,J.,Freudiger,S.,andKrafczyk,M.Anadaptiveschemeusinghierarchical
gridsforlatticeBoltzmannmulti-phaseowsimulations.ComputersandFluids,
V(2006).820–83035(8-9):pp.ol.(2007).http://www.top500.orgOP500.TvanTreeck,C.Geb¨audemodell-basierteSimulationvonRaumluftstr¨omungen.PhD
thesis,Lehrstuhlf¨urBauinformatik,TechnischeUniversit¨atM¨unchen(2004).
vanTreeck,C.,Rank,E.,Krafczyk,M.,Toelke,J.,andNachtwey,B.Extensionofa
hybridthermallbeschemeforLarge-Eddysimulationsofturbulentconvective
ows.ComputersandFluids,Vol.35:8-9:pp.863–871(2006).
vanTreeck,C.,Wenisch,P.,Borrmann,A.,Pfaffiger,M.,Wenisch,O.,andRank,E.
ComfSim-InteraktiveSimulationdesthermischenKomfortsinInnenr¨aumen
aufH¨ochstleistungsrechnern.Bauphysik,Vol.29(1):pp.2–7(2007).

116

BIBLIOGRAPHY

Wellein,performanceG.,forLammers,latticeP.,Hagerboltzmann,G.,Donath,applicationsS.,andonterascalZeiser,eT.Towarcomputers.dsInoptimalPar-
allel2005InternationalComputationalConferFluidenceonDynamics:ParallelCTheoryamdomputationalApplications,FluidPrDynamics,oceedingspp.of31–40the
(2006).

WhiteIII,J.B.andBova,S.W.Wherestheoverlap?AnanalysisofpopularMPI
implementations.InMPIDevelopersConference(1999).

wikipedia.fluidhttp://en.wikipedia.org/wiki/Computational(2007a).dynamics

Boltzmannhttp://en.wikipedia.org/wiki/Latticewikipedia.(2007b).

(filehttp://en.wikipedia.org/wiki/STLwikipedia.format)(2007c).

http://en.wikipedia.org/wiki/Symmetricwikipedia.(2007d).multiprocessing

Wilke,J.,Pohl,T.,Kowarschik,M.,andR¨ude,U.CachePerformanceOptimiza-
tionsforParallelLatticeBoltzmannCodes.LectureNotesinComputerScience
(LNCS)Vol.2790InProc.oftheEuroPar-03Conf.,pp.441–450.Springer(2003).

WAnolf-GladrIntrowoduction,D..A.SpringerLattice-Gas-VerlagCellular(2000).ISBNAutomataand3-540-66973-6.LatticeBoltzmannModels:

W¨ossner,U.,Becker,M.,andLang,U.Tangibleinterfacesforinteractiveow
simulation.InThe2ndRussian-GermanAdvancedResearchWorkshoponCompu-
tationalScienceandHighPerformanceComputing(2005).

Zou,Q.BoltzmannandHe,BGKX.Onmodel.pressurPhysicseandofFluidsvelocity,Vol.boundary9(6):pp.1591–1598conditionsfor(1997).thelattice

Acknowledgments

117

Thisthesisemergedduringmyworkasascientificassistantatthechairfor
BauinformatikattheTUM¨unchenandincooperationwiththeLeibnizComput-
ingCenter,Munich.MyresearchprojectwasfundedbytheCompetenceNet-
workforTechnical,ScientificHigh-PerformanceComputinginBavaria(KON-
WIHR).Inaddition,theHPCEuropeprogramsupportedaresearchvisitofsix
weekstotheUniversityofAmsterdam,Netherlands.

Firstofall,IwouldliketothankmypromotorProf.ErnstRankwhogaveme
theopportunitytoworkonthetopicofcomputationalsteeringandwritingmy
tivethesisatdiscussionhisinstitute.partnerHeandhasalwayssupportedbeenmeabygivingchallengingmetheand,frthereedomefore,toconstrintensifyuc-
myhavinginterbeenestsinabletovariousvisitarmanyeas.Ialsointernationalwanttoexprconferessences,mytogratitudemeetnumerwithrousegardinterto-
field.estingBesidespeople,randeseartoch,keepProf.myselfRankgaveupdatedmetortheecentopportunitydevolpmentstoinfurthermyreseardevelopch
ormyselfganizingwithanrespEliteecttoMastersmyprteachingogramandabilites,severalsupervisionotherdutiesoftostudentssmoothasthewellwayas
toIaowesuccessfulmanyandthanksefficienttoProf.PostdocUlrichperiod.R¨ude,mythesissecondexaminer.He
inspiredme—especiallyinthelastonetotwoyearsofmywork—inwhich
dirintoectioncomputerIwouldsciencelikertoeseardevelopch,whichmyself.hasHebecomealsoagavemainmeiaspectnterofestingmyinsightsthesis.
Inparticular,IwouldliketothankhimforarrangingthecontacttoProf.Peter
SlootsgroupinAmsterdam.
forIalsohostingwantmetoduringexpressmymyreseargratitudechvisittoinProf.AmsterPeterdamSlootatPrandof.AlfoSlootsnsHoekstraInstitute
forthingsfromComputationalanumberScienceofinterandestingforgivingpeopleinmehisthegroup.opportunitytolearnsomany
IlearningwouldliketoLattice-Boltzmannthankallthetheory.peopleForemost,whoIneedsupportedtorefermetoduringAlfonsthetimeHoekstra,of
leftwhowasjustexplainedalgebrato.meJosthewholeDerksen,storyJonasoftheTolke,LBMandtheoryThomas—andZeisereverythinghavealsothatbeenwas
veryForpatienttheirveryandkindhelpfulinsupportofansweringmymywork,whichquestions.causedextraeffortespecially
duetotheinteractivewayofusingsupercomputers,IwouldliketothankOliver
WMatthiasenisch,BrehmReinholdattheBader,LeibnizHelmutComputHellering,IreneCenterinGeiseler,Munich,LeonharasdwellasScheckWillemand
Vermin,LauraLeistikovandHuubStoffersattheSaraComputingCenterinAm-
sterCenterdaminandErlangen.ThomasIalsoZeiser,wantGeortogthankHagerT,andimothyGerharLanfeardWforelleintheattheinsightandComputingdis-
cussionsonHitachissupercomputerfeaturesandmanycluesonhowtoprogram
efficentcodeonthismachine.
tar,ILaurhaveentalwaysMarcheixenjoyedNikolaworkingCenic,withmyChristianstudents:Liefhold,DanielandAlfrMichaeleider,MihaKollingerGan-,

118

Acknowledgments

andIamthankfulfortheirsupportinteachingandimplementing.
mospherThankseattoourallmyinstitute.colleguesIespwhoeciallywerewantrtoesponsibleemphaziseforthethetruefriendlyfriendshipworkingIen-at-
likejoyedto(andexprstillesstoenjoy)ChristophwithmyvanrTroommateeeck,UliwhohasHeisserer.supportedSpecialmethanksasmyIgrwouldoup
leaderandhassharedhisexperienceinrunningLBMsimulations.
atTtheoenableKlinikumtherechtsmodelingderandIsar,rRainerunningofBurthegkartleadsimulationsmethroftheoughtheoperatingroomsroomsand
demonstratedtheinportanceoftheinstalledventilationsystems.Iwouldliketo
thankAlthoughhimfortherhisehaveenthusiasm,beengrmanyeatpeoplesupport,whoandhissupportedinterestme,inmythemostwork.impor-
tantbackinghasbeenmyhusbandOliver.Hewasmymaincooperationpartner
atprtheoblemsLeibnizandtotakeComputingcareofCenterthe,newestalwaysonsoftwartheespotupdates.tohelpHemealsowithsupportedhardwarmee
ingduringandconstrimplementing,uctivelectorwas.anexcellentdiscussionpartnerandmymostdiscern-

PublicationsofList

119

1.Wright,H.,Crompton,R.H.,Kharche,S.,andWenisch,P.:SteeringandVisu-
alization:EnablingTechnologiesforComputationalScience.FutureGeneration
(submitted).SystemComputer2.Wenisch,P.,vanTreeck,C.,Scheck,L.,andRank,E.:ComputationalSteering:
InteractiveFlowSimulationinCivilEngineering.Inside,Vol.5(2)(2007).
3.vanTreeck,C.,Wenisch,P.,Borrmann,A.,Pfaffinger,M.,Egger,M.,and
Rank,E.:ComputationalSteeringofThermalComfortPerception.In2ndGACM
ColloquiumonComputationalMechanics.TUM,Munich,Germany(2007).
4.vanTreeck,C.,Wenisch,P.,Borrmann,A.,Pfaffinger,M.,Egger,M.,and
Rank,E.:UtilizingHighPerformanceSupercomputingFacilitiesforInteractive
ThermalComfortAssessment.InProc.10thInt.IBPSAConferenceBuilding
(2007).ChinaBejing,Simulation.5.Wenisch,P.,vanTreeck,C.,Borrmann,A.,Rank,E.,andWenisch,O.:Com-
putationalSteeringonDistributedSystems:IndoorComfortSimulationsasaCase
StudyofInteractiveCFDonSupercomputers.InternationalJournalofParallel,
EmergentandDistributedSystems,Vol.22(4):pp.275–291(2007).
6.Wenisch,P.:InteractiveFluidSimulations:ComputationalSteeringonSupercom-
puters.InScienceandSupercomputinginEurope-report2006:pp.453-460
978-88-86037-19-8ISBN(2007).7.vanTreeck,C.,Wenisch,P.,Borrmann,A.,Pfaffinger,M.,Wenisch,O.,and
Rank,E.:ComfsiminteraktiveSimulationdesthermischenKomfortsinInnenr¨aumen
aufH¨ochstleistungsrechnern.Bauphysik,Vol.29(1):pp.2–7(2007).
8.vanTreeck,C.,Wenisch,P.,Borrmann,A.,Wenisch,O.,Kuehner,S.,Toelke,
J.,Krafczyk,M.,andRank,E.:ComputationalSteeringofLattice-Boltzmann
basedCFDSimulationsinVirtualReality(hlrbi).InResearchprojectsHLRBI
(2006).SR8000)(Hitachi9.vanTreeck,C.,Wenisch,P.,Borrmann,A.,Pfaffinger,M.,Egger,M.,Wenisch,
O.,andRank,E.:TowardsInteractiveIndoorThermalComfortSimulationIn
ProceedingsofECCOMASCFD06,EuropeanConf.onComputational
FluidDynamics,EgmondaanZee,TheNetherlands(2006).
10.vanTreeck,C.,Wenisch,P.,Borrmann,A.,Pfaffinger,M.,Wenisch,O.,and
Rank,E.:ComfsiminteraktiveSimulationdesthermischenKomfortsinInnen-
raeumenaufHoechstleistungsrechnern.InTagungsbandBauSIM2006,pp.
205–207.IBPSAGermany,M¨unchen,Germany(2006).ISBN978-3-00-019823-
6.11.Borrmann,A.,Wenisch,P.,vanTreeck,C.,andRank,E.:CollaborativeCom-
putationalSteering:PrinciplesandApplicationinHVACLayout.Integrated
Computer-AidedEngineering(ICAE),Vol.13(4):pp.361–376(2006).

120

PublicationsofList

12.Wenisch,P.,Wenisch,O.,andRank,E.:HarnessingHigh-PerformanceCom-
putersforComputationalSteering.InLectureNotesinComputerScience:Re-
centAdvancesinParallelVirtualMachineandMessagePassingInterface,
Vol.3666,pp.536–543.Springer(2005).
13.Borrmann,A.,Wenisch,P.,vanTreeck,C.,andRank,E.:CollaborativeHVAC
DesignusingInteractiveFluidSimulations:Ageometry-focusedCollaboration
Platform.InProceedingsofthe12thInternationalConferenceonConcur-
rentEngineering.FortWorth,Texas,USA(2005).
14.Wenisch,P.,Borrmann,A.,Rank,E.,vanTreeck,C.,andWenisch,O.:Colla-
borativeandInteractiveCFDSimulationusingHighPerformanceComputers.
In18thSymposiumAGSimulation(ASIM)andEuroSim,pp.145–151.
SCSPublishing-Housee.V.Erlangen,Erlangen,Germany(2005).ISBN3-
936150-41-9.15.Rank,E.,Borrmann,A.,Duester,A.,Niggl,A.,Nuebel,V.,Romberg,R.,
Scholz,D.,vanTreeck,C.,andWenisch,P.:FromAdaptivitytoComputational
Steering:ThelongWayofIntegratingNumericalSimulationintoEngineering
DesignProcesses.InADMOS2005.CIMNE,Barcelona,Spain(2005).
16.Wenisch,P.,Wenisch,O.,andRank,E.:OptimizinganInteractiveCFDSimu-
lationonaSupercomputerforComputationalSteeringinaVirtualRealityEnvi-
ronment.InHighPerformanceComputinginScienceandEngineering,pp.
(2005).Springer83–93.17.Borrmann,A.,Wenisch,P.,vanTreeckC.,andWenisch,O.:Eineverteilte
ArchitekturfuersynchroneskooperativesArbeitenmiteinerinteraktivenStroemu-
ngssimulation.In16.ForumBauinformatik.ShakerVerlag,Aachen,Ger-
3-8322-3233-8.ISBN(2004).many18.Wenisch,P.,vanTreeck,C.,andRank,E.:InteractiveIndoorAirFlowAnalysis
usingHighPerformanceComputingandVirtualRealityTechniques.In9thInter-
nationalConferenceonAirDistributioninRooms(RoomVent2004).Coim-
(2004).Portugalbra,19.Wenisch,P.andWenisch,O.:FastOctree-basedVoxelisationof3DBound-
aryRepresentation-Objects.TechnicalReport,LehrstuhlfuerBauinformatik,
TechnischeUniversit¨atM¨unchen(2004).
20.Wenisch,P.:KopplungvonHochleistungsrechnerundVirtuellerRealitaet.KON-
WIHRQuartl,Vol.37(2003).
21.Hardt,P.,Kuehner,S.,Rank,E.,andWenisch,O.:InteractiveCFDSimulations
byCouplingSupercomputerswithVirtualReality.Inside,Vol.1(2):pp.12–13
(2003).22.Hardt,P.,Kuehner,S.,Wenisch,O.,andRank,E.:InteractiveCFDSimulation
byCouplingSupercomputerswithVirtualReality.InHighPerformanceCom-
putinginScienceandEngineering.Springer(2004).ISBN3-540-44326-6.

23.

24.

25.

121

Kuehner,S.,Hardt,P.,Krafczyk,M.,andRank,E.:ComputationalSteering
ofaLattice-BoltzmannbasedCFD-solverinVirtualReality.InConferenceon
ConstructionApplicationsofVirtualReality.Virginia,USA(2003).

Hardt,P.andCrouse,B.:Pr¨aprozessorf¨ureinencomputergest¨utztenWindkanal.
Fortschritt-BerichteVol.4In14.ForumBauinformatik,pp.165–172.VDI
Verlag,Bochum,Germany(2002).ISBN3-18-318104-5.

Hardt,P.:EntwicklungeinesModulszurDefinitionvonstroemungsmechanis-
chenRandbedingungenalsAttributevon3DCAD-Geometrien.Diplomathesis,
LehrstuhlfuerBauinformatik,TUMuenchen(2001).