Panacea-BOCAF On-Line University
32 Pages

Panacea-BOCAF On-Line University


Downloading requires you to have access to the YouScribe library
Learn all about the services we offer


  • cours - matière potentielle : overview…………………………………………………………………………………………………
  • exposé - matière potentielle : about the word
  • exposé - matière potentielle : successful hydrogen supplementation
  • exposé
Panacea-BOCAF On-Line University The Panacea University is the World's first “unofficial” OPEN SOURCE University. Panacea's calls it a University as we teach. This is an educational series covering OPEN SOURCE clean FREE energy technology towards building our children a future. Panacea-BOCAF is a registered non-profit organization, dedicated to educational study and research. All copyrights belong to their owners and are acknowledged. All material presented on this web site is either news reporting or information presented for non-profit study and research, or has previously been publicly disclosed or has implicitly or explicitly been put into the public domain.
  • redundant production of dangerous hydrogen storage
  • process functions
  • benefit from hydroxy injection
  • hydroxy
  • combustion chamber
  • electrolyte
  • hydrogen
  • fuel
  • gas
  • process



Published by
Reads 14
Language English

High Performance Computing
on GPUs
Slides include some material from GPGPU tutorial at SIGGRAPH2007:
Mark Silberstein, Technion 1Outline
● Motivation
● Stream programming
– Simplified HW and SW model
– Simple GPU programming example
● Increasing stream granularity
– Using shared memory
– Matrix multiplication
● Improving performance
● Some real life example
Mark Silberstein, Technion 2Disclaimer
This lecture will discuss GPUs from the
Parallel Computing perspective
since I am NOT an expert in graphics hardware
Mark Silberstein, Technion 3Mark Silberstein, Technion 4Why GPUs-II
Mark Silberstein, Technion 5Is it a miracle? NO!
● Architectural solution prefers parallelism over
single thread performance!
● Example problem – I have 100 apples to eat
1)“high performance computing” objective: optimize
the time of eating one apple
2) “high throughput computing” objective: optimize
the time of eating all apples
st● The 1 option has been exhausted!!!
● Performance = parallel hardware + scalable
parallel program!
Mark Silberstein, Technion 6Why not in CPUs?
● Not applicable to general purpose computing
● Complex programming model
● Still immature
– Platform is a moving target
● Vendor-dependent architectures
● Incompatible architectural changes from generation to
– Programming model is vendor dependent
● AMD(ATI) – Close To Metal (CTM)
● INTEL ( LARRABEE) – nobody knows
Mark Silberstein, Technion 7Simple stream programming model
Mark Silberstein, Technion 8Generic GPU
hardware/software model
● Massively parallel processor: many concurrently running
threads (thousands)
● Threads access global GPU memory
● Each thread has limited number of private registers
● Caching: two options
– Not cached (latency hidden through time-slicing)
– Cached with unknown cache organization, but optimized
for 2D spatial locality
● Single Program Multiple Data (SPMD) model
– The same program, called kernel, is executed on the
different data
Mark Silberstein, Technion 9How we design an algorithm
● Problem: compute product of two vectors
A[10000] and B[10000] and store it in C[10000]
● Think data-parallel: same set of operations
(kernel) applied to multiple data chunks
– apply fine grain parallelization (caution here! - see
in a few slides)
● Thread creation is cheap
● The more threads the better
● Idea: one thread multiplies 2 numbers
Mark Silberstein, Technion 10