Below you will find pages that utilize the taxonomy term “Performance”
Articles
Performance Optimization
Results of performance optimization study on both PowerPC and CoreDuo machines. 100 runs of the same two functions were done and the best time from each is recorded as changed are made to the code and compiler flags.
The “Sum” test sums 10,000 vectors (c = a + b).
The “Diffuse” test runs a fluid diffusion pass on a 2D array of vectors.
PowerPC (G5 1.8Ghz) Change Sum Diffuse Baseline 28ms 48ms Switch to vFloat type 68ms 116ms 'inline' Vector ctor 69ms 128ms AltiVec Vector functions 27ms 62ms 'inline' AltiVec functions 25ms 58ms 'inline' getNeighborSum() 25ms 38ms Hand tune diffuse with vec_madd n/a 23ms -mtune=G5 24ms 22ms -ffast-math=16 24ms 22ms -falign-loops=16 24ms 22ms