13.07.2015 Views

časopisem LEO Express - Vlaky.net

časopisem LEO Express - Vlaky.net

časopisem LEO Express - Vlaky.net

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Normalized execution time1.351.301.251.201.151.101.051.000.95f177.mesaf179.artf183.equakef188.ammpi164.gziptrivial. However, another benchmark programs makeexecution time longer. Performance overhead becomes32.1% in worst case. Thus, we believe that challenging theperformance issue of low-leakage caches is worthwhile.Next, we represent the number of Data L1 cache missesin Figure 2. The y-axis shows the normalized DL1 missesby non-optimized conventional cache. If the value becomesone or more, it means sleep-miss which Cache decaycaused. This figure tells us that performance degradationgoes up in proportion to the number of sleep-miss. If wecan cut back these misses, we will obtain an advantage ofthe performance improvement.3.2 Sleep-Miss DensityIn general, the memory references have spatial locality.Therefore, it is expected that there is spatial locality also insleep-miss accesses. We refer to the frequency of sleepmissaccesses to each cache line as sleep-miss density(SMD). The SMD of line i is defined as follows:i175.vpri176.gccBenchmark ProgramsFigure 1: Normalized Execution Time of Cache decayNormalized DL1 misses4.03.53.02.52.01.51.00.50.0f177.mesa11.7f179.artf183.equakef188.ammpi164.gzipi175.vpri176.gccBenchmark Programsi181.mcfi181.mcfi197.parseri197.parserFigure 2: Normalized DL1 Missesi256.bzip2i256.bzip2AverageAverageSleep-Miss Density1816141210864201.61.41.21.00.80.6SMDNf183.equakei181.mcf0.40 200 400 600800 1000Cache-Line index (from 0 to 1023)Figure 3: Sleep-miss Density (f183.equake, i181.mcf)sleep−miss(line−i)i= (1)Nsleep−miss(avg)where N sleep-miss(line-i) is the total number of sleep-missaccesses occurred at the cache line i, and N sleep-miss(avg) is theaverage number of sleep-miss accesses for all cache lines.Namely, if the SMD value of a cache line is 2.0, it meansthat the line causes the double of sleep-miss accessescompared with the average.Based on the setup stated with section 2, we measuredSMD i in each line. Figure 3 shows the SMD of each cacheline for two benchmark programs. The x-axis shows thecache-line index in the assumed 32KB 32-way cache. Fori181.mcf, many cache lines have the SMD value of around1.0. Actually, the SMD value of all line is smaller than 1.6.On the other hand, for f183.equake, we see that some cachelines indicate much higher degree of SMD. Figure 4presents the breakdown of cache lines in terms of the SMD.The five programs, f179.art, f188.ammp, i175.vpr,i197.parser and i256.bzip2 show the same characteristicswith f183.equake. Namely, the SMD value of almost all thecache lines is less than 1.0, while that of a few lines (lessthan 10%) is equal to or greater than 4.0. Figure 5 reportsthe breakdown of sleep-miss accesses, that is, how muchthe sleep-miss accesses are dominated by the cache lineshaving different values of the SMD. For all benchmarkprograms, the cache lines indicating higher degree of SMD(equal or greater than 1.0) dominate the total sleep-missaccesses.From the observations explained above, we can considerthat in many cases a small number of cache lines areresponsible for the majority of sleep-miss accesses. Forinstance, in f183.equale, the cache lines with SMD≥4.0 areonly 7.7%, but they account for 75.2% of sleep-missaccesses. On average for all benchmarks, 2.6% of cachelines have SMD≥4.0, and they cause 25.2% of total sleep-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!