$25
Computer Organization and Architecture
1. Download the memory mountain program from /pub/cis450/programs/mountain.tgz and run it on your favorite Linux system in the lab or at home:
a. tar xvzf mountain.tgz
b. cd mountain
c. make clean
d. make -- ignore the warning regarding the volatile variable
e. ./mountain > output.txt
2. Open the output data set, output.txt, in your favorite spreadsheet program; e.g., Microsoft Excel, etc., create a 3d graph showing the performance for different stride lengths and working set sizes. This graph should be similar to the one shown on the cover of your textbook. Try to record a sample when the system is lightly loaded to avoid interference from other processes executing at the same time.
3. Use the results to estimate the sizes of the L1 and L2 (and if they exist L3) caches on your system.
4. Finally, roughly estimate the access times to different parts of the memory hierarchy, in CPU cycles, to read a 4-byte word from:
a. The on-chip L1 d-cache.
b. The on/off-chip L2 cache.
c. The integrated/off-chip L3 cache (if it exists on your system).
d. Main memory.
Hint: See practice problem 6.19 (1st Edition) or 6.22 (2nd Edition).
Hints:
1. Suppose that we log onto a Linux machine, say putty into cougar.cs.ksu.edu.
2. To determine the machines canonical (standard) name, type the command: hostname. In this case, the canonical name is cougar.cs.ksu.edu, cislinux is an alias.
3. Run the mountain program and redirect the output to output.txt: ./mountain > output.txt
4. Graph your output after importing the output.txt file into a spreadsheet.
5. Estimate the cache size; L1 about 32 kilobytes, L2 about 256 kilobytes, and L3 cache 8-12 MB. Throughput to L2 cache = 2400 MBps or 600 million 32-bit integers per second. Since the machine is running at roughly 2.53 Ghz = 2.53 billion cycles per second, it takes about 4.2 cycles/integer read from the L2 cache, etc.
6. To check your answer, issue the command: cat /proc/cpuinfo > cpuinfo.txt
processor : 0 to 23 (24 cores) vendor_id : GenuineIntel
cpu family : 6 model : 44
model name : Intel(R) Xeon(R) CPU E5649 @ 2.53GHz cache size : 12288 KB
7. Look up the specifications for the given CPU online. A google search for Intel Xeon E5649 specification yields:
Level 1 cache size
24 x 32 KB 4-way set associative instruction caches
24 x 32 KB 8-way set associative data caches
Level 2 cache size
24 x 256 KB 8-way set associative caches
Level 3 cache size
12 MB 16-way set associative shared cache
Note: 32 KB L1 data cache for each core, 256 KB L2 cache for each core, 12 MB L3 cache shared by all cores.