UP ONE LEVEL: ENEL 369 Winter 2000 Course Handouts

ENEL 369: Introduction to Computer Architecture
Lab 10 - for the week of April 3

Author: Steve Norman
Paper copies handed out: Tuesday, April 4, 2000
Last modified: Mon Apr 3 22:33:29 MDT 2000

Contents


This Lab is Important, but won't be marked

It would be impossible to make this assignment due on April 10 and have it marked and returned by TAs before classes end. To avoid having to return the assignment during the exam period, and also to give the TAs a bit of a break, this lab won't be marked.

However, this lab covers material that will definitely be on the final exam.

Solutions for all exercises will be posted on the Web.

[back to top of document]


Updates / Corrections

If corrections or clarifications are necessary after the paper version of this handout is printed, they will appear in this space on the Web version of this handout.

[back to top of document]


Exercise A: Introduction to endianness

Read This First

Endianness is an important issue related to memory systems, but hasn't been discussed in lectures. Endianness refers to the organization of bytes within a memory word. In a little-endian system with 32-bit words, bits 31-0 of the word are stored in bytes as follows:
32-bit little-endian word
A big-endian arrangement of a 32-bit word is as follows:
32-bit big-endian word
Consider the following sequence of MIPS instructions, and assume that $a0 points to a word in the data segment of a program:
   ori     $t0, $zero, 0x9
   sb      $t0, 0($a0)
   ori     $t0, $zero, 0xa
   sb      $t0, 1($a0)
   ori     $t0, $zero, 0xb
   sb      $t0, 2($a0)
   ori     $t0, $zero, 0xc
   sb      $t0, 3($a0)
   lw      $s0, 0($a0)
On a little-endian machine, the above sequence would put 0x0c0b0a09 in $s0, but on a big-endian machine, the sequence would put 0x090a0b0c in $s0.

Here are some notes about endianness on various computer systems:

What to do

Assume that $a0 points to a word in the data segment of a big-endian MIPS-like computer. What will be in $s0-$s3 just after the following instructions have been executed?
  lui     $t0, 0x6181
  ori     $t0, $t0, 0x7191
  sw      $t0, ($a0)
  lb      $s0, 0($a0)
  lb      $s1, 1($a0)
  lb      $s2, 2($a0)
  lb      $s3, 3($a0)
Repeat the exercise, assuming this time that the machine is little-endian.

[back to top of document]


Exercise B: Endianness and binary files

Read This First

One area where endianness is of major concern to applications programmers is a situation where a binary file is written by one computer system and read by another.

Before introducing binary file operations, let's very briefly review text file operations. Consider a C program using fprintf to write an int to a text file. Suppose that the character set in use is ASCII, that fp is of type FILE* and has been opened for output, and that i is an int

   i = 12345;
   fprintf(fp, "%d\n", 12345);
The fprintf function converts the computer's internal representation of 12345 to a sequence of bytes: the ASCII code for '1', the ASCII code for '2', and so on. These five bytes followed by the code for '\n' are written to the file. So the effect of the function call is to put this sequence of bytes in the file:
00110001
00110010
00110011
00110100
00110101
00001010

Unlike a text file, a binary file is not necessarily a sequence of character codes. The C view of a binary file is a sequence of bytes that might be character codes, pieces of instructions, pieces of integers or pieces of floating points numbers, or that might have some other meaning. The key C library functions for binary file input and output are fread and fwrite. The following code demonstrates writing the bit pattern for an int to a binary file. Again suppose that fp is of type FILE* and has been opened for input:

   i = 12345;
   fwrite((void *) &i, sizeof(int), 1, fp);
The arguments are
  1. The address of the first byte to be copied to the file.
  2. The size, in bytes, of the data items to be copied to the file.
  3. The number of data items to be written to the file.
  4. fp, which specifies the file.
The 32-bit representation of 12345 is
   0000_0000_0000_0000_0011_0000_0011_1001
fwrite simply copies a sequence of bytes from memory to a file. So on a 32-bit big-endian machine, the call to fwrite will put the following sequence of bytes in the file:
00000000
00000000
00110000
00111001
But on a little-endian machine the bytes would be written in this order:
00111001
00110000
00000000
00000000
Note that in both cases the bytes are totally different from what is produced by the call to fprintf.

The function to read from a binary file is fread.

It should be clear that endianness can cause problems if a binary file is written using fwrite on a machine with one endianness and read using fread on a machine with the opposite endianness.

What to do

Make a copy of the directory /local/courses/enel369/lab10/exB

There are two C programs in the directory. The program binwrite.c generates a small binary file with the following format: the first four bytes are an unsigned int giving the number of array elements stored in the file, and the remaining bytes are elements from an array of unsigned ints, four bytes per element. The program binread.c reads a file in the same format and displays the number of elements and array contents.

Read the two programs carefully to get an idea of how they work.

There are two data files in the directory: x86_output is an output file produced by binwrite.c on a (little-endian) x86-based Linux system; sun_output is an output file produced by binwrite.c on a (big-endian) Sun workstation.

Build an executable from binread.c. Run it using x86_output as the input file; then run it with sun_output as the input file. The difference in behaviour will be obvious.

Make a copy of binread.c and modify it so that it can correctly read the data in sun_output. Do not modify the calls to fread; instead use operations such as shifts and bitwise ands to rearrange the bytes within variables.

Closing Remark

Storing numbers in binary files is somewhat more efficient than using text files, because binary files are smaller and because with binary files no time is spent converting base two numbers sequences of base ten digits and vice versa.

However, endianness is only one of the many issues that can arise when reading and writing binary files on different platforms. Two of the many other issues are variations in the size of integer types (16-bit, 32-bit, or 64-bit?) and old, non-standard formats for floating-point numbers.

[back to top of document]


Exercise C: Tracing cache hits and misses

What to do

Do Exercises 7.7, 7.8, 7.20 and 7.21 from the textbook.

You will have to do a little reading to find out what a fully-associative cache is, because fully-associative caches were not covered in lectures.

Note that the addresses are word addresses, not byte addresses. The largest address is 56, so you can assume that addresses are six bits wide. So, for example, in Exercise 7.7 the tag will be the first two bits of the address and the cache index will be the last four bits.

[back to top of document]


Exercise D: How many bits of storage are needed for a cache?

What to do

Do Exercise 7.9 from the textbook.

[back to top of document]


Exercise E: Tag and index sizes

What to do

  1. Consider the cache of Figure 7.19 in the textbook, which can hold 1024 32-bit words. Consider modifying it so that it is still four-way set-associative, but holds 16384 32-bit words in four-word blocks. How will 32-bit addresses be broken into tag, index, block offset and byte offset? (That is, how many bits will be used for each of tag, index, block offset and byte offset?)
  2. Now consider a computer with 64-bit byte addresses and 64-bit memory words. (These sizes are what you would find in today's big Unix servers and in the PCs of the not-too-distant future.) Consider a cache that is two-way set-associative, stores words in four-word blocks, and holds 256K (= 256 * 1024 = 262144) bytes. How will 64-bit addresses be broken into tag, index, block offset and byte offset?

[back to top of document]