$35
I. Convert the decimal integer 477 to a normalized FPNwith β = 2.
II. Convert the decimal fraction 3/5 to a normalizedFPN with β = 2.
III. Let x = βe, e ∈ Z, L < e < U be a normalized FPN in F and xL,xR ∈ F the two normalized FPNs adjacent to x such that xL < x < xR. Prove xR − x = β(x − xL).
IV. By reusing your result of II, find out the two nor-malized FPNs adjacent to x = 3/5 under the IEEE 754 single-precision protocol. What is fl(x) and the relative roundoff error?
V. If the IEEE 754 single-precision protocol did notround off numbers to the nearest, but simply dropped excess bits, what would the unit roundoff be?
VI. How many bits of precision are lost in the subtraction
1 − cosx when ?
VII. Suggest at least two ways to compute 1 − cosx to avoid catastrophic cancellation caused by subtraction.
The above eight questions weigh 3, 4, 7, 6, 3, 3, 4 points, respectively, totaling 30 points.
2 C++ programming
(A) (10 points) By programming in C++, print values of the functions in (1) at 101 equally spaced points covering the interval [0.99,1.01]. Calculate each function in a straightforward way without rearranging or factoring. Note that the three functions are theoretically the same, but the computed values might be very different. Plot these functions near 1.0 using a magnified scale for the function values to see the variations involved. Discuss what you see. Which one is the most accurate? Why?
(B) (10 points) Consider a normalized FPN system F with the characterization β = 2,p = 3,L = −1,U = +1. Answer the following by programming in C++
• compute UFL(F) and OFL(F) and output them as decimal numbers;
• enumerate all numbers in F and verify the corollary on the cardinality of F in the summary handout;
• plot F on the real axis; • enumerate all the subnormal numbers of F;
• plot the extended F on the real axis.