$30
GOAL: The main purpose of this exercise is to make sure everyone has access to the necessary computation and software tools for this class: a C compiler, the graphing program gnuplot, and for future use the symbolic and numeric evaluator Mathematica (easy way to check algebra and do amazing graphics). In this class, there will be help to make sure everyone’s laptop is functional and have access to the Shared Computer Cluster.
1 Background
1.1 Numerical Calculations
LANGUAGES & BUILT IN DATA FORMATS: The primary language at present used for high performance numerical calculations is C or C++. The standard environment is the Unix or Linux operating system. For this reason, C and Unix tools will be emphasized. We will introduce some common tools such as Makefiles and gnuplot.
That said, modern software practices take advantage of a (vast) variety of high level languages as well. You also may use a bit of two symbolic or interpreted languages, Mathematica or Python respectively, because of the power they have to prototype, test, and visualize simple algorithms. Little prior knowledge except familiarity with C will be assumed.
In large scale computing all data must be represented somehow—for numerical computing, this is often as a floating point number. Floats don’t cover every possible real number (there are a lot of them between negative and positive infinity, after all). They can’t even represent the fraction 1/3 exactly. Nonetheless, they can express a vast expanse of numbers both in terms of precision as well as the orders of magnitude they span.
As you’ll learn in this exercise, round off error, stability, and accuracy will always be issues you should be aware of. As a starting point, you should be aware of how floating points are represented. Each language has some standard built in data formats. Two common ones are 32 bit floats and 64 bit doubles, in C lingo. To get an idea of how these formats work on a bitby-bit level, give a look at https://en.wikipedia.org/wiki/Single-precision floating-point format and https://en.wikipedia.org/wiki/Double-precision floating-point format.
You’ll notice we’re not shy about hopping off to the web to supplement the information we’ve put in this document and will discuss in class, and we expect you to do the same when you need to. Searching the web is part of this course.
As a last remark before we hop into some math, bear in mind that data types keep evolving. Big Data applications (deep learning!) are now using smaller 16 bit or even 8 bit floats for some applications. Why? Images often use 8 bit integer RGB formats. Some very demanding high precision science and engineering applications use 128 bit floats (quad precession). There are lots of tricks in code. Symbolic codes represent some numbers like π and e as a special token since there are no finite bit representations! See for more about these issues: https://en.wikipedia.org/wiki/Floating point.
1.2 Finite Differences
A common operation in calculus is the derivative: the local slope of a function. With pencil and paper, it’s straightforward to evaluate derivative analytically for well known functions. For example:
) (1)
On a computer, however, it’s a bit of a non-trivial exercise to perform the analytic derivative (you’d need a text parser, you’d need to encode implementations of many functions... there’s a reason there are only a few very powerful analytic tools, such as Mathematica, that handle this). In numerical work the standard method is to approximate the limit,
(2)
with a finite but small enough difference h. The good news is this is completely general... the bad news is this is only an approximation and it is prone to errors. Due to round off, it’s dangerous for h to get close to zero. 0/0 is an ill-defined quantity!
One approximation of a derivative is the forward finite difference, which should look familiar:
(3)
Two other methods are the backward difference
(4)
and average or central difference. The point of this exercise is to implement different types of differences, as well as test the effect of the step size h.
2 Written Exercise
Computer Engineering is link between mathematics (logic) and hardware (physics). The kind of math stuff you need is very practical part often submerged in math courses in too much formalism. You learn this language by speaking it. To make this link in the current exercise, first to how you expand a function in a Taylor series for small h. Most often the quadratic approximation is enough,
f(x + h) ' f(x) + hf0(x) + (h2)/2f00(x) + ··· (5)
for small h.
More generally we can write more and more term (if you care!)
) (6)
This means the error using 3 term scales like h4 or using 4 term is h5 etc.
2.1 Part 1: Averaging forward and backward Differences
Calculate using pencil and paper the following expressions,
a = ∆hf(x) = [f(x + h) − f(x)]/h =? + O(?) b = ∆ehf(x) = [f(x) − f(x − h)]/h =? + O(?)
(a + b)/2 = (1/2)[f(x + h) − f(x − h)]/h =? + O(?)
A = ∆2hf(x) = [f(x + 2h) − f(x)]/(2h) =? + O(?)
B = ∆e2hf(x) = [f(x) − f(x − 2h)]/(2h) =? + O(?)
(7)
using the Taylor series expression above in powers of h. Note the third expression in Eq. 7 is just the average of the first two. This is the central difference which cancels all odd h powers in the expansion (or odd terms for the approximate derivative.)
HINT: Do this the easy way! The quadratic approximation is enough for this.
2.2 Part 2: Double averaging
For extra credit combine the 4 expression of the derivative a,b,A,B with magic weights to reduce the error to h5.
) (8)
This a clever average of two central difference! Do the algebra to get the magic formula in Eq. 8. It is ok (as I did) to use Mathematica or some symbolic program to get the actual expression for the error term to be ( ! It is smart to cheat but also good to do it out my hand to really understand what you are doing and to exercise your brain.
3 Programming Exercises
3.1 Part 1: Simple C Exercise
For this first exercise we’ve included the shell of a program below; it’s your job to fill in the missing bits. The purpose of this program is to look at the forward, backward, and central difference of the function sin(x) at the point x = 1 as a function of the step size h. You should also print the exact derivative cos(x) at x = 1 in each column.
#include <iostream>
#include <iomanip> #include <cmath> using namespace std;
double f(double x) {
return sin(x);
}
double derivative(double x) {
return cos(x);
}
double forward_diff(double x, double h) {
return (f(x+h)-f(x))/h;
}
double backward_diff(double x, double h) { // return the backward difference.
}
double central_diff(double x, double h) { // return the central difference.
}
int main(int argc, char** argv)
{ double h; const double x = 1.0;
// Set the output precision to 15 decimal places (the default is 6)!
// Loop over 17 values of h: 1 to 10^(-16). for (h = /*...*/; h /*...*/; h *= /*...*/)
{
// Print h, the forward, backward, and central difference of sin(x) at 1, // as well as the exact derivative cos(x) at 1.
// Should you use spaces or tabs as delimiters? Does it matter?
cout << /*...*/ << cos(1.0) << "\n";
} return 0;
}
Don’t be afraid to search online for any information you don’t know! I’m not good at programming, I’m good at Googling and I’m good at debugging. You should name your C++ program findiff.cpp.
You can compile it with:
g++ -O2 findiff.cpp -o findiff
3.2 Part 2: Plotting using gnuplot
You have all of this data, now what? To visualize how the finite differences for different h compares with the analytic derivative, we can plot the data using the program gnuplot. This is a basic tool and with scripts you can design useful general for plotting and fitting.
Of courses there are many plotting and data analysis programs. You may want to use others. Personally I am trying to interface with Mathematica. But in the high performance computing environment dumping data in to simple files and using a gnuplot is sometime useful because it alway there. Using the output file you generated with C plot the relative error in the forward, backward, and central difference as a function of h. This is similar to what is being plotted on the right hand side of Fig. 3 in the Lecture notes. As a reminder, the relative error is defined as:
|approximate − exact|
(9) |exact|
Nice to set x and y labels on your graph, and titles for each curve so you know what is plotted.
By default gnuplot will output to the screen. You’ll want to submit an image at the end of the day; the commands set terminal and set output will be helpful in this regard! As an FYI: while it’s best to play with making plots in the gnuplot terminal, it can get annoying to do everything there! gnuplot can just run a script file:
gnuplot -e "load \"[scriptname].gp\""
Where you should replace [scriptname] with, well, the name of your plotting script!