>nm125_3 At x=0.100000000000, f3(x)=1.051709180756e+00; f4(x)=1.084166666667e+00 At x=0.010000000000, f3(x)=1.005016708417e+00; f4(x)=1.008341666667e+00 At x=0.001000000000, f3(x)=1.000500166708e+00; f4(x)=1.000833416667e+00 At x=0.000100000000, f3(x)=1.000050001667e+00; f4(x)=1.000083334167e+00 At x=0.000010000000, f3(x)=1.000005000007e+00; f4(x)=1.000008333342e+00 At x=0.000001000000, f3(x)=1.000000499962e+00; f4(x)=1.000000833333e+00 At x=0.000000100000, f3(x)=1.000000049434e+00; f4(x)=1.000000083333e+00 At x=0.000000010000, f3(x)=9.999999939225e-01; f4(x)=1.000000008333e+00 At x=0.000000001000, f3(x)=1.000000082740e+00; f4(x)=1.000000000833e+00 At x=0.000000000100, f3(x)=1.000000082740e+00; f4(x)=1.000000000083e+00 At x=0.000000000010, f3(x)=1.000000082740e+00; f4(x)=1.000000000008e+00 At x=0.000000000001, f3(x)=1.000088900582e+00; f4(x)=1.000000000001e+00
%nm125_3.m: reduce the roundoff error using Taylor series f3=@(x)(exp(x)-1)/x; % LHS of Eq.(1.2.22) f4=@(x)((x/4+1)*x/3)+x/2+1; % RHS of Eq.(1.2.22) x=0; tmp=1; for k1=1:12 tmp=tmp*0.1; x1=x+tmp; fprintf('At x=%14.12f, ', x1) fprintf('f3(x)=%18.12e; f4(x)=%18.12e', f3(x1),f4(x1)); end
1.3 Toward Good Program
Among the various criteria about the quality of a general program, the most important one is how robust its performance is against the change of the problem properties and the initial values. A good program guides the program users who do not know much about the program and at least give them a warning message without runtime error for their minor mistake. There are many other features that need to be considered, such as user friendliness, compactness and elegance, readability. But, as far as the numerical methods are concerned, the accuracy of solution, execution speed (time efficiency), and memory utilization (space efficiency) are of utmost concern. Since some tips to achieve the accuracy or at least to avoid large errors (including overflow/underflow) are given in the last section, we will look over the issues of execution speed and memory utilization.
1.3.1 Nested Computing for Computational Efficiency
The execution speed of a program for a numerical solution depends mostly on the number of function (subroutine) calls and arithmetic operations performed in the program. Therefore, we like the algorithm requiring fewer function calls and arithmetic operations. For instance, suppose we want to evaluate the value of a polynomial
(1.3. 1)
It is better to use the nested (computing) structure (as follows) than to use the above form as it is
(1.3.2)
Note that the numbers of multiplications needed in Eqs. (1.3.2) and (1.3.1) are 4 and (4 + 3 + 2 + 1 = 9), respectively. This point is illustrated by the script “nm131_1.m”, where a polynomial polyval()
’. Interested readers could run this script to see that the nested multiplication (Eq. (1.3.2)) and ‘ polyval()
’ (fabricated in a nested structure) take less time than the plain method (Eq. (1.1)), while ‘ polyval()
’ may take longer time because of some overhead time for being called.
%nm131_1.m: nested multiplication vs. plain multiple multiplication N=1000000+1; a=[1:N]; x=1; tic % initialize the timer p=sum(a.*x.̂[N-1:-1:0]); % Plain multiplication p time_plain=toc % Operation time for the sum of multiplications tic, pn=a(1); for i=2:N % Nested multiplication pn = pn*x +a(i); end pn, time_nested=toc % Operation time for the nested multiplications tic, polyval(a,x) time_polyval=toc % Operation time for using polyval()
Programming in a nested structure is not only recommended for time‐efficient computation, but also may be critical to the solution. For instance, consider a problem of finding the value
(1.3.3)
%nm131_2_1.m: nested structure lam=100; K=155; p=exp(-lam); tic S=0; for k=1:K p=p*lam/k; S=S+p; end S time_nested=toc
|
%nm131_2_2.m: not nested structure lam=100; K=155; tic S=0; for k=1:K p=lam̂k/factorial(k); S=S+p; end S*exp(-lam) time_not_nested=toc
|
The above two scripts are made for computing Eq. (1.3.3). Noting that this sum of Poisson probability distribution [W-4] is close to 1 for such a large K, we can run them to find that one works fine, while the other gives a quite wrong result. Could you tell which one is better?
1.3.2 Vector Operation vs. Loop Iteration
It is time‐efficient to use vector operations rather than loop iterations to perform a repetitive job for an array of data. The following script “nm132_1.m” compares a loop iteration and a vector operation (for computing the sum of 105 numbers) in terms of the execution speed. Could you tell which one is faster?
%nm132_1.m: Vector operation vs. Loop iteration N=1e5; th=[0:N-1]/50000*pi; tic s1=sin(th(1)); for i=2:N, s1= s1+sin(th(i)); end % Loop iteration time_loop=toc, s1 tic s2=sum(sin(th)); % Vector operation time_vector=toc, s2
As a more practical example, let us consider a problem of finding the discrete‐time Fourier transform (DtFT) [W-2 of a given sequence x[n]:
(1.3.4)
%nm132_2.m: Vector operation vs. Loop iteration N=1000; x=rand(1,N); kk=[-100:100]; W=kk*pi/100; % Frequency range % for for loop tic for k =1:length(W) X_for1(k)=0; %zeros(size(W)); for n=1:N, X_for1(k) = X_for1(k) +x(n)*exp(-j*W(k)*(n-1)); end end time_loop_loop=toc % for vector loop tic X_for2 =0 ; %zeros(size(W)); for n=1:N X_for2 = X_for2 +x(n)*exp(-j*W*(n-1)); end time_vector_loop=toc % Vector operation tic nn=[1:N].'; X_vec = x*exp(-j*(nn-1)*W); time_vector_vector=toc discrepancy1= norm(X_for1-X_vec) discrepancy2= norm(X_for2-X_vec)
The above script “nm132_2.m” compares a vector operation vs. a loop iteration for computing the DtFT in terms of the execution speed. Could you tell which one is faster?
1.3.3 Iterative Routine vs. Recursive Routine
In this section, we compare an iterative routine and a recursive routine performing the same job. Consider the following two functions ‘ fctrl1(n)
’ /
‘ fctrl2(n)
’, whose common objectives is to get the factorial of a given nonnegative integer k.
(1.3. 5)
They differ in their structure. While ‘ fctrl1()
’ uses a for
loop structure, ‘ fctrl2()
’