$30
1 Time-domain model identification via Hankel matrix analysis
1.1 Background
Consider a multi-input/multi-output, discrete-time linear system with m outputs and q inputs,
xk+1 = Axk + Buk
yk = Cxk
where the state dimension is ns so A ∈ Rns×ns, B ∈ Rns×q and C ∈ Rm×ns. It is assumed the feedthrough matrix D ∈ Rm×q is zero. We will assume the state-space matrices have real elements. The rth column of the pulse response sequence {hk}, where hk ∈ Rm×q, is obtained when x0 = 0 and the input is given by the sequence in which the rth element of u0 is 1 and all other elements are zero, and then the subsequent input samples are all zero (the “0” index indicates t = 0),
0
...
0
u0 = 1, (u0)r = 1
0
...
0 uk = 0, k = 1,2,3,...
The pulse response is the discrete-time analog of the impulse response of a continuous-time system. It is straightforward to show h0 = 0, hk = CAk−1B, k = 1,2,3,...
{u1,u2,...,u2n−1} terms in the pulse response,
h1 h2 h3 2 h3 h4 h
Hn = h3 h4 h5
... ... ...
hn hn+1 hn+2
···
···
···...
···
hn
hn+1 hn+2 ∈ Rmn×nq.
...
h2n−1
(1)
The pulse response sequence can be organized into an mn × nq Hankel matrix, denoted Hn, by using the
A distinguishing feature of the Hankel matrix is that the elements of each anti-diagonal block are the same. By virtue of it’s definition, the Hankel matrix can be expressed as the following product,
Hn = OnCn,
where you will recognize the observability and controllability matrices,
.
Next, we make the following assumption: rank On = ns and rank Cn = ns. In other words, On and Cn are full rank and in this case
rank Hn = ns,
that is, the state dimension is equal to the rank of the Hankel matrix constructed from the pulse response. This assumption on the rank not only requires that enough points be available so that the column and row dimensions of Hn are no less than ns, but it also requires that certain “structural” properties of the system be satisfied –these details are discussed in the notes under the subjects of observability and controllability. If the state-space system undergoes a change of coordinates using T ∈ Rns×ns, detT = 06 ,
zk+1 = T−1ATzk + T−1Buk
yk = CTzk
Note that On and Cn transform to
On →7 O˜n = OnT, Cn 7→ C˜n = T−1Cn
Of course, the impulse response remains the same and so does the Hankel matrix Hn = OnCn = O˜nC˜n.
Now consider the situation in which the pulse response samples are provided. In this case, Hn can be formed from {hk}, k = 0,1,2,... , according to (1). Compute an SVD of Hn,
,
where U1 ∈ Rmn×ns, Σns ∈ Rns×ns, V1 ∈ Rnq×ns. On and Cn can be determined from the SVD, however, the factorization is not unique since it is possible to group terms into On or into Cn, for example,
On = U1 and Cn = ΣnsV1T, or On = U1Σns and Cn = V1T,
or , or , for any invertible W.
A specific choice of factorization just corresponds to a specific choice of coordinates in which the state-space matrices are to be expressed. Whatever factorization is used, C is selected to be the first m rows of On and B is selected to be first q columns of Cn. To recover A, a new Hankel matrix is computed,
.
Once On and Cn have been determined from Hn using your desired factorization, a left inverse of On, denoted
†
On, and a right inverse of Cn, denoted Cn†, are used to compute A in the same coordinates as C and B,
A = On†H˜nCn†.
Although left and right inverses are not unique (unless the matrix is, in fact, invertible, in which case the left and right inverses are equal to the usual inverse) the product On†H˜nCn† is always unique so it does not matter which left or right inverse is used. The SVD of Hn conveniently allows specification of left and right inverses using the matrices from the decomposition. For example, if On = U1 and , then and .
In practice, the Hankel matrix is full rank so the “best” way to recover a model is to analyze the singular values of Hn and make a judgement on a suitable model order. For example, if there is a notable “jump” in the singular values (like in Fig. 3), then a sensible approach is to retain the largest singular values and truncate those after the “jump”. Thus, although Hn is full rank, it is approximated by a rank ns matrix, where ns is the selected model order. In other words, ns is not a priori specified –it is determined from analysis of the data, and once determined, Hn is replaced by the “optimal” rank ns approximation
σ1 0 ··· 0
0 σ2
, where Σns = ... ...
0 σns
where U1 and V1 represent the first ns columns of U and V , respectively (where U and V are the unitary matrices from an SVD of Hn). Lastly, often times an analytical model of the system is available and will provide insight into an expected model order. It is important to reconcile what an analytical model yields versus what the test data suggests since resolving a discrepancy can often lead to greater insight into the system.
1.2 A note on the Gramian relations
Recall the observability and controllability gramians using the first n samples of the impulse response are defined as
If we make the following choices for On and Cn,
then the gramians are balanced, in other words, Go(n) = Gc(n) = Σns, because
Now if the system is asymptotically stable we can take the limit p → ∞, then
(2)
(3)
and these gramians satisfy the corresponding discrete-time Lyapunov equations,
ATGo(∞)A − Go(∞) = −CTC
AGc(∞)AT − Gc(∞) = −BBT.
It’s not possible to let p → ∞ since only a finite amount is data is collected, however, if the Hankel matrix is constructed from essentially the entire transient pulse response data record, then for all practical purposes
Go(n) ≈ Go(∞) and Gc(n) ≈ Gc(∞).
Thus, these coordinates represent the traditional “balanced realization.” It is not necessary to decompose the
Hankel matrices so that balanced coordinates are used. Recall that
1.3 Impulse response data
Data sets can be downloaded from the course website. The data sequences are associated with a sample period of ts = 1/40 second. You need to write up a report based on the modeling and analysis of a two-input/two-system that produced this input-output data. The data sets you need are called u1_impulse.mat and u2_impulse.mat. The following code loads the pulse response data and graphs it. Note that only one input channel is “on” in each data set. In this problem, m = 2 (two outputs) and q = 2 (two inputs). The state dimension is not known, though, with the objective being the determination of a state-space discrete-time system that can replicate the observed data. Note that we seek an asymptotically stable model because the impulse response data decays to zero. Also note that when using data sets from physical systems that rank Hn = n because inevitably there is some noise in the measurement data and furthermore the system may be nonlinear even though it can be well-approximated with a linear model. Thus, it appears that you would need an n-dimensional system to model the data, but a lower rank approximation almost always produces a “better” model. For example, asymptotic stability of the identified model is not explicitly enforced and using high model dimensions can actually produce unstable models.
clear close all load u1_impulse.mat
y11 = u1_impulse.Y(3).Data; y21 = u1_impulse.Y(4).Data; u1 = u1_impulse.Y(1).Data; %%% note that the pulse magnitude is 5 [m,mi] = max(u10); %%% find index where pulse occurs load u2_impulse.mat
y12 = u2_impulse.Y(3).Data; y22 = u2_impulse.Y(4).Data; u2 = u2_impulse.Y(2).Data;
%%% remove any offsets in output data using data prior to pulse application y11 = y11 - mean(y11([1:mi-1])); y12 = y12 - mean(y12([1:mi-1])); y21 = y21 - mean(y21([1:mi-1])); y22 = y22 - mean(y22([1:mi-1]));
%%% rescale IO data so that impulse input has magnitude 1 y11 = y11/max(u1); y12 = y12/max(u2); y21 = y21/max(u1); y22 = y22/max(u2); u1 = u1/max(u1); u2 = u2/max(u2);
ts = 1/40; %%%% sample period
N = length(y11); %%%% length of data sets t = [0:N-1]*ts - 1;
figure(1); subplot(311) plot(t,u1,’b*’,’LineWidth’,2) ylabel(’$u_1$ (volts)’,’FontSize’,14,’Interpreter’,’Latex’); grid on axis([-0.2 2 -0.1 1.1])
subplot(312) plot(t,y11,’r*’,’LineWidth’,2) ylabel(’$y_1$ (volts)’,’FontSize’,14,’Interpreter’,’Latex’); grid on axis([-0.2 2 -0.1 0.1])
subplot(313) plot(t,y21,’r*’,’LineWidth’,2) ylabel(’$y_2$ (volts)’,’FontSize’,14,’Interpreter’,’Latex’); grid on xlabel(’second’,’FontSize’,14) set(gca,’FontSize’,14) axis([-0.2 2 -0.1 0.1])
figure(2); subplot(311) plot(t,u2,’b*’,’LineWidth’,2) ylabel(’$u_2$ (volts)’,’FontSize’,14,’Interpreter’,’Latex’); grid on axis([-0.2 2 -0.1 1.1])
subplot(312) plot(t,y12,’r*’,’LineWidth’,2) ylabel(’$y_1$ (volts)’,’FontSize’,14,’Interpreter’,’Latex’); grid on axis([-0.2 2 -0.1 0.1])
subplot(313)
plot(t,y22,’r*’,’LineWidth’,2)
ylabel(’$y_2$ (volts)’,’FontSize’,14,’Interpreter’,’Latex’); grid on xlabel(’second’,’FontSize’,14) set(gca,’FontSize’,14) axis([-0.2 2 -0.1 0.1])
This Matlab code produces Figs. 1 and 2. At the kth sample time, the y1 and y2 values from Fig. 1 form the first column of hk ∈ R2×2, and the y1 and y2 values from Fig. 2 form the second column of hk. The singular values of Hn, n ∈ {20,40,80}, are shown in Fig. 3 where there appear to be seven dominant singular values regardless of the dimension of Hn (assuming enough points are used). This suggests that a reasonable initial choice of model order is ns = 7. You will select a handful of model orders and derive the corresponding state-space models from the Hankel matrix factorization and then compare your models in a few different ways.
Task #1: Model identification
Complete the following:
Figure 1: System response when u1 is a unit impulse and u2 is zero.
Figure 2: System response when u2 is a unit impulse and u1 is zero.
Figure 3: Singular values of H20 (blue), H40 (red) and H80 (green). Only the first 40 singular values are shown for each case.
1. Construct H100 and graph the singular values. Compute models from H100 with the following state dimensions: ns ∈ {6,7,10,20}. There is no need to print out the state-space matrices since you will compute the model properties below. Calculate the max(abs(eig(A))) to confirm that all models are asymptotically stable.
2. Simulate the impulse response of each model and compare it to the measurement data. Use a separate figure for each case (model response versus data). Which model order has an inferior reproduction of the pulse response data? The remaining models are essentially indistinguishable, though, and could be used as a valid model for the system.
3. Another way to compare the model to the data is to compare the model frequency response to the empirical frequency response obtained from the pulse response data. The model frequency response is given by
C(ejωtsI − A)−1B + D,
where ts = 1/40 is the sample period in seconds. The frequency ω is in the interval [0,ωnyq], where ωnyq is the Nyquist frequency (equal to one half the sampling frequency, in other words, ωnyq = 20 hertz). Note that when evaluating the formula, ω should have units of radians-per-second. At each frequency, the frequency response is a 2 × 2 matrix with complex-valued elements. Plot the magnitude and phase of the frequency response of each model with ns ∈ {6,7,10,20}. There will be a total of eight figures because each scalar input-output channel has a magnitude and phase plot (all magnitudes of the (1,1) channel will be compared in a single figure, all phases of the (1,1) channel will be compared in a single figure, and so forth). Don’t forget to label your figures.
4. The pulse response data can be used to directly estimate the frequency response without the need for a parametric model. For example, let y11 be the impulse response data associated with output y1 due to a unit pulse applied by input u1 (where the data sequence is u1). Then, the frequency response associated with this channel can be estimated using the following code:
y11f = fft(y11)./fft(u1); N = length(y11f);
om = [0:N-1]/(ts*N); %%%% frequency vector in hertz
Estimate the empirical frequency response magnitude and phase and overlay these results on the model plots from the previous part. Thus, in each of the eight figures generated in the previous part, you will have 5 traces: 4 from the models and one from the empirical frequency response estimate. You should observe that three of the models match up very well with the empirical frequency response results.
Task #2: Transmission zeros of the MIMO model and zeros of each channel
Consider the model in which ns = 7 for this task. This model provides a very accurate reproduction of the pulse response data and, furthermore, the model frequency response is very close to the empirical frequency response. You can confirm the normal rank of
is 9, that is, rank S(α) = 9 for almost all α ∈ C. A transmission zero of the system occurs at z ∈ C when rank S(z) < 9. In this case, there exist non-zero vectors c ∈ C7 and w ∈ C2 such that
Let the input be uk = wzk, k = 0,1,2,... . The state vector xk = czk satisfies the difference equation with this input because
c .
xk uk
Furthermore, the output is zero for all samples because
yk .
The zeros can be computed from a generalized eigenvalue problem,
.
See the Matlab eig command for more information on solving this generalized eigenvalue problem. Answer the following:
1. Show that there are five finite transmission zeros of the ns = 7 model, with the remaining four zeros given as “inf.” The inf zeros can be ignored. Label the zeros from largest to smallest magnitude. Note |z1| 1 is called an unstable zero because the input it generates is unbounded, however, |zp| < 1, p = 2,3,4,5, are asymptotically stable zeros because the inputs they generate decay to zero, i.e. .
2. Graph the eigenvalues and transmission zeros of the ns = 7 model in the complex plane. Draw a circle of radius 1 (use the Matlab command axis square so the aspect ratio is one-to-one) and note that if your model is asymptotically stable (it should be) then the eigenvalues will lie within the unit circle. Denote the eigenvalues with an “x” and the transmission zeros with a “o.”
3. A discrete-time eigenvalue λd can be converted into its “equivalent” continuous-time eigenvalues according to λd = eλcts, where λc is the continuous-time eigenvalue. Note that imaginary part of λc is not uniquely defined, so just choose the smallest value for the imaginary part of λc (this choice is justified as long as there is no aliasing of resonant frequencies during the data collection). From the set of λc, how many damped oscillators are in your model and what are their approximate natural frequencies? How do the frequencies compare with certain features in frequency response graphs?
4. For the ns = 7 model, graph the eigenvalues and transmission zeros calculated for each individual channel. Note that the eigenvalues will be the same in all cases so the transfer functions in each channel are different because their zeros are different. Reconcile the channel frequency responses with the pole-zero plots, i.e. the presence or lack of the resonances and so forth. Furthermore, treating each channel as a separate system, what are its Hankel singular values? You should observe that each channel, considered in isolation, can actually be described with a lower order model whose dimension is consistent with the near pole-zero cancellations that you observe in each channel.
5. Now return to the Hankel matrix and select ns = 8. This model is over-parameterized in some sense because its pulse response and frequency response are essentially identical to those associated with the ns = 7 model. Create another pole-zero plot for each channel and show that the added pole is accompanied by zero in close proximity and that this occurs for all channels. This creates essentially a pole-zero cancellation in each channel so that the “extra” state dimension in the ns = 8 case does not have much impact on the input-output properties of the system compared to the ns = 7 model.
Task #3: Block diagram from analysis of individual channels
From the analysis of the two input-two output system, there appear to be two oscillators in the system. Furthermore, there are three poles on the positive real axis (near 0.8). These poles correspond to low-pass filters. The poles of each oscillator are distinct (despite the fact that they are quite close) so give each complex conjugate pair of oscillator poles a distinct label like “OSC1” and “OSC2”. Also label the three distinct real poles as “LP1”, “LP2” and “LP3”. Now based on which poles are not canceled by zeros in the individual channels (from the previous Task results), draw a block diagram showing how OSC1, OSC2, LP1, LP2 and LP3 are “connected” to form the two input-two output system. Can you uniquely determine the topology? If not, what parts cannot be resolved? Finally, some channels have a zero at s = 1 –this zero can be combined with one of the low-pass poles to create a high-pass filter. Which one of the low-pass poles (LP1 and/or LP2 and/or LP3?) can be paired with this zero to make the high-pass filter and how are the effects of the high-pass filter evident in the frequency response graphs?
2 Correlation functions
A pulse input to a system is not always ideal in order to determine its impulse response because disturbances and measurement noise also contribute to the observed outputs, and because of limitations on how large the input magnitude can be, it may not be possible to increase the pulse height to a level where the system’s response is dominated by the effect of the pulse input. Thus, it is advantageous to use inputs which are persistent since it is possible to transfer more energy to the system when limits on the input magnitude are present, as is always the case in any real test environment. A persistent input which is quite useful in this regard is “white noise.” Before defining white noise, the auto- and cross-correlation functions are defined for discrete-time persistent signals uk ∈ Rn and yk ∈ Rm,
Auto-correlation of u : Rn×n (4)
Auto-correlation of y : , (5)
Cross-correlation : . (6)
The index k is called the “lag index” which corresponds to the time lag tsk, where ts is the sample period for the data. It is assumed the correlation functions are independent of the time origin, i.e. the assumption that u and y are stationary. The input signal u is “zero mean, unit variance white noise” when
. (7)
In this case, different scalar channels of u are uncorrelated for all lag factors k and, furthermore, each scalar channel has only non-zero correlation with itself when k = 0.
Suppose u and y are the input-output sequences associated with an asymptotically stable linear time-invariant system, then, u and y are related via convolution,
∞ ∞
yk = X hk−lul = X hluk−l
l=−∞ l=−∞
where hk is the system’s pulse response function. In this case, the cross-correlation reduces to
This result demonstrates that a “virtual” test can be conducted: if input u produces output y, then input Ruu produces output Ryu. Importantly, if u is zero-mean, unit variance white noise, then Ruu is given by (7) and so Ryu[k] = hk.
Task #4: Impulse response identification from white noise inputs
Download the data set u_rand.mat. This data set consists of 10 minutes of input-output data. You can assign labels to the different elements of the inputs and outputs as follows:
load u_rand.mat y1 = u_rand.Y(3).Data; y2 = u_rand.Y(4).Data; u1 = u_rand.Y(1).Data; u2 = u_rand.Y(2).Data; ts = 1/40; N = length(y1); t = [0:N-1]*ts - 1;
The sample period remains ts = 1/40 second for this data. Its necessary to assess the statistical properties of the input signals. We can establish that each channel is zero-mean, uncorrelated white noise. Answer the following,
1. Verify that each input sequence is approximately zero mean.
2. Determine and graph estimates of the four entries of Ruu for indices k ∈ [−200,200]. Plot the entries versus lag factor τ ∈ [−5,5] second (which corresponds to the indices k ∈ [−200,200]).
3. Show that
Thus, the variance of each input channel is 4, not 1.
4. Estimate Ryu[k] for τ ∈ [−0.2,2] second and then graph the first column of Ryu normalized by the variance of the first channel of u and graph the second column of Ryu normalized by the variance of the second channel of u. Compare your results to the experimental impulse response obtained from the pulse response experiments -the results should be close!
3 System norms
Just norms were defined for vectors and matrices, it is possible to define norms for systems and in this case, the input “vectors” are actually drawn from some class of signals. Some system norms are induced norms, and others are not. One major difference between signal and system norms compared to norms on finite dimensional vector spaces is that on finite dimensional vector spaces (including matrices) all norms are equivalent in the sense that a vector that is small in one norm will also be small in another norm. This does not generalize to norms on infinite dimensional vector spaces, i.e. linear spaces of signals and systems.
3.1 H2 norm
The H2 norm of a linear, time-invariant system, is defined as the square root of the integral of the square of the Frobenious norm of the system’s impulse response. In other words, if “P” represents a continuous-time LTI system, then
Z ∞
kPk2H2 = kh(t)k2F dt, h = continuous-time impulse response
−∞
when the integral exists, otherwise the norm is infinite. If the system is causal, then the lower limit of integration can be set to 0. Necessary and sufficient conditions for the integral to exist when P is a continuous-time system (as opposed to a discrete-time system) are that P is asymptotically stable and the direct feedthrough term D in the state-space realization must be zero. The latter condition is needed because if D 6= 0 then impulses appear in h, and impulses do not have finite energy in the continuous-time case.
For causal discrete-time systems, the expression is modified accordingly,
∞ kPk2H2 = Xkhkk2F , hk = discrete-time pulse response. (8)
k=0
In the discrete-time case D need not be zero, however, we will assume D = 0 here since it simplifies the formula (also, the systems you have identified have the property D = 0). Note that (8) can be manipulated as follows, assuming the system is causal, asymptotically stable, and has the state-space realization xk+1 = Axk + Buk, yk = Cxk,
(9)
or, alternatively,
(10)
where Go(∞) and Gc(∞) are the observability and controllability gramians that can be computed from the Lyapunov equations (2), (3).
The most relevant interpretation of the H2 norm from an engineering perspective is as follows: it is the root-mean-square (RMS) measurement of the output y when each channel of the input u is a zero-mean, white signal with autocorrelation
function
The RMS value of a discrete-time signal v is defined as
. (11)
Despite the usage of the norm notation, the RMS value of a signal is not a norm (“transient” non-zero signals have zero RMS value, however, the other norm properties are satisfied). Nevertheless, it is a very useful measure of a signal’s “size”, especially when the signal is “random.”
To make the connection between the RMS value of the output y and the H2 norm definition, note that kykRMS can also be expressed as
.
Since Ryy[0] has special significance, compute it:
!
This result is general, however, when u is zero-mean, unit variance white noise, in other words
then Ryy[0] reduces to
∞
Ryy[0] = X hqhTq .
q=−∞
The RMS value of the output is
kyk2RMS = tr Ryy[0]
!
Task #5: H2 norm analysis of identified model
You will use the u_rand.mat data set for this analysis, too. You have already verified
.
This shows the input channels are not unit variance, however, because the variances are both approximately 4, you can scale the input and output data by a factor 2 so that each input channel now has unit variance. You also showed that the individual channels of u have no correlation with each other. Compute and compare the following:
1. the RMS value of the scaled output data y,
2. kPkH2 , where P is your 7-state model derived from the Hankel matrix analysis; use both (9) and (10) for the system norm calculation,
3. approximate kPkH2 from (8) using the experimental pulse response data only (no model).
These calculations should yield essentially the same values.
3.2 H∞ norm
The H2 norm of a system is of great importance in control systems design because a typical objective is to design the controller so that the H2 norm of certain closed-loop input-output channels in minimized (say, from some force disturbances that are modeled as broadband white noise inputs to the motion of a vibration-sensitive instrument; in this case it is very natural to consider the H2 norm of those IO channels as something to be minimized with feedback). Although the H2 can be interpreted as an induced norm, the signal norms for the inputs and outputs are different: “energy” norm for the inputs and the maximum absolute value in time norm for the outputs. The H2 norm does not satisfy the submultiplicative property. In other words, there are examples in which two systems, denoted P1 and P2, have the property kP2·P1kH2 kP2kH2kP1kH2 , where P2 ·P1 denotes the series connection of these systems. This property makes the H2 norm unsuitable for addressing another critical aspect of control systems design, namely, the robustness of the controller to perturbations of the plant dynamics.
The H∞ norm of an asymptotically stable linear system P is defined as
, (12)
where P(jω) represents the system’s frequency response function and σ represents the maximum singular value. The “sup” operation over frequency returns the least upper bound of σ(P(jω)). In many cases, “sup” can be replaced with “max”. If the system is not asymptotically stable then kPkH∞ = ∞.
It is easily seen from its definition that the H∞ norm is submultiplicative. Furthermore, it can be shown that the H∞ norm is an induced norm on an appropriate space of input and output functions so the submultiplicative property follows from this as well. There are several reasons why the H∞ norm is useful in the robustness analysis of feedback systems. First, the fact that this system norm has strong connections with the frequency response makes it convenient to use in modeling the deviations of the system from a “nominal” model. For example, you may test a system under a variety of operating conditions and find that you can associate multiple frequency responses with the system –it just depends on what conditions the system is operating under. One modeling approach is to define a nominal model and a permissible level of deviation from the nominal model, quantified by the H∞ norm, so that all empirical frequency responses are included in the set of models generated by the nominal model along with a permissible deviation. Thus, a model of the following form may be generated:
P(jω) = (I + w(jω)∆(jω))P0(jω). (13)
In this example the “true” plant is described as a multiplicative perturbation of a nominal plant P0. The perturbation is an asymptotically stable, but unknown, system with k∆kH∞ < 1. The scalar weight w is what controls the “size” of the perturbation: if |w(jω)| << 1 at certain frequencies, then P ≈ P0 at those frequencies; on the other hand, if |w(jω)| 1 at other frequencies, then we are admitting we really don’t know the “true” plant model at those frequencies despite the fact that the nominal model P0 is well-defined. This is a very natural way to model a system and will be appreciated by those who have some test and measurement experience, especially with regard to empirical frequency response estimates. A second point demonstrating the usefulness of H∞ norm comes from the fact that a controller, denoted G, will stabilize all plants P of the form (13) if and only if kw(I + P0G)−1kH∞ ≤ 1, which may be interpreted as a bound on the frequency response of a weighted closed-loop transfer function obtained with the nominal model. This conclusion is possible only because the H∞ norm is submultiplicative. This is the subject of robust control and is covered in more detail in courses like MAE273.
State-space computation of kPkH∞ –continuous-time case
The definition of the H∞ norm suggests that a search over frequency can be performed. This is possible, however a more elegant way uses a state-space model of the system. First, consider an asymptotically stable continuous-time system “P” given by x˙ = Ax + Bu, y = Cx + Du. The frequency response function is P(jω) = C(jωI − A)−1B + D and by definition
.
Note that and so the state-space system that produces this expression for its frequency response is given by z˙ = −A∗z + C∗u˜, y˜ = −B∗z + D∗u˜. The frequency response product is created by a series connection of these two systems (the physical system P and the non-physical one), so a state-space realization can be determined for the series connection as follows:
u
(14)
y˜ u
The frequency response from u to y˜ produces (P(jω))∗P(jω). Let the “system” from u to y˜ be denoted L. We also introduce a positive parameter γ whose role will be clarified in a moment. We multiply each output channel of y˜ by 1/γ2. For a fixed γ, suppose at some frequency ω0 the largest eigenvalue associated with the frequency response of is equal to 1, and has corresponding eigenvector v. Recall that the frequency response of L is hermitian at each frequency so all eigenvalues of L(jω) are real. In other words, suppose , and hence v, for some v 6= 0. We surmise
Importantly, in this case there exists the following input-output relationship,
y˜(t) = cos(!0t)v v
Since there is a forced solution in which the input is equal to the output, we can “close the loop” and such a solution will still exist for the closed-loop system,
y˜(t) = cos(!0t)v
Thus, the closed-loop system must have an eigenvalue at jω0 since a sustained oscillation is possible in the absence of external excitation. Conversely, if the closed-loop system has an eigenvalue jω0, then it can be shown that the open loop system frequency response must have an eigenvalue of 1. The closed-loop “A” matrix, denoted Aclp(γ), is a function of γ and is computed by setting u y˜ in the state-space equations (14). The first expression to manipulate is the output equation,
γ2u D∗Du
|{z} y˜
where Dγ is defined as and is assumed to be invertible (we will show why Dγ can always be assumed to be invertible). Now substitute this expression for u into the differential equation part of the state-space discription,
u
In conclusion, kPk ≥ γ ⇐⇒ there exists a purely imaginary eigenvalue(s) of Aclp(γ)
H∞
or equivalently, kPk < γ ⇐⇒ there exists no purely imaginary eigenvalues of Aclp(γ)
H∞
Thus, instead of searching over frequency for the largest singular value of P(jω), we search over γ and determine whether
Aclp has imaginary eigenvalues. A bisection procedure on γ can compute kPk to arbitrary accuracy.
H∞
A final word is necessary on the assumed invertibility of Dγ. Since limω→∞ P(jω) = D, then a lower bound for kPk
H∞ is σ(D). In other words, in the bisection search over γ we can always take γ σ(D) which guarantees invertibility of Dγ.
State-space computation of kPkH∞ –discrete-time case
The frequency response associated with the discrete-time system xk+1 = Axk + Buk, yk = Cxk + Duk, is
and the H∞ norm of the system is defined as
.
The problem with extending the continuous-time analysis to the discrete-time case is due to the fact that it is not immediately apparent how to manipulate into another discrete-time system with an equivalent frequency response (the problem is how to deal with e−jωts) so another approach will be taken.
Consider a partitioned matrix M in which M11 ∈ Cm×p, M21 ∈ Cn×p, M12 ∈ Cm×n and M22 ∈ Cn×n. Also consider Q ∈ Cn×n. Imagine “connecting” the matrices in the following manner:
where the vectors have the following dimensions y ∈ Cm, u ∈ Cp, z ∈ Cn and w ∈ Cn. This relationship reduces to
y , (15)
where it is assumed that I − M22Q is invertible. The frequency response function for the continuous-time system x˙ =
Ax + Bu, y = Cx + Du is
where upon comparison with (15) we observe that the frequency response can be interpreted in this framework by selecting M11 = D, M12 = C, M21 = B, M22 = A, and Q = jω1 I. Similarly, the frequency response function associated with the discrete-time system xk+1 = Axk + Buk, y = Cxk + Duk can be computed by setting M11 = D, M12 = C, M21 = B,
M22 = A, and Q = e jωt1 s I. Thus, the frequency responses can be pictorially represented,
The frequency response of the discrete-time system can be related to the frequency response of an “equivalent” continuoustime system by noting that for any ω ∈ [0,ωnyq) (note that ωnyq is excluded), ejωts is uniquely related to ν ∈ R according to the bilinear transformation,
(16)
but since 1/ejωts appears in the diagram, we need to derive a relation between it and 1/jν,
(17)
The continuous-time system has no direct physical meaning -it is merely constructed so that it’s frequency response function at any given “frequency” ν has a unique match with the physical discrete-time system’s frequency response.
The “equivalent” continuous-time system can be determined by noting that (17) also has the following pictorial representation,
If we insert the right-hand diagram into the diagram for the discrete-time frequency response, it can be rearranged as shown on the right by eliminating the internal z and w signals,
where the “continuous-time” state-space matrices are computed to be
(18)
Thus, in order to compute the H∞ norm for the discrete-time system, we “convert” the state-space matrices into the continuoustime form (18) and then apply the bisection search procedure over γ using Aclp(γ) formed from the matrices in (18). If H∞ norm is γ0 then Aclp(γ0) has one or more eigenvalues that are purely imaginary. The frequency at which the maximum is achieved, however, needs to be determined so let ν0 be the imaginary part of one of the purely imaginary eigenvalues of Aclp(γ0). Compute the corresponding ω0 from (16) to determine the frequency at which the largest singular value of the frequency response occurs. The only frequency not covered by the search procedure is ω = ωnyq so the discrete-time frequency response is just checked separately for this case.
Task #6: H∞ norm analysis of identified model
Complete the following:
1. Write a Matlab function that accepts as inputs the state-space matrices of a continuous-time system, upper and lower limits for the γ search, and a tolerance, and then returns the H∞ norm of the system computed to within the specified tolerance, and the approximate frequency at which the maximum gain is achieved.
2. Write a Matlab function that accepts as inputs the state-space matrices of a discrete-time system, upper and lower limits for the γ search, a tolerance, and the sample period ts associated with the system, and then returns the H∞ norm of the system computed to within the specified tolerance, and the approximate frequency at which the maximum gain is achieved. Note that this Matlab function can call the Matlab function you wrote for the continuous-time case.
3. Compute the H∞ norm of your identified discrete-time model using the Matlab functions, including the frequency at which it is achieved.
4. Compute the discrete-time frequency response of the identified model on a frequency grid in the interval [0,ωnyq] and then plot the singular values of the frequency response on this frequency grid. Overlay on top of the singular values, the singular values that you compute from the empirical frequency response data. You should observe that the model singular values are very close to the empirical singular values and that the H∞ norm computed from the model locates the magnitude and corresponding frequency where the maximum singular value achieves its largest value.