## Servicios Personalizados

## Revista

## Articulo

## Indicadores

- Citado por SciELO

## Links relacionados

- Similares en SciELO

## Compartir

## Latin American applied research

##
*versión impresa* ISSN 0327-0793

### Lat. Am. appl. res. v.35 n.2 Bahía Blanca abr./jun. 2005

**An NIIR structure using HL CPWL functions**

**L. R. Castro ^{1}, J. L. Figueroa^{2}, and O. E. Agamennoni^{3}**

^{1} *Dto. de Matemática, Univ. Nac. del Sur, B8000CPB Bahía Blanca, Argentina lcastro@uns.edu.ar*

^{2}

*Dto. de Ing. Eléctrica y de Comp. - C.O.N.I.C.E.T., Univ. Nac. del Sur, B8000CPB - Bahía Blanca, Argentina*

figueroa@uns.edu.ar

figueroa@uns.edu.ar

^{3}

*Dto. de Ing. Eléctrica y de Comp. -C.I.C., Univ. Nac. del Sur, B8000CPB - Bahía Blanca, Argentina*

oagamen@uns.edu.ar

oagamen@uns.edu.ar

*Abstract* ¾ In this paper we present a nonlinear infinite impulse response (NIIR) model structure for black-box identification of nonlinear dynamic systems. The proposed model structure allows the implementation of an identification algorithm in which the degrees of freedom of the Nonlinear Output Error (NOE) model can be easily increased or decreased during the identification process. This property is very attractive to find the appropriate NIIR model, avoiding overfitting. This is done using High Level Canonical Piecewise Linear (HL CPWL) functions with an increasing (decreasing) grid division. Therefore, the algorithm may start using a linear estimation of the model. The parameters of the HL CPWL functions are updated using a simple algorithm based on a modified steepest descent method with an independently adaptive learning rate.

*Keywords* ¾ Nonlinear Identification. NIIR Model. PWL functions.

**I. INTRODUCTION**

The main problem in system identification is to find a good model structure. If it allows to go from a linear model to a nonlinear one during the system identification process, it makes this problem much harder since the set of nonlinear models is richer than the set of linear ones (Sjöberg and Ngia, 1998). If a nonlinear finite impulse (NFIR) structure is used, the model order evaluation problem may be effectively addressed by using regularization theory (Poggio and Girosi, 1990). This is due to the reduction of computational complexity when using NFIR model structures since they allow considering more parameters than needed in the identification algorithm and reducing some of them to zero through the regularization process. If a Wiener like model structure is used, an aggregation approach can be easily implemented as in the Korenberg algorithm (Korenberg and Paarmann, 1991). In the Neural Networks literature there exist growing and pruning methods to deal with the size of a Neural Network during the training process (Haykin, 1994). If NIIR model structures are used, the problem becomes much more difficult due to the mathematical complexity and the computational cost involved in the identification process.

In this paper we present an NIIR model structure that uses High Level Canonical Piecewise Linear (HL CPWL) functions to develop a nonlinear output error (NOE) identification algorithm. The main feature of this algorithm is its simple mechanism for increasing or decreasing the model approximation capabilities, retaining the approximation achieved when moving from one grid division to another. In this way, it is possible to start the identification with a linear approximation and then increase the model degree of freedom progressively in order to reduce the mismatch up to an acceptable value. On the other hand, a reduced model may be evaluated to alleviate overfitting.

The paper is organized as follows. In Section II, we present the identification algorithm and analyze its advantages and drawbacks; in Section III we develop an example of the proposed methodology and finally, in Section IV we draw some conclusions and comments about future work. In order to be self-contained, in Appendix A we give a brief introduction to HL CPWL functions and their main properties.

**II. CPWL IIR IDENTIFICATION**

**A. From linear to nonlinear **

Let us suppose that we want to identify a system given an output vector y corresponding to an input **u**. If is the estimated vector, let us define

(1) | |

. (2) |

It is well known (see (Sjöberg *et al.*, 1995), for example) that a general black-box model is given by

(3) |

where φ* _{k}* = φ(

*u*,

^{k}^{k - 1}) is the regression vector and θ is the vector of parameters associated to the function

*f*used to approximate the system's nonlinearity. Therefore, the model is defined once

*f*and the regression vector φ

*are chosen.*

_{k}Following this idea, we propose a regression vector given by

(4) |

with *M*, *N* fixed.

Then our model is defined as follows

(5) |

where the function *f _{pwl}* used to approximate the nonlinearity of the model is a HL CPWL function defined, as in Eq. (19), by

, (6) |

and * _{r}* ,

*r*= 0, . . . ,

*N*are initialization values. This model is pictured in Fig. 1.

Figure 1. NIIR HL CPWL model.

The domain of the function *f _{pwl}* is a compact set S Ì ,

*m*=

*M*+

*N*+ 1, defined as follows

S = {x Î : *a _{i}* £ x

*£*

_{i}*a*+ δ

_{i}*ndiv*,

*i*= 1, ..., m}, (7)

(7) being δ the fixed grid size and *ndiv* the number of divisions. So each interval [*a _{i}, a_{i}* + δ

*ndiv*] of the domain S defined by Eq. (7) is divided into

*ndiv*number of subintervals of equal length δ. As a consequence, when the grid size δ decreases, the number of divisions ndiv increases.

According to Appendix A, the set defined by Eq. (7) is partitioned into polyhedral regions using a simplicial boundary configuration. The *f _{pwl}* constructed using the methodology described in Appendix A is linear on each simplex and continuous on the adjacent boundaries of the simplices.

In the methodology proposed above, it is possible to start the identification process with a linear approximation to the system. Once the parameters are optimized, the number of divisions *ndiv* may be increased in order to obtain a better piecewise linear approximation. On the other way, it is possible to go from a fine approximation to a coarser one by decreasing the value of *ndiv*. This modeling facility not only allows to obtain a better quality piecewise linear approximation but also makes it possible to prevent overfitting.

Therefore, when using HL CPWL functions as nonlinear approximators, the parameter *ndiv* gives a natural ordering of the model since it allows to go from a simple model to a more complex one. The advantages of using this kind of models was pointed out in (Sjöberg and Ngia, 1998, Ch. 1).

In Fig. 2 (a) and (b) the idea of approximating a nonlinear quadratic function using HL CPWL functions with increasing values of *ndiv* (*ndiv* = 2, 4) is pictured. It can be observed that as long as the value of *ndiv* increases, the HL CPWL function approximates the nonlinear one more accurately. Also, on the *XY* plane it can be seen the simplices determined on the region S by the different number of divisions *ndiv*.

(a)

(b)

Figure 2. HL CPWL approximation for (a) *ndiv* = 2 and (b) *ndiv* = 4.

**B. Identification algorithm**

Let (u* _{k}*, y

*)*

_{k}_{1£ k £ L}be the input/output vectors and c

^{d - 1,*}the (

*ndiv*+ 1)

^{M + N + 1}-dimensional vector of parameters for a given number of divisions

*ndiv*= 2

^{d - 1},

*d*Î (if

*d*= 1 we have a linear approximation). For

*ndiv*= 2

^{d}we find a new (

*ndiv*+ 1)

^{M + N + 1}-dimensional vector of parameters using a least square approximation technique on the new set of vertices of the region S and note it c

^{d,r},

*r*= 0.

Now we update the vector of parameters c^{d + 1,r}, *r* ³ 1 using an iterative algorithm that minimizes the square error *E ^{r}*,

*r*³ 1 between the system y and the estimate at iteration

*r*,

*r*³ 1. The expression of this error in the variables c

^{d + 1,r }can be written using Eq. (6), as follows

(8) |

In order to minimize Eq. (8) we use the following modified steepest-descent algorithm. The expression of the components of the gradient vector Ñ*E ^{r}* needed are given by

(9) |

and the vector of parameters c^{d,r} is updated using the formula

c^{d,r} ¬ c^{d,r - 1} + Δc^{d,r}, (10)

where each component of Δc^{d,r} is adaptively updated using the following algorithm

, (11) |

where lr^{r} are modified as described below and the momentum μ Î is fixed.

(12) |

*inc* > 1 and *dec* < 1 being real, fixed, positive constants.

From the formulation, the local convergence of the method to a minimum immediately follows. The drawback is that the achieved minimum may not be a global minimum but a local one. Also, the high number of parameters generated by the HL CPWL approximation when the number of divisions of the region S increases, constitutes now a limitation of the method.

In spite of this, the advantages of using HL CPWL functions enumerated below make it worth to define this identification structure.

1. The computation of the gradient is linear in the parameters and straight-forward since the approximation has already been computed in the previous step.2. The canonical HL CPWL approximation uses the least number of parameters in the sense that any other PWL approximation has greater or equal number of parameters (see (Julián

*et al.*, 1999; Julián, 1999)).

3. A very efficient method for computing the HL CPWL approximation (see (Julián, 1999, Julián

*et al.*, 1999; Julián

*et al.*, 2000)) has been implemented in the MATLAB environment for both HL CPWL and orthonormal HL CPWL functions (Julián, 2000).

**III. EXAMPLE**

We consider the well known nonlinear system due to Narendra and Parthasarathy (Narendra and Parthasarathy, 1990) given by

, (13) |

with **u** a random signal with uniform distribution. According to the proposed methodology, the regressor was defined with one input and one delayed output, *i.e.* φ* _{k}* = [u

_{k}_{k - 1}].

We first generated a linear ARX model of the system given by Eq. (6). In Fig. 3 it is possible to see this linear approximation in a PWL format.

Figure 3. Linear approximation using HL CPWL representation.

In order to improve the model performance, we increased the number of divisions to *ndiv* = 2 and optimized the parameters as described in Section IIB. Consequently, a new set of HL CPWL functions was obtained (see Fig. 4). We then repeated the process using *ndiv* = 4. As it can be appreciated in Fig. 5, the approximation rapidly improves.

Finally, the number of division of S was increased to *ndiv* = 8. The new HL CPWL approximation can be seen in Fig. 6.

As can be appreciated, the approximation to the nonlinear system quickly improves when the number of divisions of the set S increases. This statement is clearly showed in Fig. 7 and Fig. 8. In Fig. 7 we depicted the parameter optimization RMS error *versus* the number of iterations for each number of divisions. As it can be appreciated, the decreasing rate is high each time the number of divisions is augmented. On the other hand, in Fig. 8 we plotted the approximation and validation errors (*i.e.* the error in data used for approximation and the error in data not used for approximation, respectively) for the ARX and the NIIR HL CPWL models. As it can be clearly seen, there is a significant reduction of both, the approximation and validation errors, as long as the number of divisions increase.

Figure 4. NIIR HL CPWL approximation using *ndiv* = 2.

Figure 5. NIIR HL CPWL approximation using *ndiv* = 4.

Figure 6. NIIR HL CPWL approximation using *ndiv* = 8.

Figure 7. RMSE approximation error for the NIIR HL CPWL models using *ndiv* = 2,4,8.

**IV. CONCLUSIONS**

In this paper a NOE identification algorithm based on HL CPWL functions is presented. The main advantages of the algorithm are the following. We can first mention that this algorithm might be easily implemented in microelectronics due to the efficient computation of the HL CPWL functions and of the gradient. Secondly, we must point out the simplicity of the mechanism for increasing or decreasing the model degree of freedom, retaining the achieved model approximation.

The parameters of the HL CPWL for a given number of divisions could be straightforwardly evaluated from the previous ones. This would avoid using the least square methodology, as mentioned in Section IIB. This is the focus of our future work.

Furthermore, the potentials of our approach have been illustrated with a simulation example.

**A HL CPWL FUNCTIONS**

**Definition A.1** *A function f *: S Ì ® , *where* S *is a compact set, is PWL if and only if it satisfies the following*

*(i) The domain* S *is divided in a number of finite polyhedral regions R*^{(1)},*R*^{(2)}, ..., *R*^{(N)}* such that , by a finite set of boundaries *

*H *= {*H _{i}* Ì

*S*,

*i*= 1, 2, ...,

*h*}, (14)

*such that each boundary is an *(*m* - 1)*-dimensional hyperplane (or a subset of the hyperplane)*

*T*

, (15)

*where* α* _{i}* Î

*and*β

*Î*

_{i}*for i*= 1, 2, ...,

*h and cannot be covered by a*(

*m*- 2)

*-dimensional hyperplane*

^{1}.

Figure 8. Approximation and validation errors for the NIIR HL CPWL model.

*(ii) f is represented by an affine mapping of the form*

*f*^{(i)} (x) = *J*^{(i)}x + w^{(i)}, (16)

*for any *x Î *R*^{(i)}*; J*^{(i)} Î *is the Jacobian of the region R*^{(i)} *and *w^{(i)} Î .

*(iii) f is continuous on any boundary between two adjacent regions, i.e. *

*J*^{(p)}x + w^{(p)} = *J*^{(q)}x + w^{(q)}, (17)

*for any* x Î Ç .

If S is defined as in Eq. (7), the space of all continuous PWL mappings defined over the domain S partitioned with a simplicial boundary configuration *H* is denoted by *PWL _{H}* [S] and it is a linear vector space with the sum and multiplication of functions by a scalar defined as usual.

A basis for this space, constructed in (Julián *et al.*, 1999) by nesting absolute value functions, can be expressed in vector form as

, (18) |

where Λ^{i} is the vector containing the generating functions defined in (Julián *et al.*, 1999) with *i* nesting levels. Accordingly, any *f _{p}* Î

*PWL*[S] can be written as

_{H} *f _{p}* (x) = c

^{T}Λ (x), (19)

where , and every vector c^{i} is a parameter vector associated with the vector function Λ^{i}.

Then the HL CPWL functions defined on S uniformly approximate any continuous function *g* : . The HL CPWL approximation to the nonlinear function *g* is defined (cf. (Julián *et al.*, 1999; Julián, 1999)) as the function *f _{p}* Î

*PWL*[S] that satisfies

_{H} *f _{pwl}*(v

*) =*

^{j}*g*(v

*), (20)*

^{j} v^{j} being the vertices of the simplicial partition *H* of the domain S. If *g*(×) is Lipschitz continuous with Lipschitz constant *L* and the modeling error is defined as

, (21) |

then we have that

ε £ δ*L*. (22)

In order to obtain an orthonormal basis, it is necessary to define an inner product on *PWL _{H}* [S]. If

*V*

_{S}is the set of vertices of S and

*f*,

*g*belong to

*PWL*[S], then

_{H}, (23) |

defines an inner product and so the space *PWL _{H}* [S] becomes a Hilbert space. The new basis elements are linear combination of (18), that is

(x) = *T*Λ (x), (24)

and the matrix *T* may be obtained using the Gram-Schmidt procedure as given in (Julián *et al.*, 2000).

Also, the HL CPWL functions of this class can uniformly approximate any continuous function *g*: . For finding the required approximation, we use a routine of (Julián, 2000) that finds a vector of parameters c that is the solution of the least square problem min_{x} ||*A*x - b||_{2}, being *A* = ^{T} (*X*), *X* the input matrix and b the output to be approximated in sparse format. In accordance with (Julián *et al.*, 2000), the HL CPWL approximation of the nonlinear function *g *is defined as the function *f _{p}* Î

*PWL*[S] satisfying

_{H}*f*=

_{pwl}*A*c.

^{1}We say that a boundary is *covered* by an hyperplane *H* if and only if *B* Ì *H *

**ACKNOWLEDGMENT**

This work was partially supported by grant 24/K011, SEG-CyT, UNS.

**REFERENCES**

1. Haykin, S. *Neural Networks: a comprehensive foundation*. Macmillan, New York. (1994). [ Links ]

2. Julián, P. "A toolbox for the piecewise approximation of multidimensional functions". http://www.pedrojulian.com (2000). [ Links ]

3. Julián, P., A. Desages and M. B. D'Amico. "Orthonormal high level canonical PWL functions with applications to model reduction". *IEEE Trans. on Circ. and Syst.*, **47**, 702-712 (2000). [ Links ]

4. Julián, P., A. Desages and O. Agamennoni. "High level canonical piecewise representation using a simplicial partition". *IEEE Trans. on Circ. and Syst.*, **44**, 463-480 (1999). [ Links ]

5. Julián, P.. "A high level canonical piecewise linear representation: theory and applications". PhD thesis, Universidad Nacional del Sur, Bahía Blanca, Argentina. UMI Dissertation Services, Michigan, USA (1999). [ Links ]

6. Korenberg, M. J. and L. D. Paarmann. "Orthogonal approaches to time-series and system identification". *IEEE Signal Processing Magazine*, 29-43 (1991). [ Links ]

7. Narendra, K. S. and K. Parthasarathy. "Identification and control of dynamical systems using neural networks". *IEEE Trans. on Neural Networks*, **1**, 4-27 (1990). [ Links ]

8. Poggio, T. and F. Girosi. "Regularization algorithms for learning that are equivalent to multilayer networks". *Science*, **247**, 978-982 (1990). [ Links ]

9. Sjöberg, J. and S. H. Ngia. "Neural nets and related model structures for nonlinear system identification". In *Nonlinear modeling: advanced black-box techniques*, J. A. K. Suykens and K. Vandevalle, Eds. Kluwer Academic Publishers, 1-28 (1998). [ Links ]

10. Sjöberg, J., Q. Zhang, L. Ljung, A. Beneviste, B. Delyon, P. Glorennec, H. Hjalmarsson and A. Juditsky. "Nonlinear blcak-box modeling in system identification: a unified overview". *Automatica*, **35**(12), 1691-1724 (1995). [ Links ]