Latin American applied research
versión ISSN 0327-0793
Lat. Am. appl. res. v.35 n.2 Bahía Blanca abr./jun. 2005
A method solving an inverse problem with unknown parameters from two sets of relative measurements
J. R. Vega1, G. L. Frontini2, L. M. Gugliotta1, and G. E. Eliçabe2
1 Instituto de Desarrollo Tecnológico para la Industria Química (INTEC), CONICET and Univ. Nac. del Litoral, 3000 Santa Fe, Argentina
2 Instituto de Investigación en Ciencia y Tecnología de Materiales (INTEMA), Univ. Nac. de Mar del Plata, 7600 Mar del Plata, Argentina.
Abstract ¾ This work deals with an ill-posed inverse problem in which a distribution function, f(x), is estimated from two independent sets of non-negative relative measurements. Each measurement set is modeled through a Fredholm equation of the first kind, with unknown parameters in its kernel. While the first measurement model only includes a scalar unknown parameter, p0, the second model contains a vector of unknown parameters, p. The proposed method consists of the following steps: (i) to obtain a first estimate of f(x) and p0 from the first measurement; (ii) to estimate the vector p from the second measurement and the previous estimate of f(x); and (iii) to estimate an improved f(x) by simultaneously using both measurements and the estimated parameters in a unique combined problem. The proposed algorithm is evaluated through a numerical example for simultaneously estimating the particle size distribution and the refractive index of a polymer latex, from combined measurements of elastic light scattering and turbidity.
Keywords ¾ Inverse Problem. Parameter Estimation. Combined Measurements. ELS. Turbidity.
Inverse problems are frequently present in most of the measurement systems that include non-ideal devices, or when only indirect measurements are available. Indirect measurements arise when a physical property of a sample, f(x), must be estimated from the measurement of a different physical quantity, g(y). The estimation problem consists in finding f(x) from g(y), on the basis of theoretical measurement models that relate these variables. Often, this kind of inverse problem is numerically 'ill-conditioned'; i.e., small changes in the measured variable (for example, originated by different noise levels), may lead to large changes in the estimated variables (Kirsch, 1996). As a consequence, simulation of different f(x) can generate almost the same g(y), thus complicating the estimation problem.
Regularization methods replace the unstable original problem by a similar one, but stable or well-conditioned; and they usually include adjustable parameters, a priori knowledge of the solution, or some smoothness condition (Kirsch, 1996; Engl et al., 1996; Hansen, 1994). Often, when a regularization condition is included, then several solutions can be attained. Typically, a trade-off solution must be selected: (i) a strong regularization modifies the original problem, with the advantage of leading to a smooth solution; and (ii) a weak regularization keeps the original problem almost unchanged, but can originate oscillating solutions. In general, all regularization method includes (at least) an adjustable parameter, which is usually selected by the user on the basis of the obtained solutions. Alternatively, the regularization parameter can be automatically estimated through some numerical methods, as for example the Generalized Cross Validation (GCV) technique (Golub et al., 1979), or the L-curve technique (Hansen, 1994).
Combination of two or more independent sets of measurements (e.g., obtained from two different equipments), allows increasing the information content in the problem, and can contribute to improve the quality of the estimates. The numerical treatment of the combined problem is simple when the indirect measurements are absolute (i.e., when all proportionality constants are known), and the involved mathematical models are linear (Eliçabe and Frontini, 1996). However, if the model related to the measured physical quantities includes unknown proportionality constants, then the measurements are relative, and a 'normalization' parameter is required to adequately combine all sets of measurements in a unique problem (Frontini and Eliçabe, 2000). In such cases, the resulting combined problem may be nonlinear, and some alternative algorithms have been proposed to solve it. For example, Frontini and Eliçabe (2000) developed a method to estimate in a single step f(x) and the normalization parameter. Alternatively, Vega et al. (2001) and Vega et al. (2003), proposed a "two-step" procedure that involves the resolution of two linear estimation problems. In the first step, the normalization parameter is estimated on the basis of conciliating the two independent measurements; and in the second step, the sought f(x) is obtained after numerical inversion of the combined problem.
The estimation problem becomes more difficult when the mathematical model is not exactly known, or when some parameters are unknown. Typically, several linear inverse problems with unknown kernel parameters, may be described through the following Fredholm equation of the first kind:
where f(x) is an unknown function of the distributed variable, x;) represents the indirect measurement of f(x), at each measurement point, y; the kernel A contains a set of unknown parameters, p(y); and k is a proportionality constant that adequately scales the model predictions to the real measurements. The constant k includes the detector gain and it is often unknown. Moreover, sometimes such constant may depend on the unknown f(x) distribution (Vega et al., 2003). In practice, both continuous variables (x and y) of Eq. (1) are limited to a finite range of values.
In the case of only one unknown parameter in the kernel (i.e., when p = p0 = constant), the estimation of f(x) and p0 from Eq. (1) has already been considered (Frontini and Fernández Berdaguer, 2003). In this work, a more general case when the unknown parameters depend on the independent measurement variable [i.e., p = p(y)], will be investigated. The next section details the problem statement and proposes a method for estimating both f(x) and p(y). In section III, a numerical example is considered to evaluate the proposed method. The example simulates the estimation of the particle size distributions (PSD) of a polymer latex from turbidity (T) and elastic light scattering (ELS) measurements, when the refractive index of the polymer particles is unknown.
A. Formulation of the Discrete Combined Problem
Assume that each independent variable (x and y) of Eq. (1) is discretized at regular intervals Δx and Δy, respectively. Then, the discrete version of Eq. (1) can be interpreted as a set of 'm' algebraic equations (one for each y value), in 'n' unknowns (one for each x value). Thus, Eq. (1) can be written in the following vector equation:
|g = k A(p) f||(2)|
where the vector g (m×1) contains the measurements; the matrix A (m×n) is theoretically known, but it has 'r' unknown parameters represented by the vector p (r×1); and the vector f (n×1) represents the unknown ordinates of the distribution. Usually, n < m and r £ m.
Consider two different sets of measurements g1(y1) and g2(y2), obtained from two different equipments. The following is assumed: (1) both measurements are non-negative; (2) in measurement g1(y1), the kernel A1 has only one unknown parameter, p0; and (3) in measurement g2(y2), the kernel A2 has one unknown parameter for each value of the independent variable y2, p(y2). Then, the discrete equations for the measurements are represented by:
|g1 = k1 A1(p0) f||(3.a)|
|g2 = k2 A2(p) f||(3.b)|
where k1 and k2 are normally unknown proportionality constants. Eqs. (3) can be rewritten as:
|g1 = A1(p0) f||(4.a)|
|g2 = K A2(p) fr||(4.b)|
where fr = k1 f, and K = k2/k1 is the 'normalization' parameter defined in Vega et al. (2001), and Vega et al. (2003). Vector and matrix dimensions in Eqs. (4) are: g1 (m1×1), g2 (m2×1), A1 (m1×n), A2 (m2×n), fr (n×1), and p (m2×1).
Equations (4) can be combined in a unique expression, as follows:
|g = A(K, p0, p)fr||(5)|
where g [(m1+m2)×1] is the vector of combined measurements; A [(m1+m2)×n] is the combined matrix; and g1,max, g2,max are the maximum values of g1 and g2, respectively. The scaling by g1,max, g2,max in Eqs. (6) is used to treat the inversion problem of Eq. (5) as if the noise variances were the same for both sets of measurements (Eliçabe and Frontini, 1996).
The problem to be solved consists in finding K, p0, p, and fr, from the measurements g1 and g2 (although distribution f is the original unknown, the estimation of fr is thoroughly equivalent because f and fr are proportional each other through the scaling factor k1). In principle, all unknowns could be estimated through some numerical methods (e.g., a nonlinear optimization routine) applied to the nonlinear system of Eqs. (5, 6). In practice, however, such possibility is discarded because most problems are extremely ill-posed, and solutions obtained through standard methods would be rather poor. Alternatively, specific numerical methods for ill-posed problems should be applied. Even though a specific inversion method is available, only Eq. (4.a) could be solved. In fact, Eq. (4.a) can be seen as a set of m1 nonlinear algebraic equations that includes n+1 ( < m1) unknowns; whereas Eq. (4.b) expands in a set of m2 nonlinear algebraic equations in n + m2 + 1 ( > m2) unknowns. Thus, Eq. (4.b) could only be solved if an estimate of fr or p is available.
B. Estimation of the unknown parameters 'p'
Assume that an acceptable estimate of fr () is available. Since vector p affects g2, Eq. (4.b) can be used to calculate p, provided that K is known. Even when K is unknown, Eq. (4.b) allows estimating p in the two following cases:
Case 1: for any p1 ¹ p, then A2(p)fr ¹ qA2(p1)fr, regardless of the selected scalar constant q. In such case, p(y2) mainly affects the shape of g2(y2), and K can be eliminated through an adequate normalization of Eq. (4.b), as follows:
where ||.||1 stands for the standard 1-norm of a vector (i.e., ||a||1 = S |ai|). Equation (7) expands in a set of m2 nonlinear algebraic equations in m2 unknowns (the components of p).
Case 2: (i) for some p1 ¹ p, there is a scalar constant q that verifies: A2(p)fr @ qA2(p1)fr; and (ii) p0 is a known component of p(y2), i.e.: at a given y2 = y2,0, p0 = p(y2,0). In such case, p(y2) mainly affects the magnitude (but not the shape) of g2(y2). In this case, a first estimate of K can be calculated from Eq. (4.b) evaluated at y2 = y2,0, yielding:
Then, Eq. (8) is introduced into Eq. (4.b), and can be obtained by solving:
Equation (9) can be directly solved for the remaining m2 - 1 unknown components of p. Both Eq. (7) and Eq. (9) can be solved through a classical estimation algorithm for a set of nonlinear algebraic equations, such as the Gauss-Newton or the Levenberg-Marquard methods (see for example, Dennis, 1977; and Moré, 1977).
C. The Proposed Estimation Method
To estimate all unknowns, the following four-steps procedure is proposed.
Step 0 (initial conditions). Solve the inverse problem of Eq. (4.a), to find a first estimate of fr (), and the estimate of p0 (). Only the measurement g1 is required. The method given by Frontini and Fernández Berdaguer (2003) can be applied.
Step 1 (estimate of p). Obtain an estimate of p (), by solving Eq. (7) in Case 1, or Eq. (9) in Case 2. The first estimate of fr, and the measurement g2 are required.
Step 2 (estimate of K). Estimate the normalization parameter K that conciliates Eqs. (4), on the basis of the p and fr estimates given in the previous steps. The method proposed by Vega et al. (2003) can be applied, which provides the following solution:
Step 3 (estimate of fr). Estimate an improved fr distribution (), by numerically inverting Eqs. (5, 6), with the estimates of p0, p, and K given in the previous steps.
The proposed numeric method is schematically represented in Fig. 1. Notice that: (i) each step can be sequentially implemented and solved; (ii) in practice, the main outputs of this algorithm are the estimates of p and fr; (iii) only one of Eq. (7) or Eq. (9) must be implemented in Step 1, depending on the corresponding Case; and (iv) for inverting Eqs. (5, 6), the estimate of p0 is only used in Case 1 if p0 does not belong to vector p; whereas in Case 2, p0 is directly included in the estimate of vector p.
Figure 1. The proposed calculation scheme.
III. NUMERICAL EXAMPLE
To evaluate the proposed estimation method, a simulated numerical example is presented. "Synthetic" or simulated examples are ideal for investigating data treatment procedures, because the solutions are a priori known. The basic example already investigated by Vega et al. (2003) shall be analyzed, but now under the assumption of several unknown parameters in the mathematical model.
Consider a polymer latex sample, which basically consists of spherical particles suspended in an aqueous medium. Typically, the particle diameters (D) of a latex ranges from several nanometers to some micrometers, with concentrations of c.a. 109 part./cm3. Since particles differ in their sizes, a whole size distribution is present in the sample. In general, the number PSD, f(D), represents the amount of particles present in a given interval of diameters (D + dD).
The analysis of a polystyrene (PS) polymer latex with water as the dispersion medium, and a global concentration of 5×109 part/cm3 is investigated. The a priori known (discrete and bimodal) PSD consists of 56 evenly spaced points in the diameter range [50, 600] nm. It was obtained by combining two normal-logarithmic distributions, with mean diameters of 200 and 400 nm, standard deviation of 0.15 and 0.075, and number fractions of 0.85 and 0.15, respectively.
Two experiments based on ELS and T measurements are simulated. In the ELS experiment, a monocromatic light beam (of wavelength λ0 = 632.8 nm) falls onto the diluted particle sample, and a light intensity Is(θ) is scattered by the particles at each angle, θ. In ELS, the measurement is the intensity light ratio I(θ) = Is(θ) / Is(θ0), with respect to a reference angle, θ0; and it can be calculated using the Mie theory, as follows (Bohren and Huffman, 1983):
where CI(θ, D) represents the light intensity scattered by a particle of diameter D at the angle θ, and it is calculated through the Mie scattering theory (Bohren and Huffman, 1983); kI is an unknown constant; and np(λ0) [= np,0 = 1.5729] represents the particle refractive index at the experiment wavelength. A range of measurement angles [θmin,, θmax] = [12o, 150o] is considered, with measurements taken each 2o. The relative measurements I(θ) are commonly reported in angular light scattering determinations (Glatter et al., 1985).
In the T experiment, Ii(λ) is the intensity at the light source, It(λ) is the intensity at the light detector, and λ is the wavelength of the incident light in vacuum. The measurement is the turbidity spectrum, τ(λ), which is defined as the light intensity attenuation experienced by the beam of light traversing the diluted sample; i. e., (λ) = ln[Ii(λ) / It(λ)]. For spherical particles, τ(λ) is related to f(D) according to Mie theory (Bohren and Huffman, 1983), as follows:
where Qext is the extinction efficiency of a particle of diameter D at λ; kτ is a constant that depends on the optical path length; and np(λ) represents the particle refractive index at each λ. Upper and lower limits for λ [λmin, λmax] = [306, 701] nm are also considered, with measurements taken each 5 nm.
For a given system, kτ could be in principle exactly calculated as kτ = π /4, where is the optical path length (basically, the thickness of the cell). This is not the case for kI, however, that depends on the PSD through the following theoretical relationship (Vega et al., 2003): .
Consider now the discrete PSD (with the diameter axis sampled each 10 nm), described by:
f = [f(D1) f(D2) ... f(Dn)]T (13)
Assume the ELS and T measurements respectively represented by the following two vectors:
with kI = k1 = 2.0252×10-10, and kτ = k2 = 0.19635 cm; thus K = 9.6953×108 cm.
After calculating the discrete version of Eqs. (11) and (12), a problem thoroughly equivalent to that of Eqs. (3) is obtained. The unknown parameters are np(λ); and np(λ0) is one of the np(λ), because λ0 belongs to the selected range of λ.
Measurements of ELS and T were simulated according to Eqs. (11) and (12). A random Gaussian noise of standard deviation equal to 0.1% of the maximum of each measurement was added. Both measurements are represented in Fig. 2 as continuous functions (thick trace), although they have a finite number of points (m1 and m2) indicated in the figures. To quantify the effects of the parameters on the measurements, uncertainties of ±1% around the true values of np,0 and np(λ) were simulated, and the corresponding measurements are represented in thin traces in Fig. 2. The simulation results indicate that a constant error in np(λ) mainly modifies the magnitude of τ(λ), with a slight effect on its shape. Thus, the parameter estimation problem corresponds to Case 2, and np(λ) should be estimated through Eq. (9).
Figure 2. Sensitivity of: (a) ELS measurement [to ±1% in np,0]; (b) T measurement [to ±1% in np(λ)].
Throughout this work, the Phillips-Tikhonov regularization method was applied to solve all inversion problems. In that method, if A is an ill-conditioned matrix, a regularized pseudo-inverse of A may be computed through: A[-1] = (AT A + α H)-1 AT, where H is a selected regularization matrix, and α is the adjustable regularization parameter (Phillips, 1962; Tikhonov and Arsenin, 1977).
The algorithm detailed in the previous section was applied. The main simulation results are presented in Fig. 3. From the ELS measurement, I(θ), the method developed by Frontini and Fernández Berdaguer (2003) was applied to solve the Step 0 of the algorithm. The following was obtained: = 1.5757, and the first PSD estimate (fr,0), indicated as in Fig. 3. Despite clearly differs from the true PSD, Eq. (11) is fulfilled by the estimates and , thus revealing the ill-posed characteristic of the inverse problem. Also, the first estimate of the normalization parameter ( = 9.6881×106) was obtained after evaluating Eq. (8) at λ0 = 632.8 nm.
Figure 3. (a) The true refractive index, np(λ), and its estimate, . (b) The true PSD, f(D); and its estimates: , , and .
Based on and , Eq. (9) was used to estimate np(λ); and from Eq. (10), = 9.6543×106 was obtained. Then, Eqs. (5, 6) were inverted to obtain the estimated PSD, . These estimates are represented in Fig. 3. For comparison, the PSD estimate when all parameters are exactly known is also represented in Fig. 3 as . This estimate was obtained by solving Eqs. (5, 6); i.e., Step 3 of the algorithm. Notice that the refractive index estimate, , resulted slightly higher than the true value, as a consequence of the erroneous , but the shape of the true np(λ) is closely recuperated. Thus, if a better initial estimate of np,0 were available, then a practically exact estimation of the refractive index function would be obtained. The accuracy of the estimated np(λ) ranged from 0.15 % to 0.35 % with respect to the true value.
Due to the biased , the solution of the combined problem in Step 3 originates an erroneous K estimate, even worse than the initial estimate, . However, the PSD estimate resulted somewhat better than the original .
To evaluate the quality of the PSD estimates, the following performance indexes were defined:
where Dw(f) is the weight average diameter of the PSD, that can be calculated as:
The errors corresponding to the three PSD estimates are indicated in Table 1. The error of the initial estimate is relatively high mainly because the first peak of resulted shifted with respect to the true PSD. On the other hand, the errors of and are practically equal, despite np(λ) was not exactly estimated. In all cases, however, an accurate mean diameter was obtained, with relative errors lower than 1%.
Table 1. PSD estimation errors [Dw(f) = 325.0 nm]
Although the proposed algorithm only considers the sequential steps above commented, some additional simulations were implemented to explore a further improve of the estimates. The obtained was used to recalculate np,0, but no meaningful change in its value was observed. Then, was injected into Eq. (9) instead of ; but the new estimate of resulted practically coincident with its original estimate. Thus, at least for this specific problem, iterations do not allow an improvement of the solution.
A numerical method for solving an ill-conditioned inverse problem with unknown kernel parameters was presented. The algorithm assumes two available sets of indirect measurements to be known, and it consists of four sequential steps that are based on the mathematical model of the measurements. While the first measurement set is unaffected by the unknown kernel parameters, each measurement point of the second set directly depends on a different unknown parameter. Also, the method considers the more general case when the measurements are not absolute, i.e. when unknown proportionality constants are included.
If the problem were well conditioned, then no uncertainty in Step 0 would be present; and after Step 1 the problem should become completely solved. This is not the case for ill-conditioned problems, however. In such cases, the outputs of Step 0 can be erroneous, and therefore Step 1 will not provide exact estimates of the parameters. The inclusion of Steps 2 and 3 in the algorithm allows increasing the information content in the problem, by combining the measurements.
The algorithm can be applied in the more general case of indirect measurements, and when typically unknown proportionality constants are present. If such constants were a priori well-known, Eq. (9) should be replaced by Eq. (4.b), and Eq. (10) should be discarded.
Each step of the proposed algorithm can be sequentially implemented. However, the estimation of K after Step 2 should coincide with its initial estimate given by Eq. (8). Any detected differences between those values could in principle be used to implement an iterative procedure that allows improving the quality of the estimates. Such methodology should be further investigated to establish conditions that guarantee convergence of the iterations.
Extension of this algorithm for including measurements of other physical variables, or for incorporating a third set of measurements, seems to be straightforward. Inclusion of a new measurement provided by a different device will require the development of its corresponding mathematical model. Inclusion of a third measurement set will require the estimation of two normalization parameters.
The performance of the algorithm was successfully evaluated through a numerical example. The simulated example is technologically important, because the particle refractive index is often unknown. Moreover, in some cases, the estimation of the particle refractive index may be the key problem to be solved, regardless of the PSD, and to this unique effect, the proposed algorithm may also be applied.
1. Bohren, C. F. and D. R. Huffman, Absorption and Scattering of Light by Small Particles, John Wiley & Sons, New York (1983). [ Links ]
2. Dennis, J. E., Nonlinear Least Squares, State of the Art in Numerical Analysis, D. Jacobs, Ed., Academic Press, London (1977). [ Links ]
3. Engl, H. W., M. Hanke and A. Neubauer, Regularization of Inverse Problems, Kluwer Academic Publishers, Dordrecht, The Netherlands (1996). [ Links ]
4. Eliçabe, G. E. and G. L. Frontini, "Determination of the particle size distribution of latex using a combination of elastic light scattering and turbidimetry: A simulation study", J. Coll. and Interf. Sci., 181, 669-672 (1996). [ Links ]
5. Frontini, G. L. and G. E. Eliçabe, "Combination of indirect measurements to improve signal estimation", Lat. Amer. Appl. Res., 30, 99-105 (2000). [ Links ]
6. Frontini, G. L. and E. M. Fernández Berdaguer, "Inversion of elastic light scattering measurements to determine refractive index and particle size distribution of polymeric emulsions", Inverse Probl. Eng., 11, 329-340 (2003). [ Links ]
7. Glatter, O., M. Hofer, C. Jorde and W. Eigner, "Interpretation of elastic light-scattering data in real space", J. Coll. and Interf. Sci., 105, 577-586 (1985). [ Links ]
8. Golub, G. H., M. Heath and G. Wahba, "Generalized cross validation as a method for choosing a good parameter", Technometrics, 21, 215-223 (1979). [ Links ]
9. Hansen, C., "Regularization tools: A Matlab package for analysis and solution of discrete ill-posed problems", Numerical Algorithms, 6, 1-35 (1994). [ Links ]
10. Kirsch, A., An Introduction to the Mathematical Theory of Inverse Problems, Springer-Verlag, New York (1996). [ Links ]
11. Moré J. J., The Levenberg-Marquardt Algorithm: Implementation and Theory, Numerical Analysis, G. Watson, Ed., Lecture Notes in Mathematics 630, Springer Verlag, Berlin (1977). [ Links ]
12. Phillips, D. L., "A technique for the numerical solution of certain integral equations of the first kind", J. AMC, 9, 84 (1962). [ Links ]
13. Tikhonov, A. N. and V. Arsenin, Solution of Ill-posed Problems, Wiley, New York (1977). [ Links ]
14. Vega, J. R., G. L. Frontini, L. M. Gugliotta and G. E. Eliçabe, "Estimation of the normalization parameter required to combine measurements in an inverse problem", Proc. IX Reunión de Trabajo en Procesamiento de la Información y Control, Santa Fe, Argentina, 150-154 (2001). [ Links ]
15. Vega, J. R., G. L. Frontini, L. M. Gugliotta and G. E. Eliçabe, "Particle size distribution by combined elastic light scattering and turbidity measurements. A novel method to estimate the required normalization factor", Part. & Part. System Charact., 20, 361-369 (2003). [ Links ]