Difference between revisions of "Antenna Beamforming using Support Vector Machines"

From MeliWiki
Jump to: navigation, search
(Created page with '== Abstract ==')
 
Line 1: Line 1:
 
== Abstract ==
 
== Abstract ==
 +
 +
Support Vector Machines (SVM) have improved generalization
 +
performance over other classical optimization techniques. Here, we
 +
introduce an SVM-based approach for linear array processing and beamforming. The development of a
 +
modified cost function is presented and it is shown how it can be applied to the problem of linear beamforning.
 +
Finally, comparison examples are included to  show the validity of the new  minimization approach.
 +
 +
 +
 +
== Introduction ==
 +
 +
Support Vector Machines (SVM) have shown clear advantage in
 +
prediction, regression and estimation over classical approaches in a
 +
wide range of applications due to its improved generalization
 +
performance. We introduce here a framework to apply the SVM approach
 +
to linear antenna array processing.
 +
 +
Antenna array signal processing involves complex signals, for which a
 +
complex formulation of the SVM is needed. We introduce this
 +
formulation by introducing the real and imaginary parts of the error
 +
in the primal optimization and then proceeding as usual to solve a
 +
complex valued constrained optimization problem. We formulate the
 +
algorithm using a modified cost function which includes the
 +
unavoidable numerical regularization for the optimization. The
 +
adjustment of the parameters into this cost function leads to an
 +
improved robustness of the method against thermal and multiuser
 +
noise.
 +
 +
The resulting algorithm is a natural counterpart of the real valued
 +
Support Vector Regressor, which can be immediately applied to array
 +
signal processing.
 +
 +
We introduce the derived formulation as the optimizer for a linear
 +
beamformer. Several experiments illustrate the advantage of SVM over
 +
the minimum mean squared error (MMSE)  based algorithms due to its robustness against outliers and
 +
improved generalization ability. The first experiments compare the
 +
behavior of both algorithms in an environment in which interferent
 +
signals are close to the desired ones, thus producing nongaussian
 +
noise. The last experiment illustrates the improved generalization
 +
ability of the SVM when small data sets are used for training, which
 +
is common in communications applications.
 +
 +
 +
 +
== The Support Vector Approach and the Cost Function ==
 +
 +
Let <math>x[n]</math> be spatially sampled data. A linear
 +
beamformer can be written as
 +
<math>\begin{equation}
 +
d[n]={\bf w}^T {\bf x}[n]+\epsilon[n]
 +
\end{equation}</math>
 +
where <math>{\bf x}[n]</math> is the vector of <math>M</math> elements of the array output
 +
and $\epsilon[n]$ is the output error.
 +
 +
Coefficients ${\bf w}$ are usually estimated through the
 +
minimization of a certain cost function on $\epsilon[n]$
 +
 +
The SVM approach can be applied to the adjustment of this model. The
 +
main idea of SVM is to obtain the solution which has the minimum
 +
norm of ${\bf w}$. Due to the minimization of the weight vector
 +
norm, the solution will be regularized in the sense of Thikonov
 +
\cite{Tikhonov77}), improving the generalization performance.
 +
The minimization has to be subject to the constraints
 +
\begin{equation}
 +
\begin{split}
 +
&d[n]-{\bf w}^T {\bf x}[n] \leq  \varepsilon + \xi_n\\
 +
-&d[n]+{\bf w}^T {\bf x}[n]\leq  \varepsilon + \xi'_n\\
 +
&\xi[n], \xi'[n] \geq 0 \label{restr_primal_real}
 +
\end{split}
 +
\end{equation}
 +
not to be trivial. $\xi_n$ and $\xi'_n$ are the "slack"
 +
variables or losses. The optimization is intended to minimize a cost
 +
function over these variables. The parameter $\varepsilon$ is used to
 +
allow those $\xi_n$ or $\xi'_n$ for which the error is less that
 +
$\varepsilon$  to be zero. This is equivalent to the minimization of
 +
the so-called $\varepsilon$-insensitive or Vapnik Loss Function
 +
\cite{Vapnik98}, given by
 +
\begin{equation}
 +
L_{\varepsilon}(\epsilon)=
 +
\begin{cases}
 +
|\epsilon|-\varepsilon &|e|>\varepsilon\\
 +
0&|e|<\varepsilon
 +
\end{cases}
 +
\end{equation}
 +
The functional to be minimized is then
 +
\begin{equation}
 +
L_p=||{\bf w}||^2+C\sum_n L_\varepsilon({\xi_n},{\xi'_n})
 +
\end{equation}
 +
subject to $\xi_n, \xi'_n \geq 0$ where $C$ is the trade-off between
 +
the minimization of the norm (to improve generalization ability) and
 +
the minimization of the errors \cite{Vapnik98}.
 +
 +
The optimization of the above constrained problem through Lagrange
 +
multipliers $\alpha_i$, $\alpha'_i$ leads to the dual formulation
 +
\cite{Smola98c}
 +
\begin{equation}
 +
L_d=-({\boldsymbol \alpha}-{\boldsymbol \alpha'})^T{\bf
 +
R}({\boldsymbol \alpha}-{\boldsymbol \alpha'})+({\boldsymbol
 +
\alpha}-{\boldsymbol \alpha'})^T{\bf y}-({\boldsymbol
 +
\alpha}+{\boldsymbol \alpha'}){\bf 1}\varepsilon
 +
\end{equation}
 +
to be minimized with respect to $({\alpha}_i-{\alpha'}_i)$.
 +
 +
It involves the Gram matrix ${\bf R}$ of the dot products of the
 +
data vectors ${\bf x}[n]$. This matrix may be singular and thus
 +
producing an ill-conditioned problem. To avoid this numerical
 +
inconvenience, a small diagonal $\gamma {\bf I}$ is added to the
 +
matrix prior to the numerical optimization.
 +
 +
We present here a modified derivation of the SVM regressor which leads to a
 +
more convenient equivalent cost function (Fig.
 +
\ref{fig:robustcost})
 +
\begin{equation}
 +
L_{R}(e)=
 +
\begin{cases}
 +
0&|e|<\varepsilon\\
 +
\frac{1}{2\gamma}(|e|-\varepsilon)^2-\varepsilon &\varepsilon \leq |e| \leq \varepsilon+e_C\\
 +
C(|e|-\varepsilon)-\frac{1}{2}\gamma C^2 & e_C
 +
\leq |e| \label{eq:robust_cost}
 +
\end{cases}
 +
\end{equation}
 +
where $e_C=\varepsilon+\gamma C$
 +
\begin{figure}[h]
 +
\begin{centering}
 +
\epsfig{file=images/CostRobust.eps,draft=false,width=8cm}
 +
\caption{Cost function applied to the SVM
 +
beamformer.}\label{fig:robustcost}
 +
\end{centering}
 +
\end{figure}
 +
 +
 +
This cost function provides a functional that is  numerically regularized by
 +
the matrix $\gamma I$.  As it can be seen, the cost
 +
function is quadratic for the data which produce errors between
 +
$\varepsilon$ and $e_C$, and linear for errors above $e_C$. Thus,
 +
one can adjust the parameter $e_C$ to apply a quadratic cost for
 +
the samples which are mainly affected by thermal noise (i.e., for which the
 +
quadratic cost is Maximum Likelihood). The linear cost is then
 +
applied to the samples which are outliers \cite{Huber72},
 +
\cite{Muller99}. Using a linear cost function, the contribution of the
 +
outliers to the solution will not depend on its error value, but  only on
 +
its sign, thus avoiding the bias that a quadratic cost function produces.
 +
 +
Finally, we generalize the derivation to the complex-valued case, as it is
 +
necessary for array processing.

Revision as of 15:07, 1 June 2010

Abstract

Support Vector Machines (SVM) have improved generalization performance over other classical optimization techniques. Here, we introduce an SVM-based approach for linear array processing and beamforming. The development of a modified cost function is presented and it is shown how it can be applied to the problem of linear beamforning. Finally, comparison examples are included to show the validity of the new minimization approach.


Introduction

Support Vector Machines (SVM) have shown clear advantage in prediction, regression and estimation over classical approaches in a wide range of applications due to its improved generalization performance. We introduce here a framework to apply the SVM approach to linear antenna array processing.

Antenna array signal processing involves complex signals, for which a complex formulation of the SVM is needed. We introduce this formulation by introducing the real and imaginary parts of the error in the primal optimization and then proceeding as usual to solve a complex valued constrained optimization problem. We formulate the algorithm using a modified cost function which includes the unavoidable numerical regularization for the optimization. The adjustment of the parameters into this cost function leads to an improved robustness of the method against thermal and multiuser noise.

The resulting algorithm is a natural counterpart of the real valued Support Vector Regressor, which can be immediately applied to array signal processing.

We introduce the derived formulation as the optimizer for a linear beamformer. Several experiments illustrate the advantage of SVM over the minimum mean squared error (MMSE) based algorithms due to its robustness against outliers and improved generalization ability. The first experiments compare the behavior of both algorithms in an environment in which interferent signals are close to the desired ones, thus producing nongaussian noise. The last experiment illustrates the improved generalization ability of the SVM when small data sets are used for training, which is common in communications applications.


The Support Vector Approach and the Cost Function

Let <math>x[n]</math> be spatially sampled data. A linear beamformer can be written as <math>\begin{equation} d[n]={\bf w}^T {\bf x}[n]+\epsilon[n] \end{equation}</math> where <math>{\bf x}[n]</math> is the vector of <math>M</math> elements of the array output and $\epsilon[n]$ is the output error.

Coefficients ${\bf w}$ are usually estimated through the minimization of a certain cost function on $\epsilon[n]$

The SVM approach can be applied to the adjustment of this model. The main idea of SVM is to obtain the solution which has the minimum norm of ${\bf w}$. Due to the minimization of the weight vector norm, the solution will be regularized in the sense of Thikonov \cite{Tikhonov77}), improving the generalization performance. The minimization has to be subject to the constraints \begin{equation} \begin{split} &d[n]-{\bf w}^T {\bf x}[n] \leq \varepsilon + \xi_n\\ -&d[n]+{\bf w}^T {\bf x}[n]\leq \varepsilon + \xi'_n\\ &\xi[n], \xi'[n] \geq 0 \label{restr_primal_real} \end{split} \end{equation} not to be trivial. $\xi_n$ and $\xi'_n$ are the "slack" variables or losses. The optimization is intended to minimize a cost function over these variables. The parameter $\varepsilon$ is used to allow those $\xi_n$ or $\xi'_n$ for which the error is less that $\varepsilon$ to be zero. This is equivalent to the minimization of the so-called $\varepsilon$-insensitive or Vapnik Loss Function \cite{Vapnik98}, given by \begin{equation} L_{\varepsilon}(\epsilon)= \begin{cases} |\epsilon|-\varepsilon &|e|>\varepsilon\\ 0&|e|<\varepsilon \end{cases} \end{equation} The functional to be minimized is then \begin{equation} L_p=||{\bf w}||^2+C\sum_n L_\varepsilon({\xi_n},{\xi'_n}) \end{equation} subject to $\xi_n, \xi'_n \geq 0$ where $C$ is the trade-off between the minimization of the norm (to improve generalization ability) and the minimization of the errors \cite{Vapnik98}.

The optimization of the above constrained problem through Lagrange multipliers $\alpha_i$, $\alpha'_i$ leads to the dual formulation \cite{Smola98c} \begin{equation} L_d=-({\boldsymbol \alpha}-{\boldsymbol \alpha'})^T{\bf R}({\boldsymbol \alpha}-{\boldsymbol \alpha'})+({\boldsymbol \alpha}-{\boldsymbol \alpha'})^T{\bf y}-({\boldsymbol \alpha}+{\boldsymbol \alpha'}){\bf 1}\varepsilon \end{equation} to be minimized with respect to $({\alpha}_i-{\alpha'}_i)$.

It involves the Gram matrix ${\bf R}$ of the dot products of the data vectors ${\bf x}[n]$. This matrix may be singular and thus producing an ill-conditioned problem. To avoid this numerical inconvenience, a small diagonal $\gamma {\bf I}$ is added to the matrix prior to the numerical optimization.

We present here a modified derivation of the SVM regressor which leads to a more convenient equivalent cost function (Fig. \ref{fig:robustcost}) \begin{equation} L_{R}(e)= \begin{cases} 0&|e|<\varepsilon\\ \frac{1}{2\gamma}(|e|-\varepsilon)^2-\varepsilon &\varepsilon \leq |e| \leq \varepsilon+e_C\\ C(|e|-\varepsilon)-\frac{1}{2}\gamma C^2 & e_C

\leq |e| \label{eq:robust_cost}

\end{cases} \end{equation} where $e_C=\varepsilon+\gamma C$ \begin{figure}[h] \begin{centering} \epsfig{file=images/CostRobust.eps,draft=false,width=8cm} \caption{Cost function applied to the SVM beamformer.}\label{fig:robustcost} \end{centering} \end{figure}


This cost function provides a functional that is numerically regularized by the matrix $\gamma I$. As it can be seen, the cost function is quadratic for the data which produce errors between $\varepsilon$ and $e_C$, and linear for errors above $e_C$. Thus, one can adjust the parameter $e_C$ to apply a quadratic cost for the samples which are mainly affected by thermal noise (i.e., for which the quadratic cost is Maximum Likelihood). The linear cost is then applied to the samples which are outliers \cite{Huber72}, \cite{Muller99}. Using a linear cost function, the contribution of the outliers to the solution will not depend on its error value, but only on its sign, thus avoiding the bias that a quadratic cost function produces.

Finally, we generalize the derivation to the complex-valued case, as it is necessary for array processing.