16.11.2014 Views

Binet-Cauchy Kernerls on Dynamical Systems and its Application to ...

Binet-Cauchy Kernerls on Dynamical Systems and its Application to ...

Binet-Cauchy Kernerls on Dynamical Systems and its Application to ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g> <strong>on</strong> <strong>Dynamical</strong><br />

<strong>Systems</strong> <strong>and</strong> <strong>its</strong> Applicati<strong>on</strong> <strong>to</strong> the Analysis<br />

of Dynamic Scenes<br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

Vishwanathan S.V.N.<br />

Smola Alex<strong>and</strong>er J.<br />

Vidal René<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong><br />

presented by<br />

Tr<strong>on</strong> Rober<strong>to</strong><br />

July 31, 2006


Support Vec<strong>to</strong>r Machines (SVMs)<br />

Classificati<strong>on</strong> problem<br />

Given m observati<strong>on</strong>s (x i , y i ) ∈ X × Y, X ⊆ R d ,Y = {−1, 1}<br />

find f : X → Y<br />

Support Vec<strong>to</strong>r Machines use<br />

f = sign(〈w, x〉 + b) (1)<br />

The hyperplane H(w, b) = {x|〈w, x, +〉b = 0} spl<strong>its</strong> X in two<br />

half spaces <strong>and</strong> is given by<br />

1<br />

min<br />

w,b,ξ i 2 ‖w‖2 + C<br />

n∑<br />

i=1<br />

subject <strong>to</strong> y i (〈w, x〉 + b) ≥ 1 − ξ i<br />

ξ i ≥ 0<br />

∀i = 1, . . . , m<br />

ξ i<br />

(2)<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

Support Vec<strong>to</strong>r<br />

Machines<br />

Kernel methods<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong>


H(w, b) depends <strong>on</strong>ly <strong>on</strong> some points (called support vec<strong>to</strong>rs)<br />

⇒ efficient computati<strong>on</strong>s in training<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

Support Vec<strong>to</strong>r<br />

Machines<br />

Kernel methods<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong>


Kernel methods<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

SVMs use the hypothesis of linearly separable clusters. What <strong>to</strong><br />

do if the surface of separati<strong>on</strong> is not linear?<br />

Idea: use Φ : X → H <strong>to</strong> map the points x i in<strong>to</strong> a higher<br />

(possible infinite) dimensi<strong>on</strong>al feature space where Φ(x i ) are<br />

linearly separable<br />

f <strong>and</strong> ‖w‖ 2 can be expressed in term of the dot product<br />

〈Φ(x), Φ(x ′ )〉<br />

⇒ avoid explicite computati<strong>on</strong> of the mapping Φ(x)<br />

⇒ Kernel Trick: the classificati<strong>on</strong> can be d<strong>on</strong>e by replacing<br />

〈Φ(x), Φ(x ′ )〉 with the kernel functi<strong>on</strong> k(x, x ′ )<br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

Support Vec<strong>to</strong>r<br />

Machines<br />

Kernel methods<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong>


Compound matrices<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Let A ∈ R m×n . Given q ≤ min(m, n), define<br />

I n q = {i = (i 1 , i 2 , . . . , i q ) : 1 ≤ i 1 ≤ . . . ≤ i q , i i ∈ N } <strong>and</strong> likewise<br />

I m q .<br />

The compound matrix of order q, C q (A) is defined as<br />

[C q (A)] i,j = det(A(i k , j l )) where i ∈ I n q <strong>and</strong> j ∈ I m q (3)<br />

The size of C q (A) is ( n<br />

q<br />

)<br />

×<br />

( mq<br />

)<br />

.<br />

Two particular cases:<br />

if q = m = n, C q (A) = det(A)<br />

if q = 1, C q (A) = A<br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong>


<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> kernels<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Theorem (<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g>)<br />

Let A ∈ R l×m <strong>and</strong> B ∈ R l×n . For all 1 ≤ q ≤ min(m, n, l) we<br />

have C q (A ⊤ B) = C q (A) ⊤ C q (B)<br />

Theorem (<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> kernels)<br />

Let A, B ∈ R n×k . For all 1 ≤ q ≤ min(n, k) the two kernels<br />

k(A, B) = tr(C q (A ⊤ B)) (4)<br />

k(A, B) = det(C q (A ⊤ B)) (5)<br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong><br />

are well defined <strong>and</strong> positive semi-definite


<strong>Dynamical</strong> systems <strong>and</strong> trajec<strong>to</strong>ries<br />

Definiti<strong>on</strong>s<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

The state of a dynamical system is defined by some x ∈ H where<br />

H is a RKHS with kernel κ(·, ·). The evoluti<strong>on</strong> of a linear<br />

dynamical system is denoted as:<br />

x A (t) = A(x, t) for t ∈ T (6)<br />

A : H × T → H is a set of linear opera<strong>to</strong>rs indexed by T , the set<br />

of times of measurement.<br />

The pair (x, A) identifies the trajec<strong>to</strong>ry of a dynamical system,<br />

defined as the map:<br />

Traj A : H → H T , Traj A (x)(t) = x A (t) (7)<br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Introducti<strong>on</strong><br />

Trace kernels<br />

Determinant kernels<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong>


<strong>Dynamical</strong> systems <strong>and</strong> trajec<strong>to</strong>ries<br />

Comparis<strong>on</strong> approaches<br />

Two approaches:<br />

Example<br />

Compare parameters: useful when suitable<br />

parametrizati<strong>on</strong> exist. But systems with different parameters<br />

can behave almost identically.<br />

The two systems<br />

x ← a(x) = |x| p <strong>and</strong> x ← b(x) = min(|x| p , |x|) (8)<br />

give identical trajec<strong>to</strong>ries as l<strong>on</strong>g as p > 1 <strong>and</strong> |x| < 1<br />

Compare trajec<strong>to</strong>ries: independent from parametrizati<strong>on</strong>,<br />

can take in account initial c<strong>on</strong>diti<strong>on</strong>s. Requires a similarity<br />

measure.<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Introducti<strong>on</strong><br />

Trace kernels<br />

Determinant kernels<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong>


Trace kernels <strong>on</strong> dynamical systems<br />

Applying the trace kernel <strong>on</strong> trajec<strong>to</strong>ries involve the computati<strong>on</strong><br />

of<br />

tr ( Traj A (x) ⊤ Traj A ′(x ′ ) ) = ∑ κ(x A (t), x A ′ ′(t)) (9)<br />

t∈T<br />

To assure c<strong>on</strong>vergence, a popular soluti<strong>on</strong> is <strong>to</strong> adopt discounting<br />

schemes:<br />

Definiti<strong>on</strong> (Trace kernel for trajec<strong>to</strong>ries)<br />

The trace kernel <strong>on</strong> trajec<strong>to</strong>ries is defined as<br />

µ(t) = ce −λt (10)<br />

µ(t) = δ τ (t) (11)<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Introducti<strong>on</strong><br />

Trace kernels<br />

Determinant kernels<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong><br />

k((x, A)(x ′ , A ′ )) = E t∼µ(t) [κ(x A (t), x A ′ ′(t))] (12)


Trace kernels <strong>on</strong> dynamical systems<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

This kernel can be further specialized:<br />

<strong>on</strong> model parameters<br />

k(A, A ′ ) = E x,x ′[k((x, A)(x ′ , A ′ ))] (13)<br />

<strong>on</strong> initial c<strong>on</strong>diti<strong>on</strong>s<br />

k(x, x ′ ) = E A,A ′[k((x, A)(x ′ , A ′ ))] (14)<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Introducti<strong>on</strong><br />

Trace kernels<br />

Determinant kernels<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong>


Determinant kernel <strong>on</strong> dynamical systems<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Definiti<strong>on</strong> (Determinant kernel for trajec<strong>to</strong>ries)<br />

Like the previous case, the determinant kernel is defined as<br />

k((x, A)(x ′ , A ′ )) = E t∼µ(t)<br />

[<br />

det(TrajA (x) ⊤ Traj A ′(x ′ )) ] (15)<br />

For the determinant <strong>to</strong> exist, µ(t) must have finite support.<br />

Then, the kernel reduce <strong>to</strong><br />

where K i,j = κ(x A (t i ), x ′ A ′(t j))<br />

k((x, A)(x ′ , A ′ )) = det(K) (16)<br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Introducti<strong>on</strong><br />

Trace kernels<br />

Determinant kernels<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong><br />

This kernel, following an informati<strong>on</strong>-theoretic interpretati<strong>on</strong>,<br />

measures the statistical dependence between two sequences


Linear dynamical systems - Discrete time<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

ARMA model<br />

y t = Cx t + w t where w t ∼ N (0, R), y ∈ R m (17)<br />

x t+1 = Ax t + v t where v t ∼ N (0, Q), x ∈ R n (18)<br />

In this case, closed form soluti<strong>on</strong> can be found for both trace <strong>and</strong><br />

determinant kernels<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

Discrete time<br />

Trace kernel<br />

Determinant kernel<br />

C<strong>on</strong>tinous time<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong>


Trace kernel <strong>on</strong> linear dynamical systems<br />

With the assumpti<strong>on</strong> µ(t) = e −λt , κ(y t , y ′ t) = y ⊤ t Wy ′ t, W<br />

positive semi-definite, the trace kernel can be written<br />

with same realizati<strong>on</strong> of noise for both systems<br />

k((x 0 , A, C), (x 0, ′ A ′ , C ′ )) = x0 ⊤ Mx 0 ′ 1<br />

+ tr(QM +WR)<br />

1 − e−λ (19)<br />

with different noise realizati<strong>on</strong>s<br />

k((x 0 , A, C), (x ′ 0, A ′ , C ′ )) = x ⊤ 0 Mx ′ 0 (20)<br />

where M satisfies the Sylvester’s equati<strong>on</strong><br />

M = e −λ A ⊤ MA ′ + C ⊤ WC ′ (21)<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

Discrete time<br />

Trace kernel<br />

Determinant kernel<br />

C<strong>on</strong>tinous time<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong><br />

Other closed forms exist when independence from initial c<strong>on</strong>diti<strong>on</strong><br />

is desired <strong>and</strong> when the model is fully observable (C = I,R = 0)


Determinant kernel <strong>on</strong> linear dynamical systems<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

With the same assumpti<strong>on</strong> used for the trace kernel, the<br />

determinant kernel can be expressed in a closed form<br />

where<br />

N.B.<br />

k((x 0 , A, C), (x ′ 0, A ′ , C ′ )) = det(W ) det(C ⊤ MC ′ ) (22)<br />

M = e −λ AM(A ′ ) ⊤ + x 0 (x ′ 0) ⊤ (23)<br />

1 k is uniformly zero if C or M do not have full rank (e.g.<br />

m > n)<br />

2 if we c<strong>on</strong>sider E x0,x ′ 0<br />

[k] <strong>to</strong> obtain independence from initial<br />

c<strong>on</strong>diti<strong>on</strong>s, a closed form is possible <strong>on</strong>ly for m = n = 1<br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

Discrete time<br />

Trace kernel<br />

Determinant kernel<br />

C<strong>on</strong>tinous time<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong>


Linear dynamical systems - C<strong>on</strong>tinous time<br />

The evoluti<strong>on</strong> of an LTI system can be described with the model<br />

which has soluti<strong>on</strong><br />

d<br />

x(t) = Ax(t) (24)<br />

dt<br />

x(t) = exp(At)x(0) (25)<br />

Using the exp<strong>on</strong>ential discount µ(t) = e −λt , the trace kernel<br />

becomes<br />

k((x 0 , A), (x 0 , A ′ )) = x ⊤ 0 Mx ′ 0 (26)<br />

where<br />

(A − λ 4 I)M + M(A − λ I) = −W (27)<br />

4<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

Discrete time<br />

Trace kernel<br />

Determinant kernel<br />

C<strong>on</strong>tinous time<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong>


C<strong>on</strong>necti<strong>on</strong>s with existing kernels<br />

Subspaces angles<br />

Given an ARMA model without noise, the Henkel matrix<br />

⎡ ⎤<br />

⎡<br />

⎤ C<br />

y 0 y 1 y 2 . . .<br />

⎢<br />

Z = ⎣<br />

y 1 y 2 . . . ⎥<br />

CA<br />

[<br />

⎦ = ⎢<br />

.<br />

. ..<br />

⎣<br />

CA 2 ⎥ x0 x 1 x 2 . . . ]<br />

⎦<br />

.<br />

lives in a subspace spanned by the columns of O<br />

The kernel<br />

k((A, C), (A ′ , C ′ )) =<br />

= O [ x 0 x 1 x 2 . . . ] (28)<br />

n∏<br />

cos 2 (θ i ) = det(Q ⊤ Q ′ ) 2 (29)<br />

i=1<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

C<strong>on</strong>tinous time<br />

Graphs<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong><br />

measure the angle between subspaces


C<strong>on</strong>necti<strong>on</strong>s with existing kernels<br />

Cepstral coefficient<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Let H(z) be the transfer functi<strong>on</strong> of an ARMA model without<br />

noise. The cepstrum is defined as<br />

The Martin kernel<br />

log(H(z)H ∗ (z −1 )) = ∑ n∈Z<br />

c n z −n (30)<br />

k((A, C), (A ′ , C ′ )) =<br />

∞∑<br />

ncn ∗ c n ′ (31)<br />

n=1<br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

C<strong>on</strong>tinous time<br />

Graphs<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong><br />

is equivalent <strong>to</strong> the Subspace Angles approach (De Cock <strong>and</strong> De<br />

Moor, 2002)


C<strong>on</strong>necti<strong>on</strong>s with existing kernels<br />

R<strong>and</strong>om walks<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

R<strong>and</strong>om walk <strong>on</strong> a graph can be seen as an ARMA model<br />

without noise<br />

y t = Cx t (32)<br />

x t+1 = Ax t (33)<br />

Assume equiprobable distributi<strong>on</strong> over starting node<br />

k(G 1 , G 2 ) =<br />

1<br />

|V 1 ||V 2 | 1⊤ M1 (34)<br />

M = e −λ A ⊤ 1 MA 2 + C ⊤ 1 WC 2 (35)<br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

C<strong>on</strong>tinous time<br />

Graphs<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong><br />

Equivalent <strong>to</strong> the geometric graph kernel of Gärtner et al. (2003)


C<strong>on</strong>necti<strong>on</strong>s with existing kernels<br />

Diffusi<strong>on</strong> <strong>on</strong> graphs<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

C<strong>on</strong>tinuous diffusi<strong>on</strong> process can be seen as the c<strong>on</strong>tinuous LTI<br />

system<br />

d<br />

x(t) = Lx(t) (36)<br />

dt<br />

Assume same Laplacian matrix, different initial c<strong>on</strong>diti<strong>on</strong>s,<br />

µ(t) = δ τ (t)<br />

[K] i,j = k((L, e i ), (L, e j )) = [exp(Lτ) ⊤ exp(Lτ)] i,j (37)<br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

C<strong>on</strong>tinous time<br />

Graphs<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

If undirected graph, L = L ⊤ K = exp(2Lτ) (38)<br />

C<strong>on</strong>clusi<strong>on</strong><br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

Same as the diffusi<strong>on</strong> kernel of K<strong>on</strong>dor <strong>and</strong> Lafferty (2002).


C<strong>on</strong>necti<strong>on</strong>s with existing kernels<br />

Polyg<strong>on</strong>s comparis<strong>on</strong><br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Burkhardt (2004) uses feature functi<strong>on</strong>s Φ(x, p) where p is a<br />

polyg<strong>on</strong> <strong>and</strong> x is a vertex index<br />

Dynamic of the system from x ← (x mod n) + 1<br />

Compare two polyg<strong>on</strong>s from their trajec<strong>to</strong>ries<br />

(Φ(1, p), . . . , Φ(n, p))<br />

Equiprobable distributi<strong>on</strong> <strong>on</strong> initial c<strong>on</strong>diti<strong>on</strong>s (starting points <strong>on</strong><br />

each polyg<strong>on</strong>)<br />

k(p, p ′ ) = 1 ∑n,n ′<br />

nn ′<br />

x,x ′ =1<br />

〈Φ(x, p), Φ(x ′ , p ′ )〉 (39)<br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

C<strong>on</strong>tinous time<br />

Graphs<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong><br />

Equivalent <strong>to</strong> Burkhardt (2004) but can be generalized <strong>to</strong> any<br />

kernel


N<strong>on</strong>-linear dynamical systems<br />

The time evoluti<strong>on</strong> of a fully observable n<strong>on</strong>-linear dynamical<br />

systems is governed by<br />

x t+1 = f (x t ) (40)<br />

Find a mapping Φ : X → H which linearize the system<br />

<strong>and</strong> apply the same <strong>to</strong>ols.<br />

N.B.<br />

Φ(x t+1 ) = AΦ(x t ) (41)<br />

the space of possible soluti<strong>on</strong>s have <strong>to</strong> be restricted by<br />

imposing c<strong>on</strong>straints <strong>on</strong> the structure of A<br />

the same c<strong>on</strong>straints appears <strong>on</strong> M while solving the<br />

Sylvester’s equati<strong>on</strong>s<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong>


Dynamic textures<br />

Setup<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

The proposed kernels have been applied <strong>to</strong> the problem of<br />

comparing dynamic textures, using ARMA models<br />

System identificati<strong>on</strong> performed using the sub-optimal closed<br />

form soluti<strong>on</strong> of Doret<strong>to</strong> et al. (2003)<br />

Dataset of 150 sequences (grayscale 8 bit, 115x170 px, 120<br />

frames each)<br />

Sequences from 65 <strong>to</strong> 80 cannot be modeled as dynamic textures<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

Dynamic textures<br />

Video clips clustering<br />

C<strong>on</strong>clusi<strong>on</strong>


Dynamic textures<br />

Results<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Figure: <str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> kernel Figure: Martin kernel<br />

Darker area ⇒ low value kernel ⇒ similar sequences<br />

The value of λ doesn’t affect clustering<br />

Sequences 65-80 are recognized as novel<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

Dynamic textures<br />

Video clips clustering<br />

C<strong>on</strong>clusi<strong>on</strong>


Dynamic textures<br />

Parameters interpretati<strong>on</strong><br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Initial c<strong>on</strong>diti<strong>on</strong> of the system x 0 : discriminate between<br />

sequences with same the foreground (dynamic behaviour)<br />

but different background (c<strong>on</strong>tained in the initial c<strong>on</strong>diti<strong>on</strong>)<br />

Exp<strong>on</strong>ential decay λ: relative weights between short <strong>and</strong><br />

l<strong>on</strong>g range interacti<strong>on</strong>s<br />

Dynamics (A,C): systems with A <strong>and</strong> A ′ with different<br />

dimensi<strong>on</strong>ality can be compared if the outputs y t , y ′ t bel<strong>on</strong>g<br />

<strong>to</strong> the same space. Can compare various levels of detail in<br />

the parametrizati<strong>on</strong> of the same system.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

Dynamic textures<br />

Video clips clustering<br />

C<strong>on</strong>clusi<strong>on</strong>


Video clips clustering<br />

Clustering of r<strong>and</strong>om video clips (120 frames each)<br />

The kernel has been used <strong>to</strong> compute the k-nearest neighborhood<br />

<strong>and</strong> LLE (Local Linear Embedding) has been used for clustering<br />

<strong>and</strong> embedding in the 2D space<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

Dynamic textures<br />

Video clips clustering<br />

C<strong>on</strong>clusi<strong>on</strong>


C<strong>on</strong>clusi<strong>on</strong><br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> <str<strong>on</strong>g>Kernerls</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Dynamical</strong> <strong>Systems</strong><br />

Vishwanathan S.V.N.<br />

Smola A. J., Vidal R.<br />

Introductive material<br />

General framework which includes previous kernels as<br />

particular cases<br />

Can be applied <strong>to</strong> a wide class of dynamical systems<br />

(linear/n<strong>on</strong>-linear, DT/CT)<br />

Computati<strong>on</strong>al-efficient closed form soluti<strong>on</strong> for linear<br />

dynamical systems<br />

Good results<br />

<str<strong>on</strong>g>Binet</str<strong>on</strong>g>-<str<strong>on</strong>g>Cauchy</str<strong>on</strong>g> Kernels<br />

<strong>Dynamical</strong> systems<br />

Linear dynamical systems<br />

C<strong>on</strong>necti<strong>on</strong>s with existing<br />

kernels<br />

N<strong>on</strong>-linear dynamical<br />

systems<br />

Applicati<strong>on</strong>s <strong>and</strong> results<br />

C<strong>on</strong>clusi<strong>on</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!