Thursday, January 5, 2023

Conditional expectation

Let $X$ be a sample space, $F$ be a sigma algebra and $G$ a sub-sigma algebra of $F$.

Conditional expectation is a projection of random variables, $E: L^2(X,F) → L^2(X,G)$. The operator can be extended to $L^1(X,F)$ and is a Markov operator.
 
How is it related to conditional probability?
 
Let $A$ be an event, $\{B_i\}$ be a decomposition of $X$ and $G$ be the sub-sigma algebra generated by $\{B_i\}$. The conditional expectation of the indicator function $1_A$ is a linear combination of $1_{B_i}$. Here the coefficients are the conditional probabilities $P(A|B_i)$! In this sense, the conditional expectation packages all the conditional probabilities together.
 
More generally, given a $C^*$-subalgebra $B$ of a $C^*$-algebra $A$ having the same unit, conditional expectation is a positive, surjective, unital operator $T:A→B$ such that $T^2 = T$.
 
Rotation algebra is the universal $C^*$-algebra generated by unitaries $u$ and $v$ with $uv = e^{2 \pi i \theta} vu$. Cuntz algebra is the universal $C^*$-algebra generated by isometries $S_i$ where summation of $\sum_i S_i S_i^* = 1$.
 
Conditional expectation is used to prove the simplicity of rotation algebra and Cuntz algebra.
 
References:
* Functional Analysis for Probability and Stochastic Processes, Adam Bobrowski

Saturday, April 12, 2014

Two little remarks on sphere

1. "Take a sphere of radius $R$ in $N$ dimensions, $N$ large; then most points inside the sphere are in fact very close to the surface." David Ruelle

Let $R > 0$ and $0< \alpha <1$. Let $$B(x;R)=\{ (x_1, \cdots, x_N) : x_1^2 + \cdots + x_N^2 \leq R^2 \} . $$ The fraction of the volume of $B(x; \alpha R)$ to the volume of $B(x; R)$ is $\alpha^N$, and $\lim_{N \to 0} \alpha^N = 0$. It means that "a full sphere of high dimension has all its volume within a "skin" of size $\varepsilon$ near the surface" (Collet and Eckmann). This phenomenon seems to be related to the concentration of measure and other probabilistic perspectives.


2. A matrix $A$ is positive if and only if all of its eigenvalues are positive. We write $A \geq 0$.

A 2-by-2 positive matrix is in the form $$A = \begin{bmatrix} t+x & y + iz \\ y-iz & t-x \end{bmatrix}$$ where $t \geq 0$ and $x^2 + y^2 + z^2 \leq t^2$.

The matrix $$ \begin{bmatrix} t_0+x_0 & y_0 + i z_0 \\ y_0-i z_0 & t_0-x_0 \end{bmatrix} $$ can be identified with the ball
$$ B(x_0,y_0,z_0;t_0) := \{ (x,y,z) \in \mathbb{R}^3 : (x-x_0)^2 + (y-y_0)^2 + (z-z_0)^2 \leq t_0^2 \} . $$

Let $A_j = \begin{bmatrix} t_j +x_j & y_j + i z_j \\ y_j - i z_j & t_j - x_j \end{bmatrix}$ for $j=1,2$. We have the following equivalence:
$$\begin{array}{rl}
& A_1 \geq A_2 \\
\Longleftrightarrow & (x_1 - x_2)^2 + (y_1 - y_2)^2 + (z_1 - z_2)^2 \leq (t_1 - t_2)^2 \\
& \text{ and } t_1 \geq t_2 \\
\Longleftrightarrow & B(x_2,y_2,z_2;t_2) \subset B(x_1,y_1,z_1;t_1) .
\end{array} $$
In other words, the ordering of the matrices corresponds to the inclusion of the balls!


Tuesday, February 4, 2014

Matrix multiplication as a convolution


1. The product of two power series/polynomials is $$ \begin{align*} & (a_0 + a_1 x + a_2 x^2 + \cdots) (b_0 + b_1 x + b_2 x^2 + \cdots)  \\ = & c_0 + c_1 x + c_2 x^2 + \cdots  \end{align*}$$ The coefficients given by $$c_n = \sum_{k=0}^n a_k b_{n-k}$$ is sometimes called the Cauchy product. This is a convolution.

2. Let $G$ be a finite group and $C[G]$ be the group algebra with complex coefficients. Let $$ f = \sum_x f(x) x, g = \sum_y g(y) y $$ be two elements in $C[G]$. The product is $$ f g = \sum_z ( \sum_{x y = z} f(x) g(y) ) z. $$ When $f$ and $g$ are treated as functions $f, g : G \to C$, $$ (f g) (z) = \sum_{x y = z} f(x) g(y)$$ is a convolution. In fact, $C[G]$ is an example of convolution algebra $L^1 (G)$.

3. Let $J = \{1, ... ,n \}$ and $G' = J \times J = \{ ( i, j ) : 1≤ i, j ≤n \}$. $G'$ is a groupoid: not every pair of elements in $G'$ can be composed, $(a, b)$ can only be composed with $(c,d)$ when $b=c$. In such case, $(i, j) (j, k) = (i, k).$ 
Let us consider the groupoid convolution algebra $C[G']$ as above. The convolution in this case is then $$(f g) (i,k) = \sum_{ (i, j) (j, k) = (i, k) } f(i, j) g(j, k).$$ If we rewrite this in another format, it should look more familiar: $$(AB)_{ik} = \sum_{i=1}^n A_{ij} B_{jk}. $$ This is matrix multiplication.

Sunday, July 7, 2013

!

"Take a sphere of radius $R$ in $N$ dimensions, $N$ large; then most points inside the sphere are in fact very close to the surface."

Saturday, June 22, 2013

From combinatorics to entropy

From combinatorics to entropy:
Let $N = n_1 + ... + n_k$ and $p_i = \frac{n_i}{N} $.
$$\log ( \frac{N!}{n_1 ! ... n_k ! } ) \approx - N \sum_i p_i \log p_i $$
by Stirling's formula.

I wonder if this was the first time ever in human history that such an expression $$\sum_i p_i \log p_i $$ appeared! Entropy is often too abstract to me. The approximation above is a link between counting combinations and entropy, and it seems to provide the most concrete grasp~
This is the genius of Boltzmann, Maxwell and Gibbs which leads to the development of statistical mechanics.

Energy, entropy, free energy, enthalpy, Legendre transform, etc. are still difficult to understand to me.



Renyi/generalized entropy:
$$D_q = \frac{1}{q-1} \log \sum_i p_i^q $$
It is related to the generalized dimension $\dim_q(\mu)$ and $L^q$-spectrum $\tau(q) := \liminf_{r \to 0} \frac{1}{\log r} \log \sup \sum_i \mu(B_i)^q$:
$$ \dim_q (\mu) = \lim_{r \to 0} \frac{D_q}{\log r} = \frac{\tau(q)}{q-1} .$$

Thursday, May 9, 2013

Fourier coefficients as eigenvalues/spectrum

In this post I want to make a connection between Fourier coefficients and eigenvalues/spectrum. Let me put the claim up front:
If $ \lambda= \hat{f} (n)$, then $ f - \lambda \delta_0$ is not invertible with respect to the convolution product.
Please feel free to jump directly to the end for the explanation.

Let $\mathbb{T}= [0,1]$ be identified with the circle $\{z: \mathbb{C} : |z| =1 \} = \{ e^{2 \pi i t} : t \in [0,1]\} $. An integrable periodic function $f \in L^1 (\mathbb{T})$ can be associated with its decomposition as a Fourier series
$$ f(t) \sim \sum_{n=-\infty}^\infty \hat{f} (n) e^{2 \pi i n t}, $$
where the Fourier coefficients are defined by
$$ \hat{f} (n) := \int_0^1 f(t) e^{-2 \pi i n t} dt . $$
Intuitively the Fourier coefficients $\hat{f} (n)$ tell us the weights of the components $e^{2 \pi i n t}$ within f. $ \hat{f}$ is the spectrum of the signal $f$.

In linear algebra, the spectrum of a square matrix (or more generally a bounded linear operator) $A$ is the set of eigenvalues $\{ \lambda \in \mathbb{C} : A- \lambda I $  is not invertible $\} $. Extending this, the spectrum of an element $a$ in a unital Banach algebra $A$ is $\{\lambda \in \mathbb{C}: a - \lambda 1$ is not invertible $\}$. Is there any way to connect this spectrum with the spectrum of a function?

Let us recall the definition of convolution of two functions:
$$f *g (x) := \int_0^1 f(x-t) g(t) dt .$$
It has the following property: $\widehat{f*g} (n) = \hat{f} (n) \hat{g} (n) $. Note that $L^1(\mathbb{T}) $ is an algebra with respect to this convolution product (not with respect to the usual multiplication)! However, $L^1(\mathbb{T})$ does not have an identity element (Is there a function such that f * g = f for all f?), we cannot talk about invertibility in this algebra. (Hence in harmonic analysis we need something called approximate identity as a substitute, sometimes appearing as summability kernel in classical case.) For convenience, let us move to a larger algebra $M(\mathbb{T})$ which contains $L^1(\mathbb{T})$.

$M(\mathbb{T})$ is the algebra of regular Borel measures on $\mathbb{T}$. A function $f \in L^1(\mathbb{T})$ is identified with the measure $f(t)dt \in M(\mathbb{T})$. The convolution product and Fourier coefficients of a measure is extended as follows:
$$ \begin{align*} f * \mu (x) :=& \int_0^1 f(x-t) d \mu(t) , \\  \mu * \nu (E) :=&  \int_0^1 \int_0^1 1_E (x+y)  d\mu(x) d\nu(y) \\  =&    \int_0^1 \mu(E-y) d\nu(y) \end{align*} $$
and
$$ \hat{\mu} (n) := \int_0^1 e^{-2 \pi i n t} d\mu(t) $$
where $1_E$ is the indicator function of $E$.  $\widehat{\mu * \nu} (n) = \hat{\mu} (n) \hat{\nu} (n)$ is satisfied analogously.

Now we have an identity element in $M(\mathbb{T})$, namely the Dirac measure $\delta_0$:
$$ f* \delta_0 (x) = \int_0^1 f(x-t) d \delta_0(t) = f(x) .$$
Its Fourier transform is $\hat{\delta}_0 (n) = \int_0^1 e^{-2 \pi i n t} d\delta_0 (t) = 1$ for all $n$.

Finally, we can show our claim, by considering the contrapositive of the statement.
Suppose $f - \lambda \delta_0$ is invertible, then $$ (f - \lambda \delta_0) * h = \delta_0 $$ for some $h \in M(\mathbb{T}) $. Taking Fourier transform we have $$ (\hat{f} (n) - \lambda ) \hat{h} (n) = 1 $$ for all $n$. This implies that $ \lambda \neq \hat{f} (n) $ for all $n$.


Remark:
1. Originally I want to prove the converse as well, but it involves some theory of Gelfand transform.

2. To be frank, this may not be the best way of presenting $\hat{f} (n)$ as eigenvalues. A more proper explanation would be along the line of eigenspace decomposition of the regular representation
$$ \begin{align*} \pi : L^1 (\mathbb{T}) &\to B(L^2(\mathbb{T})) \\ \pi(f) g &:= f *g . \end{align*} $$ In this case we can really claim that $\lambda = \hat{f} (n)$ if and only if $\lambda$ is an eigenvalue of $\pi(f)$.

3. I hope this post is still an interesting observation though, and will arouse your interest in the linkage between Fourier analysis, spectral theorem and representation theory. The interplay of the group action with the differential operator is behind the scene. One may also ponder on this question: why is $e^{2 \pi i n t}$ so special?

Personally the motivation along this investigation is the line of different generalizations of spectrum, from eigenvalues to spectrum of a ring and a $C^*$-algebra!