# Tag Archives: matrix

## SL(2,IR) is the Commutator Subgroup of GL(2,IR)

Here is a proof of the above fact.

Let $N$ be the commutator subgroup of the general linear group $GL(2,\mathbb R)$; i.e.,

$N=\langle ABA^{-1}B^{-1}:A,B\in GL(2,\mathbb R)\rangle$.

First, it is clear that $N$ is contained in the special linear group $SL(2,\mathbb R)$, since $\det(ABA^{-1}B^{-1})=1$ for any $A,B\in GL(2,\mathbb R)$. Next, we claim that $N$ contains all matrices

$\begin{pmatrix} 1 & b\\ 0 & 1\end{pmatrix}$.

This follows from noting that

$\begin{pmatrix} 1 & b\\ 0 & 1\end{pmatrix}=\begin{pmatrix} 1 & b\\ 0 & b\end{pmatrix}\begin{pmatrix} 1 & 1\\ 0 & 1\end{pmatrix}\begin{pmatrix} 1 & b\\ 0 & b\end{pmatrix}^{-1}\begin{pmatrix} 1 & 1\\ 0 & 1\end{pmatrix}^{-1}$.

By taking transposes, it also follows that $N$ contains all matrices

$\begin{pmatrix} 1 & 0\\ c & 1\end{pmatrix}$.

Further, $N$ contains all matrices

$\begin{pmatrix} a & 0\\ 0 & 1/a\end{pmatrix}$

since

$\begin{pmatrix} a & 0\\ 0 & 1/a\end{pmatrix}=\begin{pmatrix} a & 0\\ 0 & 1\end{pmatrix}\begin{pmatrix} 0 & 1\\ 1 & 0\end{pmatrix}\begin{pmatrix} a & 0\\ 0 & 1\end{pmatrix}^{-1}\begin{pmatrix} 0 & 1\\ 1 & 0\end{pmatrix}^{-1}$

for any $a\neq 0$.

Now let

$\begin{pmatrix} a & b\\ c & d\end{pmatrix}\in SL(2,\mathbb R)$.

Then $ad-bc=1$. Using the above results,

$\begin{pmatrix} a & b\\ c & d\end{pmatrix}=\begin{pmatrix} 1 & 0\\ c/a & 1\end{pmatrix}\begin{pmatrix} 1 & ab\\ 0 & 1\end{pmatrix}\begin{pmatrix} a & 0\\ 0 & 1/a\end{pmatrix}\in N$

if $a\neq 0$, and

$\begin{pmatrix} a & b\\ c & d\end{pmatrix}=\begin{pmatrix}0&1\\-1&0\end{pmatrix}\begin{pmatrix}1&-d/b\\ 0&1\end{pmatrix}\begin{pmatrix}1&0\\ ab&1\end{pmatrix}\begin{pmatrix}1/b&0\\ 0&b\end{pmatrix}\in N$

if $b\neq 0$, and the latter since

\begin{aligned}&\begin{pmatrix} 0 & -1\\ 1 & 0\end{pmatrix}\\=&\begin{pmatrix}x&y\\0&-x-y\end{pmatrix}\begin{pmatrix}-x-y&0\\ x&y\end{pmatrix}\begin{pmatrix}x&y\\0&-x-y\end{pmatrix}^{-1}\begin{pmatrix}-x-y&0\\ x&y\end{pmatrix}^{-1}\\ \in &\ N\end{aligned}

for any $x,y,x+y\neq 0$. Thus $SL(2,\mathbb R)\subseteq N$, i.e., $N=SL(2,\mathbb R)$.

2 Comments

Filed under Linear Algebra

## Some Interesting Linear Algebra Proofs

Below are some cute linear algebra results and proofs cherrypicked from various sources. All the standard hypotheses (on the base field, the size of the matrices, etc.) that make the claims valid are assumed. The list will likely be updated.

Fact 1. Let $A,B,X$ be matrices. If $AX=XB$, then $p(A)X=Xp(B)$ for any polynomial $p$.

Proof. We have $A^2X=A(AX)=A(XB)=(AX)B=(XB)B=XB^2$. By induction, $A^nX=XB^n$ for any $n\in\mathbb N$. Hence the result follows. $\square$

Remark. Note that $X$ need not be square, let alone invertible.

Fact 2. Let $x_0,\dots,x_n$ be distinct. Then the Vandermonde matrix

$\displaystyle V=\begin{pmatrix} 1 & x_0 & \cdots & x_0^n\\ 1 & x_1 & \cdots & x_1^n\\ \vdots & \vdots & \ddots & \vdots\\ 1 & x_n & \cdots & x_n^n\end{pmatrix}$

is invertible.

Proof. It suffices to show that the kernel of the linear transformation $f(x)=Vx$ is trivial. If $a=(a_0,\dots,a_n)$ is in the kernel, then $p(X)=a_0+a_1X+\cdots+a_nX^n=0$ for each $X\in\{x_0,\dots,x_n\}$. Since $\deg(p)=n$, this forces $p(X)$ to be identically zero. Thus $a=0$. $\square$

Fact 3. A matrix is diagonalisable iff its minimal polynomial decomposes into distinct linear factors.

Proof. A matrix is diagonalisable iff every Jordan block has size $1$. Since the multiplicity of an eigenvalue in the minimal polynomial corresponds to the size of the largest Jordan block, the result follows. $\square$

Corollary. Idempotent matrices are diagonalisable. Moreover, the rank of an idempotent matrix is equal to the algebraic multiplicity of the eigenvalue $1$.

Fact 4. If $A^nx=0$ and $A^{n-1}x\neq 0$, then $x,Ax,\dots,A^{n-1}x$ are linearly independent.

Proof. Note that $\ker(A^n)\supseteq\ker(A^{n-1})$. Let $x\in\ker(A^n)\setminus\ker(A^{n-1})$ and suppose that $a_0x+a_1Ax+\cdots+a_{n-1}A^{n-1}x=0$. Multiplying both sides by $A^i$ for $i=n-1,\dots,1$ shows that $a_i=0$ for all $i$, as desired. $\square$

Corollary 1. If $\ker(A^n)\neq\ker(A^{n-1})$, then $\dim(\ker(A^j))\ge j$ for $j=1,\dots,n$.

Proof. If $x\in\ker(A^n)\setminus\ker(A^{n-1})$, then $A^{n-1}x,\dots,A^{n-j}x\in\ker(A^j)$. $\square$

Corollary 2. If $A$ is $n\times n$, and $\ker(A^n)\neq\ker(A^{n-1})$, then $\dim(\ker(A^j))=j$ for each $0\le j\le n$. In particular, $A$ is similar to the nilpotent Jordan block of size $n$.

Fact 5. If $f$ is linear on $V$, then $V\cong\ker(f)\oplus f(V)$.

Proof. The short exact sequence

$0\to\ker(f)\hookrightarrow V\twoheadrightarrow f(V)\to 0$

is split. So the result follows by the splitting lemma. $\square$

Corollary. If $W\subseteq V$ is a subspace, then $V\cong W\oplus W^\perp$.

Fact 6. If $A$ is $n\times n$, then $r(A)\ge n-k$, where $k$ is the algebraic multiplicity of the eigenvalue $0$ of $A$.

Proof. Since the nullity of $A$ is the geometric multiplicity of the eigenvalue $0$, and the geometric multiplicity of an eigenvalue is at most its algebraic multiplicity, we get $r(A)=n-n(A)\ge n-k$. $\square$

Fact 7. The number of distinct eigenvalues of $A$ is at most $r(A)+1$.

Proof. The rank of a matrix is the number of non-zero eigenvalues and the nullity is the number of zero eigenvalues, both counted with multiplicity. $\square$

Leave a comment

Filed under Linear Algebra

## Discriminants and Lattices

Let $K=\mathbb Q(\alpha)$ be a quadratic number field. For $a,b\in K$, recall that the discriminant $\Delta(a,b)$ is defined as

$\displaystyle \Delta(a,b):=\left|\begin{matrix} a^{(1)} & a^{(2)}\\ b^{(1)} & b^{(2)}\end{matrix}\right|^2$

where $a^{(1)}, a^{(2)}$ are the Galois conjugates of $a$ and $b^{(1)}, b^{(2)}$ are those of $b$. For any $\beta\in K$ we define its discriminant to be $\Delta(\beta):=\Delta(1,\beta)$.

Write $a=a_1+a_2\alpha$ and $b=b_1+b_2\alpha$. Then

$\left(\begin{matrix} a\\ b\end{matrix}\right)=\underbrace{\left(\begin{matrix} a_1 & a_2\\ b_1 & b_2\end{matrix}\right)}_{A}\left(\begin{matrix} 1\\\alpha\end{matrix}\right)$

If $\alpha,\bar\alpha$ are the Galois conjugates of $\alpha$, then

$\Delta(a,b)=\left|\begin{matrix} a_1+a_2\alpha & a_1+a_2\bar\alpha\\ b_1+b_2\alpha & b_1+b_2\bar\alpha\end{matrix}\right|^2=\left|\begin{matrix} a_1 & a_2\\ b_1 & b_2\end{matrix}\right|^2\left|\begin{matrix} 1 & 1\\\alpha & \bar\alpha\end{matrix}\right|^2$

$\therefore\boxed{\Delta(a,b)=(\det A)^2\Delta(\alpha)}$

Now suppose that $\mathbb Z[\alpha]=a\mathbb Z+b\mathbb Z$. Then $\mathbb Z[\alpha]$ is spanned by $\{a,b\}$, so there are integers $p,q,r,s$ such that

$\underbrace{\left(\begin{matrix} p & q\\ r & s\end{matrix}\right)}_{M}\left(\begin{matrix} a\\ b\end{matrix}\right)=\left(\begin{matrix} 1\\\alpha\end{matrix}\right)$

So we have

$MA\left(\begin{matrix} 1\\\alpha\end{matrix}\right)=\left(\begin{matrix} 1\\\alpha\end{matrix}\right)$.

Lemma. If $P$ is a $2\times 2$ matrix with integer coefficients and $w=(1,\alpha)^T$ with $\alpha\not\in\mathbb Q$, then $Pw=w$ if and only if $P=I$, the $2\times 2$ identity matrix.

Proof. This follows from the $\mathbb Z$-linear independence of $\{1,\alpha\}$. More concretely,

$\underbrace{\left(\begin{matrix} s & t\\ u & v\end{matrix}\right)}_{P}\left(\begin{matrix}1\\\alpha\end{matrix}\right)=\left(\begin{matrix} 1\\\alpha\end{matrix}\right)\Rightarrow\begin{cases}s+t\alpha=1\\ u+v\alpha=\alpha\end{cases}$

$\therefore s=1,\ t=0,\ u=0,\ v=1\Rightarrow P=I$. $\square$

Thus $MA=I$, so that $\det(M)\det(A)=1$. But $\det(M)$ and $\det(A)$ are integers. Hence $|\det(M)|=|\det(A)|=1$, i.e. $\Delta(a,b)=\Delta(\alpha)$. Thus

Fact. $\{a,b\}\subset\mathbb Z[\alpha]$ spans $\mathbb Z[\alpha]$ if and only if $\Delta(a,b)=\Delta(\alpha)$.

Note that all of the above arguments generalize to arbitrary number fields.

A nice corollary:

Corollary. $(a,b)$ and $(c,d)$ generate $\mathbb Z^2$ (as a group) if and only if

$\left|\begin{matrix} a & b\\ c & d\end{matrix}\right|=\pm 1$.

In other words, two bases generate the same lattice only if their fundamental parallelograms have equal areas.

1 Comment

Filed under Geometry, Linear Algebra, Number Theory

## Commuting Matrices

This post arose from an attempt to solve a question in this past Waterloo pure mathematics PhD comprehensive exam.

Let $k$ be an algebraically closed field. Let $\mathcal M_n(k)$ denote the set of all $n\times n$ matrices with entries in $k$. Let

$\displaystyle J_\lambda:=\left(\begin{matrix}\lambda & 1 & \cdots & 0\\ 0 & \lambda & \cdots & 0\\\vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \cdots & \lambda\end{matrix}\right)\in\mathcal M_n(k)$

be a Jordan block. Let $e_1,\dots,e_n$ be the standard basis vectors, i.e. the $j$-th component of $e_i$ is $\delta_{ij}$. Note the action of $J_0$ on the basis vectors: $J_0e_i=e_{i-1}$ for each $i$ (where we take $e_0=0$).

Suppose $P\in\mathcal M_n(k)$ commutes with $J_\lambda$. Then

$Pe_{i-1}=PJ_0e_i=P(J_\lambda-\lambda I)e_i=(J_\lambda-\lambda I)Pe_i=J_0Pe_i$

i.e. if $P_i:=Pe_i$ is the $i$-th column of $P$, then $P_{i-1}=J_0P_i$ for each $i$. Thus

$P=(P_1\mid\cdots\mid P_n)=(J_0^{n-1}P_n\mid\cdots\mid P_n)=(J_0^{n-1}\mid\cdots\mid I)P_n$

i.e. if $P_n=(a_1,\dots,a_n)^T$, then $P=a_1J_0^{n-1}+\cdots+a_nI$, a polynomial in $J_0$. Further, since $J_0=J_\lambda-\lambda I$, it follows that $P$ is a polynomial in $J_\lambda$. So we deduce:

Fact 1. $P$ commutes with $J_\lambda$ iff $P$ is a polynomial in $J_0$.

Fact 2. $P$ commutes with $J_\lambda$ iff $P$ is a polynomial in $J_\lambda$.

If we denote

$\mathcal C(A):=\{B\in\mathcal M_n(k): AB=BA\}$

for $A\in\mathcal M_n(k)$, then we’ve just shown

\begin{aligned}\mathcal C(J_\lambda)&=\{f(J_\lambda):f\in k[X]\}\\ &=\{f(J_0):f\in k[X]\}\\ &=\mathcal C(J_0)\end{aligned}

Now let $A\in\mathcal M_n(k)$ have minimal and characteristic polynomial $(X-\lambda)^n$. This means the Jordan normal form of $A$ is $J_\lambda$. So there exists an invertible matrix $M$ such that $A=MJ_\lambda M^{-1}$. Thus

\begin{aligned}\mathcal C(A)&=\{P\in\mathcal M_n(k): PA=AP\}\\ &=\{P\in\mathcal M_n(k):PMJ_\lambda M^{-1}=MJ_\lambda M^{-1}P\}\\ &=\{P\in\mathcal M_n(k): M^{-1}PMJ_\lambda=J_\lambda M^{-1}PM\}\\ &=\{P\in\mathcal M_n(k): M^{-1}PM=f(J_\lambda)\text{ for some }f\in k[X]\}\\ &=\{P\in\mathcal M_n(k): P=Mf(J_\lambda)M^{-1}\text{ for some }f\in k[X]\}\\ &=\{P\in\mathcal M_n(k): P=f(MJ_\lambda M^{-1})\text{ for some }f\in k[X]\}\\ &=\{P\in\mathcal M_n(k): P=f(A)\text{ for some }f\in k[X]\}\\ &=\{f(A): f\in k[X]\}\end{aligned}

Thus

Fact 3. $P$ commutes with $A$ iff $P$ is a polynomial in $A$.

Leave a comment

Filed under Linear Algebra

## Möbius Transformations and Cross-Ratios

A Möbius transformation is a map $f:\mathbb C_\infty\to\mathbb C_\infty$ of the form

$\displaystyle f(z)=\frac{az+b}{cz+d},\quad a,b,c,d\in\mathbb C,\quad ad-bc\neq 0$

where $\mathbb C_\infty:=\mathbb C\cup\{\infty\}$is the extended complex plane and the ‘point at infinity’ $\infty$ is defined so that

1. if $c\neq 0$ then $f(\infty)=a/c$ and $f(-d/c)=\infty$;
2.  if $c=0$ then $f(\infty)=\infty$.

The following video gives a very illuminating illustration. The sphere in the video is called the Riemann sphere, which in a sense ‘wraps up’ the extended complex plane into a sphere. Each point on the sphere corresponds to a unique point on the plane (i.e. there is a bijection between points on the extended plane and points on the sphere), with the ‘light source’ being the point at infinity. This bijective correspondence is the main reason for including the point at infinity.

According to the video any Möbius transformation can be generated by the four basic ones: translations, dilations, rotations and inversions:

1. Translation: $f(z)=z+b$, $b\in\mathbb C$
2. Dilation: $f(z)=az$, $a\in\mathbb R$
3. Rotation: $f(z)=e^{i\theta}z$, $\theta\in[0,2\pi]$
4. Inversion: $f(z)=1/z$

Exercise. Show that any Möbius transformation is some composition of these operations.

The Möbius transformations in fact form a group $\mathcal M$ under composition which acts on $\mathbb C_\infty$. Moreover, we have a surjective homomorphism

$\displaystyle\begin{matrix}\hbox{GL}_2(\mathbb C)\to\mathcal M,\quad\left(\begin{matrix} a & b\\ c & d\end{matrix}\right)\mapsto\displaystyle\frac{az+b}{cz+d}\end{matrix}$.

Möbius transformations exhibit very interesting properties, some of which are:

Proposition 1. Given distinct $z_1,z_2,z_3\in\mathbb C_\infty$, there is a unique Möbius map $f\in\mathcal M$ such that

$\displaystyle f:\left(\begin{matrix} z_1\\ z_2\\z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$

Proof. It is not difficult to work out that the unique $f$ is given by

$\displaystyle f(z)=\frac{z-z_1}{z-z_3}\frac{z_2-z_3}{z_2-z_1}$. $\square$

Proposition 2. The action of $\mathcal M$ on $\mathbb C_\infty$ is sharply triply transitive: if $z_1,z_2,z_3\in\mathbb C_\infty$ are distinct and $w_1,w_2,w_3\in\mathbb C_\infty$ are distinct, then there eixsts a unique $f\in\mathcal M$ such that $f(z_i)=w_i$ for $i=1,2,3$.

Proof. By proposition 1, there is a unique $g\in\mathcal M$ such that

$\displaystyle g:\left(\begin{matrix} z_1\\ z_2\\ z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$

and a unique $h\in\mathcal M$ such that

$\displaystyle h:\left(\begin{matrix} w_1\\ w_2\\ w_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$.

Then $f=h^{-1}\circ g$ is the unique map satisfying the required property. $\square$

The cross-ratio $[z_1,z_2,z_3,z_4]$ of four distinct points $z_1,z_2,z_3,z_4\in\mathbb C_\infty$ is defined to be the unique $\lambda\in\mathbb C_\infty$ such that if $f\in\mathcal M$ is the unique map satisfying

$\displaystyle f:\left(\begin{matrix} z_1\\ z_2\\z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$

then $f(z_4)=\lambda$, i.e.

$\displaystyle [z_1,z_2,z_3,z_4]=\frac{z_1-z_4}{z_3-z_4}\frac{z_2-z_3}{z_2-z_1}$.      $(*)$

One nice thing about cross-ratios is that they are preserved by Möbius transformations.

Proposition 3. If $f\in\mathcal M$, then $[z_1,z_2,z_3,z_4]=[f(z_1),f(z_2),f(z_3),f(z_4)]$.

Proof. Let $g\in\mathcal M$ such that

$\displaystyle g:\left(\begin{matrix} z_1\\ z_2\\ z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$.

Then $[z_1,z_2,z_3,z_4]=g(z_4)$. Likewise, if $h\in\mathcal M$ satisfies

$\displaystyle h:\left(\begin{matrix} f(z_1)\\ f(z_2)\\ f(z_3)\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$

then $h\circ f=g$ by proposition 1, so $[f(z_1),f(z_2),f(z_3),f(z_4)]=h(f(z_4))$ $=g(z_4)=[z_1,z_2,z_3,z_4]$. $\square$

From $(*)$ we observe that some permutations of $1,2,3,4$ leave the value of the cross-ratio $[z_1,z_2,z_3,z_4]$ invariant, e.g. $(1\ 3)(2\ 4)\in S_4$ is one such. What about the others?

Let $S_4$ act on the indices of the cross-ratio $[z_1,z_2,z_3,z_4]$. The permutations $\sigma\in S_4$ that fix $[z_1,z_2,z_3,z_4]$ form the stabiliser subgroup of $S_4$ of this action. Using transitivity and invariance (propositions 2 and 3), the orbit of $[z_1,z_2,z_3,z_4]$ is just the different assignments of the values $0,1,\infty$ to $z_1,z_2,z_3$; i.e. the distinct cross-ratios that we get by permuting the indices are just

$[z_1,z_2,z_3,z_4]$$[z_1,z_3,z_2,z_4]$$[z_2,z_1,z_3,z_4]$,

$[z_2,z_3,z_1,z_4]$, $[z_3,z_1,z_2,z_4]$, $[z_3,z_2,z_1,z_4]$.

Let $f\in\mathcal M$ such that

$\displaystyle f:\left(\begin{matrix} z_1\\ z_2\\z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$.

Then writing $\lambda=f(z_4)=[z_1,z_2,z_3,z_4]$ shows that the values of the above cross-ratios are (not necessarily in this order—too lazy to work out the precise order)

$\displaystyle \lambda,\frac 1\lambda, 1-\lambda,\frac{1}{1-\lambda},\frac{\lambda}{\lambda-1},\frac{\lambda-1}{\lambda}$

and they form the subgroup of $\mathcal M$ that fixes the set $\{0,1,\infty\}$. This group is isomorphic to $S_3$.

So there are in fact four permutations $\sigma\in S_4$ such that

$[z_1,z_2,z_3,z_4]=[z_{\sigma(1)},z_{\sigma(2)},z_{\sigma(3)},z_{\sigma(4)}]$

and they form a subgroup of $S_4$ that is isomorphic to the Klein four-group

$V_4=\{e, (1\ 2)(3\ 4), (1\ 3)(2\ 4), (1\ 4)(2\ 3)\}$.

Leave a comment

Filed under Algebra, Complex Analysis