Tag Archives: matrix

SL(2,IR) is the Commutator Subgroup of GL(2,IR)

Here is a proof of the above fact.

Let N be the commutator subgroup of the general linear group GL(2,\mathbb R); i.e.,

N=\langle ABA^{-1}B^{-1}:A,B\in GL(2,\mathbb R)\rangle.

First, it is clear that N is contained in the special linear group SL(2,\mathbb R), since \det(ABA^{-1}B^{-1})=1 for any A,B\in GL(2,\mathbb R). Next, we claim that N contains all matrices

\begin{pmatrix} 1 & b\\ 0 & 1\end{pmatrix}.

This follows from noting that

\begin{pmatrix} 1 & b\\ 0 & 1\end{pmatrix}=\begin{pmatrix} 1 & b\\ 0 & b\end{pmatrix}\begin{pmatrix} 1 & 1\\ 0 & 1\end{pmatrix}\begin{pmatrix} 1 & b\\ 0 & b\end{pmatrix}^{-1}\begin{pmatrix} 1 & 1\\ 0 & 1\end{pmatrix}^{-1}.

By taking transposes, it also follows that N contains all matrices

\begin{pmatrix} 1 & 0\\ c & 1\end{pmatrix}.

Further, N contains all matrices

\begin{pmatrix} a & 0\\ 0 & 1/a\end{pmatrix}

since

\begin{pmatrix} a & 0\\ 0 & 1/a\end{pmatrix}=\begin{pmatrix} a & 0\\ 0 & 1\end{pmatrix}\begin{pmatrix} 0 & 1\\ 1 & 0\end{pmatrix}\begin{pmatrix} a & 0\\ 0 & 1\end{pmatrix}^{-1}\begin{pmatrix} 0 & 1\\ 1 & 0\end{pmatrix}^{-1}

for any a\neq 0.

Now let

\begin{pmatrix} a & b\\ c & d\end{pmatrix}\in SL(2,\mathbb R).

Then ad-bc=1. Using the above results,

\begin{pmatrix} a & b\\ c & d\end{pmatrix}=\begin{pmatrix} 1 & 0\\ c/a & 1\end{pmatrix}\begin{pmatrix} 1 & ab\\ 0 & 1\end{pmatrix}\begin{pmatrix} a & 0\\ 0 & 1/a\end{pmatrix}\in N

if a\neq 0, and

\begin{pmatrix} a & b\\ c & d\end{pmatrix}=\begin{pmatrix}0&1\\-1&0\end{pmatrix}\begin{pmatrix}1&-d/b\\ 0&1\end{pmatrix}\begin{pmatrix}1&0\\ ab&1\end{pmatrix}\begin{pmatrix}1/b&0\\ 0&b\end{pmatrix}\in N

if b\neq 0, and the latter since

\begin{aligned}&\begin{pmatrix} 0 & -1\\ 1 & 0\end{pmatrix}\\=&\begin{pmatrix}x&y\\0&-x-y\end{pmatrix}\begin{pmatrix}-x-y&0\\ x&y\end{pmatrix}\begin{pmatrix}x&y\\0&-x-y\end{pmatrix}^{-1}\begin{pmatrix}-x-y&0\\ x&y\end{pmatrix}^{-1}\\ \in &\ N\end{aligned}

for any x,y,x+y\neq 0. Thus SL(2,\mathbb R)\subseteq N, i.e., N=SL(2,\mathbb R).

2 Comments

Filed under Linear Algebra

Some Interesting Linear Algebra Proofs

Below are some cute linear algebra results and proofs cherrypicked from various sources. All the standard hypotheses (on the base field, the size of the matrices, etc.) that make the claims valid are assumed. The list will likely be updated.

Fact 1. Let A,B,X be matrices. If AX=XB, then p(A)X=Xp(B) for any polynomial p.

Proof. We have A^2X=A(AX)=A(XB)=(AX)B=(XB)B=XB^2. By induction, A^nX=XB^n for any n\in\mathbb N. Hence the result follows. \square

Remark. Note that X need not be square, let alone invertible.

Fact 2. Let x_0,\dots,x_n be distinct. Then the Vandermonde matrix

\displaystyle V=\begin{pmatrix} 1 & x_0 & \cdots & x_0^n\\ 1 & x_1 & \cdots & x_1^n\\ \vdots & \vdots & \ddots & \vdots\\ 1 & x_n & \cdots & x_n^n\end{pmatrix}

is invertible.

Proof. It suffices to show that the kernel of the linear transformation f(x)=Vx is trivial. If a=(a_0,\dots,a_n) is in the kernel, then p(X)=a_0+a_1X+\cdots+a_nX^n=0 for each X\in\{x_0,\dots,x_n\}. Since \deg(p)=n, this forces p(X) to be identically zero. Thus a=0. \square

Fact 3. A matrix is diagonalisable iff its minimal polynomial decomposes into distinct linear factors.

Proof. A matrix is diagonalisable iff every Jordan block has size 1. Since the multiplicity of an eigenvalue in the minimal polynomial corresponds to the size of the largest Jordan block, the result follows. \square

Corollary. Idempotent matrices are diagonalisable. Moreover, the rank of an idempotent matrix is equal to the algebraic multiplicity of the eigenvalue 1.

Fact 4. If A^nx=0 and A^{n-1}x\neq 0, then x,Ax,\dots,A^{n-1}x are linearly independent.

Proof. Note that \ker(A^n)\supseteq\ker(A^{n-1}). Let x\in\ker(A^n)\setminus\ker(A^{n-1}) and suppose that a_0x+a_1Ax+\cdots+a_{n-1}A^{n-1}x=0. Multiplying both sides by A^i for i=n-1,\dots,1 shows that a_i=0 for all i, as desired. \square

Corollary 1. If \ker(A^n)\neq\ker(A^{n-1}), then \dim(\ker(A^j))\ge j for j=1,\dots,n.

Proof. If x\in\ker(A^n)\setminus\ker(A^{n-1}), then A^{n-1}x,\dots,A^{n-j}x\in\ker(A^j). \square

Corollary 2. If A is n\times n, and \ker(A^n)\neq\ker(A^{n-1}), then \dim(\ker(A^j))=j for each 0\le j\le n. In particular, A is similar to the nilpotent Jordan block of size n.

Fact 5. If f is linear on V, then V\cong\ker(f)\oplus f(V).

Proof. The short exact sequence

0\to\ker(f)\hookrightarrow V\twoheadrightarrow f(V)\to 0

is split. So the result follows by the splitting lemma. \square

Corollary. If W\subseteq V is a subspace, then V\cong W\oplus W^\perp.

Fact 6. If A is n\times n, then r(A)\ge n-k, where k is the algebraic multiplicity of the eigenvalue 0 of A.

Proof. Since the nullity of A is the geometric multiplicity of the eigenvalue 0, and the geometric multiplicity of an eigenvalue is at most its algebraic multiplicity, we get r(A)=n-n(A)\ge n-k. \square

Fact 7. The number of distinct eigenvalues of A is at most r(A)+1.

Proof. The rank of a matrix is the number of non-zero eigenvalues and the nullity is the number of zero eigenvalues, both counted with multiplicity. \square

Leave a comment

Filed under Linear Algebra

Discriminants and Lattices

Let K=\mathbb Q(\alpha) be a quadratic number field. For a,b\in K, recall that the discriminant \Delta(a,b) is defined as

\displaystyle \Delta(a,b):=\left|\begin{matrix} a^{(1)} & a^{(2)}\\ b^{(1)} & b^{(2)}\end{matrix}\right|^2

where a^{(1)}, a^{(2)} are the Galois conjugates of a and b^{(1)}, b^{(2)} are those of b. For any \beta\in K we define its discriminant to be \Delta(\beta):=\Delta(1,\beta).

Write a=a_1+a_2\alpha and b=b_1+b_2\alpha. Then

\left(\begin{matrix} a\\ b\end{matrix}\right)=\underbrace{\left(\begin{matrix} a_1 & a_2\\ b_1 & b_2\end{matrix}\right)}_{A}\left(\begin{matrix} 1\\\alpha\end{matrix}\right)

If \alpha,\bar\alpha are the Galois conjugates of \alpha, then

\Delta(a,b)=\left|\begin{matrix} a_1+a_2\alpha & a_1+a_2\bar\alpha\\ b_1+b_2\alpha & b_1+b_2\bar\alpha\end{matrix}\right|^2=\left|\begin{matrix} a_1 & a_2\\ b_1 & b_2\end{matrix}\right|^2\left|\begin{matrix} 1 & 1\\\alpha & \bar\alpha\end{matrix}\right|^2

\therefore\boxed{\Delta(a,b)=(\det A)^2\Delta(\alpha)}

Now suppose that \mathbb Z[\alpha]=a\mathbb Z+b\mathbb Z. Then \mathbb Z[\alpha] is spanned by \{a,b\}, so there are integers p,q,r,s such that

\underbrace{\left(\begin{matrix} p & q\\ r & s\end{matrix}\right)}_{M}\left(\begin{matrix} a\\ b\end{matrix}\right)=\left(\begin{matrix} 1\\\alpha\end{matrix}\right)

So we have

MA\left(\begin{matrix} 1\\\alpha\end{matrix}\right)=\left(\begin{matrix} 1\\\alpha\end{matrix}\right).

Lemma. If P is a 2\times 2 matrix with integer coefficients and w=(1,\alpha)^T with \alpha\not\in\mathbb Q, then Pw=w if and only if P=I, the 2\times 2 identity matrix.

Proof. This follows from the \mathbb Z-linear independence of \{1,\alpha\}. More concretely,

\underbrace{\left(\begin{matrix} s & t\\ u & v\end{matrix}\right)}_{P}\left(\begin{matrix}1\\\alpha\end{matrix}\right)=\left(\begin{matrix} 1\\\alpha\end{matrix}\right)\Rightarrow\begin{cases}s+t\alpha=1\\ u+v\alpha=\alpha\end{cases}

\therefore s=1,\ t=0,\ u=0,\ v=1\Rightarrow P=I. \square

Thus MA=I, so that \det(M)\det(A)=1. But \det(M) and \det(A) are integers. Hence |\det(M)|=|\det(A)|=1, i.e. \Delta(a,b)=\Delta(\alpha). Thus

Fact. \{a,b\}\subset\mathbb Z[\alpha] spans \mathbb Z[\alpha] if and only if \Delta(a,b)=\Delta(\alpha).

Note that all of the above arguments generalize to arbitrary number fields.

A nice corollary:

Corollary. (a,b) and (c,d) generate \mathbb Z^2 (as a group) if and only if

\left|\begin{matrix} a & b\\ c & d\end{matrix}\right|=\pm 1.

In other words, two bases generate the same lattice only if their fundamental parallelograms have equal areas.

1 Comment

Filed under Geometry, Linear Algebra, Number Theory

Commuting Matrices

This post arose from an attempt to solve a question in this past Waterloo pure mathematics PhD comprehensive exam.

Let k be an algebraically closed field. Let \mathcal M_n(k) denote the set of all n\times n matrices with entries in k. Let

\displaystyle J_\lambda:=\left(\begin{matrix}\lambda & 1 & \cdots & 0\\ 0 & \lambda & \cdots & 0\\\vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \cdots & \lambda\end{matrix}\right)\in\mathcal M_n(k)

be a Jordan block. Let e_1,\dots,e_n be the standard basis vectors, i.e. the j-th component of e_i is \delta_{ij}. Note the action of J_0 on the basis vectors: J_0e_i=e_{i-1} for each i (where we take e_0=0).

Suppose P\in\mathcal M_n(k) commutes with J_\lambda. Then

Pe_{i-1}=PJ_0e_i=P(J_\lambda-\lambda I)e_i=(J_\lambda-\lambda I)Pe_i=J_0Pe_i

i.e. if P_i:=Pe_i is the i-th column of P, then P_{i-1}=J_0P_i for each i. Thus

P=(P_1\mid\cdots\mid P_n)=(J_0^{n-1}P_n\mid\cdots\mid P_n)=(J_0^{n-1}\mid\cdots\mid I)P_n

i.e. if P_n=(a_1,\dots,a_n)^T, then P=a_1J_0^{n-1}+\cdots+a_nI, a polynomial in J_0. Further, since J_0=J_\lambda-\lambda I, it follows that P is a polynomial in J_\lambda. So we deduce:

Fact 1. P commutes with J_\lambda iff P is a polynomial in J_0.

Fact 2. P commutes with J_\lambda iff P is a polynomial in J_\lambda.

If we denote

\mathcal C(A):=\{B\in\mathcal M_n(k): AB=BA\}

for A\in\mathcal M_n(k), then we’ve just shown

\begin{aligned}\mathcal C(J_\lambda)&=\{f(J_\lambda):f\in k[X]\}\\ &=\{f(J_0):f\in k[X]\}\\ &=\mathcal C(J_0)\end{aligned}

Now let A\in\mathcal M_n(k) have minimal and characteristic polynomial (X-\lambda)^n. This means the Jordan normal form of A is J_\lambda. So there exists an invertible matrix M such that A=MJ_\lambda M^{-1}. Thus

\begin{aligned}\mathcal C(A)&=\{P\in\mathcal M_n(k): PA=AP\}\\ &=\{P\in\mathcal M_n(k):PMJ_\lambda M^{-1}=MJ_\lambda M^{-1}P\}\\ &=\{P\in\mathcal M_n(k): M^{-1}PMJ_\lambda=J_\lambda M^{-1}PM\}\\ &=\{P\in\mathcal M_n(k): M^{-1}PM=f(J_\lambda)\text{ for some }f\in k[X]\}\\ &=\{P\in\mathcal M_n(k): P=Mf(J_\lambda)M^{-1}\text{ for some }f\in k[X]\}\\ &=\{P\in\mathcal M_n(k): P=f(MJ_\lambda M^{-1})\text{ for some }f\in k[X]\}\\ &=\{P\in\mathcal M_n(k): P=f(A)\text{ for some }f\in k[X]\}\\ &=\{f(A): f\in k[X]\}\end{aligned}

Thus

Fact 3. P commutes with A iff P is a polynomial in A.

Leave a comment

Filed under Linear Algebra

Möbius Transformations and Cross-Ratios

A Möbius transformation is a map f:\mathbb C_\infty\to\mathbb C_\infty of the form

\displaystyle f(z)=\frac{az+b}{cz+d},\quad a,b,c,d\in\mathbb C,\quad ad-bc\neq 0

where \mathbb C_\infty:=\mathbb C\cup\{\infty\}is the extended complex plane and the ‘point at infinity’ \infty is defined so that

  1. if c\neq 0 then f(\infty)=a/c and f(-d/c)=\infty;
  2.  if c=0 then f(\infty)=\infty.

The following video gives a very illuminating illustration. The sphere in the video is called the Riemann sphere, which in a sense ‘wraps up’ the extended complex plane into a sphere. Each point on the sphere corresponds to a unique point on the plane (i.e. there is a bijection between points on the extended plane and points on the sphere), with the ‘light source’ being the point at infinity. This bijective correspondence is the main reason for including the point at infinity.

According to the video any Möbius transformation can be generated by the four basic ones: translations, dilations, rotations and inversions:

  1. Translation: f(z)=z+b, b\in\mathbb C
  2. Dilation: f(z)=az, a\in\mathbb R
  3. Rotation: f(z)=e^{i\theta}z, \theta\in[0,2\pi]
  4. Inversion: f(z)=1/z

Exercise. Show that any Möbius transformation is some composition of these operations.

The Möbius transformations in fact form a group \mathcal M under composition which acts on \mathbb C_\infty. Moreover, we have a surjective homomorphism

\displaystyle\begin{matrix}\hbox{GL}_2(\mathbb C)\to\mathcal M,\quad\left(\begin{matrix} a & b\\ c & d\end{matrix}\right)\mapsto\displaystyle\frac{az+b}{cz+d}\end{matrix}.

Möbius transformations exhibit very interesting properties, some of which are:

Proposition 1. Given distinct z_1,z_2,z_3\in\mathbb C_\infty, there is a unique Möbius map f\in\mathcal M such that

\displaystyle f:\left(\begin{matrix} z_1\\ z_2\\z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)

Proof. It is not difficult to work out that the unique f is given by

\displaystyle f(z)=\frac{z-z_1}{z-z_3}\frac{z_2-z_3}{z_2-z_1}. \square

Proposition 2. The action of \mathcal M on \mathbb C_\infty is sharply triply transitive: if z_1,z_2,z_3\in\mathbb C_\infty are distinct and w_1,w_2,w_3\in\mathbb C_\infty are distinct, then there eixsts a unique f\in\mathcal M such that f(z_i)=w_i for i=1,2,3.

Proof. By proposition 1, there is a unique g\in\mathcal M such that

\displaystyle g:\left(\begin{matrix} z_1\\ z_2\\ z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)

and a unique h\in\mathcal M such that

\displaystyle h:\left(\begin{matrix} w_1\\ w_2\\ w_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right).

Then f=h^{-1}\circ g is the unique map satisfying the required property. \square 

The cross-ratio [z_1,z_2,z_3,z_4] of four distinct points z_1,z_2,z_3,z_4\in\mathbb C_\infty is defined to be the unique \lambda\in\mathbb C_\infty such that if f\in\mathcal M is the unique map satisfying

\displaystyle f:\left(\begin{matrix} z_1\\ z_2\\z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)

then f(z_4)=\lambda, i.e.

\displaystyle [z_1,z_2,z_3,z_4]=\frac{z_1-z_4}{z_3-z_4}\frac{z_2-z_3}{z_2-z_1}.      (*)

One nice thing about cross-ratios is that they are preserved by Möbius transformations.

Proposition 3. If f\in\mathcal M, then [z_1,z_2,z_3,z_4]=[f(z_1),f(z_2),f(z_3),f(z_4)].

Proof. Let g\in\mathcal M such that

\displaystyle g:\left(\begin{matrix} z_1\\ z_2\\ z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right).

Then [z_1,z_2,z_3,z_4]=g(z_4). Likewise, if h\in\mathcal M satisfies

\displaystyle h:\left(\begin{matrix} f(z_1)\\ f(z_2)\\ f(z_3)\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)

then h\circ f=g by proposition 1, so [f(z_1),f(z_2),f(z_3),f(z_4)]=h(f(z_4)) =g(z_4)=[z_1,z_2,z_3,z_4]. \square


From (*) we observe that some permutations of 1,2,3,4 leave the value of the cross-ratio [z_1,z_2,z_3,z_4] invariant, e.g. (1\ 3)(2\ 4)\in S_4 is one such. What about the others?

Let S_4 act on the indices of the cross-ratio [z_1,z_2,z_3,z_4]. The permutations \sigma\in S_4 that fix [z_1,z_2,z_3,z_4] form the stabiliser subgroup of S_4 of this action. Using transitivity and invariance (propositions 2 and 3), the orbit of [z_1,z_2,z_3,z_4] is just the different assignments of the values 0,1,\infty to z_1,z_2,z_3; i.e. the distinct cross-ratios that we get by permuting the indices are just

[z_1,z_2,z_3,z_4][z_1,z_3,z_2,z_4][z_2,z_1,z_3,z_4],

[z_2,z_3,z_1,z_4], [z_3,z_1,z_2,z_4], [z_3,z_2,z_1,z_4].

Let f\in\mathcal M such that

\displaystyle f:\left(\begin{matrix} z_1\\ z_2\\z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right).

Then writing \lambda=f(z_4)=[z_1,z_2,z_3,z_4] shows that the values of the above cross-ratios are (not necessarily in this order—too lazy to work out the precise order)

\displaystyle \lambda,\frac 1\lambda, 1-\lambda,\frac{1}{1-\lambda},\frac{\lambda}{\lambda-1},\frac{\lambda-1}{\lambda}

and they form the subgroup of \mathcal M that fixes the set \{0,1,\infty\}. This group is isomorphic to S_3.

So there are in fact four permutations \sigma\in S_4 such that

[z_1,z_2,z_3,z_4]=[z_{\sigma(1)},z_{\sigma(2)},z_{\sigma(3)},z_{\sigma(4)}]

and they form a subgroup of S_4 that is isomorphic to the Klein four-group

V_4=\{e, (1\ 2)(3\ 4), (1\ 3)(2\ 4), (1\ 4)(2\ 3)\}.

Leave a comment

Filed under Algebra, Complex Analysis