# Monthly Archives: December 2014

## Commuting matrices

This post arose from an attempt to solve a question in this past Waterloo pure mathematics PhD comprehensive exam.

Let $k$ be an algebraically closed field. Let $\mathcal M_n(k)$ denote the set of all $n\times n$ matrices with entries in $k$. Let

$\displaystyle J_\lambda:=\left(\begin{matrix}\lambda & 1 & \cdots & 0\\ 0 & \lambda & \cdots & 0\\\vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \cdots & \lambda\end{matrix}\right)\in\mathcal M_n(k)$

be a Jordan block. Let $e_1,\dots,e_n$ be the standard basis vectors, i.e. the $j$-th component of $e_i$ is $\delta_{ij}$. Note the action of $J_0$ on the basis vectors: $J_0e_i=e_{i-1}$ for each $i$ (where we take $e_0=0$).

Suppose $P\in\mathcal M_n(k)$ commutes with $J_\lambda$. Then

$Pe_{i-1}=PJ_0e_i=P(J_\lambda-\lambda I)e_i=(J_\lambda-\lambda I)Pe_i=J_0Pe_i$

i.e. if $P_i:=Pe_i$ is the $i$-th column of $P$, then $P_{i-1}=J_0P_i$ for each $i$. Thus

$P=(P_1\mid\cdots\mid P_n)=(J_0^{n-1}P_n\mid\cdots\mid P_n)=(J_0^{n-1}\mid\cdots\mid I)P_n$

i.e. if $P_n=(a_1,\dots,a_n)^T$, then $P=a_1J_0^{n-1}+\cdots+a_nI$, a polynomial in $J_0$. Further, since $J_0=J_\lambda-\lambda I$, it follows that $P$ is a polynomial in $J_\lambda$. So we deduce:

Fact 1. $P$ commutes with $J_\lambda$ iff $P$ is a polynomial in $J_0$.

Fact 2. $P$ commutes with $J_\lambda$ iff $P$ is a polynomial in $J_\lambda$.

If we denote

$\mathcal C(A):=\{B\in\mathcal M_n(k): AB=BA\}$

for $A\in\mathcal M_n(k)$, then we’ve just shown

\begin{aligned}\mathcal C(J_\lambda)&=\{f(J_\lambda):f\in k[X]\}\\ &=\{f(J_0):f\in k[X]\}\\ &=\mathcal C(J_0)\end{aligned}

Now let $A\in\mathcal M_n(k)$ have minimal and characteristic polynomial $(X-\lambda)^n$. This means the Jordan normal form of $A$ is $J_\lambda$. So there exists an invertible matrix $M$ such that $A=MJ_\lambda M^{-1}$. Thus

\begin{aligned}\mathcal C(A)&=\{P\in\mathcal M_n(k): PA=AP\}\\ &=\{P\in\mathcal M_n(k):PMJ_\lambda M^{-1}=MJ_\lambda M^{-1}P\}\\ &=\{P\in\mathcal M_n(k): M^{-1}PMJ_\lambda=J_\lambda M^{-1}PM\}\\ &=\{P\in\mathcal M_n(k): M^{-1}PM=f(J_\lambda)\text{ for some }f\in k[X]\}\\ &=\{P\in\mathcal M_n(k): P=Mf(J_\lambda)M^{-1}\text{ for some }f\in k[X]\}\\ &=\{P\in\mathcal M_n(k): P=f(MJ_\lambda M^{-1})\text{ for some }f\in k[X]\}\\ &=\{P\in\mathcal M_n(k): P=f(A)\text{ for some }f\in k[X]\}\\ &=\{f(A): f\in k[X]\}\end{aligned}

Thus

Fact 3. $P$ commutes with $A$ iff $P$ is a polynomial in $A$.

Filed under Linear algebra

## Permuting the units mod n

Let $n>2$ be an integer and $a\in(\mathbb Z/n\mathbb Z)^\times$. We have a permutation map on $(\mathbb Z/n\mathbb Z)^\times$ given by $\pi(a):x\mapsto ax$. Then it is easy to see that

$\pi:(\mathbb Z/n\mathbb Z)^\times\to S_{\phi(n)}$

is an injective homomorphism, where $\phi$ is Euler’s totient function. This is the key idea in the proof of Cayley’s theorem, but we are going to do something a bit different.

Let’s look at $\pi(a)$ in the disjoint cycle notation. One of the cycles is $(1\ a\ a^2 \cdots a^{k-1})$, where $k$ is the order of $a$ modulo $n$. What about the others? In fact, every cycle is of the form $(b\ ba\ ba^2\cdots ba^{k-1})$ for some $b\in(\mathbb Z/n\mathbb Z)^\times$. So every cycle has length $k=\hbox{ord}_n(a)$. And the number of cycles is $\phi(n)/k$.

Let’s look at the sign $\varepsilon(\pi(a))$ of $\pi(a)$. There are $\phi(n)/k$ cycles each of length $k$, and each cycle of length $k$ can be written as a product of $k-1$ transpositions, so

$\varepsilon(\pi(a))=\displaystyle (-1)^{(k-1)\phi(n)/k}=(-1)^{\phi(n)/k}\quad(*)$

where the last equality is obtained by looking at the cases when $k$ is even/odd (and using the fact that $\phi(n)$ is even).

Remark. $(*)$ is saying that the number of transpositions in $\pi(a)$ has the same parity as the number of disjoint cycles in it.

Hence $f=\varepsilon\circ\pi$, being a composition of homomorphisms, is a homomorphism from $(\mathbb Z/n\mathbb Z)^\times$ to $\{\pm 1\}$. It might be interesting to know for which $n$ we get all of $\{\pm 1\}$ as $\hbox{im}(f)$, and for which $n$ we just get the trivial group $\{1\}$.

Fact.  $\hbox{im}(f)=\{\pm 1\}$ if and only if $\phi(n)/\hbox{ord}_n(a)$ is odd for some $a\in(\mathbb Z/n\mathbb Z)^\times$.

Corollary. If there exists $a\in(\mathbb Z/n\mathbb Z)^\times$ such that $v_2(\hbox{ord}_2(a))=v_2(\phi(n))$, then there are exactly $\phi(n)/2$ such $a$‘s. ($v_2(n)$ is the largest integer $k$ such that $2^k\mid n$.)

Proposition. If $n$ is a power of two other than $2^2=4$, then $\hbox{im}(f)=\{1\}$.

Proof. We want to show that $\phi(n)/\hbox{ord}_n(a)$ is even for each $a\in(\mathbb Z/n\mathbb Z)^\times$. If $n=2^k$ then the integers coprime to $n$ are precisely the odd integers, so $\phi(n)=2^{k-1}$. So $\hbox{ord}_n(a)=2^j$ for some $j\le k-1$. If $j there is nothing to prove. Otherwise, suppose that $j=k-1$. This means $a$ is a primitive root modulo $n$. A well-known theorem says that primitive roots exist only for $2,4,p^k,2p^k$ for $p$ an odd prime. Hence $n$ must be $2$ or $4$$\square$

On the other hand:

Proposition. If there is a primitive root modulo $n$, then $\hbox{im}(f)=\{\pm 1\}$.

Proof. Let $\phi(n)=2^qm$, where $m$ is odd. Let $a$ be a primitive root modulo $n$. Then $\hbox{ord}_n(a^m)=2^q$, so $\phi(n)/\hbox{ord}_n(a^m)=m$ is odd, and so $f(a^m)=-1$. $\square$

So both cases in fact occur infinitely often.

I don’t know whether it might be possible to completely classify the integers based on $\hbox{im}(f)$, nor do I know what any of this actually means, but perhaps it is something worth pondering.

Filed under Algebra, Number theory

## Möbius transformations and cross-ratios

A Möbius transformation is a map $f:\mathbb C_\infty\to\mathbb C_\infty$ of the form

$\displaystyle f(z)=\frac{az+b}{cz+d},\quad a,b,c,d\in\mathbb C,\quad ad-bc\neq 0$

where $\mathbb C_\infty:=\mathbb C\cup\{\infty\}$is the extended complex plane and the ‘point at infinity’ $\infty$ is defined so that

1. if $c\neq 0$ then $f(\infty)=a/c$ and $f(-d/c)=\infty$;
2.  if $c=0$ then $f(\infty)=\infty$.

The following video gives a very illuminating illustration. The sphere in the video is called the Riemann sphere, which in a sense ‘wraps up’ the extended complex plane into a sphere. Each point on the sphere corresponds to a unique point on the plane (i.e. there is a bijection between points on the extended plane and points on the sphere), with the ‘light source’ being the point at infinity. This bijective correspondence is the main reason for including the point at infinity.

According to the video any Möbius transformation can be generated by the four basic ones: translations, dilations, rotations and inversions:

1. Translation: $f(z)=z+b$, $b\in\mathbb C$
2. Dilation: $f(z)=az$, $a\in\mathbb R$
3. Rotation: $f(z)=e^{i\theta}z$, $\theta\in[0,2\pi]$
4. Inversion: $f(z)=1/z$

Exercise. Show that any Möbius transformation is a composition of these four operations.

The Möbius transformations in fact form a group $\mathcal M$ under composition which acts on $\mathbb C_\infty$. Moreover, we have a surjective homomorphism

$\displaystyle\begin{matrix}\hbox{GL}_2(\mathbb C)\to\mathcal M,\quad\left(\begin{matrix} a & b\\ c & d\end{matrix}\right)\mapsto\displaystyle\frac{az+b}{cz+d}\end{matrix}$.

Möbius transformations exhibit very interesting properties, some of which are:

Proposition 1. Given distinct $z_1,z_2,z_3\in\mathbb C_\infty$, there is a unique Möbius map $f\in\mathcal M$ such that

$\displaystyle f:\left(\begin{matrix} z_1\\ z_2\\z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$

Proof. It is not difficult to work out that the unique $f$ is given by

$\displaystyle f(z)=\frac{z-z_1}{z-z_3}\frac{z_2-z_3}{z_2-z_1}$. $\square$

Proposition 2. The action of $\mathcal M$ on $\mathbb C_\infty$ is sharply triply transitive: if $z_1,z_2,z_3\in\mathbb C_\infty$ are distinct and $w_1,w_2,w_3\in\mathbb C_\infty$ are distinct, then there eixsts a unique $f\in\mathcal M$ such that $f(z_i)=w_i$ for $i=1,2,3$.

Proof. By proposition 1, there is a unique $g\in\mathcal M$ such that

$\displaystyle g:\left(\begin{matrix} z_1\\ z_2\\ z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$

and a unique $h\in\mathcal M$ such that

$\displaystyle h:\left(\begin{matrix} w_1\\ w_2\\ w_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$.

Then $f=h^{-1}\circ g$ is the unique map satisfying the required property. $\square$

The cross-ratio $[z_1,z_2,z_3,z_4]$ of four distinct points $z_1,z_2,z_3,z_4\in\mathbb C_\infty$ is defined to be the unique $\lambda\in\mathbb C_\infty$ such that if $f\in\mathcal M$ is the unique map satisfying

$\displaystyle f:\left(\begin{matrix} z_1\\ z_2\\z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$

then $f(z_4)=\lambda$, i.e.

$\displaystyle [z_1,z_2,z_3,z_4]=\frac{z_1-z_4}{z_3-z_4}\frac{z_2-z_3}{z_2-z_1}$.      $(*)$

One nice thing about cross-ratios is that they are preserved by Möbius transformations.

Proposition 3. If $f\in\mathcal M$, then $[z_1,z_2,z_3,z_4]=[f(z_1),f(z_2),f(z_3),f(z_4)]$.

Proof. Let $g\in\mathcal M$ such that

$\displaystyle g:\left(\begin{matrix} z_1\\ z_2\\ z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$.

Then $[z_1,z_2,z_3,z_4]=g(z_4)$. Likewise, if $h\in\mathcal M$ satisfies

$\displaystyle h:\left(\begin{matrix} f(z_1)\\ f(z_2)\\ f(z_3)\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$

then $h\circ f=g$ by proposition 1, so $[f(z_1),f(z_2),f(z_3),f(z_4)]=h(f(z_4))$ $=g(z_4)=[z_1,z_2,z_3,z_4]$. $\square$

From $(*)$ we observe that some permutations of $1,2,3,4$ leave the value of the cross-ratio $[z_1,z_2,z_3,z_4]$ unaltered, e.g. $(1\ 3)(2\ 4)\in S_4$ is one such. What about the others?

Let $S_4$ act on the indices of the cross-ratio $[z_1,z_2,z_3,z_4]$. The permutations $\sigma\in S_4$ that fix $[z_1,z_2,z_3,z_4]$ form the stabiliser subgroup of $S_4$ of this action. Using transitivity and invariance (propositions 2 and 3), the orbit of $[z_1,z_2,z_3,z_4]$ is just the different assignments of the values $0,1,\infty$ to $z_1,z_2,z_3$; i.e. the distinct cross-ratios that we get by permuting the indices are just

$[z_1,z_2,z_3,z_4]$$[z_1,z_3,z_2,z_4]$$[z_2,z_1,z_3,z_4]$,

$[z_2,z_3,z_1,z_4]$, $[z_3,z_1,z_2,z_4]$, $[z_3,z_2,z_1,z_4]$.

Let $f\in\mathcal M$ such that

$\displaystyle f:\left(\begin{matrix} z_1\\ z_2\\z_3\end{matrix}\right)\mapsto\left(\begin{matrix} 0\\ 1\\ \infty\end{matrix}\right)$.

Then writing $\lambda=f(z_4)=[z_1,z_2,z_3,z_4]$ shows that the values of the above cross-ratios are (not necessarily in this order—too lazy to work out the precise order)

$\displaystyle \lambda,\frac 1\lambda, 1-\lambda,\frac{1}{1-\lambda},\frac{\lambda}{\lambda-1},\frac{\lambda-1}{\lambda}$

and they (as functions of $\lambda$) form the subgroup of $\mathcal M$ that fixes the set $\{0,1,\infty\}$. This group is isomorphic to $S_3$.

So there are in fact four permutations $\sigma\in S_4$ such that

$[z_1,z_2,z_3,z_4]=[z_{\sigma(1)},z_{\sigma(2)},z_{\sigma(3)},z_{\sigma(4)}]$

and they form a subgroup of $S_4$ that is isomorphic to the Klein four-group

$V_4=\{e, (1\ 2)(3\ 4), (1\ 3)(2\ 4), (1\ 4)(2\ 3)\}$.