## Fast Perfect Square Test in Python

I was reading this stack overflow post on fast ways to check if a number is a perfect square in Python. It seems that my own version from here based on binary search might be faster than the proposed solutions. (It passes this test in 0.007768 seconds on my laptop.) I just wanted to leave it here in case people find it useful.

def is_square(num):
"""Return True iff num is a perfect square."""
if num <= 0:
return num == 0
lo = 10**((len(str(num))-1)//2)
hi = lo * 10
while lo <= hi:
mid = lo + (hi-lo)//2
if mid * mid == num:
return True
if mid * mid < num:
lo = mid + 1
else:
hi = mid - 1
return False


Filed under Number Theory

## A Multi-dimensional AM-GM Inequality

We give below a re-statement and a generalisation of the final inequality from this post.

Theorem 1. For a vector $v \in \mathbb{R}^{d}$, let ${\rm AM}(v)$ (resp. ${\rm GM}(v)$) denote the arithmetic (resp. geometric) mean of the entries of $v$. Similarly, for a matrix $A \in \mathbb{R}^{m \times n}$, let ${\rm AM}(A)$ (resp. ${\rm GM}(A)$) denote the vector consisting of the arithmetic (resp. geometric) means of the columns of $A$. Let $A \in \mathbb{R}^{m \times n}$ consist of non-negative entries. Then

$\displaystyle {\rm AM}({\rm AM}(A)) \ge {\rm GM}({\rm AM}(A^{\intercal})) \ge {\rm AM}({\rm GM}(A))$

Proof. We have

\begin{aligned} {\rm AM}({\rm AM}(A)) &= \frac{1}{n} \sum_{j = 1}^{n} \frac{1}{m} \sum_{i = 1}^{m} A_{i, j} \\ (\hbox{by AM-GM}) &\ge \left(\prod_{j = 1}^{n} \frac{1}{m} \sum_{i = 1}^{m} A_{i, j}\right)^{1 / n} \\ &= {\rm GM}({\rm AM}(A^{\intercal})) \\ (\hbox{by H\"{o}lder}) &\ge \frac{1}{m} \sum_{i = 1}^{m} \left(\prod_{j = 1}^{n} A_{i, j}\right)^{1 / n} \\ &= {\rm AM}({\rm GM}(A)), \end{aligned}

as desired. $\square$

The above proof can be re-written as follows

\begin{aligned} {\rm AM}_{n}({\rm AM}_{m}(A)) &= \frac{1}{n} \sum_{j = 1}^{n} \frac{1}{m} \sum_{i = 1}^{m} A_{i, j} \\ (\hbox{by AM-GM}) &\ge \left(\prod_{j = 1}^{n} \frac{1}{m} \sum_{i = 1}^{m} A_{i, j}\right)^{1 / n} \\ &= {\rm GM}_{n}({\rm AM}_{m}(A)) \\ (\hbox{by H\"{o}lder}) &\ge \frac{1}{m} \sum_{i = 1}^{m} \left(\prod_{j = 1}^{n} A_{i, j}\right)^{1 / n} \\ &= {\rm AM}_{m}({\rm GM}_{n}(A)), \end{aligned}

where we have used ${\rm AM}_{i}$ (resp. ${\rm GM}_{i}$) to represent the arithmetic (resp. geometric) mean corresponding to the dimension $i \in \{m, n\}$. The usefulness of this notation is that it directly generalises the above theorem to higher dimensions. E.g. for a $3$-dimensional tensor $A \in \mathbb{R}^{m \times n \times p}$ with non-negative entries, we have

\begin{aligned} {\rm AM}_{p}({\rm AM}_{n}({\rm AM}_{m}(A))) &= \frac{1}{p} \sum_{k = 1}^{p} \frac{1}{n} \sum_{j = 1}^{n} \frac{1}{m} \sum_{i = 1}^{m} A_{i, j, k} \\ (\hbox{by AM-GM}) &\ge \left(\prod_{k = 1}^{p} \frac{1}{n} \sum_{j = 1}^{n} \frac{1}{m} \sum_{i = 1}^{m} A_{i, j, k}\right)^{1 / p} \\ &= {\rm GM}_{p}({\rm AM}_{n}({\rm AM}_{m}(A))) \\ (\hbox{by H\"{o}lder}) &\ge \frac{1}{n} \sum_{j = 1}^{n}\left(\prod_{k = 1}^{p} \frac{1}{m} \sum_{i = 1}^{m} A_{i, j, k}\right)^{1 / p} \\ &= {\rm AM}_{n}({\rm GM}_{p}({\rm AM}_{m}(A))) \\ (\hbox{by H\"{o}lder}) &\ge \frac{1}{n} \sum_{j = 1}^{n} \frac{1}{m} \sum_{i = 1}^{m} \left(\prod_{k = 1}^{p} A_{i, j, k}\right)^{1 / p} \\ &= {\rm AM}_{n}({\rm AM}_{m}({\rm GM}_{p}(A))) \\ (\hbox{by AM-GM}) &\ge \left(\prod_{j = 1}^{n} \frac{1}{m} \sum_{i = 1}^{m} \left(\prod_{k = 1}^{p} A_{i, j, k}\right)^{1 / p}\right)^{1 / n} \\ &= {\rm GM}_{n}({\rm AM}_{m}({\rm GM}_{p}(A))) \\ (\hbox{by H\"{o}lder}) &\ge \frac{1}{m} \sum_{i = 1}^{m} \left(\prod_{j = 1}^{n} \left(\prod_{k = 1}^{p} A_{i, j, k}\right)^{1 / p}\right)^{1 / n} \\ &= {\rm AM}_{m}({\rm GM}_{n}({\rm GM}_{p}(A)))\\ (\hbox{by AM-GM}) &\ge \left( \prod_{i = 1}^{m} \left(\prod_{j = 1}^{n} \left(\prod_{k = 1}^{p} A_{i, j, k}\right)^{1 / p}\right)^{1 / n}\right)^{1 / m} \\ &= {\rm GM}_{m}({\rm GM}_{n}({\rm GM}_{p}(A))) \end{aligned}

i.e. ignoring the $A$,

\begin{aligned} {\rm AM}_{p} \circ {\rm AM}_{n} \circ {\rm AM}_{m} &\ge {\rm GM}_{p} \circ {\rm AM}_{n} \circ {\rm AM}_{m} \\ &\ge {\rm AM}_{n} \circ {\rm GM}_{p} \circ {\rm AM}_{m} \\ &\ge {\rm AM}_{n} \circ {\rm AM}_{m} \circ {\rm GM}_{p} \\ &\ge {\rm GM}_{n} \circ {\rm AM}_{m} \circ {\rm GM}_{p} \\ &\ge {\rm AM}_{m} \circ {\rm GM}_{n} \circ {\rm GM}_{p} \\ &\ge {\rm GM}_{m} \circ {\rm GM}_{n} \circ {\rm GM}_{p} \end{aligned}

In general, we have the following result.

Theorem 2. Given a sequence of operations $*_{\sigma(1)} \circ \cdots \circ *_{\sigma(n)}$ acting on a $d_{1} \times \cdots \times d_{n}$ tensor $A = (A_{i_{1}, \dots, i_{n}})$ with non-negative entries, where $\sigma \in S_{n}$, each $*$ corresponds to ${\rm AM}$ or ${\rm GM}$, and $*_{j}(A)$ corresponds to applying $*$ along the $j$-th coordinate of $A$, i.e.

\displaystyle \begin{aligned} {\rm AM}_{j}(A) &= \frac{1}{d_{j}} \sum_{i_{j}} A_{i_{1}, \dots, i_{n}} \\ {\rm GM}_{j}(A) &= \left(\prod_{i_{j}} A_{i_{1}, \dots, i_{n}}\right)^{\frac{1}{d_{j}}}, \end{aligned}

the following operations on the sequence preserve a non-increasing order:

1. Replace any ${\rm AM}$ by a ${\rm GM}$.
2. Swap any ${\rm GM}$ with its immediate successor ${\rm AM}$.

Remark. Note for example that although

\begin{aligned} {\rm AM}_{n} \circ {\rm GM}_{p} \circ {\rm AM}_{m} &\ge \begin{cases} {\rm AM}_{n} \circ {\rm AM}_{m} \circ {\rm GM}_{p} \\ {\rm GM}_{n} \circ {\rm GM}_{p} \circ {\rm AM}_{m} \end{cases} \\ &\ge {\rm GM}_{n} \circ {\rm AM}_{m} \circ {\rm GM}_{p} \end{aligned}

the relationship between ${\rm AM}_{n} \circ {\rm AM}_{m} \circ {\rm GM}_{p}$ and ${\rm GM}_{n} \circ {\rm GM}_{p} \circ {\rm AM}_{m}$ isn’t clear from the above.

Filed under Algebra, Combinatorics

## Sums of Function over Subsets of Natural Numbers

In this post we analysed sums of the form

$S_{k}(n) = 1^{k} + \cdots + n^{k}.$

We will generalise this to sums of the form

$S_{f}(n) := f(1) + \cdots + f(n)$

for suitable functions $f: \mathbb{N} \to \mathbb{N}$.

Theorem. For $n \in \mathbb{N}$, define $[n] := \{1, \dots, n\}$. Suppose that $f: \mathbb{N} \to \mathbb{N}$ satisfies

$\displaystyle \sum_{j \in [n]} f(j) < 2^{n} - 1$

for some $n \in \mathbb{N}$. Then there exist disjoint non-empty subsets $A, B \subseteq [n]$ such that

$\displaystyle \sum_{j \in A} f(j) = \sum_{j \in B} f(j)$.

First Proof. Each non-empty subset $T \subseteq [n]$ corresponds to a sum

$\displaystyle S_{f}(T) := \sum_{j \in T} f(j) \in [S_{f}(n)]$.

Since $S_{f}(n) < 2^{n} - 1$—the total number of non-empty subsets $T \subseteq [n]$—there must be two non-empty subsets $A, B \subseteq [n]$ with $S_{f}(A) = S_{f}(B)$. Excluding the common elements from both sides gives the desired claim. $\square$

Second Proof. We consider expressions of the form

$S(c_{1}, \dots, c_{n}) := c_{1} f(1) + \cdots + c_{n} f(n)$,

where $c_{i} \in \{\pm 1\}$ for each $i$. Then $S$ takes $2^{n}$ possible values in $[-S_{f}(n), S_{f}(n)]$, which are all congruent modulo $2$. Hence, the number of distinct values that $S$ takes is at most

$\displaystyle \left \lceil \frac{2 S_{f}(n) + 1}{2} \right \rceil = S_{f}(n) + 1$.

Since this is less than $2^{n}$, we can find $c_{j}, c_{j}' \in \{\pm 1\}$ for $j \in [n]$ such that

$\displaystyle \sum_{j=1}^{n} c_{j} f(j) = \sum_{j=1}^{n} c_{j}' f(j) \Leftrightarrow \sum_{j=1}^{n} (c_{j} - c_{j}') f(j) = 0$

with $c_{j} \neq c_{j}'$ for some $j$.

Note that $c_{j} - c_{j}' \in \{0, \pm 2\}$ for all $j$. Let $A = \{j: c_{j} - c_{j}' = 2\}$ and $B=\{j: c_{j} - c_{j}' = -2\}$. Then $A$ and $B$ are non-empty and disjoint, $A \cup B \subseteq \{1, \dots, n\}$, and

$\displaystyle\sum_{j \in A} f(j) = \sum_{j \in B} f(j)$,

as desired. $\square$

Remark. This is tight: taking $f(j) = 2^{j - 1}$ shows that

$\displaystyle \sum_{j \in [n]} f(j) = 2^{n} - 1$

and

$\displaystyle \sum_{j \in A} f(j) \neq \sum_{j \in B} f(j)$

for any two non-empty subsets $A, B \subseteq [n]$.

Filed under Combinatorics

## A Bound on the Prime Counting Function

The idea for this post arose from this quora answer.

Let $\pi(n)$ denote the number of primes up to $n$. It is well-known that

$\displaystyle\pi(n)\sim\frac{n}{\log n}$,

which is the celebrated prime number theorem. Although “elementary” proofs exist, most proofs of the prime number theorem are technical. Nevertheless, we can obtain some simple bounds as follows.

Let $p_1,\ldots,p_k$ denote fixed prime numbers $\leq n$. By the inclusion-exclusion principle, the number of positive integers $\leq n$ not divisible by any of the $p_i$ is

\displaystyle\begin{aligned}&n-\sum_{i}\left\lfloor\frac{n}{p_i}\right\rfloor+\sum_{i

Therefore, we have the following (rather weak) bound for the prime-counting function $\pi(n)$:

$\boxed{\displaystyle \pi(n)< n\left(1-\frac{1}{p_1}\right)\cdots\left(1-\frac{1}{p_k}\right) + k + 2^{k-1}}$.

Nonetheless, this implies the following result.

Theorem. $\displaystyle\lim_{n\to\infty}\frac{\pi(n)}{n}=0$.

For example, let’s say we want to find all $n$ such that at least 20% of the natural numbers below $n$ are prime. Then

$\displaystyle\pi(n)\geq\frac n5$.

If $p_1,\ldots,p_k$ are the prime numbers $\leq n$, then this means that

$\displaystyle n\left(1-\frac{1}{p_1}\right)\cdots\left(1-\frac{1}{p_k}\right) + k + 2^{k-1}>\frac n5$
$\displaystyle\iff n\left(\frac 15-\frac{\phi(p_k!)}{p_k!}\right).

The left-hand side is positive for $k=6$, for which the inequality has the solution $n<6041035/1657\approx 4638.78$. Trying the values in this range we obtain the following set of solutions

$n\in[2,330]\cup[331,335]\cup[337,340]\cup[349,350]\cup[353,355]\cup[359,360]$

In general, if we want to find all $n$ such that $\pi(n)\geq\varepsilon n$, then we start by finding the least $k$ such that the $k$-th prime $p_k$ satisfies

$\displaystyle\left(1-\frac{1}{p_1}\right)\cdots\left(1-\frac{1}{p_k}\right)=\frac{\phi(p_k!)}{p_k!}<\varepsilon$.

(Note that such a $k$ is guaranteed to exist since the left-hand side approaches $0$ as $k\to\infty$, since the harmonic series diverges.) Then we can bound $n$ using

$\displaystyle n<\frac{k+2^{k-1}}{\varepsilon-\frac{\phi(p_k!)}{p_k!}}$.

(Observe also that $\displaystyle\frac{\phi(n!)}{n!}<\frac{\phi((n-1)!)}{(n-1)!}\Leftrightarrow n$ is prime.)

Filed under Number Theory

## SL(2,IR) is the Commutator Subgroup of GL(2,IR)

Here is a proof of the above fact.

Let $N$ be the commutator subgroup of the general linear group $GL(2,\mathbb R)$; i.e.,

$N=\langle ABA^{-1}B^{-1}:A,B\in GL(2,\mathbb R)\rangle$.

First, it is clear that $N$ is contained in the special linear group $SL(2,\mathbb R)$, since $\det(ABA^{-1}B^{-1})=1$ for any $A,B\in GL(2,\mathbb R)$. Next, we claim that $N$ contains all matrices

$\begin{pmatrix} 1 & b\\ 0 & 1\end{pmatrix}$.

This follows from noting that

$\begin{pmatrix} 1 & b\\ 0 & 1\end{pmatrix}=\begin{pmatrix} 1 & b\\ 0 & b\end{pmatrix}\begin{pmatrix} 1 & 1\\ 0 & 1\end{pmatrix}\begin{pmatrix} 1 & b\\ 0 & b\end{pmatrix}^{-1}\begin{pmatrix} 1 & 1\\ 0 & 1\end{pmatrix}^{-1}$.

By taking transposes, it also follows that $N$ contains all matrices

$\begin{pmatrix} 1 & 0\\ c & 1\end{pmatrix}$.

Further, $N$ contains all matrices

$\begin{pmatrix} a & 0\\ 0 & 1/a\end{pmatrix}$

since

$\begin{pmatrix} a & 0\\ 0 & 1/a\end{pmatrix}=\begin{pmatrix} a & 0\\ 0 & 1\end{pmatrix}\begin{pmatrix} 0 & 1\\ 1 & 0\end{pmatrix}\begin{pmatrix} a & 0\\ 0 & 1\end{pmatrix}^{-1}\begin{pmatrix} 0 & 1\\ 1 & 0\end{pmatrix}^{-1}$

for any $a\neq 0$.

Now let

$\begin{pmatrix} a & b\\ c & d\end{pmatrix}\in SL(2,\mathbb R)$.

Then $ad-bc=1$. Using the above results,

$\begin{pmatrix} a & b\\ c & d\end{pmatrix}=\begin{pmatrix} 1 & 0\\ c/a & 1\end{pmatrix}\begin{pmatrix} 1 & ab\\ 0 & 1\end{pmatrix}\begin{pmatrix} a & 0\\ 0 & 1/a\end{pmatrix}\in N$

if $a\neq 0$, and

$\begin{pmatrix} a & b\\ c & d\end{pmatrix}=\begin{pmatrix}0&1\\-1&0\end{pmatrix}\begin{pmatrix}1&-d/b\\ 0&1\end{pmatrix}\begin{pmatrix}1&0\\ ab&1\end{pmatrix}\begin{pmatrix}1/b&0\\ 0&b\end{pmatrix}\in N$

if $b\neq 0$, and the latter since

\begin{aligned}&\begin{pmatrix} 0 & -1\\ 1 & 0\end{pmatrix}\\=&\begin{pmatrix}x&y\\0&-x-y\end{pmatrix}\begin{pmatrix}-x-y&0\\ x&y\end{pmatrix}\begin{pmatrix}x&y\\0&-x-y\end{pmatrix}^{-1}\begin{pmatrix}-x-y&0\\ x&y\end{pmatrix}^{-1}\\ \in &\ N\end{aligned}

for any $x,y,x+y\neq 0$. Thus $SL(2,\mathbb R)\subseteq N$, i.e., $N=SL(2,\mathbb R)$.