MoreRSS

site iconDesvlModify

A blog mainly on graduate mathematics.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Desvl

The Structure of SL_2(F_3) as a Semidirect Product

2023-11-12 06:12:19

Introduction

Let $\mathbb{F}_3$ be the field of three elements and $SL_2(\mathbb{F}_3)$ be the group of $2 \times 2$ matrices with determinant $1$. In this post we show that $SL_2(\mathbb{F}_3)$ is the semi-direct product of $H_8$ and $\mathbb{Z}/3\mathbb{Z}$.

First of all we determine the cardinality of $SL_2(\mathbb{F}_3)$. To do this, we consider $GL_2(\mathbb{F}_3)$ and notice that $SL_2(\mathbb{F}_3)$ is the kernel of $\det$ function.

To determine $GL_2(\mathbb{F}_3)$, fix a basis of $\mathbb{F}_3 \oplus \mathbb{F}_3$ and let $A$ be a matrix representation of an element in $GL_2(\mathbb{F}_3)$. The first column of $A$ has $3^2-1$ number of choices: we only exclude the case of $(0,0)^T$. The second column has $3^2-3$ choices. We exclude $3$ possibilities given by the scalar multiplication of the first column to prevent linear dependence. Therefore $|GL_2(\mathbb{F}_3)|=(3^2-1)(3^2-3)=48$. Next we consider the exact sequence

We get $|SL_2(\mathbb{F}_3)|=|GL_2(\mathbb{F}_3)|/(\mathbb{F}_3)^\ast|=48/2=24$.

We immediately think about the possibility that $SL_2(\mathbb{F}_3)\cong \mathfrak{S}_4$. Is that the case?

24=3*8

As a group of order 24, we immediately consider the elements of order $2$, $3$ and $4$ in order to know the structure of the group we are looking at.

The element of order 2

There are ${4 \choose 2}/2!=3$ elements of order $2$ in $\mathfrak{A}_4$, i.e. those being products of two $2$-cycles. However, how many elements of order $2$ are there in $SL_2(\mathbb{F}_3)$? Let $A$ be such an element, then $A^2 = I$. Therefore all elements of order $2$ is nullified by the polynomial

If $A \in SL_2(\mathbb{F}_3)$ is of order $2$, then the minimal polynomial of $A$ divides $f(X)$, hence is either $X+1$ or $X^2-1$. The second case is impossible because then $f(X)$ will be the characteristic polynomial of $A$ and therefore $A$ has eigenvalue $1$ and $-1$ thus determinant $-1$. We get

Proposition 1. The element in $SL_2(\mathbb{F}_3)$ of order $2$ is only $A=-I$. In particular, $SL_2(\mathbb{F}_3)$ is not isomorphic to $\mathfrak{S}_4$.

Determine the group using Sylow theory

Checking elements of order $2$ is not out of nowhere. Since $24=2^3 \cdot 3$, it makes sense to look at $2$-Sylow and $3$-Sylow subgroups of $SL_2(\mathbb{F}_3)$. Sylow’s theorem ensures that there is a subgroup of order $3$, which can only be $\mathbb{Z}/3\mathbb{Z}$. We have also determined that the subgroup of order $2$ is $\{-I,I\}$. Next we determine the group of order $8$.

Elements of order 4

To study elements of order $4$, we immediately consider the polynomial

Let $A \in SL_2(\mathbb{F}_3)$ be an element of order $4$. Then $g(A)=0$. But since $A+I \ne 0$ and $A-I \ne 0$, we will be considering $h(X)=X^2+1$ instead. Notice that $h(X)$ is irreducible in $\mathbb{F}_3[X]$ and therefore it is minimal polynomial of $A$. Since the degree of $h$ is $2$, we also see $h(X)$ is the characteristic polynomial of $A$.

From this polynomial we see that $\mathrm{tr}(A)=0$. Combining with the fact that $|A|=1$, we can easily deduce that elements of order $4$ consists of

We in particular have $i^3=i^{-1}=-i$, $j^3=j^{-1}=-j$ and $k^3=k^{-1}=-k$. Furthermore, $k=ij=-ji$. These identities rings a bell of quaternion number. We therefore have the quaternion group lying in $SL_2(\mathbb{F}_3)$ as a $2$-Sylow subgroup:

Is there any other $2$-Sylow subgroup? The answer is no. To see this, let $H’$ be another $2$-Sylow group. Then there exists some $g \in SL_2(\mathbb{F}_3)$ such that $H’=gH_8 g^{-1}$, which is equal to $H_8$ because all elements in $K$ will have order $4$.

Proposition 2. The quaternion group $H_8$ can be embedded into $SL_2(\mathbb{F}_3)$ as the unique $2$-Sylow group. In particular, $SL_2(\mathbb{F}_3)$ has no element of order $8$.

An element of order 3

Let $A \in SL_2(\mathbb{F}_3)$ be an element of order $3$. Then its minimal polynomial $m(X)$ divides $X^3-1=(X-1)^3=(X-1)^2(X-1)$. Since $A-I \ne 0$, we must have $m(X)=(X-1)^2=X^2+X+1$. We can also see that the characteristic polynomial of $A$ is also $X^2+X+1$. In particular, we see the trace of $A$ is $-1=2$. We can then choose

Therefore $K=\{I,A,A^2\}$ is a $3$-Sylow subgroup of $SL_2(\mathbb{F}_3)$, which is not unique, because for example one can also consider the group generated by the transpose of $A$.

Conclusion

Notice that $H \cap K = \{1\}$ because $\gcd(3,4)=1$. Therefore the map $H \times K \to HK$ given by $(x,y) \mapsto xy$ is bijective. Since $H$ is also normal, we are safe to write $G=H\ltimes K$ because $|HK|=|H||K|=24=|G|$.

References

A Separable Extension Is Solvable by Radicals Iff It Is Solvable

2023-10-21 16:39:01

Introduction

Polynomial is of great interest in various fields, such as analysis, geometry and algebra. Given a polynomial, we try to extract as many information as possible. For example, given a polynomial, we certainly want to find its roots. However this is not very realistic. Abel-Ruffini theorem states that it is impossible to solve polynomials of degree $\ge 5$ in general. For example, one can always solve the polynomial $x^n-1=0$ for arbitrary $n$, but trying to solve $x^5-x-1=0$ over $\mathbb{Q}$ is not possible. Galois showed that the flux of solvability lies in the structure of the Galois group, depending on whether it is solvable group-theoretically.

In this post, we will explore the theory of solvability in the modern sense, considering extensions of arbitrary characteristic rather than solely number fields over $\mathbb{Q}$.

Solvable Extensions

Definition 1. Let $E/k$ be a separable and finite field extension, and $K$ the smallest Galois extension of $k$ containing $E$. We say $E/k$ is solvable if $G(K/k)$ (the Galois group of $K$ over $k$) is solvable.

Throughout we will deal with separable extensions because without this assumption one will be dealing with normal extensions instead of Galois extensions. Although we will arrive at a similar result.

Proposition 1. Let $E/k$ be a separable extension. Then $E/k$ is solvable if and only if there exists a solvable Galois extension $L/k$ such that $k \subset E \subset L$.

Proof. If $E/k$ is solvable, it suffices to take $L$ to be the smallest Galois extension of $k$ containing $E$. Conversely, Suppose $L/k$ is a solvable and Galois such that $k \subset E \subset L$. Let $K$ be the smallest Galois extension of $k$ containing $E$, i.e. we have $k \subset E \subset K \subset L$. We see $G(K/k) \cong G(L/k)/G(L/K)$ is a homomorphism image of $G(L/k)$ and it has to be solvable. $\square$

Next we introduce an important concept concerning field extensions.

Definition 2. Let $\mathcal{C}$ be a certain class of extension fields $F \subset E$. We say that $\mathcal{C}$ is distinguished if it satisfies the following conditions:

  1. Let $k \subset F \subset E$ be a tower of fields. The extension $k \subset E$ is in $\mathcal{C}$ if and only if $k \subset F$ is in $\mathcal{C}$ and $F \subset E$ is in $\mathcal{C}$.
  2. If $k \subset E$ is in $\mathcal{C}$ and if $F$ is any given extension of $k$, and $E,F$ are both contained in some field, then $F \subset EF$ is in $\mathcal{C}$ too. Here $EF$ is the compositum of $E$ and $F$, i.e. the smallest field that contains both $E$ and $F$.
  3. If $k \subset F$ and $k \subset E$ are in $\mathcal{C}$ and $F,E$ are subfields of a common field, then $k \subset FE$ is in $\mathcal{C}$.

When dealing with several extensions at the same time, it can be a great idea to consider the class of extensions they are in. For example, Galois extension is not distinguished because normal extension does not satisfy 1. That’s why we need to have the fundamental theorem of Galois theory, a.k.a. Galois correspondence, because not all intermediate subfields are Galois. Separable extension is distinguished however. We introduce this concept because:

Proposition 2. Solvable extensions form a distinguished class of extensions. (N.B. these extensions are finite and separable by default.)

Proof. We verify all three conditions mentioned in definition 2. To make our proof easier however, we first verify 2.

Step 1. Let $E/k$ be solvable. Let $F$ be a field containing $k$ and assume $E, F$ are subfields of some algebraically closed field. We need to show that $EF/F$ is solvable. By proposition 1, there is a Galois solvable extension $K/k$ such that $K \supset E \supset k$. Then $KF$ is Galois over $F$ and $G(KF/F)$ is a subgroup of $G(K/k)$. Therefore $KF/F$ is a Galois solvable extension and we have $KF \supset EF \supset F$, which implies that $EF/F$ is solvable.

Step 2. Consider a tower of extensions $E \supset F \supset k$. Assume now $E/k$ is solvable. Then there exists a Galois solvable extension $K$ containing $E$, which implies that $F/k$ is solvable because $K \supset F$. We see $E/F$ is also solvable because $EF=E$ and we are back to step 1.

Conversely, assume that $E/F$ is solvable and $F/k$ is solvable. We will find a solvable extension $M/k$ containing $E$. Let $K/k$ be a Galois solvable extension such that $K \supset F$, then $EK/K$ is solvable by step 1. Let $L$ be a Galois solvable extension of $K$ containing $EK$. If $\sigma$ is any embedding of $L$ over $k$ in a given algebraic closure, then $\sigma K = K$ and hence $\sigma L$ is a solvable extension of $K$. [This sentence deserves some explanation. Notice that $L/k$ is not necessarily Galois, therefore $\sigma$ is not necessarily an automorphism of $L$ and $\sigma L \ne L$ in general . However, since $K/k$ is Galois, the restriction of $\sigma$ on $K$ is an automorphism so therefore $\sigma K = K$. The extension $\sigma L / \sigma K$ is solvable because $\sigma L$ is isomorphic to $L$ and $\sigma K = K$.]

We let $M$ be the compositum of all extensions $\sigma L$ for all embeddings $\sigma$ of $L$ over $k$. Then $M/k$ is Galois and so is $M/K$ [note: this is the property of normal extension; besides, $M/k$ is finite]. We have $G(M/K) \subset \prod_{\sigma}G(\sigma L/K)$ which is a product of solvable groups. Therefore $G(M/K)$ is solvable, meaning $M/K$ is a solvable extension. We have a surjective homomorphism $G(M/k) \to G(K/k)$ (given by $\sigma \mapsto \sigma|_K$) and therefore $G(M/k)$ has a normal subgroup whose factor group is solvable, meaning $G(M/k)$ is solvable. Since $E \subset M$, we are done.

Step 3. If $F/k$ and $E/k$ are solvable and $E,F$ are subfields of a common field, we need to show that $EF$ is solvable over $k$. By step 1, $EF/F$ is solvable. By step 2, $EF/k$ is solvable. $\square$

Solvable By Radicals

Definition 2. Let $F/k$ be a finite and separable extension. We say $F/k$ is solvable by radicals if there exists a finite extension $E$ of $k$ containing $F$, and admitting a tower decomposition

such that each step $E_{i+1}/E_i$ is one of the following types:

  1. It is obtained by adjoining a root of unity.
  2. It is obtained by adjoining a root of a polynomial $X^n-a$ with $a_i \in E_i$ and $n$ prime to the characteristic.
  3. It is obtained by adjoining a root of an equation $X^p-X-a$ with $a \in E_i$ if $p$ is the characteristic $>0$.

For example, $\mathbb{Q}(\sqrt{-2})/\mathbb{Q}$ is solvable by radicals. We consider the polynomial $f(x)=x^2-2x+3$. We know its roots are $x_1=-1-\sqrt{-2}$ and $x_2=-1+\sqrt{-2}$. However let’s see the question in the sense of field theory. Notice that

Therefore $f(x)=0$ is equivalent to $(x-1)^2=-2$. Then $x-1=\sqrt{-2}$ and $x-1=-\sqrt{-2}$ in $\mathbb{Q}(\sqrt{-2})$ are two equations that make perfect sense. Thus we obtain our desired roots. The field gives us the liberty of basic arithmetic, and the radical extension gives us the method to look for a radical root.

It is immediate that the class of extensions solvable by radicals is a distinguished class.

In general, we are adding “$n$-th root of something”. However, when the characteristic of the field is not zero, there are some complications. For example, talking about the $p$-th root of an element in a field of characteristic $p>0$ will not work. Therefore we need to take good care of that. The second and third types are nods to Kummer theory and Artin-Schreier theory respectively, which are deduced from Hilbert’s theorem 90’s additive and multiplicative form. We interrupt the post by introducing the respective theorems.


Let $K/k$ be a cyclic extension of degree $n$, that is, $K/k$ is Galois and $G(K/k)$ is cyclic. Suppose $G(K/k)$ is generated by $\sigma$. Then we have the celebrated “Theorem 90”:

Theorem 1 (Hilbert’s theorem 90, multiplicative form). Notation being above, let $\beta \in K$. The norm $N_{k}^{K}(\beta)=1$ if and only if there exists an element $\alpha \ne 0$ in $K$ such that $\beta = \alpha/\sigma\alpha$.

To prove this, we need Artin’s theorem of independent characters. With this, we see the second type of extension in definition 2 is cyclic.

Theorem 2. Let $k$ be a field, $n$ an integer $>0$ prime to the characteristic of $k$, and assume that there is a primitive $n$-th root of unity in $k$.

  1. Let $K$ be a cyclic extension of degree $n$. Then there exists $\alpha \in K$ such that $K = k(\alpha)$ and $\alpha$ satisfies an equation $X^n-a=0$ for some $a \in k$.
  2. Conversely, let $a \in k$. Let $\alpha$ be a root of $X^n-a$. Then $k(\alpha)$ is cyclic over $k$ of degree $d|n$, and $\alpha^d$ is an element of $k$.

All in all, theorem 2 states that a $n$-th root of $a$ yields a cyclic extension. However we don’t drop the assumption that $n$ is prime to the characteristic of $k$. When this is not the case, we will use Artin-Schreier theorem.

Theorem 3 (Hilbert’s theorem 90, additive form). Let $K/k$ be a cyclic extension of degree $n$. Let $\sigma$ be the generator of $G(K/k)$. Let $\beta \in K$. The trace $\mathrm{Tr}_k^K(\beta)=0$ if and only if there exists an element $\alpha \in K$ such that $\beta = \alpha-\sigma\alpha$.

This theorem requires another application of the independence of characters.

Theorem 4 (Artin-Schreier). Let $k$ be a field of characteristic $p$.

  1. Let $K$ be a cyclic extension of $k$ of degree $p$. Then there exists $\alpha \in K$ such that $K=k(\alpha)$ and $\alpha$ satisfies an equation $X^p-X-a=0$ with some $a \in k$.
  2. Conversely, given $a \in k$, the polynomial $f(X)=X^p-X-a$ either has one root in $k$, in which case all its roots are in $k$, or it is irreducible. In the latter case, if $\alpha$ is a root then $k(\alpha)$ is cyclic of degree $p$ over $k$.

In other words, instead of looking at the $p$-th root of unity in a field of characteristic $p$, we look at the root of $X^p-X-a$, which still yields a cyclic extension.


Now we are ready for the core theorem of this post.

Theorem 5. Let $E$ be a finite separable extension of $k$. Then $E$ is solvable by radicals if and only if $E/k$ is solvable.

Proof. First of all we assume that $E/k$ is solvable. Then there exists a finite Galois solvable extension of $k$ containing $E$ and we call it $K$. Let $m$ be the product of all primes $l$ such that $l \ne \mathrm{char}k$ but $l|[K:k]$. Let $F=k(\zeta)$ where $\zeta$ is a primitive $m$-th root of unity. Then $F/k$ is abelian and is solvable by radical by definition.

Since solvable extensions form a distinguished class, we see $KF/F$ is solvable. There is a tower of subfields between $F$ and $KF$ such that each step is cyclic of prime order, because every solvable group admits a tower of cyclic groups, and we can use Galois correspondence. By theorem 2 and 4, we see $KF/F$ is solvable by radical because extensions of prime order have been determined by these two theorems. It follows that $E/k$ is solvable by radicals: $KF/F$ is solvable by radicals, $F/k$ is solvable by radicals $\implies$ $KF/k$ is solvable by radicals $\implies$ $E/k$ is solvable by radicals because $KF \supset E \supset k$.


The elaboration of the “if” part is as follows. In order to prove $E/k$ is solvable by radicals, we show that there is a much bigger field $KF$ containing $E$ such that $KF/k$ is solvable by radical. First of all there exists a finite Galois solvable extension $K/k$ containing $E$. Next we define a cyclotomic extension $F/k$ with the following intentions

  1. $F/k$ should be solvable by radicals.
  2. $F$ contains enough primitive roots of unity, so that we can use theorem 2 freely.

To reach these two goals, we decide to put $F=k(\zeta)$ where $\zeta$ is a $m$-th root of unity and $m$ is the radical of $[K:k]$ divided by the characteristic of $k$ when necessary. This field $F$ certainly ensures that $F/k$ is solvable by radical. For the second goal, we need to take a look of the subfield between $F$ and $KF$. Let $k = K_0 \subset K_1 \subset \dots \subset K_n = K$ be a tower of field extensions such that every step $K_{i+1}/K_i$ is of prime degree [this is possible due to the solvability of $K/k$]. These prime numbers can only be factors of $[K:k]$ Then in the lifted field extension $F=K_0F \subset K_1F \subset \dots \subset K_nF=KF$ we do not introduce new prime numbers. Why do we consider prime factors of $[K:k]$? Let’s say $[K_{i+1}F:K_iF] = \ell$ is a prime number. If $\ell=\mathrm{char}k$ then we can use theorem 4. Otherwise we still have $\ell|[K:k]$ so we use theorem 2. However this theorem requires a primitive $\ell$-th root to be in $K_{i}F$. Our choice of $m$ and $\zeta$ guaranteed this to happen because $\ell|m$ and therefore a primitive $\ell$-th root of unity exists in $F$. We can make $m$ bigger but there is no necessity. The “only if” part does nearly the same thing, with an alternation of logic chain.


Conversely, assume that $E/k$ is solvable by radicals. For any embedding $\sigma$ of $E$ in $E^{\mathrm{a}}$ over $k$, the extension $\sigma E/k$ is also solvable by radicals. Hence the smallest Galois extension $K$ of $E$ containing $k$, which is a composite of $E$ and its conjugates is solvable by radicals. Let $m$ be the product of all primes unequal to the characteristic dividing the degree $[K:k]$ and again let $F=k(\zeta)$ where $\zeta$ is a primitive $m$-th root of unity. It will suffice to prove that $KF$ is solvable over $F$, because it follows that $KF$ is solvable by $k$ and hence $G(K/k)$ is solvable because it is a homomorphic image of $G(KF/k)$. But $KF/F$ can be decomposed into a tower of extensions such that each step is prime degree and of the type described in theorem 2 and theorem 4. The corresponding root of unity is in the field $F$. Hence $KF/F$ is solvable, proving the theorem. $\square$

Picard's Little Theorem and Twice-Punctured Plane

2023-09-18 19:47:20

Introduction

Let $f:\mathbb{C} \to \mathbb{C}$ be a holomorphic function. By Liouville’s theorem, if $f(\mathbb{C})$ is bounded, then $f$ has to be a constant function. However, there is a much stronger result. In fact, if $f(\mathbb{C})$ differs $\mathbb{C}$ from exactly $2$ points, then $f$ is a constant. In other words, suppose $f$ is non-constant, then the equation $f(z)=a$ for all $a \in \mathbb{C}$ except at most one $a$. To think about this, if $f$ is a non-constant polynomial, then $f(z)=a$ always has a solution (the fundamental theorem of algebra). If, for example, $f(z)=\exp(z)$, then $f(z)=a$ has no solution only if $a=0$.

The proof will not be easy. It will not be proved within few lines of obvious observations, either in elementary approaches or advanced approaches. In this post we will follow the later by studying the twice-punctured plane $\mathbb{C} \setminus\{0,1\}$. To be specific, without loss of generality, we can assume that $0$ and $1$ are not in the range of $f$. Then $f(\mathbb{C}) \subset \mathbb{C}\setminus\{0,1\}$. Next we use advanced tools to study $\mathbb{C}\setminus\{0,1\}$ in order to reduce the question to Liouville’s theorem by constructing a bounded holomorphic function related to $f$.

We will find a holomorphic covering map $\lambda:\mathfrak{H} \to \mathbb{C}\setminus\{0,1\}$ and then replace $\mathfrak{H}$ with the unit disc $D$ using the Cayley transform $z \mapsto \frac{z-i}{z+i}$. Then the aforementioned $f$ will be lifted to a holomorphic function $F:\mathbb{C} \to D$, which has to be constant due to Liouville’s theorem, and as a result $f$ is constant.

With these being said, we need analytic continuation theory to establish the desired $\lambda$, and on top of that, (algebraic) topology will be needed to justify the function $F$.

Analytic Continuation

For a concrete example of analytic continuation, I recommend this post on the Riemann $\zeta$ function. In this post however, we only focus on the basic language of it in order that we can explain later content using analytic continuation.

Our continuation is always established “piece by piece”, which is the reason we formulate continuation in the following sense.

Definition 1. A function element is an ordered pair $(f,D)$ where $D$ is an open disc and $f \in H(D)$. Two function elements $(f_1,D_1)$ and $(f_2,D_2)$ are direct continuation of each other if $D_1 \cap D_2 \ne \varnothing$ and $f_1=f_2$ on $D$. In this case we write

The notion of ordered pair may ring a bell of sheaf and stalk. Indeed some authors do formulate analytic continuation in this language, see for example Principles of Complex Analysis by Serge Lvovski.

The $\sim$ relation is by definition reflective and symmetric, but not transitive. To see this, let $\omega$ be the primitive $3$-th root of unity. Let $D_0, D_1,D_2$ be open discs with radius $1$ and centres $\omega^0,\omega^1,\omega^2$. Since the $D_i$ are simply connected, we can always pick $f_i \in H(D_i)$ such that $f_i^2(z)=z$, and $(f_0,D_0) \sim (f_1,D_1)$ and $(f_1,D_1) \sim (f_2,D_2)$ but on $D_0 \cap D_2$ one has $f_2 =-f_0 \ne f_0$. Indeed there is nothing mysterious: we are actually rephrasing the fact that square root function cannot be defined at a region containing $0$.

Definition 2. A chain is a finite sequence $\mathscr{C}$ of discs $(D_0,D_1,\dots,D_n)$ such that $D_{i-1} \cap D_i \ne \varnothing$ for $i=1,\dots,n$. If $(f_0,D_0)$ is given and if there exists function elements $(f_i,D_i)$ such that $(f_{i-1},D_{i-1}) \sim (f_i,D_i)$ for $i=1,\dots,n$, then $(f_n,D_n)$ is said to be the analytic continuation of $(f_0,D_0)$ along $\mathscr{C}$.

A chain $\mathscr{C}=(D_0,\dots,D_n)$ is said to cover a curve $\gamma$ with parameter interval $[0,1]$ if there are numbers $0=s_0<s_1<\dots<s_n=1$ such that $\gamma(0)$ is the centre of $D_0$, $\gamma(1)$ is the centre of $D_n$, and

If $(f_0,D_0)$ can be continued along this $\mathscr{C}$ to $(f_n,D_n)$, we call $(f_n,D_n)$ an analytic continuation of $(f_0,D_0)$ along $\gamma$; $(f_0,D_0)$ is then said to admit an analytic continuation along $\gamma$.

Either way, it is not necessary that $(f_0,D_0) \sim (f_n,D_n)$. However, unicity of $(f_n,D_n)$ is always guaranteed. We will sketch out the proof on unicity.

Lemma 1. Suppose that $D_0 \cap D_1 \cap D_2 \ne \varnothing$, $(D_0,f_0) \sim (D_1,f_1)$ and $(D_1,f_1) \sim (D_2,f_2)$, then $(D_0,f_0) \sim (D_2,f_2)$.

Proof. By assumption, $f_0=f_1$ in $D_0 \cap D_1$, and $f_1=f_2$ in $D_1 \cap D_2$. It follows that $f_0=f_2$ in $D_0 \cap D_1 \cap D_2$, which is open and non-empty. Since $f_0$ and $f_2$ are holomorphic in $D_0 \cap D_2$ and $D_0 \cap D_2$ is connected, we have $f_0 = f_2$ in $D_0 \cap D_2$. This is because on a open connected set $D_0 \cap D_2$, the zero of $f_0-f_2$ is not discrete. Therefore $f_0-f_2$ has to be $0$ everywhere on $D_0 \cap D_2$. $\square$

Theorem 1. If $(f,D)$ is a function element and $\gamma$ is a curve which starts at the centre of $D$, then $(f,D)$ admits at most one analytic continuation along $\gamma$.

Sketch of the proof. Let $\mathscr{C}_1=(A_0,A_1,\dots,A_m)$ and $\mathscr{C}_2=(B_0,B_1,\dots,B_n)$ be two chains that cover $\gamma$. If $(f,D)$ can be analytically continued along $\mathscr{C}_1$ to a function element $(g_m,A_m)$ and along $\mathscr{C}_2$ to $(h_n,B_n)$, then $g_m=h_n$ in $A_m \cap B_n$.

We are also given partitions $0=s_0<s_1<\dots<s_m=s_{m+1}=1$ and $0=t_0<t_1<\dots<t_n=t_{n+1}=1$ such that

and function elements $(g_i,A_i) \sim (g_{i+1},A_{i+1})$ and $(h_j,B_j) \sim (h_{j+1},B_{j+1})$ for $0 \le i \le m-1$ and $0 \le j \le n-1$ with $g_0=h_0=f$. The poof is established by showing that the continuation is compatible with intersecting intervals, where lemma 1 will be used naturally. To be specific, if $0 \le i \le m$ and $0 \le j \le n$, and if $[s_i,s_{i+1}] \cap [t_j,t_{j+1}] \ne \varnothing$, then $(g_i, A_i) \sim (h_j,B_j)$.

The Monodromy Theorem

The monodromy theorem asserts that on a simply connected region $\Omega$, for a function element $(f,D)$ with $D \subset \Omega$, we can extend it to all $\Omega$ if $(f,D)$ can be continued along all curves. To prove this we need homotopy properties of analytic continuation and simply connected spaces.

Definition 1. A simply connected space is a path connected topological space $X$ with trivial fundamental group $\pi_1(X,x_0)=\{e\}$ for all $x_0 \in X$.

The following fact is intuitive and will be used in the monodromy theorem.

Lemma 2. Let $X$ be a simply connected space and let $\gamma_1$ and $\gamma_2$ be two closed curves $[0,1] \to X$ with $\gamma_1(0)=\gamma_2(0)$ and $\gamma_1(1)=\gamma_2(1)$. Then $\gamma_1$ and $\gamma_2$ are homotopic.

Proof. Let $\gamma_i^{-1}$ be the curve defined by $\gamma_i^{-1}(t)=\gamma_i(1-t)$ for $i=1,2$. Then

where $e$ is the identity of $\pi_1(X,\gamma_1(0))$. $\square$

Next we prove the two-point version of the monodromy theorem.

Monodromy theorem (two-point version). Let $\alpha,\beta$ be two points on $\mathbb{C}$ and $(f,D)$ be a function element where $D$ is centred at $\alpha$. Let $\{\gamma_t\}$ be a homotopy class indexed by a map $H(s,t):[0,1] \times [0,1] \to \mathbb{C}$ with the same origin $\alpha$ and terminal $\beta$. If $(f,D)$ admits analytic continuation along each $\gamma_t$, to an element $(g_t,D_t)$, then $g_1=g_0$.

In brief, analytic continuation is faithful along homotopy classes. By being indexed by $H(s,t)$ we mean that $\gamma_t(s)=H(s,t)$. We need the uniform continuity of $H(s,t)$.

Proof. Fix $t \in [0,1]$. By definition, there is a chain $\mathscr{C}=(A_0,\dots,A_n)$ which covers $\gamma_t$, with $A_0=D$, such that $(g_t,D_t)$ is obtained by continuation of $(f,D)$ along $\mathscr{C}$. There are numbers $0=s_0<\dots<s_n=1$ such that

For each $i$, define

The $d_i$ makes sense and is always positive, because $E_i$ is always compact and $A_i$ is an open set. Then pick any $\varepsilon \in (0,\min_i\{d_i\})$. Since $H(s,t)$ is uniformly continuous, there exists a $\delta>0$ such that

We claim that $\mathscr{C}$ also covers $\gamma_u$. To do this, pick any $s \in [s_i,s_{i+1}]$. Then $\gamma_u(s) \in A_i$ because

Therefore by theorem 1, we have $g_t=g_u$. Notice that for any $t \in [0,1]$, there is a segment $I_t$ such that $g_u=g_t$ for all $u \in [0,1] \cap I_t$. Since $[0,1]$ is compact, there are finitely many $I_t$ that cover $[0,1]$. Since $[0,1]$ is connected, we see, after a finite number of steps, we can reach $g_0=g_1$. $\square$

Momodromy theorem. Suppose $\Omega$ is a simply connected open subset of the plane, $(f,D)$ is a function element with $D \subset \Omega$, and $(f,D)$ can be analytically continued along every curve in $\Omega$ that starts at the centre of $D$. Then there exists $g \in H(\Omega)$ such that $g(z)=f(z)$ for all $z \in D$.

Proof. Let $\gamma_0$ and $\gamma_1$ be two curves in $\Omega$ from the centre $\alpha$ of $D$ to some point $\beta \in \Omega$. Then the two-point monodromy theorem and lemma 2 ensures us that these two curves lead to the same element $(g_\beta,D_\beta)$, where $D_\beta \subset \Omega$ is a circle with centre at $\beta$. If $D_{\beta_1}$ intersects $D_\beta$, then $(g_{\beta_1},D_{\beta_1})$ can be obtained by continuing $(f,D)$ to $\beta$, then along the segment connecting $\beta$ and $\beta_1$. By definition of analytic continuation, $g_{\beta_1}=g_\beta$ in $D_{\beta_1} \cap D_\beta$. Therefore the definition

is a consistent definition and gives the desired holomorphic extension of $f$. $\square$

Modular Function

Let $\mathfrak{H}$ be the open upper half plane. We will find a function $\lambda \in H(\mathfrak{H})$ whose image is $E=\mathbb{C} \setminus\{0,1\}$ and is in fact the (holomorphic) covering space of $E$. The function $\lambda$ is called a modular function.

As usual, consider the action of $G=SL(2,\mathbb{Z})$ on $\mathfrak{H}$ given by

Definition 2. A Modular function is a holomorphic (or meromorphic) function $f$ on $\mathfrak{H}$ which is invariant under a non-trivial subgroup $\Gamma$ of $G$. That is, for any $\varphi \in \Gamma$, one has $f \circ \varphi=f$.

In this section, we consider this subgroup:

It has a fundamental domain

Basically, $Q$ is bounded by two vertical lines $x=1$ and $x=-1$ vertically, and two semicircles with centre at $x=\frac{1}{2}$ and $x=-\frac{1}{2}$ with diameter $1$, but only the left part contains boundary points. The term fundamental domain will be justified by the following theorem.

Theorem 4. Let $\Gamma$ and $Q$ be as above.

(a) Let $\varphi_1,\varphi_2$ be two distinct elements of $\Gamma$, then $\varphi_1(Q) \cap \varphi_2(Q) = \varnothing$.

(b) $\bigcup_{\varphi \in \Gamma}\varphi(Q)=\mathfrak{H}$.

(c) $\Gamma$ is generated by two elements

Sketch of the proof. Let $\Gamma_1$ be the subgroup of $\Gamma$ generated by $\sigma$ and $\tau$, and show (b’):

Then (a) and (b’) would imply that $\Gamma_1=\Gamma$ and (b) is proved. To prove (a), one will replace $\varphi_1$ with the identity element and discuss the relationship between $c$ and $d$ for $\varphi_2=\begin{pmatrix}a & b \\ c & d \end{pmatrix}$. To prove (b’), one need to notice that

For $w \in \mathfrak{H}$, by picking $\varphi_0 \in \Gamma$ that maximises $\Im\varphi_0(w)$, only to show that $z=\varphi_0(w) \in \Sigma$ and therefore $w \in \Sigma$.


We are now allowed to introduce the modular function.

Theorem 5. Notation being above, there exists a function $\lambda \in H(\mathfrak{H})$ such that

(a) $\lambda \circ \varphi = \lambda$ for every $\varphi \in \Gamma$.

(b) $\lambda$ is one-to-one on $Q$.

(c) $\lambda(\mathfrak{H})=\lambda(Q)=E=\mathbb{C}\setminus\{0,1\}$.

(d) $\lambda$ has the real axis as its natural boundary. That is, $\lambda$ has no holomorphic extension to any region that properly contains $\mathfrak{H}$.

Proof. Consider

This is a simply connected region with simple boundary. There is a continuous function $h$ which is one-to-one on $\overline{Q}_0$ and is holomorphic in $Q_0$ such that $h(Q_0)=\mathfrak{H}$, $h(0)=0$, $h(1)=1$ and $h(\infty)=\infty$. This is a consequence of conformal mapping theory.

The Schwartz reflection principle extends $h$ to a continuous function on $\overline{Q}$ which is a conformal mapping of $Q^\circ$ (the interior of $Q$) onto the plane minus the non-negative real axis, by the formula

Note the extended $h$ is one-to-one on $Q$, and $h(Q)$ is $E$ defined in (c).

On the boundary of $Q$, the function $h$ is real. In particular,

and that

We now define

for $\varphi \in \Gamma$ and $z \in \varphi(Q)$. This definition makes sense because for each $z \in \mathfrak{H}$, there is one and only one $\varphi \in \Gamma$ such that $z \in \varphi(Q)$. Properties (a) (b) and (c) follows immediately.

Notice $\lambda$ is continuous on

and therefore on an open set $V$ containing $Q$. Cauchy’s theorem shows that $\lambda$ is holomorphic in $V$. Since $\mathfrak{H}$ is covered by the union of the sets $\varphi(V)$ for $\varphi \in \Gamma$, and since $\lambda \circ \varphi = \lambda$, we conclude that $\lambda \in H(\mathfrak{H})$.

Finally, the set of all numbers $\varphi(0)=b/d$ is dense on the real axis. If $\lambda$ could be analytically continued to a region which properly contains $\mathfrak{H}$, the zeros of $\lambda$ would have a limit point in this region, which is impossible since $\lambda$ is not constant. $\square$

We are now ready for the pièce de résistance of this post.

Picard’s Little Theorem

Theorem (Picard). If $f$ is an entire function and if there are two distinct complex numbers $\alpha$ and $\beta$ such that are not in the range of $f$, then $f$ is constant.

The proof is established by considering an analytic continuation of a function $g$ associated with $f$. The continuation will be originated at the origin and validated by monodromy theorem. Then by Cayley’s transformation, we find out the range of $g$ is bounded and hence $g$ is constant, so is $f$.

Proof. First of all notice that without loss of generality, we assume that $\alpha=0$ and $\beta=1$, because otherwise we can replace $f$ with $(f-\alpha)/(\beta-\alpha)$. That said, the range of $f$ is $E$ in theorem 5. There is a disc $A_0$ with centre at $0$ so that $f(A_0)$ lies in a disc $D_0 \subset E$.

For every disc $D \subset E$, there is an associated region $V \subset \mathfrak{H}$ such that $\lambda$ in theorem 5 is one-to-one on $V$ and $\lambda(V)=D$; each such $V$ intersects at most two of the domains $\varphi(Q)$. Corresponding to each choice of $V$, there is a function $\psi \in H(D)$ such that $\psi(\lambda(z))=z$ for all $z \in V$.

Now let $\psi_0 \in H(D_0)$ be the function such that $\psi_0(\lambda(z))=z$ as above. Define $g(z)=\psi_0(f(z))$ for $z \in A_0$. We claim that $g(z)$ can be analytically continued to an entire function.

If $D_1$ is another disc in $E$ with $D_0 \cap D_1 \ne \varnothing$, we can choose a corresponding $V_1$ so that $V_0 \cap V_1 \ne \varnothing$. Then $(\psi_0,D_0)$ and $(\psi_1,D_1)$ are direct analytic continuations of each other. We can proceed this procedure all along to find a direct analytic continuation $(\psi_{i+1},D_{i+1})$ of $(\psi_i,D_i)$ with $V_{i+1} \cap V_i \ne 0$. Note $\psi_i(D_i) \subset V_i \subset \mathfrak{H}$ for all $i$.

Let $\gamma$ be a curve in the plane which starts at $0$. The range of $f \circ \gamma$ is a compact subset of $E$ and therefore $\gamma$ can be covered by a chain of discs, say $A_0,\dots,A_n$, so that each $f(A_i)$ is in a disc $D_i \subset E$. By considering function elements $\{(\psi_{i},D_i)\}$, composing with $f$ on each $D_i$ (this is safe because $f$ is entire), we get an analytic continuation of $(g,A_0)$ along the chain $(A_0,\dots,A_n)$. Note $\psi_i \circ f(A_i) \subset \psi_i(D_i) \subset \mathfrak{H}$ again.

Since $\gamma$ is arbitrary, we have shown that $(g,A_0)$ can be analytically continued along every curve in the plane. The monodromy theorem implies that $g$ extends to an entire function. Thus proving our claim given before.

Note the range of the extended $g$ on every possible $A_i$ has range lying inside $\mathfrak{H}$. Therefore $g(\mathbb{C}) \subset \mathfrak{H}$. It follows that

has range in the unit disc. By Liouville’s theorem, $h$ is a constant function. Thus $g$ is constant too.

Now we move back to $f$ by looking at $A_0$. Since $\psi_0$ is one-to-one on $f(A_0)$ and $A_0$ is not empty and open, $f(A_0)$ has to be a singleton. Thus $f$ is constant on $A_0$. If we represent $f$ as a power series on a disc lying inside $A_0$, we see $f$ has to be a constant. $\square$

Note we have also seen that the range of a non-constant function cannot be half of a plane. But this result is useless because we can find two points on a large chunk of a plane after all.

Reference

  • Walter Rudin, Real and Complex Analysis.
  • Tammo tom Dieck, Algebraic Topology.

SL(2,R) As a Topological Space and Topological Group

2023-08-13 01:28:22

Introduction

There are a lot of important linear algebraic groups that are widely used in mathematics, physics and industry. Some of them have nice visualisations. For example, it is widely known that $SU(2) \cong S^3$ and $SO(3) \cong \mathbb{RP}^3$. The group $SL(2,\mathbb{R})$ is not less important than them but the visualisation concerning this group is not very easy to be found. In this post we show that

where $D$ is the open unit disk. In other words, $SL(2,\mathbb{R})$ can be considered as a donut, not the shell of it ($S^1 \times S^1$) but the “content” or “flesh” of it. More formally, the inside of a solid torus.

The related core theory can be found in Iwasawa decomposition, but to access it we need Lie group and Lie algebra theories, which involves differential geometry and certainly goes beyond the scope of this post. Interested readers can refer to Lie Groups Beyond an Introduction chapter 6 for Iwasawa decomposition theory.

Immediate topological consequences

Before we establish the homeomorphism

we first see what we can derive from it.

  • Is $SL(2,\mathbb{R})$ compact?

No. Since $D$ is not compact, $S^1 \times D$ cannot be compact.

  • What is the fundamental group of $SL(2,\mathbb{R})$?

Notice there is a (strong) deformation retract between $S^1 \times D$ and $S^1$. Therefore $\pi_1(SL(2,\mathbb{R})) = \pi_1(S^1)=\mathbb{Z}$.

  • Connectedness of $SL(2,\mathbb{R})$?

It is connected because $S^1$ and $D$ are connected. It is not simply connected because the fundamental group is not trivial.

  • What is the dimension of $SL(2,\mathbb{R})$ as a manifold?

The dimension is $3$.

The Iwasawa decomposition

If we directly jump to the conclusion without mentioning Lie theory, one will see the decomposition comes from nowhere. Instead of defining $K$, $A$ and $N$ that will appear later and show that there is no discrepancy, we deduce the decomposition without the usage of Lie theory. Instead, we consider the action of $SL(2,\mathbb{R})$ on the upper half plane, because group action is likely to expose more information of the group.

Consider the group action of $SL(2,\mathbb{R})$ on the upper half plane

given by

Up to an explosion of calculation, one can indeed verify that this is a group action and in particular

As one may guess, it is not wise to continue without investigating the action first, or we will be lost in calculation. We first show that this action is transitive by showing that for any $z=x+yi \in \mathfrak{H}$, there is some $\sigma \in SL(2,\mathbb{R})$ such that $\sigma(z)=i$:

Let’s play around the last linear equation system:

We can put $c=0$ and $a=\frac{1}{\sqrt{y}}$ so that $b=-\frac{x}{\sqrt{y}}$ and $d=\sqrt{y}$. That is,

We have therefore proved:

The action of $SL(2,\mathbb{R})$ on $\mathfrak{H}$ is transitive.

Proof. For any $z,z’ \in \mathfrak{H}$, there exists $\sigma$ and $\sigma’$ such that $\sigma(z)=i$ and $\sigma’(z’)=i$. Then $\sigma’^{-1}(\sigma(z))=z’$, i.e. $\sigma’^{-1}\sigma$ sends $z$ to $z’$. $\square$

By working around $i$ on $\mathfrak{H}$ we can save ourselves from a lot of troubles. It is then desirable to find the stabiliser of $i$.

The stabiliser of $i \in \mathfrak{H}$ is $SO(2) \cong S^1$.

Proof. Suppose $\sigma=\begin{pmatrix} a & b \\c & d \end{pmatrix}$ stabilises $i$. Then first of all we have

Then

It follows that

Therefore $\sigma \in O(2) \cap SL(2) = SO(2)$ as expected. $\square$

With these being said, the action of $SL(2,\mathbb{R})$ on $i$ consists of $SO(2)$ that moves nothing and the rest that actually move things. In other words, $SL(2,\mathbb{R})/SO(2) \cong \mathfrak{H}$ as a $2$-manifold. In particular, the action is a isometry. We will find the effective part of the group action out. For $\sigma \in SL(2,\mathbb{R})$, we assume that $\sigma(i)=x+iy$. Then

Let $B$ be the upper triangular matrices in $SL(2,\mathbb{R})$ with positive diagonal elements. Then it is elements in $B$ that actually move things. According to this classification, we have obtained a decomposition

The matrix multiplication map $B \times SO(2) \to SL(2,\mathbb{R})$ is surjective.

Proof. Notice that every element of $B$ can be written in the form

For any $\sigma \in SL(2,\mathbb{R})$, suppose $\sigma(i)=x+iy$, then $\sigma(i)=\lambda_{x,y}(i)$, therefore $\lambda_{x,y}^{-1}\sigma(i)=i$, i.e. $\lambda_{x,y}^{-1}\sigma \in SO(2)$, i.e. $\lambda_{x,y}^{-1}\sigma$ is a stabiliser of $i$. The product $\sigma = \lambda_{x,y}(\lambda_{x,y}^{-1}\sigma)$ always lies in the image of $B \times SO(2)$. $\square$

We can decompose $B$ further:

Let $N$ be the group of upper triangular matrices in $SL(2,\mathbb{R})$ with $1$ on the diagonal line and let $A$ be the group of diagonal matrices with non-negative entries. Then $B=NA$. Let $K=SO(2) \subset SL(2,\mathbb{R})$, then we have obtained the so-called Iwasawa decomposition:

There is a diffeomorphism onto

Proof. It only remains to show injectivity. Suppose $n_1a_1k_1=n_2a_2k_2$. Applying both sides onto $i$ we obtain $n_1a_1(i)=n_2a_2(i)$. Suppose

Then we have $n_1a_1(i)=x_1+y_1i=n_2a_2(i)=x_2+y_2i$. It follows that $x_1=x_2$ and $y_1=y_2$, i.e. $n_1=n_2$ and $a_1=a_2$ and therefore $k_1=k_2$. $\square$

By investigating $N$ and $A$ further we obtain

The group $SL(2,\mathbb{R})$ is homeomorphic to $S^1 \times D$.

Proof. Notice that $N$ is homeomorphic to $\mathbb{R}$ and $A$ is homeomorphic to $\mathbb{R}_{>0}\cong \mathbb{R}$. $\square$

Notice the order of $N,A,K$ does not matter very much: $NAK,KAN,ANK,KNA$ are the same thing. This is because $AN=NA$ and for $nak \in SL(2,\mathbb{R})$, we have $(nak)^{-1}=k^{-1}a^{-1}n^{-1}$ which lies in the preimage of $K \times A \times N$ under matrix multiplication.

Immediate group-theoretical consequences

With the full Iwasawa decomposition in mind, we can scratch the surface of the rather complicated $SL(2,\mathbb{R})$.

The only continuous homomorphism of $SL(2,\mathbb{R})$ to $\mathbb{R}$ is trivial.

Proof. Let $f:SL(2,\mathbb{R}) \to \mathbb{R}$ be such a map. We have $f(kan)=f(k)+f(a)+f(n)$. We need to show that $f(k)=f(a)=f(n)=0$.

First of all, since $K$ is a compact subgroup of $SL(2,\mathbb{R})$, its image on $\mathbb{R}$ has to be a compact subgroup. On the other hand, $f$ on $A$ and $N$ can be constructed more explicitly. For $A$, we see $\begin{pmatrix}r & 0 \\ 0 & \frac{1}{r} \end{pmatrix} \mapsto r \mapsto \log{r}$ yields an isomorphism of $A$ and $\mathbb{R}$, in both algebraical and topological sense. For $N$ on the other hand, we immediately have an isomorphism $\begin{pmatrix}1 & x \\ 0 & 1\end{pmatrix} \mapsto x$. Therefore the image of $f$ on $A$ and $N$ can be realised as as $u\log{r}$ and $vx$ for some $u,v \in \mathbb{R}$. We use the fact that $AN=NA$ to determine $u$ and $v$. Notice that

applying $f$ on both sides, we have

For $u$, we consider the conjugate relation

Applying $f$ on both sides we obtain

This proves the triviality of $f$. $\square$

Let $f:SL(2,\mathbb{R}) \to GL(n,\mathbb{R})$ be a continuous homomorphism, then $f(SL(2,\mathbb{R})) \subset SL(n,\mathbb{R})$.

Proof. Consider the sequence of group homomorphisms

Since $SL(2,\mathbb{R})$ is connected, we see $\det\circ f(SL(2,\mathbb{R}))$ is connected, thus lying in $\mathbb{R}_{>0}$. We can then modify the sequence a little bit:

The map $\log \circ \det \circ f$ is a continuous homomorphism sending $SL(2,\mathbb{R})$ to $\mathbb{R}$, which is trivial, and therefore

This proves our assertion. $\square$

There are still a lot we can do without much Lie theory but Haar measure theory. The reader is advised to try this exercise set to see, for example, that the “volume” of $SL(2,\mathbb{R})/SL(2,\mathbb{Z})$ is $\zeta(2)$. In the references / further reading section the reader will also find a way to show that $SL(2,\mathbb{Z})\backslash SL(2,\mathbb{R})/SO(2,\mathbb{R})$ has volume $\frac{\pi}{3}$.

References / Further Reading

Important Posts of This Blog

2023-08-04 22:49:21

This post collects the top 5 most popular content according to Google Search Console.

irreducible representations of so(3)…

The group $SO(3)$ is one of the most “realistic” Lie groups, as it describes all 3D rotations in real world. In the post Irreducible Representations of SO(3) and the Laplacian, we compute all of its irreducible representations, using the theory of Laplacian and harmonic polynomials. This is indeed not an easy job, as it shows the “hard” side of linear algebra.

fourier transform of sinx/x…

The Fourier transforms of $\frac{\sin x}{x}$ and $\left(\frac{\sin{x}}{x}\right)^2$ are important but not easy to compute. In this post The Fourier transform of sinx/x and (sinx/x)^2 and more we did the computation by extensively using contour integration. Along the journey, we also review important concepts in complex analysis.

fréchet derivative…

Fréchet derivative generalises the concept of derivative into any topological vector spaces, the dimension being arbitrary. The most important thing is, derivative should oftentimes be understood as a linear operator instead of a number or a matrix, as is shown in the post A Brief Introduction to Fréchet Derivative.

fourier transform of e^-ax^2…

In this post The Fourier Transform of exp(-cx^2) and Its Convolution, we compute the Fourier transform of $\exp(-cx^2)$ using two ways, differential equation and Gaussian integral. We also find the convolution quite easy to be computed if we utilise Fourier transform.

Artin's Theorem of Induced Characters

2023-07-18 03:39:59

Introduction

When studying a linear space, when some subspaces are known, we are interested in the contribution of these subspaces, by studying their sum or (inner) direct sum if possible. This philosophy can be applied to many other fields.

In the context of representation theory, say, we are given a finite group $G$, with a subgroup $H$, we want to know how a character of $H$ is related to a character of $G$, through induction if anything. Next we state the content of this post more formally.

Let $G$ be a finite group with distinct irreducible characters $\chi_1,\dots,\chi_h$. A class function $f$ on $G$ is a character if and only if it is a linear combination of the $\chi_i$’s with non-negative integer coefficients. We denote the space of characters by $R^+(G)$. However, $R^+(G)$ lacks a satisfying algebraic structure, as one is not even allowed to freely do subtraction. For this reason, we extend the coefficients to all of integers, by defining

An element of $R(G)$ is called a virtual character because when one coefficient of some $\chi_i$ is negative, it cannot be a character in the usual sense. Note that $R(G)$ is a finitely generated free abelian group, hence we are free to do subtraction in the normal sense.

Besides, since the product of two characters are still a character, we see $R(G)$ is a ring (not necessarily commutative). To be precise, it is a subring of the ring $F_\mathbb{C}(G)$, the ring of class functions of $G$ over $\mathbb{C}$. Furthermore, we actually have $F_\mathbb{C}(G) \cong \mathbb{C} \otimes R(G)$.

Let $H$ be a subgroup of $G$. Then the operation of restriction and induction defines homomorphisms $\mathrm{Res}:R(G) \to R(H)$ and $\mathrm{Ind}:R(H) \to R(G)$. By extending Frobenius reciprocity linearly, still we find that $\mathrm{Res}$ and $\mathrm{Ind}$ are adjoints of each other. We also notice that the image of $\mathrm{Ind}:R(H) \to R(G)$ is a right ideal of $R(G)$. This is because, for any $\varphi \in R(H)$ and $\psi \in R(G)$, one has

But being an ideal should not be the end of our story. We want to know what happens if we consider more than one subgroups. For example, since every group is the union of all of its cyclic groups, what if we consider all cyclic subgroups of $G$? We are also interested in how all these ideals work together. This is where Artin’s theorem comes in.

Artin’s Theorem - Statement and a Concrete Example

Artin’s Theorem. Let $X$ be a family of subgroups of a finite group $G$. Let $\mathrm{Ind}:\oplus_{H \in X}R(H) \to R(G)$ be the homomorphism defined by the family of $\mathrm{Ind}_H^G$, $H \in X$. Then the following statements are equivalent:

(i) $G$ is the union of the conjugates of all $H \in X$. Equivalently, for any $\sigma \in G$, there is some $H \in X$ such that $H$ contains a conjugate of $\sigma$.

(ii) The cokernel of $\mathrm{Ind}:\bigoplus_{H \in X}R(H) \to R(G)$ is finite.

Example. Put $G=D_4$, the dihedral group consists of rotations ($\sigma$) and flips ($\tau$) of the square. We write

In this example we take $X=\{\langle\sigma\rangle,\langle\tau\rangle,\langle\tau\sigma\rangle\}$. First of all we put down the character table of $G$:

The character table of elements of $X$ is not difficult to carry out as they are characters of cyclic groups.

Instead of writing something like $\mathrm{Ind}_{\langle\sigma\rangle}^{D_4}\chi_1^\sigma=\chi_1+\chi_4$ manually for all characters, we put all of them in an induction-restriction table:

which yields a matrix naturally:

How to read the induction-restriction table? For example, the first column is $\langle \mathrm{Ind}_{\langle\sigma\rangle}^{D_4}\chi_1^\sigma,\chi_j\rangle$. Since $\mathrm{Ind}_{\langle\sigma\rangle}^{D_4}\chi_1^\sigma=\chi_1+\chi_4$, the column becomes $(1,0,0,1,0)$. On the other hand, the rows are indicated by the inner product with restriction. For example, since we have $\mathrm{Res}_{\langle\sigma\rangle}^{D_4}\chi_5=1$, thus $\langle\chi_4^\sigma,\mathrm{Res}_{\langle\sigma\rangle}^{D_4}\chi_5\rangle=1$ and therefore $T_{54}=1$. Induction and restriction coexist up to a transpose, which is another way to illustrate Frobenius reciprocity.

We obtain the induction map explicitly:

where the basis of $R(D_4)$ is $\chi_1,\dots,\chi_5$ and the basis of $R(\langle\sigma\rangle) \oplus R(\langle\tau\rangle) \oplus R(\langle\tau\sigma\rangle)$ is given by the second row of the induction-restriction table. By doing Gaussian elimination of rows and columns of $T$ (over $\mathbb{Z}$), i.e. changing the basis for $\mathbb{Z}^5$ and $\mathbb{Z}^8$, the matrix $T$ is reduced to the form

The image of $U$ is $\mathbb{Z} \oplus \mathbb{Z} \oplus \mathbb{Z} \oplus \mathbb{Z} \oplus 2\mathbb{Z}$, hence the cokernel of the induction map is

which is certainly finite. One can also verify that $X$ satisfies (i).

Proof of Artin’s Theorem

(i) => (ii)

Consider the exact sequence

To show that $\mathrm{coker}(\mathrm{Ind})$ is finite (it is a finitely generated ring to begin with), it suffices to show that it suffices to see its result from tensoring with $\mathbb{Q}$, in other words, that

is a surjective map, i.e. it has trivial cokernel. This is equivalent to the surjectivity of the $\mathbb{C}$-linear map

By Frobenius reciprocity, this is on the other hand equivalent to the injectivity of

Notice that $\mathbb{C} \otimes R(G)$ is the space of class functions of $G$. For a class function $f$ of $G$, if its restriction on each $H$ is $0$, according to (i), all values of $f$ have been determined, therefore $f$ is $0$ everywhere.

(ii) => (i)

Let $S$ be the union of the conjugates of the subgroups $H \in X$. Then we write elements in $\oplus_{H \in X}R(H)$ as $g=\sum_{H \in X}\mathrm{Ind}_H^G(f_H)$. It follows that $g$ always vanishes on $G \setminus S$. If (ii) holds, then

is a surjective map. Therefore class functions of $G$, i.e. elements of $\mathbb{C} \otimes R(G)$ vanish on $G \setminus S$, which forces $G \setminus S$ to be empty, i.e. $G=S$.

References / Further Reading