1.3 Permutations

Definition 1.7

A permutation of a set X is a bijection from X to X.
If \(X=\{1,2,\ldots,n\}\) we write \(S_n\) for the set of all permutations of X, and call \(S_n\) the symmetric group on n letters.

Sometimes, like when \(X=\{1,2,3,\ldots,n\}\), a set comes with a natural order. Then a permutation \(\sigma\) can be thought of as a re-ordering of X, since the elements \(\sigma(1),\sigma(2),\ldots,\sigma(n)\) are the numbers 1, 2, …, n written in some different order.

If \(\sigma\) and \(\tau\) are permutations we will often write \(\sigma \circ \tau\) as \(\sigma \tau\), and refer to it as the product of \(\sigma\) and \(\tau\) instead of the composition.

1.3.1 Two row notation

To record an element \(\sigma\in S_n\), we need to say what \(\sigma(i)\) is for \(1\leq i \leq n\). One way to do this is two row notation in which we write the numbers 1 to n in a row and then write \(\sigma(i)\) beneath i: \[ \begin{pmatrix} 1 & 2 & \cdots & n-1 & n \\ \sigma(1) & \sigma(2) & \cdots & \sigma(n-1) & \sigma(n) \end{pmatrix} \] Because elements of \(S_n\) are bijections, the numbers appearing on the bottom row are just 1, 2, …, n written in a possibly different order.

The inverse of a bijection is again a bijection so if \(\sigma \in S_n\) then \(\sigma^{-1} \in S_n\). If we have \(\sigma\) in two row notation it is easy to find its inverse: since \(\sigma^{-1}\) must send \(\sigma(i)\) to i, we just swap the two rows and then rearrange the columns so that the top row is in the correct order. For example, if we begin with \[ \sigma = \begin{pmatrix} 1 & 2 & 3 & 4 & 5 \\ 3 & 4 & 5 & 1 & 2 \end{pmatrix} \] then on swapping the rows we get \[ \begin{pmatrix} 3 & 4 & 5 & 1 & 2 \\ 1 & 2 & 3 & 4 & 5 \end{pmatrix} \] and rearranging gives \[ \sigma^{-1} = \begin{pmatrix} 1 & 2 & 3 & 4 & 5 \\ 4 & 5 & 1 & 2 & 3 \end{pmatrix}.\]

Theorem 1.5 \(|S_n|=n!\).

Proof. Induction on n. When \(n=1\) there is a unique bijection \(\{1\}\to\{1\}\), namely the identity map, so \(|S_1|=1=1!\) as required.

For the inductive step, note that the number of elements of \(S_n\) is the number of different ways to order the elements \(1,2,\ldots,n\). An ordering of \(1,2,\ldots,n\) is the same thing as an ordering of \(1,2,\ldots,n-1\) with n inserted into one of n positions, so the number of possible orderings is n times the number of orderings of \(1,\ldots,n-1\), which is \((n-1)!\) by the inductive hypothesis. So \(|S_n|=n\times (n-1)!=n!\), completing the inductive step.

1.3.2 Cycles

Definition 1.8

Let \(a_0,\ldots,a_{m-1}\) be distinct elements of \(\{1,2,\ldots,n\}\). Then \((a_0,\ldots,a_{m-1})\) is the permutation in \(S_n\) such that \(a_i \mapsto a_{i+1}\) for \(0\leq i < m-1\), \(a_{m-1}\mapsto a_0\), and if \(x \neq a_1,\ldots,a_m\) then \(x \mapsto x\). Such a permutation is called an m-cycle
A permutation which is an m-cycle for some m is called a cycle.
Two cycles \((a_0,\ldots,a_{m-1})\) and \((b_0,\ldots,b_{l-1})\) are called disjoint if no \(a_i\) is equal to any \(b_j\).

Figure 1.7: Illustration of a 5-cycle. \(a_0,a_1,a_2,a_3,a_4\) are shown in a circle with an arrow pointing from one to the next, then from \(a_4\) back to \(a_0\)

Not every permutation is a cycle, e.g. \[ \begin{pmatrix} 1&2&3&4\\ 2&1&4&3 \end{pmatrix}. \] On the other hand, as we will prove in this section, any permutation can be written as a product of disjoint cycles: you can check that the permutation above is equal to \((1,2)(3,4)\). To give the proof, we need some further ideas related to permutations. First we are going to define powers of permuations in such a way that the usual exponent laws apply.

Definition 1.9 Let \(m\in\mathbb{Z}\) and \(\sigma\in S_n\). Then \[ \sigma^m = \begin{cases} \sigma \circ \cdots \circ \sigma \;\; (m \text{ times}) & m>0 \\ \id & m=0 \\ \sigma^{-1}\circ \cdots \circ \sigma^{-1} \;\; (-m \text{ times}) & m<0 \end{cases} \]

It’s slightly tedious, but you can check that for any \(a,b\in\mathbb{Z}\) we have \[ \sigma^a \sigma^b = \sigma^{a+b}.\] There is a proof of a more general result in the section of these notes on groups.

Definition 1.10 Let \(x \in \{1,2,\ldots,n\}\) and \(\sigma \in S_n\). The orbit of x under \(\sigma\), written \(\operatorname{orb}(x)\), is \[ \operatorname{orb}(x) = \{\sigma^m(x) : m \in \mathbb{Z}\} .\]

Despite appearances, \(\operatorname{orb}(x)\) is a finite set, because it is a subset of \(\{1,\ldots,n\}\). We can say a little about what \(\operatorname{orb}(x)\) looks like using the following result:

Lemma 1.2 Let \(x \in \{1,\ldots,n\}\) and \(\sigma \in S_n\). Then there is a whole number \(r>0\) such that \(\sigma^r(x)=x\).

Proof. The elements \(\sigma^m(x)\) for \(m=0,1,2,\ldots\) can’t all be different, so there must exist \(i<j\) such that \(\sigma^i(x)=\sigma^j(x)\). Then \(\sigma^{-i}\sigma^i (x) = \sigma^{-i}\sigma^j (x)\), so \(x =\sigma^{j-i}(x)\) and we can take \(r=j-i\).

Theorem 1.6 Every element of \(S_n\) can be written as a product of disjoint cycles.

To help keep track of what is happening in the proof we use the running example of

\[\sigma = \begin{pmatrix}1&2&3&4&5&6&7&8&9\\ 4&6&9&8&3&1&7&2&5 \end{pmatrix}\]

Note first of all that \(\operatorname{orb}(1) = \operatorname{orb}(4)= \operatorname{orb}(8) = \operatorname{orb}(2) = \operatorname{orb}(6) = \{1, 4, 8, 2, 6\}\), \(\operatorname{orb}(3) = \operatorname{orb}(9) =\operatorname{orb}(5) = \{3, 9, 5\}\), and \(\operatorname{orb}(7) = \{7\}\).

Proof. Let \(1 \leq x \leq n\). By Lemma 1.2 there is a positive number r such that \(\sigma^r(x) = x\). Let r(x) be the smallest such positive number. In the running example, r(1) = r(4) = r(8) = r(2) = r(6) = 5, r(3) = r(9) = r(5) = 3, and r(7) = 1. Let \(c_x = (x, \sigma(x), \ldots, \sigma^{r-1}(x))\), a cycle in \(S_n\). Again, in the example \(c_1 = (1, 4, 8, 2, 6) = c_4\), \(c_7 = (7)\), \(c_5 = (5, 3, 9)\).

Let \(y \in \operatorname{orb}(x)\). We will show that \(c_x(y) = \sigma(y)\). There is an integer m such that \(y = \sigma^m(x)\). We can divide \(m\) by r(x) to get a quotient q and a remainder r such that \(0 \leq r < r(x)\):

\[m = q r(x) + r\]

Then \(y = \sigma^m(x) = \sigma^{qr(x) + r}(x) = \sigma^r(\sigma^{qr(x)}(x))\). Since \(\sigma^{r(x)}(x) = x\) we get \(\sigma^{q r(x)}(x) = x\) and so \(y = \sigma^r(x)\).

In the running example, take x = 8, so r(x) = 5, and \(y = \sigma^{-42}(x) = 1 \in \operatorname{orb}(x)\). Then \(-42 = -9 \times 5 + 3\) so q = -9, r = 3. The cycle \(c_8 = (8, 2, 6, 1, 4)\). We have \(\sigma(y) = 4\) and \(c_8(y) = 4\).

Recall that \(c_x = (x, \sigma(x), \ldots, \sigma^{r(x)-1}(x))\). Since \(y = \sigma^r(x)\) is one of the elements in this cycle and since \(\sigma^{r(x)}(x) = x\) it follows straightaway that \(c_x(y) = \sigma(y)\).

Next we will show that if two orbits \(\operatorname{orb}(a)\) and \(\operatorname{orb}(b)\) have non-empty intersection, they are equal. If \(z\) is in their intersection then \(z = \sigma^{u}(a) = \sigma^v(b)\) for some \(u, v \in \mathbb{Z}\). We then have \(a = \sigma^{v-u}(b) \in \operatorname{orb}(b)\). Now every element \(\sigma^m(a)\) of \(\operatorname{orb}(a)\) equals \(\sigma^{v-u+m}(b)\), which is an element of \(\operatorname{orb}(b)\). It follows \(\operatorname{orb}(a) \subseteq \operatorname{orb}(b)\). A similar argument shows \(\operatorname{orb}(b) \subseteq \operatorname{orb}(a)\), so \(\operatorname{orb}(a) = \operatorname{orb}(b)\).

The list \(\operatorname{orb}(1), \operatorname{orb}(2), \ldots, \operatorname{orb}(n)\) may contain some repetitions - in our running example, as we have seen, \(\operatorname{orb}(6)\) and \(\operatorname{orb}(4)\) are equal (and so are many others). Remove any repetitions from the list to leave \(\operatorname{orb}(x_1), \ldots, \operatorname{orb}(x_k)\). In the running example, this leaves us with orb(1), orb(3), orb(7) so k=3, \(x_1, = 1, x_2 = 3, x_3 = 7\). The cycles \(c_{x_1}, \ldots, c_{x_k}\) are disjoint because the orbits \(\operatorname{orb}(x_i)\) and \(\operatorname{orb}(x_j)\) are distinct for \(i\neq j\), so by the previous paragraph they have no elements in common. In the running example these cycles are (1, 4, 8, 2, 6), (3, 9, 5), and (7).

We will show that \(\sigma = c_{x_1}\cdots c_{x_k}\), a product of disjoint cycles. Let \(1\leq y \leq n\); we must show that the product of the \(c_{x_i}\)s sends \(y\) to \(\sigma(y)\). We know that \(y \in \operatorname{orb}(y)\) and that \(\operatorname{orb}(y)\) is equal to \(\operatorname{orb}(x_m)\) for some m. Then \(y\) (and \(\sigma(y)\), also an element of \(\operatorname{orb}(x_m)\)) does not belong to any of the other orbits \(\operatorname{orb}(x_i)\) for \(i\neq m\). Thus each \(c_{x_i}\) for \(i\neq m\) satisfies \(c_{x_i}(y)=y\) and \(c_{x_i}(\sigma(y)) = \sigma(y)\), while as we saw earlier \(c_{x_i}(y) = \sigma(y)\). It follows \((c_{x_1}\cdots c_{x_k})(y) = \sigma(y)\), as required.

The theorem gives us a way of expressing a given permutation as a product of disjoint cycles: first we find the orbits, then each orbit gives rise to a cycle and the product of these cycles is equal to our original permutation. In the running example of the theorem, we get \(\sigma = (1,4,8,2,6)(3,9,5)(7)\). Similarly if

\[\sigma = \begin{pmatrix}1&2&3&4&5&6\\4&1&6&2&3&5\end{pmatrix}\]

then \(\sigma(1)=4, \sigma(4) = 2, \sigma(2)=1\), so the first cycle is (1, 4, 2), and \(\sigma(3) = 6, \sigma(6) = 5, \sigma(5)=3\), so the next cycle is (3, 6, 5). This exhausts all the number 1, 2, 3, 4, 5, 6, so it must be \(\sigma = (1, 4, 2)(3, 6, 5)\).

1.3.3 Transpositions and the sign of a permutation

Definition 1.11 A transposition is a 2-cycle.

Lemma 1.3 If \(a_0, \ldots, a_{m-1}\) are distinct then \[(a_0, a_1, \ldots, a_{m-1}) = (a_0,a_1)(a_1, a_2)\cdots (a_{m-2},a_{m-1})\]

Proof. Induction on m. For \(m=2\) there is nothing to prove. For the inductive step, we can assume that \((a_0,a_1)\cdots (a_{m-2}, a_{m-1}) = (a_0,\ldots,a_{m-1})\) and we must show that if we multiply this on the right by \((a_{m-1}, a_m)\) we get \((a_0,\ldots,a_m)\).

Consider the permutation \[ (a_0,\ldots,a_{m-1})(a_{m-1},a_m).\] Every number not an \(a_i\) is sent to itself by this permutation. Every \(a_i\) for \(i \leq m-2\) is sent to \(a_{i+1}\), since it is fixed by the 2-cycle. \(a_{m-1}\) is sent to \(a_m\) by the 2-cycle and this is fixed by the m-cycle. \(a_m\) is sent to \(a_{m-1}\) by the 2-cycle and then to \(a_0\) by the m-cycle. Summarizing, \(a_i\) goes to \(a_{i+1}\) for \(i<m\) and \(a_m\) goes to \(a_0\), hence this permutation is the m+1 cycle \((a_0,\ldots,a_m)\).

Proposition 1.3 Every permutation is equal to a product of transpositions.

Proof. Let \(\sigma\) be any permutation. Theorem 1.6 says we can express \(\sigma\) as a product of disjoint cycles. The previous lemma says we can express each of these cycles as a product of transpositions. This expresses \(\sigma\) as a product of transpositions.

Definition 1.12 A permutation is even if it can be written as a product of an even number of transpositions, and odd if it can be written as an odd number of transpositions.

For example, the identity permutation \(\id = (1,2)(1,2)\) so it is even. It follows straight from the definition that an even permutation multiplied by another even permutation is even, even times odd is odd, odd times even is odd, and odd times odd is even.

It’s not clear however that a permutation couldn’t be odd and even at the same time.

Theorem 1.7 Every permutation is either odd or even, but not both.

Proof. (Sketch). First we know from the previous proposition that every permutation can be written as a product of transpositions, so the only problem is to prove that it is not possible to find two expressions for a given permutation, one using a product \(s_1 s_2 \cdots s_{2m+1}\) of an odd number of transpositions and one using a product \(t_1 t_2 \cdots t_{2l}\) of an even number of transpositions. The steps are as follows:

If this were the case then we would have \(\id = t_{2l}\cdots t_1 s_1 \cdots s_{2m+1}\), expressing the identity as a product of an odd number of transpositions, so it is enough to prove this impossible.
Observe that any transposition \((i,j)\) can be written \[ (j-1,j)\cdots (i+2,i+3)(i+1,i+2) \circ (i,i+1) \circ (i+1,i+2)(i+2,i+3)\cdots (j-1,j)\] where we have assumed without loss of generality that \(i<j\). This is a product of an odd number of adjacent transpositions, i.e. transpositions of the form \((k,k+1)\).
So it is enough to show that the identity cannot be written as a product of an odd number of adjacent transpositions.
Consider polynomials in the n variables \(x_1,\ldots,x_n\). If you have such a polynomial \(f(x_1,\ldots,x_n)\) and a permutation \(\sigma \in S_n\), you get a new polynomial \(\sigma(f)\) by replacing each \(x_i\) in f with \(x_{\sigma(i)}\). For example, if \(\sigma = (1,2,3)\) and \(f=x_1+x_2^2\) then \(\sigma(f)=x_2+x_3^2\). Define \[ \Delta = \prod_{1\leq i<j \leq n}(x_i-x_j)\]
Observe that for any k, \((k,k+1)(\Delta) = -\Delta\), and so any product of an odd number of adjacent transpositions sends \(\Delta\) to \(-\Delta\) too.
Get a contradiction from the fact that \(\id(\Delta) = \Delta\).

Proposition 1.4

The identity permutation is even.
An m-cycle is even if m is odd and odd if m is even.

Proof.

\(\id = (1,2)(1,2)\).
This follows from 1.3.

Definition 1.13 The sign of a permutation \(\sigma\), written \(\sgn(\sigma)\), is \[ \sgn(\sigma) = \begin{cases} \phantom{-}1 & \sigma \text{ is even} \\ -1 & \sigma \text{ is odd.} \end{cases} \]

Thus \(\sgn (a_0,\ldots,a_{m-1}) = (-1)^{m-1}\). Sign has a nice property:

Proposition 1.5 For any permutations \(\sigma\) and \(\tau\) we have \(\sgn(\sigma \tau) = \sgn(\sigma)\sgn(\tau)\).

Proof. This is just a restatement of the fact that an even permutation composed with an even permutation is even, an odd permutation composed with an odd permutation is even, and an odd permutation composed with an even permutation is odd.

This lets us find the sign of an arbitrary permutation: first express it as a product of cycles (they don’t have to be disjoint), then use this result and the fact that even length cycles are odd and odd length cycles are even.