Hilbert’s Basis Theorem

If you have had some experience with rings and commutative algebra in the past, you may have encountered the idea of a Noetherian ring. This post will not deal with explaining Noetherian rings or their importance; instead I would recommend several resources which undoubtedly handle it much more comprehensively than I can:

Today we will prove a fundamental result regarding Noetherian rings, which can be traced back to an 1890 result of Hilbert. In what follows, we will use commutative rings with identity, although analogous results exist in other cases.

Theorem: Let R be a Noetherian ring. Then R[X] is Noetherian, and via induction, R[X_1,\dots,X_k] is Noetherian as well. 

Proof: Assume towards contradiction that R[X] was not Noetherian. Using the Axiom of Choice, we may select a sequence of polynomials f_0,f_1,\dots, f_k,\dots such that f_k\in R[X]\setminus (f_0,\dots,f_{k-1}) and is of minimal degree in R[X]\setminus(f_0,\dots,f_{k-1}). For each k, let a_k be the leading coefficient of f_k, and consider the chain of ideals

(a_0)\subset (a_0,a_1)\subset\dots\subset (a_0,a_1,\dots,a_k)\subset\dots\subset R.

Now R is Noetherian, so this chain stabilizes for some integer N. Thus a_{N+1}\in (a_0,\dots,a_N), so that a_{N+1}=\sum_{i=0}^Nr_ia_i for some r_i\in R. Now consider the polynomial


First, this is well defined, since the degree sequence of f_0,f_1,\dots is non-decreasing, so each summand is multiplied by a nonnegative power of x. Next, this polynomial lies outside of the ideal (f_0,\dots,f_N) since f_{N+1} lies outside of this ideal. Finally, note that by choice of the r_i, the degree of this polynomial is strictly less than the degree of f_{N+1}. This contradicts our original choice of f_{N+1}, and hence it must be that R[X] is Noetherian.


Some concluding remarks. First, using induction and the fact that R[X_1,\dots,X_k]\cong (R[X_1,\dots,X_{k-1}])[X_k], one can show that a polynomial ring in finitely many variables over a Noetherian ring is also Noetherian. Second, note that this proof is entirely non-constructive, and tells us nothing about how to get a generating set for an ideal. To do this we must turn to the idea of Grobner bases, which are interesting in their own right but will not be discussed here. As always, thanks for reading!

Posted in Ring and Field Theory | Leave a comment

“One Line” Algebra Qual Problems

Today I’m going to try to distill some solutions to previous algebra qual problems into one sentence answers. This will likely mean that the solutions are missing details, but my hope is to be able to give a decent overview of how the solution would go. My goal in doing this is to develop my intuition for the main ideas used in these solutions, and also to work on being able to talk about mathematics succinctly (there is also a “puzzle” element to finding the shortest way to express the solution which I have enjoyed). Apologies in advance if this ends up being more confusing than helpful.

Problem 1 [Wisconsin Algebra Qual, Jan 2007]: Let F be a field of characteristic 0 and let f\in F[x] be an irreducible polynomial of degree >1 with splitting field E\subset F.  Define \Omega=\{\alpha\in E:f(\alpha)=0\}.

  • (a) Let \alpha\in \Omega and m a positive integer. If g is the minimal polynomial of \alpha^m over F, show that the roots of g are \{\beta^m:\beta\in \Omega\}.
  • (b) Now fix \alpha\in \Omega and assume r\alpha\in \Omega for some r\in F. Show that for all i\geq0, \beta r^i\in \Omega.
  • (c) If r is as in part (b), conclude r is a root of unity.
  • (d) If \alpha and r are as in part (b) and if m is the order of r, show that f(X)=g(X^m), where g is the minimal polynomial of \alpha^m over F.

Solution 1:

  • (a) We know \alpha satisfies g(X^m) so f divides g(X^m) and \{\beta^m:\beta\in \Omega\} is contained in the set of roots of g; further, g(X) divides \prod_{\beta\in\Omega}(X-\beta^m) as the latter polynomial is fixed by any Galois action and hence lies in F[X], which shows the reverse inclusion.
  • (b) The Galois action acts transitively on \Omega and preserves roots of f; a homomorphism mapping \alpha to \alpha r will map \alpha r to \alpha r^2 and a homomorphism mapping \alpha to \beta will send \alpha r^i to \beta r^i.
  • (c) Polynomials can only have finitely many roots so \{\alpha\cdot r^i:i\geq 0\} is finite.
  • (d) As in part (a), f divides g(X^m); both polynomials have the same degree as |\Omega|=m\cdot|\{\beta^m:\beta\in \Omega\}| since r is an mth root of unity.


Problem 2 [Atiyah MacDonald, Ch1 Ex 7]: Let A be a ring in which every element x satisfies x^n=1 for some n (depending on x). Show that every prime ideal in A is maximal.

Solution 2: If I is prime then A/I is an integral domain and we may apply the cancellation law on \overline{x}^n=\overline{x} to see that \overline{x} is invertible and hence A/I is a field, showing I is maximal.


Problem 3 [Multiple sources, none of which I can remember specifically]: Show that if F is a finite field, then |F|= p^k for some prime p.

Solution 3: The characteristic of F must be some prime p, so that F forms a vector space over \mathbb{F}_p which, after choosing a basis and counting, must have order p^k for some k\geq1.


Problem 4 [MSU Algebra Exam, Fall 2016]: If G is a finite p group and N\subset G is normal, show that N\cap Z(G)\neq \{1\}.

Solution 4: N is a union of conjugacy classes, so we may let G act on N via conjugation and examine orbits to see |N|=p^k=1+\sum_{conjugacy classes}|C_g|; each |C_g| is a power of p and as the identity element has a trivial conjugacy class we must have other elements with trivial conjugacy classes which are precisely elements in the center of G.


Problem 5 [MSU Algebra Exam, Fall 2016]: Show that there are only finitely many simple groups G containing a subgroup of index k.

Solution 5: If H\subset G is not normal and has index k, then the action of G on the cosets of H gives a nontrivial map from G to S_k which cannot be injective if G is too big.

Posted in Group Theory, Ring and Field Theory | Tagged | Leave a comment

A good title for this blog post exists, but I don’t know what it is

Today we will look at one of my favorite topics in combinatorics: the probabilistic method. Championed by Paul Erdős in the 1940’s, the probabilistic method has since become a staple in extremal and Hungarian-style combinatorics.

While the applications of the probabilistic method are complex and far reaching, the fundamental idea of the method relies on a quite simple observation which we can exhibit with my favorite mathematical tool: jelly beans.

Suppose we have a glass jar of jelly beans, and inside we can see that 9 of which are red and 1 of which is green. If we drew a jelly bean at random, what is the probability of selecting the green jelly bean? We haven’t gotten to the probabilistic method yet; this is just your run of the mill probability question. The answer is of course 10\%; if we ran this experiment 10 times, we would expect to have drawn 1 green jelly bean. Now let’s turn this experiment on its head.

Suppose we have an opaque jar that is filled with jelly beans. The jar contains some mix of red and green jelly beans. It might be all red beans, all green beans, or some combination of the two; we have no way of determining whether or not a green jelly bean exists in this jar without some more information. But now here’s the kicker: with this jar comes a note that simply says “if we draw a jelly bean at random from this jar, the probability of selecting a green jelly bean is nonzero.” With this simple note we gain an important piece of information: there must be a green jelly bean in this jar if the probability of drawing one is nonzero. We cannot look too far into the note. We don’t know if there are a plethora of green jelly beans or just one. We don’t know if the green jelly beans are pristine, kidney shaped pieces of apple flavored heaven or misshapen, moldy lumps of questionable origin. But we know that a green jelly bean exists. And that is the crux of the probabilistic method.

The Probabilistic Method: Suppose we have a group of combinatorial objects C, and we want to determine whether or not one of these objects has the property P. If we select an object randomly (or following any arbitrary probability distribution)  from C and this object has P with nonzero probability, then an object in C with property P exists.

With this simple introduction to the method in hand, let’s try to see the power of it. For this we will introduce the Ramsey Numbers R(k,k). (Note: Ramsey numbers exist in a more general form than discussed here. I would suggest reading more at the link or in any graph theory book if you find this interesting; for now we will focus on the specific type of Ramsey numbers so that we may highlight the probabilistic method).

For a positive integer k, the Ramsey number R(k,k) is defined as the minimal positive integer such that any graph on R(k,k) vertices either contains a complete graph on k vertices or an independent graph on k vertices. (For those new to graph theory, a “complete graph” is one in which every possible edge exists; an independent graph is one which contains no edges). For example, we know R(3,3)>5, since the cycle graph


does not contain a triangle or a set of three vertices which are pairwise disconnected. A straightforward use of the pidgeonhole principle can be used to show that this bound is tight, i.e. R(3,3)=6. However, when k is large, much less is known about R(k,k). Mathematicians have proven R(4,4)=18 and, beyond that, we can only bound the values of R(k,k):

k Best known bound on R(k,k)
5 [43,49]
6 [102,165]
7 [205,540]
8 [282,1870]
9 [565,6588]
10 [798,23556]

Ramsey numbers are a classic example of something that is fairly easy to describe, yet notoriously hard to compute. According to Erdős himself,

“Suppose aliens invade the earth and threaten to obliterate it in a year’s time unless human beings can find the Ramsey number for red five and blue five [ R(5,5) ]. We could marshal the world’s best minds and fastest computers, and within a year we could probably calculate the value. If the aliens demanded the Ramsey number for red six and blue six [ R(6,6) ], however, we would have no choice but to launch a preemptive attack.”

In the remainder of the post we will show what is, in some sense,  the best general lower bound for R(k,k). (Below we will be using graphs whose vertices are labeled; this allows us to not have to worry about graph isomorphisms and such. One can quickly see that the results are identical for unlabeled graphs)

Theorem: For all k\geq 3, we have R(k,k)>2^{k/2-1}.

Proof: First, what does it mean to give a lower bound for R(k,k)? R(k,k) is the minimal value such that any graph on R(k,k) has a certain property. To bound R(k,k) from below by 2^{k/2-1}, one approach can be to show that there exists a graph on 2^{k/2-1} vertices without this property. Hence it is natural to consider the probabilistic method.

Consider a random graph G on n vertices. By “random” we mean each edge in G exists with probability 1/2 (this is the Erdős–Rényi model with parameter p=1/2), so that all of the possible graphs on n labeled vertices have an equal chance of being selected.

For any subgraph H on k vertices in our random graph, the probability that H is a complete graph is 2^{-\binom{k}{2}} (recall each edge appears with probability 1/2, so to get the probability of all \binom{k}{2} possible edges appearing we raise 1/2 to the power \binom{k}{2}). There are \binom{n}{k} subgraphs of G of size k, and hence the probability that G contains a complete subgraph of size k is


By the exact same reasoning (replace “every edge in my subgraph exists” with “no edge in my subgraph exists”), the probability that G contains an independent subgraph of size k is also \binom{n}{k}\cdot2^{-\binom{k}{2}}. Therefore the total probability that G satisfies the Ramsey property (either containing a complete subgraph or an independent subgraph) can be bounded above by simply adding these two probabilities (to get the exact probability we would have to subtract the probability of both events occurring), to get

Probability that G has the Ramsey property \leq 2\cdot\binom{n}{k}\cdot2^{-\binom{k}{2}}.

In particular, we have

Probability that G does not have the Ramsey property \geq 1-2\cdot\binom{n}{k}\cdot2^{-\binom{k}{2}}.

Thus to apply the probabilistic method, we want to find a value of n for which \geq 1-2\cdot\binom{n}{k}\cdot2^{-\binom{k}{2}}>0.

To do this, we apply the simple bound \binom{n}{k}<n(n-1)\dots(n-k+1)<n^k, which turns our desired inequality into


If n\leq 2^{k/2-1}, we get

2n^k\leq  2^{k^2/2-k+1},

which is certainly less than 2^{\tfrac{k(k-1)}{2}} for k\geq 3 as desired.

To summarize: if n\leq 2^{k/2-1}, then the probability that a randomly selected graph G on n vertices does not have the Ramsey property is strictly positive. Hence the probabilisitc method tells us that such a graph exists for all such n. Thus for all graphs to satisfy the Ramsey property, we may conclude that R(k,k) must be strictly greater than 2^{k/2-1}.      \square

It is worth pointing out that more subtle lower bounds for R(k,k) exist. However, quite remarkably, this “simple” lower bound does quite well in that there is no known lower bound for R(k,k) of the form c^k for any constant c>\sqrt{2}. So asymptotically the probabilistic method does quite well for our current techniques and tools.

It is also worth pointing out that there is a lot of work to do in bounding the Ramsey numbers. For one, the use of the probabilistic method tells us nothing about what kind of graphs are problematic when it comes to the Ramsey property. It is possible that there is some pattern out there, but because we worked with random graphs it is hopeless for us to find it with this method. Additionally, the best known upper bound for R(k,k) is the central binomial coefficient R(k,k)\leq \binom{2r-2}{r-1}, which grows asymptotically as \frac{4^{r-1}}{\sqrt{\pi(r-1)}}. That is much bigger than our lower bound, and leaves a lot of room for error!

As always, thanks for reading!

Posted in Combinatorics | Tagged , | Leave a comment

Groups of order 4,312 are not simple: An application of Sylow Theory

Today we will solve a problem from Wisconsin’s 2012 Algebra Qualifying Exam. The main tools used in the problem are Sylow Theory and group actions; we will take these tools for granted, but for great introductions to each topic I would recommend Keith Conrad’s notes here and here. The solution as written is not the cleanest; instead I have tried to add in some of my thoughts when solving the problem to highlight why the solution is (somewhat) natural to stumble upon. The problem is in three parts:

Problem: Let G be a finite group of order 4,312 = 2^3\cdot 7^2\cdot 11.

  • Show that G has a subgroup of order 77.

  • Show that G has a subgroup of order 7 whose normalizer has index dividing 8.

  • Conclude that G is not simple.

Let’s do this. A natural place to start the first part is with Sylow Theory. Since 77=7\cdot 11, we might try to show that a Sylow-11 subgroup is unique and hence normal, and then form a product subgroup with an element of order 7 to get the desired subgroup.

Thus let n_{11} be the number of Sylow-11 subgroups. By the third Sylow Theorem, n_{11}\equiv 1 \mod 11 and n_{11} divides 2^3\cdot 7^2. By simply checking all possibilities, we conclude n_{11}\in \{1,56\}. If n_{11}=1, then there is a unique Sylow-11 subgroup which is normal. We may then take a subgroup of size 7 (which exists by Cauchy’s theorem or by the extended first Sylow Theorem), form a product subgroup, and create a subgroup of size 77. Otherwise, we have n_{11}=56. In this case we note that (miraculously) |G|=56\cdot 77. Furthermore, the second Sylow Theorem tells us that all Sylow-11 subgroups are conjugate to one another; hence we consider the action of G on a Sylow-11 subgroup via conjugation. The orbit of this action has size 56, and thus the Orbit-Stabilizer Theorem tells us that the stabilizer subgroup has size 77 as desired. Thus in either case we have found a subgroup of size 77, and we move to part 2. (NB: The fact that there are possibly 56 Sylow-11 subgroups and that 4312=56\cdot 77 is really just a happy accident that makes for a good test problem; it would be hard to plan for this from the outset, and hence this is somewhat of a “black magic” solution; however, it does follow quite organically from the consideration of Sylow Theory (and highlights the power of the theory) which is partly why I like this problem so much).

For part two, we need a subgroup of size 7 with a large normalizer. Recall the definition of the normalizer of a subgroup K as

N_K=\{g\in G:gKg^{-1}=K\}.

Given that we just found a subgroup H of size 77, the obvious choice is to explore this subgroup a little bit more. Now H has size 77=7\cdot 11; a simple application of the third Sylow Theorem shows that H has a unique Sylow-7 subgroup K, and hence K is normal in H (in fact, one may show that H must be cyclic of order 77 by pursuing the Sylow Theory a little bit more, but that is irrelevant to the current problem). So we have a chain of subgroups

K\subset H\subset N_K\subset G.

Thus we already know 77 divides |N_K|. Now K is a p-group of size 7. Again by Sylow Theory we know that K must be contained in a Sylow-7 subgroup, say Syl_7. Now |Syl_7|=7^2; a simple fact of p-groups is that groups of order p^2 are abelian, and hence Syl_7 must also normalize K (in fact it centralizes it, but again, not important). So we also have Syl_7\subset N_K. So |N_K| is divisible by both 77 and 49, and hence 539 divides |N_K|. Looking at indices, we get |G:N_K| divides 8, as desired.

For the last part we are supposed to conclude the G is not simple. To do this, we must find a normal subgroup contained in G. The easiest normal subgroups to find in groups are kernels of homomorphisms; given that this is the last part of a problem, we should try to use the previous part to construct some sort of homomorphism with a kernel and we will be done.

Let us take the normalizer subgroup N_K which we found above. There are two natural homomorphisms we can consider when given a subgroup such as N_K, both arising from natural group actions. One is given by conjugation on N_K, and the other is the multiplication action on cosets of N_K. Since we have no idea about the number of subgroups which are conjugate to N_K, but we do know that there are relatively few cosets of N_K, let’s try the action on the cosets. Say that |G:N_K|=8, so that there are 8 cosets of N_K (the proof with fewer cosets is identical). Then the action via left multiplication of G on the set of cosets gives a map G\to S_8, the symmetric group on 8 symbols.  If this map was injective then we could identify G with a subgroup of S_8. But 11 divides |G|, and 11 does not divide |S_8|=8!; hence we would arrive at a contradiction. So this map cannot be injective, and hence must have a nontrivial kernel. Thus we are done, right? Well, not quite. It could be that the entire homomorphism was trivial, and hence the kernel is G itself. But this means that N_K must be equal to G (why?). In particular, the normalizer of our original subgroup K is all of G, and hence K itself is normal. So in any case G has a proper normal subgroup, and hence G cannot be simple.

This concludes this problem, which I hope exhibits the strength of the Sylow Theorems and also the importance of keeping natural group actions in the back of your head when tackling problems of this type. As always, thanks for reading!

Posted in Group Theory, Uncategorized | Tagged | Leave a comment

Cauchy’s Theorem

The famous Lagrange’s Theorem states that, in any finite group G, the order of any element g\in G divides the order of G. A natural follow up to this theorem is whether or not the converse holds. That is, if k\in \mathbb{N} divides the order of G, does there exist an element g\in G of order k? Unfortunately the converse is, in general, false. Examples are easy to come by; take, for instance, (\mathbb{Z}/2\mathbb{Z})^j and look for an element of order 2^i, i>1.

While the converse to Lagrange’s Theorem is false in general, Cauchy’s Theorem tells us that there are special circumstances in which the converse is true. Namely, if G is a finite group and p is a prime number dividing |G|, then we can guarantee that some element in G has order p. This means, for example, that in  the symmetric group S_n we can find an element of order p for all p\leq n (it is easy to come up with the elements here, just take a p-cycle. In general Cauchy’s Theorem does not tell you how to find such elements).

There are several proofs of Cauchy’s Theorem available. The standard first proof uses induction on the size of G, and while it is instructive, I find it to be a bit messy. Below I reproduce a (in my opinion) more elegant proof using group actions. The original version of this proof is due to James McKay.

Proof of Cauchy’s Theorem: Let G be a finite group, p a prime with p dividing |G|. Consider the set

S=\{(g_1,g_2,\dots,g_p):g_i\in G, g_1g_2\dots g_p=1\}.

In other words, S consists of tuples from G whose ordered product equals the identity. Note that once we choose g_1,\dots,g_{p-1}, the element g_p is forced to equal (g_1\cdot\dots\cdot g_{p-1})^{-1}, and thus |S|=|G|^{p-1}.

Now we make the following observation: if g_1g_2\dots g_p=1, then any cyclic permutation of the product must also yield the identity (to see this, use the relation that if ab=1 then ba=1 and induct). Thus S admits an action by the cyclic group C_p. If s\in S, then by the Orbit-Stabilizer Theorem the orbit of s must have size 1 or p. What does it mean for s to have an orbit of size 1? If this is the case, then it must be that s=(g_1,\dots,g_p) with g_1=\dots=g_p; in otherwords, s represents an element g with g^p=1. As p is prime, if we can find such a g which is not the identity element, we will have found our element of order p.

To finish, we use the fact that the orbits of a group action partition the set on which the group is acting. Thus we have


We know that (1,\dots,1) is contained in S since 1^p=1, and thus there is one orbit of size 1. Thus

|S|=1+\sum |O|.

Now |O|\in \{1,p\} as mentioned previously. Further, |S|=|G|^{p-1} is divisible by p, and thus the right hand side above is divisible by p. It follows immediately that the number of orbits of size 1 is a multiple of p, and as we have already found one, there must be at least p. Thus we can guarantee an element of order p as desired.

As always, thanks for reading!

Posted in Group Theory | Tagged | Leave a comment

What are Permutation Patterns?

Today we examine a topic known as combinatorial pattern avoidance. There has been an explosion of interest in this field of late; see, for instance, the international permutation patterns conference, or any recent issues of your favorite discrete mathematics journal. I was first introduced to this field by Bruce Sagan at Michigan State’s SURIEM, an outstanding research program for undergraduate mathematicians.

We start from the ground up. For a positive integer n, let [n]=\{1,2,3\dots, n\}. A permutation \pi is a bijection from {}[n] to itself. So, for instance, we could have \pi: [4]\to[4] with \pi(1)=2, \pi(2)=1,\pi(3)=4,\pi(4)=3. Writing a permutation in this form is incredibly unwieldy, and hence we will use the one line notation \pi=\pi_1\pi_2\dots\pi_n, where \pi_i=\pi(i). Continuing our example from above, we would have \pi=2143. (Those familiar with these constructs may ask why we are not using the cycle notation; we will be using permutations purely as combinatorial strings and disregarding their group structure, hence the form). Given a permutation \pi=\pi_1\dots\pi_n, we say that \pi has length n.

Now given any string of positive integers w=w_1\dots w_k, we may standardize w to create st(w) by replacing the smallest letter in w with a 1, the second smallest letter in w with a 2, and in general the ith smallest letter with i. For instance, w=3581 standardizes to st(w)=2341. Two words which have the same standardization are said to be order isomorphic or in the same relative ordering, for reasons which become clear upon some thought. (Really what we are doing here is choosing a standard representative for strings based on their relative ordering; for instance, we want all three letter strings which go Small, Big, Medium to be considered the same, as they all standardize to 132.

We may now define containment and avoidance. Given a permutation \pi of length n and a permutation \sigma of length k, with k\leq n, we say that \pi contains \sigma if there exist indices i_1<i_2<\dots<i_k such that \pi_{i_1}\pi_{i_2}\dots\pi_{i_k} standardizes to \sigma. If this is not the case, then we say \pi avoids \sigma. When we ask questions such as “does \pi contain \sigma?” we will refer to \sigma as the pattern, hence the name of the field.  

Let’s have a few examples. Let \pi=43521, \tau=123, and \sigma = 231. Then I claim \pi contains \sigma but avoids \tau. I would encourage you to try and verify this for yourself before reading on. For verification, note that \pi contains the  subsequence 452, which standardizes to \sigma = 231. However, \pi does not contain any increasing subsequence of length 3, and hence \pi cannot contain \tau = 123. There are a couple remarks to be made following this example. First, note that \pi contains several subsequences which standardize to \sigma, namely 452, 451, 352, and 351. For now, we won’t worry too much about counting how many subsequences standardize to our pattern, and just focus on whether or not the pattern is contained at all. The second pedagogical remark is simply to reinforce that the subsequences taken above need not be adjacent in the original permutation, they just need to be taken in their original ordering. The subject of vincular pattern avoidance deals with placing additional restrictions on the containment, such as forcing the subsequence to be consecutive. But that is a topic for another day.

Now that we have defined the notion of permutation patterns and their containment, I would like to briefly end by listing some questions that are studied in this field, and some reasons for why this field is of interest in the first place. The list is in no particular order. I have denoted reasons of interest by “ROI” and questions by “Q”

  •  ROI: A stack is a data storage unit studied and used frequently in computer science. One operation you can try to use them for is to sort a string into increasing order; it turns out that the stack sortable permutations are exactly those which avoid the pattern 231. This is a result due to Knuth, and this result is largely credited for the explosion of interest in this area.
  • Q: Fix a permutation \sigma of length k. As an explicit function of n, how many permutations of length n avoid sigma? Remarkably, if \sigma is any permutation of length 3, then the answer is given by the Catalan numbers.
  • Q: If we cannot give an explicit formula to the question above, can we give asymptotics? One of the biggest open problems in the field currently is to find a tight growth rate for the number of permutations avoiding the pattern 1324.
  • ROI: Permutation patterns are beginning to be used in dynamical modeling. See, for instance, “The Frequence of Pattern Occurrence in Random Walks” by S. Elizalde and M. Martinez.
  • Q: What is the computational complexity of determining whether or not a given permutation contains another permutation? What if we restrict the class of permutations we examine?
  • Q: Let \Pi denote the set of all permutations of any arbitrary (finite) length, and put an ordering on \Pi by \sigma\leq \pi if and only if \pi contains \sigma as a pattern. Then this can readily be checked to be a partially ordered set (poset) known as the permutation pattern poset. This has been an object of intense study recently; a large open problem is finding the Mobius function of this poset.

I think that this is a good list for now. If you are interested in more, here is a decent place to start. Alternatively, come back to my blog in a week or so! I will plan on proving some more results from this area, starting with the Catalan result in the second bullet point above.

As always, thanks for reading!

Posted in Combinatorics, Patterns | Tagged | Leave a comment

On the geometric mean of the binomial coefficients

Today we prove a remarkable asymptotic result regarding the binomial coefficients. Namely, let G_n=\left(\prod_{k=1}^n\binom{n}{k}\right)^{\tfrac{1}{n+1}}. We will show

\displaystyle \lim_{n\to\infty}G_n^{\frac{1}{n}}=\sqrt{e}.

Before getting to the proof, I should say that this is most definitely not my creation; I believe the first publication of this proof may be credited to Polya (but I am not sure). In any case, I was shown this proof at MSU in my Putnam training seminar. It was presented by Dr. Ignacio Uriarte-Tuero. Any unintended errors introduced are mine alone.

To start, we use the definition of \binom{n}{k} to write

\begin{aligned} G_n^{n+1}&=\prod_{k=1}^n\binom{n}{k}\\&=\frac{(n!)^{n+1}}{\prod_{k=1}^n(k!)^2}\end{aligned}

Now for 1\leq j\leq n, we want to examine the power of j which appears in the fraction. So, for instance, n appears n+1 times in the numerator and twice in the denominator from (n!)^2, n-1 appears n+1 times in the numerator and four times in the denominator from (n!)^2 and ((n-1)!)^2, etc. We get the formula


We now multiply by 1 in a clever way. Noting that \sum_{k=1}^n(n+1-2k)=0 we obtain


as the denominator has total exponent 0. We can now write


We want to examine G_n^{\frac{1}{n}}. Taking logarithms, we get

\frac{1}{n}\ln(G_n)=\frac{1}{n} \sum_{k=1}^n(1-\frac{2k}{n+1})\ln(1-\frac{k}{n+1})

We now recognize this expression as a Riemann sum of the function g(x)=(1-2x)\ln(1-x) on the interval {}[0,1]. (Ok, it is not exactly a Riemann sum; we should be dividing by n+1, not n. But in the limit this will not matter.) Taking limits, it follows that

\lim_{n\to\infty}\frac{\ln(G_n)}{n}=\lim_{n\to\infty}\frac{1}{n} \sum_{k=1}^n(1-\frac{2k}{n+1})\ln(1-\frac{k}{n+1})=\int_0^1(1-2x)\ln(1-x)dx

Evaluating the integral is now easy via integration by parts; it turns out that


and thus \lim_{n\to\infty}\ln(G_n^{\frac{1}{n}})=\frac{1}{2} and \lim_{n\to\infty} G_n^{\frac{1}{n}}=\sqrt{e}.

And we’re done! As always, thanks for reading!

Posted in Asymptotics, Combinatorics, Contest Math | Leave a comment