Difference between revisions of "Catalan Numbers"

From Math Images
Jump to: navigation, search
Line 630: Line 630:
'''Ballot Sequence:''' In a sequence of ''2n'' items with ''n'' A’s and ''n'' B’s. If there are no more B’s than A’s anywhere in the sequence, then the number of ways of counting these items is the ''n''th Catalan number.
So far we have seen a certain number of applications of Catalan numbers and how they are related to each other. In fact, Catalan numbers arise in over 600 examples.
We have been convinced that, however the examples vary, applications of Catalan numbers are related to each other in an equivalent way. If we remember the Ballot sequence, which says that "In a sequence of 2n items with n +'s and n -'s, if there are no more -'s than +'s anywhere in the sequence(in other words, the sum of this sequence is always nonnegative), then the number of ways of counting these items is the $n$th Catalan number."
Think about the parentheses. If there are more closed parentheses than open parentheses somewhere in the sequence, then it will not make sense.
Think about the parentheses. If there are more closed parentheses than open parentheses somewhere in the sequence, then it will not make sense.
Any problem that follows this rule could be solved by Catalan numbers. If interested, you could create your own example!
Any problem that follows this rule could be solved by Catalan numbers. Here is one example. A class of 400 college students is voting for their class president. 200 students vote for A, and 200 students vote for B. If in the voting process B always trails A, then could you be able to tell me the number ways of the sequence in which the votes could be appear?
A class of 400 college students is voting for their class president. 200 students vote for A, and 200 students vote for B. If in the voting process there are always more votes for A than for B, then could you be able to tell me the number ways of the sequence in which the votes could be appear?

Revision as of 12:18, 6 July 2012

Worm and Apple
Field: Algebra
Image Created By: Phoebe Jiang

Worm and Apple

This greedy little worm wants to eat the poor apple. He can only go to the east and to the north in this 8 by 8 grid. Since there is stain on the grid, he cannot pass above the diagonal connecting the worm and the apple. How many ways could he get there? The main image shows only one way of reaching the apple.
This is a very famous grid problem in combinatorics, which could be solved by Catalan numbers.

Basic Description

Catalan numbers grow rapidly. The first several Catalan numbers are listed as following.

n 0 1 2 3 4
C_n 1 1 2 5 14 42 132 429 1430 4862 16796 58786 208012 742900 2674440 9694845 35357670

More explicit and detailed description is under More Mathematical Explanation section.

Why It's Interesting


The first person who discovered Catalan numbers was Leonhard Euler. In 1751, Euler discussed the number of ways to cut a polygon with lines into triangles without any of the lines intersecting in his letter to Christian Goldbach, a German mathematician.

It was a French and Belgian mathematician, Eugène Charles Catalan, who described this number sequence in a well-defined formula, and introduced this subject to solve parentheses expressions.

Before Catalan, a Mongolian mathematician Minggatu was the first person in China who established and applied what was later to be known as Catalan numbers. In the 1730s, he brought forward this sequence of numbers and continued using it when he was trying to express series expansions of \sin(ma), where m = 2, 3, 4, 5, 10, 100, 1000, and10000. This topic was included in his book, Ge Yuan Mi Lu Jie Fa (The Quick Method for Obtaining the Precise Ratio of Division of a Circle).


In this section we will consider 10 most representative examples of applications of Catalan numbers that arise in a variety of combinatorial problems. Examples are indicated with bullet points.

Stacking Coins

  • We are going to stack coins on a bottom row that consists of n consecutive coins. It is not allowed to put the coins on the two sides of the bottom coins. How many ways there are to stack coins on the n coins?
n: The number of ways to stack coins in the plane.
Solution: C_n.
Coins 1230.jpg
Coins 40.jpg

Balanced Parentheses

  • We want to group a string of parentheses. Each open parenthesis must have a matching closed parenthesis. Therefore, "(( )( ))" is valid, but ")( )) ((" and "( ))( ) (" are not. How many groupings are there to group n pairs of parentheses?
n: The number of pairs of parentheses.
Solution: C_n.

Do Nothing!
1 solution
( )
1 solution
(( ))
( )( )
2 solutions
( (( )) )
( ( )( ) )
(( )) ( )
( ) (( ))
( )( )( )
5 solutions

  • Many other applications are equivalent to balanced parentheses, and here is one example. If we want to connect 2n dots lying on a horizontal line in the plane with n nonintersecting arcs, the solution is also the Catalan sequence. Each arc connecting the two dots is equivalent to a pair of parentheses, with the left dot equivalent to an open parenthesis and the right dot equivalent to a closed parenthesis.

Mountain Ranges

  • We want to form mountain ranges on a line with n upstrokes and n downstrokes. Same as the matching rule of the parentheses grouping problem, each upstroke must have a matching downstroke. How many mountain ranges are there for each value of n?
n: The number of pairs of upstrokes and downstrokes.
Do Nothing!
1 solution
/ \
1 solution
/ \ / \
/ \
/ \/ \
2 solutions
n=3 / \ / \ / \
\ / / \
/ \ / \/ \
/ \ \ /
/ \/ \ / \
\ / \ / \ \
/ / \ / \ \
/ // \\ \
/ / / \ \ \
/ / / \ \\
5 solutions
  • Note that a pair of strokes and a pair of parentheses are equivalent: upstrokes are equivalent to open parentheses, and downstrokes are equivalent to closed parentheses. The fact that one pair of parentheses are inside another pair corresponds to that one pair of strokes are on top of another pair, thus forming the shape of mountain ranges.

Polygon Triangulation

  • We want to cut convex polygons into triangles by connecting the vertices with straight, non-intersecting lines. How many different ways are there for a polygon with n+2 sides? This is the application Euler was interested in.
n: The number of sides of the polygon - 2.
Solution: C_n
Polygons 1230.jpg
Polygons 40.jpg
Note that a 2-sided polygon is set to be triangulated in exactly one way, do nothing, so it follows C_0 = 1 .

Binary Trees

n: The number of internal nodes on full binary trees.
Note that when there is only one node, we have one solution, which is the node itself, so it matches with C_0 = 1.
Binary Trees 123.jpg
Binary Paths 40.jpg
In summary, a full binary tree with n internal nodes has 2n + 1 nodes, 2n branches and n+1 leaves.

  • Other transformations of binary trees and plane trees also contains Catalan sequence:
1. Binary trees with n vertices [1].
2. Plane trees with n+1 vertices [1].

Binary Paths

  • In a n × n grid, we are going to joint the lower left point A and the upper right point B by a path. We are only allowed to go to the right or upwards for each unit, and cannot pass above the diagonal connecting A and B.
n: The number of paths described above.
Solution: C_n. (Thus, the answer to the main image is C_8.)
Binary Paths 120.jpg
Binary Paths 30.jpg
Figure-1 An example of Dyck Path.
Binary Paths 400.jpg
1. Did you find out that these kind of paths look a lot like mountain ranges if you rotate them counterclockwise about origin until the diagonal is horizontal?
2. Did you notice that, whatever value n is, the first step is alway to the east and the last step is always to the north? It is because we cannot pass above the diagonal.

  • This kind of lattice walk is also known as Dyck Path. Based on Cartesian Coordinates system, a Dyck path is a walk from (0, 0) to (n, n) in a n × n lattice that is composed of one-unit steps only in positive x-axis and positive y-axis directions without passing above the line y = x (see Figure-1). Other transforms of Dyck Paths turn out to follow the sequence of Catalan numbers as well:
1. Dyck Paths (as defined above) from (0, 0) to (2n + 2, 0) such that any maximal sequence of consecutive steps (1, -1) ending on the x-axis has odd length [1].
2. Dyck Paths (as defined above) from (0, 0) to (2n + 2, 0) with no peaks at height two [1].


  • A permutation of {1, 2, ... , n} is an rearrangement of the n numbers. For example, the permutation of {1, 2, 3} includes 6 terms: (1, 2, 3), (1, 3, 2), (2, 1, 3,), (2, 3, 1), (3, 1, 2), (3, 2, 1). 123-avoiding permutation means to avoid an increasing subsequence of 3 terms (the 3 terms do not have to be consecutive). Therefore, we should avoid (1, 2, 3) for n=3. Take n=4 as another example, (4, 3, 1, 2) is valid, but (4, 1, 2, 3) is not valid because of the subsequence 123, and neither is (2, 3, 1, 4) because of 234.
n: The number of permutations that avoid 123.
Solution: C_n.
1 solution
1 solution
(1, 2), (2, 1).
2 solutions
(1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1).
5 solutions
(1, 4, 3, 2), (2, 1, 4, 3), (2, 4, 1, 3), (2, 4, 3, 1),
(3, 1, 4, 2), (3, 2, 1, 4), (3, 2, 4, 1), (3, 4, 1, 2), (3, 4, 2, 1),
(4, 1, 3, 2), (4, 2, 1, 3), (4, 2, 3, 1), (4, 3, 1, 2), (4, 3, 2, 1).
14 solutions
Note that 123-avoiding permutation only avoids an increasing subsequence of three terms, regardless of the value of n. Therefore, (1,2), (2,1) are valid although they are increasing subsequences as well.

  • Similarly, there is a 321-avoiding permutation of $[n]$, which avoids a decreasing subsequence of $3$ terms. Take $n = 3$ as an example, we will then have $(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2)$. And a permutation of $[n]$ is called 132-avoiding, ``if it does not have three entries $a < b < c$ so that $a$ is the leftmost of them and b is the rightmost of them[2].

  • Catalan numbers count shuffles of the permutation 1,2, \cdots, n with itself, i.e., permutations of the multiset \left \{ 1^2, 2^2, \cdots , n^2 \right \} which are a union of two disjoint subsequences 1,2, \cdots, n . On top of this, there should be no weakly decreasing subsequence of length three . [3]

Explanations. A shuffle of 1, 2, \cdots, n and itself 1, 2, \cdots, n is obtained by intermixing the letters in each string of numbers, while the letters in each string must stay in the original order. For example, a shuffle of 123 and 456 could be: 124536 . No weakly decreasing of length three means that the subsequence either is strictly increasing or has at most two equal entries not followed by a decrease. For example, 121233 is valid (because it has only two equal entries instead of three with no decrease followed), but 112332is not (because there is a decrease, 2, followed after 33).
Back to the application, let's see an example of n = 3, where we want to know the number of shuffles of permutations of 1, 2, 3 with 1, 2, 3. It turns out that there are 5 distinct shuffles:
112233 112323 121233 121323 123123.
Note that they are distinctive shuffles. In fact, for each single shuffle, some of the entries may come from either string of 1, 2, 3, although the sequence appears the same. For instance, the first shuffle 112233 could have a multiplicity 8.
1 1 2 2 3 3 , 1 1 2 2 3 3 , 1 1 2 2 3 3 , 1 1 2 2 3 3 , 1 1 2 2 3 3 , 1 1 2 2 3 3 , 1 1 2 2 3 3 , 1 1 2 2 3 3

  • Catalan sequence is also the answer to the number of permutations of a_1 a_2 a_3 \cdots a_{2n} , formed from integers 1, 1, 2, 2, 3, 3, ... n, n such that
a) these integers 1, 2, 3, ... , n are in increasing order when they first occur, and
b) there is no form like \alpha\beta\alpha\beta \cdots, where the integers \alpha, \beta, \cdots do not have to be consecutive.
For example, 1212 and 122313 are not valid. Here are the solutions where n = 3:
112233, 112332, 122331, 122133, 123321.

Young Diagrams

Figure-2 Partition of 4.

In combinatorics, partition of a positive integer n is an expression of rewriting n as a sum of positives integers, where the summands do not differ in orders. The sum would be called composition if order matters. Take n=4 as an example, there are 5 ways to partition 4: 4, 3+1, 2+2, 2+1+1, 1+1+1+1. Remember, the ordering of the integers does not matter, i.e. 1+1+2, 1+2+1, 2+1+1 are equivalent partitions.

Partitions could be visualized in explicit graphs, and the most commonly used one is called Young diagrams. Again, take n = 4 and n = 5 as two examples for better understanding. Young diagrams of partition 4 and 5 are shown in Figure-2 and Figure-3.

Figure-3 Partition of 5.
  • Young diagrams that fit in the shape (n - 1, n - 2, ... , 1) follow Catalan sequence [1].
Explanations. The shape (n - 1, n - 2, ... , 1) is a Young diagram that looks like a upside down staircase. Its top stair consists of n - 1 blocks, next stair n - 2 blocks, and so on until the last stair has only 1 block. By "fit," we mean that we try to find Young diagrams that could be a part of the shape (n - 1, n - 2, ... , 1) or the entire shape.
In this image above, the last figure is the shape (2, 1) where n = 3. Each of the other four figures, including the empty set, could be a part of the shape. Add the original shape, then we have five solutions for shape (2,1) in total.


Partially Ordered Set P, or Poset P for short, is a set together with a binary relation denoted \le , satisfying the following three axioms, where x and y are arbitrary objects:

  1. For all x \in P, x \le x. (reflexivity)
  2. If x \le y, and  y \le x , then  x = y. (antisymmetry)
  3. If x \le y, and  y \le z , then  x \le z. (transitivity) [1]

Hasse diagram is used to represent a finite poset. Each element in the poset is a vertex in Hasse diagram. The transitive relation in the poset is represented by lines going up from one vertex to another in Hasse diagram. The lines could cross each other but cannot touch other vertex before it reaches the endpoint. It is the line segments and labeled vertices in the diagram that illustrates the partial order of a set. See Figure-4, -5, -6 for several examples of Hasse diagram. As you can see, the element in the poset is any object; it could be a set, a diagram or a number.

How do Hasse diagrams embody the 3 definitions of posets? In Figure 4,

  1. Each element in the diagram reflects itself, i.e., the set {a} is less or equal to itself {a}. This shows reflexivity.
  2. Since set {a} is less or equal to {a} and {a} is less or equal to {a}, then {a} = {a}. This symmetry does not work between, for example {a} and {b}, because {a} and {b} are two different, nonsymmetric sets. This is the idea of antisymmetry.
  3. Since the empty set is less or equal to set {a}, and set {a} is less or equal to set {a, b}, the empty set is less or equal to {a, b}. In the diagram, the three sets are connected with 2 line segments. This shows that if we can follow the lines going from the bottom element up to the top one, then all the elements on the way obey transitivity. Likewise, we can tell that {b} is less or equal to {a, b, c} according to the diagram.

  • Linear extensions of the poset 2 × n follow Catalan sequence [1] (See Figure 7).
Figure-7 This is the Hasse diagram of the poset 2 × n .
Explanations. If n = 3, then it looks like this Hasse4.jpg. Linear extension is obtained by rewriting the Hasse diagram on a line, and people read it from bottom up as they read Hasse diagrams. How do we know where to put the elements? Starting from the bottom 1, we write each element in Hasse diagram only after the elements it connects from the bottom are already written. Hence, there are more than one linear extension of a poset. The gif pictures will help you comprehend the process.
Repeat the same steps as shown in Figure-8 and Figure-9, and we will get 5 linear extensions. The starting and ending point will never change, whereas the points in between vary.
123456 , 123546, 132456, 132546, 135246
The number of linear extensions of a poset 2× n turns out to be the nth Catalan numbers.

Pascal's Triangle


If you take the difference of numbers in the middle column on odd rows and their adjacent column, you will find the Catalan sequence. <template>AlignEquals

A More Mathematical Explanation

Note: understanding of this explanation requires: *combinatorics

Basic Description

The nth Catalan numbers are defined as

'"`UNIQ--math-0000003D- [...]

Basic Description

The nth Catalan numbers are defined as

C_{n} = \frac{1}{n+1} \cdot \begin{pmatrix}
 2n \\
\end{pmatrix}  = \frac{(2n)!}{n!(n+1)!}, \qquad n = 0, 1, 2, ...

The binomial coefficient, \tbinom{n}{r}, pronounced as n choose r, represents the number of possible combinations of r objects from a collection of  n objects:

\binom{n}{r} = \frac{n!}{r! (n-r)!} .


 2n \\
\end{pmatrix} = \frac{(2n)!}{n! (2n -n)!} = \frac{2n \cdot (2n -1) \cdot (2n -2) \cdots (2n - n + 1)}{n!}

Example: \begin{pmatrix}
 11 \\
\end{pmatrix}  = \frac{11!}{4!~(11-4)!} = \frac{11 \times 10 \times 9 \times 8}{4 \times 3 \times 2 \times 1}.

Catalan numbers could be described in various but equivalent ways. If you transform the first formula just a little bit, you will get another useful formula of Catalan numbers.

C_{n} = \begin{pmatrix}
 2n \\
\end{pmatrix} -  \begin{pmatrix}
 2n \\
 n + 1
\end{pmatrix}, \qquad n = 0, 1, 2, ...

See Proof in the next section to know more about its proof. Note that \tbinom{2n}{n} , \tbinom{2n}{n+1} \in \mathbb{N} and \tbinom{2n}{n} > \tbinom{2n}{n + 1} . Therefore, C_n is the difference between two positive, natural numbers, which could be extended to Pascal's Triangle.


  • Prove C_{n} = \begin{pmatrix}
 2n \\
\end{pmatrix} -  \begin{pmatrix}
 2n \\
 n + 1
\end{pmatrix}, n = 0, 1, 2, ... is the formula for Catalan sequence.
1. Check if it is true when n = 0.
C_{0} = \begin{pmatrix}
 2(0) \\
\end{pmatrix} -  \begin{pmatrix}
 2(0) \\
 0 + 1
\end{pmatrix} = 
 0 \\
\end{pmatrix} -  \begin{pmatrix}
 0 \\
\end{pmatrix} = 1 - 0 = 1 .
2. Show it is true when n \geqslant 1.
C_n = \frac{1}{n+1} {2n \choose n} = {2n \choose n} - \frac{n}{n+1} {2n \choose n} = \frac{2n!}{n!n!} - \frac{n}{n+1} \frac{2n!}{n!n!} = \frac{2n!}{n!n!} - \frac{2n!}{(n+1)!(n-1)!} = {2n \choose n} - {2n \choose n+1}. \blacksquare

  • Prove this recurrence relation: C_{n+1} = \frac{2(2n+1)}{n+2} C_n for n=0, 1, 2, ... .
Recall that
C_n = \frac{(2n)!}{n!(n+1)!}, \qquad n = 0, 1, 2, \cdots.
Therefore, we have
C_{n+1} = \frac{(2n+2)!}{(n+1)!(n+2)!}, \qquad n = 0, 1, 2, \cdots.
Take the ratio of C_{n+1} to C_n:
\frac{C_{n+1}}{C_n} = \frac{ \frac{(2n+2)!}{(n+1)!(n+2)!}}{ \frac{(2n)!}{n!(n+1)!}} = \frac{ (2n+2)! n! (n+1)!} {(n+1)! (n+2)! (2n)!} = \frac{(2n+2) (2n+1)}{(n+1) (n+2)} = \frac{ 2(2n+1)}{(n+2)}.
C_{n+1} = \frac{ 2( 2n+1)}{(n+2)} C_n
for all nonnegative values of n. \blacksquare

Recursive Definition

We have seen various kinds of applications of Catalan numbers so far: "Stacking Coins," "Balanced Parentheses," "Mountain Ranges," "Polygon Triangulation," "Binary Trees," "Binary Paths," "Permutation," "Young Diagrams" and "Posets." In fact, all the sequences are equivalent, and we will show that there is a common formula that counts them all.

C_0 =1,  C_{n+1} = \sum_{i=0}^n C_i C_{n-i} \text{ for  } n\ge 1 .

In the application Balanced Parenthese, it is already known that for each open parenthesis, there is a close parenthesis. Now, let's try to find a pattern in these paired parentheses with example n = 3:

( (( )) ) - ( ( )( ) ) - (( )) ( ) - ( ) (( )) - ( )( )( ) .

The "pattern" is that we can always separate them in two collections. For example, we can separate the set "(( )) ( )" into: "(( ))" and "( )." We name them collection A and collection B, either of which is able to contain zero pairs of parentheses. Similarly, ( (( )) ) could be separated into " ( (( )) ) " and nothing. For ( ( )( ) ) , we treat it as a whole and put it in collection A, so B is, again, empty.

What about ( )( )( ) ? At first glance, we see three pairs of parentheses, but we have only two collections. We could choose to put the first two pairs of parentheses in collection A and the last pair should be in B, or put two in collection B and only one in A. Since this is repetitive, there is a need for a regulation in order to avoid the repetitiveness of " ( )( ) " in A "( )" in B and "( )" in A "( ) ( )" in B, which are technically the same. Since n is no less than 1 in the recurrence definition mentioned above, it is certain that there is at least a pair of parentheses, and we will fix it in collection A. Thus, the simplest form where n = 1 is:

 ( \quad  )  \quad {\color{White} ( ) }
 {\color{Maroon}A}  \quad   {\color{Blue}B}   ,

and this is our base form. For values of n that are greater than 1, we simply add more pairs of parentheses inside the fixed black parenthese to collection A, and place the rest in collection B. In this way, both collection A and B are able to contain up to n - 1 pairs of parentheses (the black parentheses in the base form does not count as one of them). If collection A contains k pairs, then it is not hard to find that there are n - ( k + 1) pairs in collection B.

What is the purpose of separating the parentheses into two collections? Well, we want to two collections A and B for the purpose of counting the combinations of parentheses systematically: if A has 0 pairs, B has n - 1 pairs; A 1 pair, and B n - 2 pairs; A 2 pairs, and B n - 3 pairs, etc.

Number of Pairs
Contained in A
Number of Pairs
Contained in B
Number of Solutions
for Each Situation
n - 1
 \Big( \quad  \Big) {\color{Blue}( \cdots ) \cdots}
 {\color{Maroon}A} \qquad   {\color{Blue}B}
C_0 C_{n-1}
 n - 2
 \Big( \  {\color{Maroon} ( \ )} \ \Big)  {\color{Blue}(  \cdots  ) \cdots}
 {\color{Maroon}A}    \qquad    {\color{Blue}B}
C_1 C_{n-2}
n - 1
 \Big( \ {\color{Maroon} ( ( \cdots ) )( \cdots ) \cdots} \  \Big) {\color{White} ABCDEFGH}
 {\color{Maroon}A}    \qquad \qquad  {\color{Blue}B}
C_{n - 1} C_0

Add up all of the situations, and we get the total number:

C_n = C_0 C_{n-1} + C_1 C_{n-2} + C_2 C_{n-3} + \cdots + C_{n+2} C_1 + C_{n-1} C_0.

This formula is the recursive relation that we are looking for. Plugging actual numbers may help you understand this great formula.


Teaching Materials

There are currently no teaching materials for this page. Add teaching materials.


  1. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 Stanley, Richard P. Enumerative Combinatorics, vol.2. Cambridge University Press. New York/Cambridge. 1999. Cite error: Invalid <ref> tag; name "Enumerative Combinatorics" defined multiple times with different content Cite error: Invalid <ref> tag; name "Enumerative Combinatorics" defined multiple times with different content Cite error: Invalid <ref> tag; name "Enumerative Combinatorics" defined multiple times with different content
  2. Dowling, Thomas A. Catalan Numbers. Department of Mathematics, Ohio State University. Retrieved from http://www.mhhe.com/math/advmath/rosen/r5/instructor/applications/ch07.pdf.
  3. Stanley, Richard P. Catalan Addendum. MIT Mathematics. Version of 22 October 2011. (No.t^6) Retrieved from Richard Stanley's home page http://www-math.mit.edu/~rstan/ec/.

[5] Stanley, Richard P. Enumerative Combinatorics, vol.1. Cambridge University Press. New York/Cambridge. 1999.

[6] Campbell, Douglas M.. The Computation of Catalan Numbers. Mathematics Magazine, Vol. 57, No.4 (Sep., 1984), pp. 195 - 208.

[7] Choo, Koo-Guan. Catalan Numbers. Retrieved from http://www.maths.usyd.edu.au/u/kooc/catalan.html.

[8] Britz, Thomas. Cameron, Peter. Partially ordered sets. 2001. Retrieved from http://www.maths.qmul.ac.uk/~pjc/csgnotes/posets.pdf.

If you are able, please consider adding to or editing this page!

Have questions about the image or the explanations on this page?
Leave a message on the discussion page by clicking the 'discussion' tab at the top of this image page.