Home > Uncategorized > Using the mean to find the mode of a Binomial Distribution

Using the mean to find the mode of a Binomial Distribution

The original motivation behind this investigation was an attempt to save my Statistics students a few precious seconds in their upcoming S1 module paper.

The mean or expectation of a Binomial Distribution is always very close to mode, (the value of X that has greatest probability). I want to know if you can use the mean to reliably predict the mode.

Binomial Distributions come up all over the place.  A classic example would be where you try to score, say, a 5 with an ordinary dice. You perform n trials, and the probability of success on a particular go is 1/6.  On each trial you’ll either succeed, i.e. score a 5 (the probability of which is 1/6) or you’ll fail, i.e. not score 5 (the probability of which is 5/6). X is simply the number of times you score 5 out of those n trials.  Binomial literally means ‘two numbers’: the probability of success and the probability of failure; and the two numbers must add up to 1.

But we can be very general.  In the following example we are not even given the context of the experiment; we’re just given n which is the number of trials, and p which is the probability of success on each separate trial.

Example Problem
Let X be the number of successes and X~B(20,0.42)
i.e. X is binomially distributed across 20 trials with probability of success in each trial 0.42

(i) The expected number of successes (the mean).
(ii) The most likely number of successes (the mode).

Part (i) is straightforward. Just use the formula np.
20×0.42 = 8.4

Part (ii) is only a little fiddlier.  Textbooks recommend using the expectation np as a guide, and to calculate the probabilities that X takes each of the values either side of np. The answer is the value of X that yields the greatest probability:

P(X=8) = 20C8(0.42)8(1-0.58)12 = 0.1767 (4s.f.)
P(X=9) = 20C9(0.42)9(1-0.58)11 = 0.1707 (4s.f.)
So the mode is 8 successes.

Is there a quicker way that doesn’t involve using the formula nCrpr(1-p)n-r ?
In this example, the expectation, np, rounded to the nearest integer, is 8.  Is it a coincidence that this value is also the most likely number of successes?

Most textbook problems like this one are such that np rounded to the nearest integer gives the most likely number of successes.  If this were a reliable fact for any n and p, then we could take all calculations away from the solution to part (ii) and simply write down the answer, saving a minute or more in an exam.  So let’s see if we can prove or disprove it…


“If X~B(n,p) then the mode is equal to the expectation rounded to the nearest integer.”

Consider the possible shapes of a binomial distribution:

Symmetrical (p=0.5)
When n is even, the hypothesis clearly holds:

When n is odd, np will be exactly halfway between two integers, and will round to the greater of them. This is ok because the distribution is bimodal and we’re still ending up with one of the two modes:

Asymmetrical or ‘skew’ (p≠0.5)
Usually there is one value of X that has greater probability than any of the others. Consider these two similar distributions (notice the value of p is slightly different in each):


The value of p changes only slightly from 0.44 to 0.45, but the most likely value of X has now jumped from X=3 to X=4.  So there must be a value of p between 0.44 and 0.45 where P(X=3) and P(X=4) are exactly equal.

We can find this in the general case, for any n and for any pair P(X=r) and P(X=r-1):

This can be derived in either direction, so: if there exists a positive integer r ≤n such that p=r/(n+1), then P(X=r)P(X=r-1).

But, also, there is a tipping point at which np stops being rounded up and starts being rounded down: this is halfway between the two integers, at np = r-½.  Divide both sides of this equation by n, to give p=(r-½)/n. Now we have two expressions for p.

If the value of p can fall between r/(n+1) and (r-½)/n then the hypothesis does not hold, i.e. we cannot confidently round the expectation to the nearest integer to calculate the mode.

Example: Given n=20 and r=8, then r/(n+1) = 0.381 and (r-½)/n = 0.375
If p takes a value between these figures then the hypothesis does not hold.  Given how close these figures are to each other, this explains why a counterexample seemed so elusive.

Formally, rounding the expectation to the nearest integer does not give the correct mode if and only if there exists some positive integer r ≤n such that,

To paraphrase (or ‘para-equate’?), if the expectation np has already been calculated, it may make more sense to consider whether np falls within certain bounds, so rounding the expectation to the nearest integer does not give the correct mode if and only if there exists some positive integer r ≤n such that

Final considerations
These compound inequalities provide only small windows into which p must fall in order to negate the hypothesis.  We will now add up the sizes of these small windows to determine the likelihood that the hypothesis does not hold. Effectively we’re working out: if you blindly round the expectation to the nearest integer, what’s the probability you get the correct (or incorrect) mode? Since the binomial problem could come from any source and in any context, it seems reasonable to assume a uniform distribution for p between 0 and 1.

Consider the size of these small windows – that size is the difference between the two expressions we have for p.  We can sum over this expression for all values of r from 1 to n. This gives the probability that, for a randomly chosen value of p, the hypothesis does not hold.

For large values of n, this is approximately equal to ¼. So, for a randomly chosen probability of success from a single trial, the probability that the hypothesis holds is at least ¾. That means you’re likely but far from certain to get it right. It’s probably not worth just guessing. Shame!

This interactive diagram shows the regions where the value of p negates the hypothesis (the red regions show the values of p that provide counterexamples).  Drag the slider to modify the value of n:

The graph below shows how the probability that the hypothesis holds varies with increasing n. See how it tends to ¾ from above.

The recommendation by the textbooks, to separately calculate the probabilities that X takes each of the values either side of np, then choosing the larger, is the best approach to finding the mode.
But there is at least a ¾ chance that it will be the integer value closest to np, so if (and only if) you’re really short of time in an exam, guess!

  1. No comments yet.
  1. No trackbacks yet.