The binomial distribution is one of the most important discrete probability distributions in Cambridge A-Level Statistics 1 (9709). It models situations where you repeat a fixed number of independent trials, each with only two possible outcomes and a constant probability of success. Real-world applications range from quality control testing to genetics, medical trials, and opinion polls.
State the four conditions required for a binomial distribution to apply
Recognise when a binomial model is appropriate (and when it is not)
Use the formula P(X=r) = nCr pr(1−p)n−r to calculate exact probabilities
Calculate the mean E(X) = np and variance Var(X) = np(1−p) = npq
Use cumulative probabilities: P(X≤r), P(X≥r), P(a≤X≤b)
Use a calculator (binompdf / binomcdf) and binomial tables efficiently
Find n or p given information about probabilities or moments
Identify critical regions for hypothesis testing (introductory)
Four Conditions
Fixed n, independent trials, two outcomes, constant p
P(X=r) Formula
nCr pr qn-r — choose × success × failure
Mean E(X)
np — expected number of successes
Variance Var(X)
npq where q = 1−p
Cumulative P(X≤r)
Sum P(X=0)+…+P(X=r)
Complement
P(X≥r) = 1−P(X≤r−1)
Calculator
binompdf(n,p,r) and binomcdf(n,p,r)
Critical Region
Tail where P < significance level
Learn 1 — Binomial Conditions
The Four Conditions for B(n, p)
We say X follows a binomial distribution, written X ~ B(n, p), if and only if ALL four of the following conditions hold:
1. Fixed number of trials n
2. Trials are independent
3. Only two outcomes per trial (success / failure)
4. Constant probability p of success at each trial
Condition 1 — Fixed Number of Trials
The experiment is repeated exactly n times, and n is known in advance before the experiment starts.
Example: "A fair coin is tossed 10 times" → n = 10 is fixed. ✓ Counter-example: "Toss until you get a head" → n is not fixed. This is a geometric distribution, not binomial.
Condition 2 — Independent Trials
The outcome of each trial does not affect any other trial. Knowing one outcome gives no information about another.
Example: Rolling a die repeatedly — each roll is independent. ✓ Counter-example: Drawing cards one by one without replacement from a pack — removing a card changes the composition, so trials are not independent. ✗
Condition 3 — Two Outcomes Only
Each trial results in exactly one of two mutually exclusive outcomes, called success and failure. We choose which to label "success" based on what we are counting.
Example: Testing whether a component is defective (defective = success, or non-defective = success — we choose). ✓
The labels are arbitrary; what matters is that there are exactly two categories.
Condition 4 — Constant Probability of Success
The probability of success, denoted p, is the same for every trial. Consequently, the probability of failure q = 1 − p is also constant.
Example: Each seed from a batch has a 0.8 probability of germinating (assuming independence). p = 0.8 throughout. ✓ Counter-example: Testing items from a production line where quality changes over time — p is not constant. ✗
Identifying Binomial Situations
Checklist: For any context, ask:
• Is n fixed? → If not, NOT binomial.
• Are trials independent? → If not, NOT binomial.
• Are there exactly two outcomes? → If not, consider other distributions.
• Is p constant? → If not, NOT binomial.
Only if all four hold can you write X ~ B(n, p).
When Binomial Does NOT Apply — Sampling Without Replacement
Key Scenario: Selecting items from a small, finite population without replacement violates independence — once an item is removed, the probabilities change for remaining draws.
Example: A box contains 5 red and 3 blue balls. Draw 3 without replacement. X = number of red balls.
P(1st red) = 5/8, but P(2nd red | 1st red) = 4/7 ≠ 5/8. Not constant → NOT binomial.
Correct model: Hypergeometric distribution.
Exception: If the population is very large relative to sample size, the probabilities barely change and we treat it as approximately binomial. A common rule: if sample < 10% of population, binomial is a good approximation.
Worked Identification Examples
Example 1: A biased coin with P(Head) = 0.6 is tossed 8 times. X = number of heads.
n = 8 (fixed) ✓ | independent ✓ | two outcomes: head/tail ✓ | p = 0.6 constant ✓
→ X ~ B(8, 0.6)
Example 2: A factory makes bolts; 5% are defective. 20 bolts chosen at random (with replacement). X = number defective.
n = 20 ✓ | independent (with replacement) ✓ | defective/not ✓ | p = 0.05 ✓
→ X ~ B(20, 0.05)
Example 3: A bag has 4 green and 6 white balls. 3 are drawn without replacement. X = green balls drawn.
n = 3 ✓ | NOT independent (without replacement, small population) ✗
→ Binomial does NOT apply. Use hypergeometric.
Learn 2 — The P(X = r) Formula
The Binomial Probability Formula
If X ~ B(n, p), then the probability that X takes the value r (exactly r successes in n trials) is:
P(X = r) = nCr × pr × (1−p)n−r for r = 0, 1, 2, …, n
Where nCr = n! / (r!(n−r)!) is the binomial coefficient — the number of ways to choose which r of the n trials are successes.
Understanding Each Part
• nCr: the number of arrangements of r successes among n trials (e.g. SSFFS... in different orders)
• pr: probability of getting exactly r successes (each success has probability p)
• (1−p)n−r: probability of getting exactly n−r failures (each failure has probability q = 1−p)
We multiply them because each specific arrangement has probability pr × qn-r, and there are nCr such arrangements.
Always write down X ~ B(n, p) first. State n, p and the value of r explicitly before applying the formula. This earns method marks in the exam even if arithmetic goes wrong.
Learn 3 — Mean & Variance of B(n, p)
The Formulas
E(X) = np Var(X) = np(1−p) = npq SD(X) = √(npq)
where q = 1 − p is the probability of failure. These formulas are given in the Cambridge formula booklet, but you must be able to apply them quickly and reverse them to find n or p.
Worked Examples
Example 1: X ~ B(20, 0.4). Find E(X), Var(X) and SD(X).
• E(X) = np is the expected (average) number of successes in n trials with probability p each.
• If you toss a fair coin 100 times, you expect E(X) = 100 × 0.5 = 50 heads.
• Var(X) = npq measures how variable the count of successes is. Larger p(1−p) means more spread.
The product p(1−p) is maximised when p = 0.5, giving maximum spread. For p near 0 or 1, the variance is small — the distribution is concentrated at one end.
Finding n and p from Given Information
Example 3: X ~ B(n, p). Given E(X) = 6 and Var(X) = 4.2, find n and p.
Using calculator or summing:
P(X ≤ 6) ≈ 0.9452
P(X ≤ 2) ≈ 0.1672
P(3 ≤ X ≤ 6) ≈ 0.9452 − 0.1672 = 0.7780
Step-by-Step Cumulative Method
Strategy for any cumulative probability:
1. Write down X ~ B(n, p)
2. Convert the event to one involving P(X ≤ k) or its complement
3. Add individual probabilities or use calculator/tables
4. Show all working for marks
A critical region is the set of values of X for which the observed result would be considered statistically significant at a given significance level α.
For a one-tailed lower test at 5%: find the largest r such that P(X ≤ r) < 0.05. For a one-tailed upper test at 5%: find the smallest r such that P(X ≥ r) < 0.05, i.e. 1 − P(X ≤ r−1) < 0.05.
Example: X ~ B(20, 0.5) under H0. Find the critical region for an upper-tail test at 5%.
Need smallest r with P(X ≥ r) < 0.05, i.e. P(X ≤ r−1) > 0.95.
P(X ≤ 13) ≈ 0.9423 < 0.95; P(X ≤ 14) ≈ 0.9793 > 0.95
Critical region: X ≥ 15 (since P(X ≥ 15) = 1 − P(X ≤ 14) ≈ 0.0207 < 0.05).
When summing individual binomial probabilities by hand, build a table row by row. Round intermediate values to at least 5 s.f. to avoid accumulation of rounding errors.
Learn 5 — Using Binomial Tables & Calculator
Using a Scientific Calculator
Most modern scientific calculators and graphical calculators have built-in binomial functions:
On a Casio fx-CG50 or similar: MENU → STAT → DIST → BINM → Bpd (pdf) or Bcd (cdf)
Enter n, p, x (= r), then Execute.
Using Binomial Cumulative Distribution Tables
Cambridge 9709 exams sometimes provide a booklet of cumulative binomial tables. These give P(X ≤ r) for selected values of n and p.
How to read the table:
1. Find the section for your n (e.g. n = 10)
2. Find the column for your p (e.g. p = 0.30)
3. Find the row for your r (e.g. r = 4)
4. Read off P(X ≤ 4) directly from the table
Single exact probability, small n: Use the formula directly. Show all working. Single exact probability, large n: Use binompdf on calculator or tables. Cumulative probability: Use binomcdf on calculator or tables. Range P(a ≤ X ≤ b): binomcdf(n,p,b) − binomcdf(n,p,a−1) Complement P(X ≥ r): 1 − binomcdf(n,p,r−1)
Complementary Events to Simplify Calculations
When computing P(X ≥ r) for large r, complement reduces work significantly.
If p > 0.5 and your tables only go up to p = 0.5, use the symmetry of the binomial:
If X ~ B(n, p) then (n − X) ~ B(n, 1−p)
So P(X = r) with parameter p = P(Y = n−r) with parameter 1−p, where Y = n−X.
Example: X ~ B(8, 0.7). Find P(X ≤ 3) using tables for p = 0.3.
P(X ≤ 3) = P(Y ≥ 5) where Y ~ B(8, 0.3)
= 1 − P(Y ≤ 4) = 1 − 0.9420 = 0.0580
In the exam, always state what function and parameters you use: "Using binomcdf(10, 0.3, 4)" earns the method mark. Never just write a decimal answer from a calculator without referencing the method.
Worked Examples
Eight fully worked examples covering all key binomial skills. Study the method, not just the answer.
Example 1 — Verifying Binomial Conditions
A survey asks 15 people whether they support a policy. Each person responds independently. Previous surveys show 60% support. Let X = number who support the policy. Verify that X ~ B(15, 0.6) and state any assumptions.
Step 1 — Fixed n: 15 people surveyed, so n = 15. ✓
Step 2 — Independence: We are told each person responds independently. ✓ (Assumption: this holds in practice.)
Step 3 — Two outcomes: Each person either supports (success) or does not (failure). ✓
Step 4 — Constant p: We assume each person has the same probability 0.6 of supporting, as given by historical data. ✓
Conclusion: All four conditions satisfied. X ~ B(15, 0.6). Assumption: the probability applies equally to all individuals in this sample.
Key point NEVER write P(X ≥ 5) = 1 − P(X ≤ 5). This omits P(X=5) from both sides. Always subtract P(X ≤ r−1).
Example 5 — Find n Given Probability Condition
X ~ B(n, 0.2). Find the smallest n such that P(X ≥ 1) > 0.95.
Step 1: P(X ≥ 1) = 1 − P(X = 0) = 1 − (0.8)n
Step 2: We need 1 − (0.8)n > 0.95 ⇒ (0.8)n < 0.05
Step 3: Take natural logs: n ln(0.8) < ln(0.05) ⇒ n > ln(0.05)/ln(0.8)
ln(0.05) = −2.9957..., ln(0.8) = −0.22314...
n > 2.9957/0.22314 = 13.43...
Step 4: Since n must be a whole number and n > 13.43, the smallest n is 14.
Check: P(X ≥ 1) = 1 − (0.8)14 = 1 − 0.04398 = 0.9560 > 0.95 ✓
n = 13: P(X ≥ 1) = 1 − (0.8)13 = 0.9450 < 0.95 ✗
Example 6 — Find p Given Probability
X ~ B(3, p). Given P(X = 0) = 0.125, find p.
Step 1: P(X=0) = (1−p)3 = 0.125
Step 2: Take cube roots: 1−p = (0.125)1/3 = 0.5
Step 3: p = 1 − 0.5 = 0.5
Check P(X=0) = (0.5)3 = 0.125 ✓
Example 7 — Mean and Variance
X ~ B(n, p) with E(X) = 4 and Var(X) = 3.2. Find n, p, and P(X = 4).
Step 3: Critical region: X ≤ 2. The actual significance level is P(X ≤ 2) ≈ 0.0355 = 3.55%.
Note Cambridge S1 does not require formal hypothesis test procedures — just finding the critical region is sufficient at this stage.
Common Mistakes
These mistakes appear repeatedly in Cambridge S1 scripts. Learn them now and avoid losing marks.
Mistake 1 — Wrong Complement for P(X ≥ r)
WRONG: P(X ≥ 4) = 1 − P(X ≤ 4)
CORRECT: P(X ≥ 4) = 1 − P(X ≤ 3) [subtract P of "at most r−1"]
The complement of "at least 4" is "fewer than 4" = "at most 3". Writing 1 − P(X ≤ 4) gives 1 minus the event that includes P(X=4), so you're leaving P(X=4) out from nowhere — and double-subtracting it.
Mistake 2 — Not Checking All Four Conditions
WRONG: "There are two outcomes, so it's binomial."
CORRECT: Check ALL four conditions — fixed n, independence, two outcomes, constant p.
Drawing without replacement from a small population typically fails the independence and/or constant p conditions. "Two outcomes" is necessary but not sufficient.
Mistake 3 — Wrong Variance Formula
WRONG: Var(X) = np or Var(X) = np2
CORRECT: Var(X) = np(1−p) = npq where q = 1−p
A very common slip is copying E(X) = np and confusing it with the variance formula. Always write q = 1 − p explicitly first.
Mistake 4 — Treating Dependent Trials as Binomial
WRONG: "Draw 5 cards from a deck without replacement. X ~ B(5, 1/4) for hearts."
CORRECT: Binomial does not apply — trials are dependent (sampling without replacement). Use hypergeometric or direct counting.
Each card drawn changes the composition of the remaining deck, so p changes with every draw.
Mistake 5 — Forgetting nCr in the Formula
WRONG: P(X=3) = p3(1−p)n−3
CORRECT: P(X=3) = nC3 × p3 × (1−p)n−3
Without nCr, you only account for one specific arrangement of successes and failures, not all possible arrangements. This gives the probability of a specific sequence, not the total probability of r successes.
Mistake 6 — Using p Instead of q in the Exponent
WRONG: P(X=2) = nC2 × p2 × pn−2 = nC2 × pn
CORRECT: P(X=2) = nC2 × p2 × (1−p)n−2
Failures have probability q = 1−p, not p. Confusing p and q is extremely common when p is close to 0.5.
Mistake 7 — Wrong Inequality in Finding n or p
WRONG: P(X ≥ 1) > 0.95 ⇒ (0.8)n > 0.05 ⇒ n < 13.4 ⇒ n = 13
When taking logarithms of an inequality with a negative base (ln of a number less than 1 is negative), the inequality sign flips. Always check by substituting your answer back in.
Mistake 8 — Rounding Intermediate Values
WRONG: Rounding each P(X=k) to 2 d.p. before summing, then getting an incorrect total.
CORRECT: Keep at least 5 significant figures in intermediate steps, and round only the final answer.
Cumulative binomial probabilities require summing several small terms. Early rounding compounds errors and can cause a final answer that differs by 0.01 or more — losing accuracy marks.
Given E(X) and Var(X): q = Var(X)/E(X) ⇒ p = 1−q ⇒ n = E(X)/p
Full Formula Reference Table
Quantity
Formula
Notes
Notation
X ~ B(n, p)
n trials, prob p of success
P(X = r)
nCr pr qn-r
q = 1−p
E(X)
np
Mean / expected number of successes
Var(X)
npq
q = 1−p; always ≥ 0
SD(X)
√(npq)
Same units as X
P(X ≥ r)
1 − P(X ≤ r−1)
Complement with r−1 NOT r
P(X > r)
1 − P(X ≤ r)
Strict inequality
P(a≤X≤b)
P(X≤b) − P(X≤a−1)
Both endpoints included
Binomial coefficient
nCr = n!/(r!(n−r)!)
Also written C(n,r) or (n choose r)
Conditions Checklist (for exam questions)
When asked to justify a binomial model, state all four:
• Fixed number of trials: n = ___ (state the value)
• Trials are independent: (state the reason from context)
• Only two outcomes: success = ___, failure = ___
• Constant probability: p = ___ for each trial
Proof Bank
These proofs are not required for Cambridge S1 but deepen your understanding and support Further Mathematics. Each is written with full logical steps.
Proof 1 — P(X = r) from Combinatorial Argument
Claim: If X counts the number of successes in n independent Bernoulli trials each with probability p, then P(X = r) = nCr pr(1−p)n−r.
Proof:
Consider a specific sequence of n trial outcomes with exactly r successes and n−r failures, for example: SS…SFF…F (r S's followed by n−r F's).
By independence, the probability of this specific sequence is:
P(SS…SFF…F) = p × p × … × p × (1−p) × … × (1−p) = pr(1−p)n−r
Any sequence with exactly r successes in n trials has this same probability pr(1−p)n−r (by independence and commutativity of multiplication).
The number of distinct sequences with exactly r successes in n trials equals the number of ways to choose which r positions (out of n) are successes = nCr.
Since all these sequences are mutually exclusive (they are distinct outcomes):
P(X = r) = nCr × pr(1−p)n−r □
Proof 2 — Deriving E(X) = np Using Indicator Variables
Claim: If X ~ B(n, p), then E(X) = np.
Proof (Method 1 — Indicator Variables):
Define indicator variables I1, I2, …, In where Ik = 1 if trial k is a success, 0 if it is a failure.
Then X = I1 + I2 + … + In (total number of successes).
For each k: E(Ik) = 1×P(Ik=1) + 0×P(Ik=0) = p.
By linearity of expectation (which holds regardless of independence):
E(X) = E(I1) + E(I2) + … + E(In) = p + p + … + p = np □
Proof (Method 2 — Direct Summation):
E(X) = Σr=0n r × nCr pr qn-r
The r = 0 term vanishes. For r ≥ 1: r×nCr = n × n−1Cr−1.
So E(X) = np Σr=1nn−1Cr−1 pr−1 qn−r = np Σj=0n−1n−1Cj pj qn−1−j = np(p+q)n−1 = np × 1 = np □
Proof 3 — Deriving Var(X) = npq
Claim: If X ~ B(n, p), then Var(X) = np(1−p).
Proof (Using Indicator Variables):
Using the same indicator variables Ik as above, X = ΣIk.
Each Ik is a Bernoulli(p) variable: Var(Ik) = E(Ik2) − [E(Ik)]2 = p − p2 = p(1−p) = pq.
Since I1, I2, …, In are independent (trials are independent), variances add:
Var(X) = Var(I1) + Var(I2) + … + Var(In) = pq + pq + … + pq = npq □
Give answers to 4 decimal places. Remember: P(X≥r) = 1 − P(X≤r−1).
Exercise 5 — Find n or p from Given Information (10 Questions)
Give n as a whole number and p to 4 decimal places unless stated.
Practice — 30 Mixed Questions
Mixed practice covering all binomial topics. Aim for 100% — confetti awaits!
Challenge — 15 Harder Questions
Harder questions involving multi-step reasoning, finding n and p, and critical regions.
Exam Style Questions (8 Questions)
These questions mirror Cambridge A-Level 9709 S1 style. Show all working. Click "Show Mark Scheme" to reveal the answer.
Question 1 [5 marks]
A biased coin has P(Head) = 0.35. The coin is tossed 12 times. X = number of heads.
(a) State the distribution of X. [1]
(b) Find P(X = 4). [2]
(c) Find P(X ≤ 3). [2]
In a multiple-choice test there are 20 questions, each with 5 options. A student guesses every answer. X = number of correct answers.
(a) State the distribution of X and justify each condition. [4]
(b) Find P(X ≥ 6). [2]
Each day the probability that it rains in a particular city is 0.45, independently of other days. Find the probability that, in a week of 7 days:
(a) it rains on exactly 3 days [2]
(b) it rains on more than 4 days [3]
A factory tests light bulbs. The probability a bulb is faulty is p. A sample of 8 bulbs is tested. The probability that none are faulty is 0.1678 (to 4 s.f.).
(a) Show that p = 0.2. [2]
(b) Find the probability that at most 2 are faulty. [3]
Under H0, X ~ B(20, 0.5). A one-tailed upper test is conducted at the 5% significance level.
(a) Find the critical region for X. [3]
(b) State the actual significance level of the test. [1]
(c) An observation of x = 14 is obtained. State the conclusion of the test. [3]
(a) Need P(X ≥ c) < 0.05, i.e. 1 − P(X ≤ c−1) < 0.05 ⇒ P(X ≤ c−1) > 0.95 [M1]
P(X ≤ 13) = binomcdf(20, 0.5, 13) = 0.9423 < 0.95 ✗
P(X ≤ 14) = binomcdf(20, 0.5, 14) = 0.9793 > 0.95 ✓ [M1]
Critical region: X ≥ 15 [A1]
(b) Actual significance level = P(X ≥ 15) = 1 − 0.9793 = 2.07% [B1]
(c) x = 14 does not lie in the critical region (X ≥ 15). [M1]
There is insufficient evidence at the 5% level to reject H0. [M1]
We conclude that the data is consistent with X ~ B(20, 0.5). [A1]
Past Paper Questions
Representative Cambridge 9709 S1 past paper questions on the Binomial Distribution. Attempt fully before revealing the solution.
Past Paper Q1 — (Nov 2018 style) [6 marks]
The random variable X ~ B(12, p). Given that P(X = 0) = 0.05314 (to 4 s.f.), find:
(a) The value of p, to 2 decimal places. [2]
(b) P(X ≤ 3). [2]
(c) The mean and variance of X. [2]
A quality control inspector examines items from a production line where 8% are defective. Items are selected independently. The inspector examines 25 items. Find the probability that:
(a) No items are defective. [1]
(b) At least 3 are defective. [3]
(c) The expected number of defective items. [1]
Each trial in an experiment has probability p of success, independently. An experiment consists of 10 trials.
(a) Given Var(X) = 1.96, find the two possible values of p. [3]
(b) Given also that E(X) > 5, state which value of p applies and find P(X ≥ 8). [4]
A factory produces components; each is independently faulty with probability 0.05. Components are packed in boxes of 30.
(a) Find the probability that a box contains no faulty components. [1]
(b) Find the probability that a box contains at most 2 faulty components. [3]
(c) A company buys 10 boxes. Find the probability that exactly 3 boxes contain at most 2 faulty components. [4]
X ~ B(30, 0.05) for number of faulty components per box.
(a) P(X=0) = (0.95)30 = 0.2146 (4 s.f.)
(b) P(X ≤ 2) = P(X=0)+P(X=1)+P(X=2)
P(X=1) = 30(0.05)(0.95)29 = 0.3389
P(X=2) = 435(0.05)2(0.95)28 = 0.2586
P(X ≤ 2) = 0.2146 + 0.3389 + 0.2586 = 0.8122 (4 s.f.)
(c) Let Y = number of boxes (out of 10) with at most 2 faulty. Y ~ B(10, 0.8122).
P(Y=3) = 10C3(0.8122)3(0.1878)7
= 120 × 0.5361 × 0.0000374 × ... let me use exact calc:
= 120 × (0.8122)3 × (0.1878)7
(0.8122)3 = 0.5360; (0.1878)7 = 0.000003764
P(Y=3) = 120 × 0.5360 × 0.000003764 = 0.000242 (3 s.f.)