SSS 3 SECOND TERM MATHEMATICS

Topic: USE OF CUMULATIVE FREQUENCY TO ESTIMATE PERCENTILES INCLUDING MEDIUM

The cumulative frequency curve (Ogive) can be used to estimate the median, the quartiles and percentiles of a grouped data.

The median is the mark that corresponds to the middle item (i.e. the mark half-way up the distribution). So to estimate the median from a distribution, the following steps are taken:

Percentiles

The percentiles (centiles) divide the distribution into hundred different equal parts. There are 99 (ninety-nine) percentiles and each is estimated the same way as the quartiles. For instance, to estimate the 40th percentile (denoted P40) of a distribution we compute 40/100(N) as in (2) above and carry out all other steps.

Note that P25 is the point into which 25% or a quarter of the data are meant to fall while P75 means 75% or three quarter are meant t o fall either below or above this point. Therefore Q1≡ P25 while Q3 ≡P75.

The percentiles are widely used in education circles in reporting the result of standardized tests like the WAEC, TEDRO, TEOFL, SAT etc. They give adequate information about student’s position or rank within his/her group or class mates, since by definition a percentile point is a point in the distribution with a certain percentage of the distribution of the cases below it.

Estimating Percentiles from the Cumulative Relative Frequency Table

The data above corresponds to the week 6 quiz score data in the course packet.

Complete the relative frequency and cumulative relative frequency columns.

 x Frequency Relative Frequency Cumulative

Relative Frequency 210.050.05510.050.10710.050.151010.050.201220.100.301310.050.351420.100.451530.150.601610.050.651720.100.751810.050.802040.201.00 201

To estimate the pth percentile, move down the cumulative relative frequency column until the first line at which you find or pass the value of p for the percentile you are looking for.

If you pass beyond the value of p that you are looking for, the value for the percentile is the x value in the data column at the first line beyond the percentile you were looking for.

To find the 40th percentile

Go down right hand column looking for 0.40.

You don’t find it exactly.  You pass it between 0.35 and 0.45.

Then the 40th percentile is the x value for the line at which you first pass 0.40

The 40th percentile is 14

If you find the exact value of p that you are looking for, the value for the percentile is the average of the x value in that line and the x value in the next line.

To find the 30th percentile

Go down right hand column looking for 0.30.

You find 0.30 on the line where x = 12.

You estimate that the 30th percentile is about 12.

To be technically correct, the 30th percentile would then be the average of the x value on that line (12) and the x value on the next line (13).

The 30th percentile is (12+13)/2=12.5

Find the 70th percentile and the 80th percentile.

More Percentile Practice:

Find the 20th percentile and the 25th percentile (answers on page 3 of this document)

Write the sentences that interpret these percentiles in the context of the situation in this problem.

Quartile Practice:

Find the 25th and 75th percentiles using this method.

Find the first and third quartiles using your calculator.

Do they agree with each other?

To estimate the pth percentile:

Construct the cumulative relative frequency column in the table.

To estimate the pth percentile, move down the cumulative relative frequency column until the first line at which you find or pass the value of p for the percentile you are looking for.

If you pass beyond the value of p that you are looking for, the value for the percentile is the x value in the data column at the first line beyond the percentile you were looking for.

If you find the exact value of p that you are looking for, the value for the percentile is the average of the x value in that line and the x value in the next line.

To find the 70th percentile

Go down right hand column looking for 0.70.

You don’t find it exactly.  You pass it between 0.65 and 0.75.

Then the 70th percentile is the x value for the line at which you first go beyond 0.70

The 70th percentile is 17.

Interpretation:  70% of students had quiz scores of 17 or less.

To find the 80th percentile

Go down right hand column looking for 0.80.

You find 0.80 on the line where x = 18.

You estimate that the 80th percentile is about 18.

To be technically correct, the 80th percentile would then be the average of the x value on that line (18) and the x value on the next line (20).

The 80th percentile is (18+20)/2=19.

Interpretation:  80 % of students had quiz scores of 19 or less

To estimate the pth percentile:

Construct the cumulative relative frequency column in the table.

To estimate the pth percentile, move down the cumulative relative frequency column until the first line at which you find or pass the value of p for the percentile you are looking for.

If you pass beyond the value of p that you are looking for, the value for the percentile is the x value in the data column at the first line beyond the percentile you were looking for.

If you find the exact value of p that you are looking for, the value for the percentile is the average of the x value in that line and the x value in the next line.

To find the 20th percentile

Go down right hand column looking for 0.20.

You find 0.20 on the line where x = 10.

You estimate that the 20th percentile is about 10.

To be technically correct, the 20th percentile would then be the average of the x value on that line (10) and the x value on the next line (12).

The 20th percentile is (10+12)/2=11.

Interpretation: 20 % of students had quiz scores of 11 or less

To find the 25th percentile

Go down right hand column looking for 0.25.

You don’t find it exactly.  You pass it between 0.20 and 0.30

Then the 25th percentile is the x value for the line at which you first go beyond 0.25

The 25th percentile is 12.

Interpretation:  25% of students had quiz scores of 12 or less.

To find quartiles Q1 and Q3 you are looking for the 25th and 75th percentiles

To estimate the pth percentile:

Construct the cumulative relative frequency column in the table.

To estimate the pth percentile, move down the cumulative relative frequency column until the first line at which you find or pass the value of p for the percentile you are looking for.

If you pass beyond the value of p that you are looking for, the value for the percentile is the x value in the data column at the first line beyond the percentile you were looking for.

If you find the exact value of p that you are looking for, the value for the percentile is the average of the x value in that line and the x value in the next line.

To find the 25th percentile

Go down right hand column looking for 0.25.

You don’t find it exactly.  You pass it between 0.20 and 0.30

Then the 25th percentile is the x value for the line at which you first go beyond 0.25

The 25th percentile is 12.

To find the 75th percentile

Go down right hand column looking for 0.75.

You find 0.75 on the line where x = 17.

You estimate that the 75th percentile is about 17.

To be technically correct, the 75th percentile would then be the average of the x value on that line (17) and the x value on the next line (18).

The 75th percentile is (17+18)/2=17.5

Use your calculator to find the quartiles using one variable statistics

Q1=12               Median = 15               Q3 = 17.5

½ (12+12)                                                  ½ (17+18)

The TI calculators find the lower quartile by finding the median of the lower half of the data and the upper quartile by finding the median of the upper half of the data.

In this case your calculator will find Q1 = 12 and Q3 = 17.5

The cumulative relative frequency method and your calculator’s method gave the same result.

Be aware that:

When using different technology tools to do statistical analysis you may find that sometimes different technology will give different answers.  Excel, your calculator, the cumulative relative frequency method by hand, the locator method in the textbook, and statistical software packages may give some answers that are the same and some that are close but not exactly the same.  This is because the different technologies use different programs that may round differently to find the position of the percentile or quartile. Percentiles are most appropriately used with very large data sets.

Then there are usually not large gaps between the data values, and the different methods give more consistent answers for quartiles or for percentiles.  Because we are working with small data sets with gaps between the data values, we find more inconsistencies between the various methods.