About Statistics – Class 9 Maths
The class-9 maths notes for the Chapter-Statistics chapter introduce data collection, organisation, and analysis. Our Class 9 maths notes explain mean, median, and mode in simple terms, with solved examples for clarity. The NCERT solutions for class 9 Maths offer detailed steps for each exercise, helping students solve data-related problems accurately.
With Class 9 Maths tuition, learners gain hands-on practice in interpreting charts, graphs, and tables. Tutors make data handling engaging and interactive, ensuring students grasp every concept with confidence and ease.
Introduction
The branch of science known as Statistics has been used in India from ancient times. Statistics deals with collection of numerical facts i.e., data, their classification & tabulation and their interpretation. In statistics we shall try to study, in detail about collection, classification and tabulation of such data.
Importance of Data
Expressing facts with the help of data is of great importance in our day-to-day life. For example, instead of saying that India has a large population it is more appropriate to say that the population of India, based on the census of 2000 is more than one billion.
Collection of Data
On the basis of methods of collection, data can be divided into two categories:
Primary Data
Data which are collected for the first time by the statistical investigator or with help of his workers is called primary data. As example if an investigator wants to study the condition of the workers working in a factory then from this he collects some data like their monthly income, expenditure, number or brother, sisters, etc.
Secondary Data
These are the data already collected by a person or a society and these may be in published or unpublished form. These data should be carefully used. These are generally obtained from the following two sources:
(A) Published sources
(B) Unpublished sources
Classification of Data
When the data is complied in the same form and order in which it is collected, it is known as Raw Data. It is also Crude Data. For example, the marks obtained by 20 students of class X in English out of 10 marks are as follows:
7, 4, 9, 5, 8, 9, 6, 7, 9, 2,
0, 3, 7, 6, 2, 1, 9, 8, 3, 8,
Geographical Basis
Here, the data is classified on the basis of place or region. For example the production of food grains of different state is shown in the following table:
| S.No. | State | Production (in Tons) |
|---|---|---|
| 1 | Andhra Pradesh | 9690 |
| 2 | Bihar | 8074 |
| 3 | Haryana | 10065 |
| 4 | Punjab | 17065 |
| 5 | Uttar Pradesh | 28095 |
Chronological Classification
If data's classification is based on hour, day, week and month or year, then it is called chronological classification. For example, the population of India in different year is shown in following table:
| S.No | Year | Production (in Crores) |
|---|---|---|
| 1 | 1951 | 46.1 |
| 2 | 1961 | 53.9 |
| 3 | 1971 | 61.8 |
| 4 | 1981 | 68.5 |
| 5 | 1991 | 88.4 |
| 6 | 2001 | 100.01 |
Qualitative Basis
When the data are classified into different groups on the basis of their descriptive qualities and properties, such a classification is known as descriptive or qualitative classification. Since the attributes cannot be measured directly, they are counted on the basis of presence or absence of qualities. For example intelligence, literacy, unemployment, honesty etc. The following table shows classification on the basis of sex and employment.
Population (in lacs)
| Gender → | Male | Female |
|---|---|---|
| Position of Employment ↓ | ||
| Employed | 16.2 | 13.7 |
| Unemployed | 26.4 | 24.8 |
| Total | 42.6 | 38.5 |
Quantitative Basis
If facts are such that they can be measured physically e.g. marks obtained, height, weight, age, income, expenditure etc. Such facts are known as variable values. If such facts are kept into classes then it is called classification according to quantitative or class intervals.
| Marks obtained | 10-20 | 20-30 | 30-40 | 40-50 |
|---|---|---|---|---|
| No. of students | 7 | 9 | 15 | 6 |
Variate
The numerical quantity whose value varies in objective is called a variate, generally a variate is represented by x. There are two types of variate:
- Discrete variate: Its magnitude is fixed. For example, the number of teacher in different branches of a institute are 30, 35, 40 etc.
- Continuous variate: Its magnitude is not fixed. It is expressed in groups like 10 - 20, 20 - 30, .... etc.
Range
The difference of the maximum and the minimum values of the variable x is called range.
Class Frequency
In each class the number of times a data is repeated is known as its class frequency.
Class Interval
Class Interval = Range / Number of classes
It is generally denoted by h or i.
Class Limits
The lowest and the highest value of the class are known as lower and upper limited respectively of that class.
Class Mark
The average of the lower and the upper limits of a class is called the mid value or the class mark of that class. It is generally denoted by x.
If x be the mid value and h be the class interval, then the class limits are (x - h/2, x + h/2)
Example: The mid values of a distribution are 54, 64, 74, 84 and 94. Find the class interval and class limits.
Solution:
The class interval is the difference of two consecutive class marks, therefore class interval (h) = 64 – 54 = 10.
Here the mid values are given and the class interval is 10.
So class limits are:
- For 1st class: 54 - 10/2 to 54 + 10/2 or 49 to 59
- For 2nd class: 64 - 10/2 to 64 + 10/2 or 59 to 69
- For 3rd class: 74 - 10/2 to 74 + 10/2 or 69 to 79
- For 4th class: 84 - 10/2 to 84 + 10/2 or 79 to 89
- For 5th class: 94 - 10/2 to 94 + 10/2 or 89 to 99
Therefore class limits are 49 - 59, 59 - 69, 69 - 79, 79 - 89, and 89 – 99.
Frequency Distribution
The marks scored by 30 students of IX class, of a school in the first test of Mathematics out of 50 marks are as follows:
6, 32, 10, 17, 22, 28, 0, 48, 6, 22,
32, 6, 36, 26, 48, 10, 32, 48, 28, 22,
22, 22, 28, 26, 17, 36, 10, 22, 28, 0
The number of times a mark is repeated is called its frequency. It is denoted by f.
| Marks obtained | Tally mark | Frequency | Marks obtained | Tally mark | Frequency |
|---|---|---|---|---|---|
| 0 | II | 2 | 26 | II | 2 |
| 6 | III | 3 | 28 | IIII | 4 |
| 10 | III | 3 | 32 | III | 3 |
| 17 | II | 2 | 36 | II | 2 |
| 22 | IIII I | 6 | 48 | III | 3 |
Above type of frequency distribution is called ungrouped frequency distribution. Although this representation of data is shorter than representation of raw data, but from the angle of comparison and analysis it is quite bit. So to reduce the frequency distribution, it can be classified into groups in following ways and it is called grouped frequency distribution.
| Class | Frequency |
|---|---|
| 0 – 10 | 8 |
| 11 – 20 | 2 |
| 21 – 30 | 12 |
| 31 – 40 | 5 |
| 41 – 50 | 3 |
Types of Frequency Distribution
Statistical methods like comparison, decision taken etc. depends on frequency distribution. Frequency distribution are of three types.
Individual Frequency Distribution:
Here each item or original price of unit is written separately. In this category, frequency of each variable is one.
e.g. Total marks obtained by 10 students in a class.
| S. No. | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
|---|---|---|---|---|---|---|---|---|---|---|
| Marks obtained | 46 | 18 | 79 | 12 | 97 | 80 | 5 | 27 | 67 | 54 |
Discrete Frequency Distribution:
When number of terms is large and variable are discrete, i.e., variate can accept some particular values only under finite limits and is repeated then its called discrete frequency distribution. For example the wages of employees and their numbers is shown in following table:
| Monthly wages | No. Of employees |
|---|---|
| 4000 | 10 |
| 6000 | 8 |
| 8000 | 5 |
| 11000 | 7 |
| 20000 | 2 |
| 25000 | 1 |
The above table shows ungrouped frequency distribution the same facts can be written in grouped frequency as follows:
| Monthly wages | No. of employees |
|---|---|
| 0-10,000 | 23 |
| 11,000-20,000 | 9 |
| 21,000-30,000 | 1 |
Continuous Frequency Distribution:
When number of terms is large and variate is continuous. i.e., variate can accept all values under finite limits and they are repeated then it is called continuous frequency distribution. For example age of students in a school is shown in the following table:
| Age (in year) | Class | No. of students |
|---|---|---|
| Less than 5 year | 0-5 | 72 |
| Between 5 and 10 year | 5-10 | 103 |
| Between 10 and 15 year | 10-15 | 50 |
| Between 15 and 20 year | 15-20 | 25 |
Classes can be made mainly by two methods
Exclusive Series
In this method upper limit of the previous class and lower limit of the next class is same. In this method the term of upper limit in a class is not considered in the same class, it is considered in the next class.
Inclusive Series
In this method value of upper and lower limit are both contained in same class. In this method the upper limit of class and lower limit of other class are not same. Some time the value is not a whole number, it is a fraction or in decimals and lies in between the two intervals then in such situation the class interval can be constructed as follows:
| A | B | ||
|---|---|---|---|
| Class | Frequency | Class | Frequency |
| 0-9 | 4 | 0-9.99 | 4 |
| 10-19 | 7 | 10-19.99 | 7 |
| 20-29 | 6 | 20-29.99 | 6 |
| 30-39 | 3 | 30-39.99 | 3 |
| 40-49 | 3 | 40-49.99 | 3 |
Cummulative Frequency
Discrete Frequency Distribution
From the table of discrete frequency distribution, it can be identified that number of employees whose monthly income is 4000 or how many employees of monthly income 11000 are there. But if we want to know how many employees whose monthly income is upto 11000, then we should add 10 + 8 + 5 + 7 i.e., number of employees whose monthly income is upto 11000 is 30. Here we add all previous frequency and get cumulative frequency. It will be more clear from the following table:
| Class | Frequency (f) | Cumulative frequency (cf) | Explanation |
|---|---|---|---|
| 4000 | 10 | 10 | 10 = 10 |
| 6000 | 8 | 18 | 10 + 8 |
| 8000 | 5 | 23 | 18 + 5 |
| 11000 | 7 | 30 | 23 + 7 |
| 20000 | 2 | 32 | 30 + 2 |
| 25000 | 1 | 33 | 32 + 1 |
Continuous Frequency Distribution
In the previous page we obtained cumulative frequency for discrete series. Similarly cumulative frequency table can be made from continuous frequency distribution also. For example, for table:
| Monthly income Variate (x) |
No. of employee Frequency (f) |
Cumulative Frequency (cf) |
Explanation |
|---|---|---|---|
| 0 – 5 | 72 | 72 | 72 = 72 |
| 5 – 10 | 103 | 175 | 72 + 103 = 175 |
| 10 – 15 | 50 | 225 | 175 + 50 = 225 |
| 15 - 20 | 25 | 250 | 225 + 25 = 250 |
Above table can also be written as follows:
| Class | Cumulative Frequency |
|---|---|
| Less than 5 | 72 |
| Less than 10 | 175 |
| Less than 15 | 225 |
| Less than 20 | 250 |
From this table the number of students of age less than the upper limit of a class, i.e., number of student whose age is less than 5, 10, 15, 20 year can determined by merely seeing the table but if we need the number students whose age is more than zero, more than 5, more than 10 or more than 15, then table should be constructed as follows:
| Class | Frequency | Age | Cumulative frequency | Explanation |
|---|---|---|---|---|
| 0 – 5 | 72 | 0 and more | 250 | 250 = 250 |
| 5 – 10 | 103 | 5 and more | 178 | 250 – 72 = 178 |
| 10 - 15 | 50 | 10 and more | 75 | 178 – 103 = 75 |
| 15 - 20 | 25 | 15 and more | 25 | 75 – 50 = 25 |
Graphical Representation of Data
We have discussed the representation of data in tabular form. There is another representation known as graphical representation of data.
These representations become easier than tabular form. We have the following graphical representation:
- Bar graphs
- Histograms
- Frequency polygons
Bar Graphs
A bar graph is a pictorial representation of data in which usually bar of uniform width are drawn with equal spacing between them on one axis and values of variable are shown on other axis.
Each rectangle or bar represents only one value of the data. So, the number of rectangles will be exactly the same as the number of values in the numerical data. The height (or length in case the base is on a vertical line) of each bar is proportional to the numerical values of the data. The height/length of each bar represents the numerical values of the data on a scale, selected suitably.
Histograms
This is a form of representation like the bar graph, but it is used for continuous class intervals.
For a continuous frequency distribution, a series of rectangles are constructed having their widths equal to the widths of the classes and heights (or lengths) are selected in such a way that, the areas of the rectangles are respectively proportional to the frequencies of the classes. In case, the widths of the classes are uniformly same, the heights of the rectangles are selected proportional to the corresponding frequencies of the classes.
By selecting suitable scales on x-axis and y-axis, the rectangles are drawn leaving no gap in between consecutive rectangles. The figure drawn appears like a single solid figure and it is called a histogram.
Frequency Polygons
It is another representation in which we join upper midpoint of all the rectangles. The polygon so formed is called Frequency polygon.
Frequency polygon of a given continuous frequency distribution can be drawn in two ways:
- With the help of the histogram of the given frequency distribution.
- Without taking the help of the histogram.
Example: In a particular section of Class IX, 40 students were asked about the months of their birth and the following graph was prepared for the data so obtained:
IMAGE
Observe the bar graph given above and answer the following questions:
- How many students were born in the month of November?
- In which month were the maximum number of students born?
Solution:
- 4 students were born in the month of November.
- The Maximum number of students were born in the month of August.
Measures of Central Tendency
The variable in frequency table can be either qualitative or quantitative. In the case of quantitative variables the information contained in the raw data, or in frequency table can be presented by means of few numerical values. Methods providing such values are called measures of location or measures of central tendency and the numerical value obtained is called an average or central value.
The commonly used measures of central tendency are:
- Arithmetic mean
- Geometric mean
- Harmonic mean
- Median
- Mode
We will study about arithmetic mean, median, mode etc.
MEAN
For ungrouped data:
If x₁, x₂, ….. xₙ are n values of a variable x, then the arithmetic mean of these values is given by
X̄ = (x₁ + x₂ + x₃ + ........ + xₙ) / n
= Σxᵢ / n (where i = 1 to n)
For grouped data
If a variate x has values x₁, x₂, ….. xₙ with the corresponding frequencies f₁, f₂, ….. fₙ respectively, then the arithmetic mean of these values is given by
X̄ = (x₁f₁ + x₂f₂ + ........ + xₙfₙ) / Σfᵢ = N
= Σfᵢxᵢ / N
= Σfᵢxᵢ / Σfᵢ = N
Important Properties:
- (i) If X̄ is the mean of n observations x₁, x₂, ….. xₙ, then the algebraic sum of the deviations about X̄ is 0
i.e. Σ(xᵢ − x̄) = 0 - (ii) The mean of the observations x₁ ± a, x₂ ± a, x₃ ± a ……, xₙ ± a is x̄ ± a.
- (iii) The mean of the ax₁, ax₂, ax₃, ….. axₙ is ax̄.
- (iv) The mean of the observations x₁/a, x₂/a, ….. xₙ/a is x̄/a.
Median
If x₁, x₂, ….. xₙ are n values of a variable arranged in descending or ascending order, then
Median = value of [(n+1)/2]th observation if n is odd
Median = [value of (n/2)th observation + value of (n/2 + 1)th observation] / 2 if n is even
- (i) The median can be calculated graphically, while mean cannot be.
- (ii) Median is not affected by extreme values.
Mode
It is the value which occurs most frequently in a set of observation and around which the other items of the set cluster densely.
Uses of Mode: Mode is the average to be used to find the ideal size, e.g., in business forecasting, in manufacture of ready-made garments, shoes etc.
Empirical Relation between Mode, Median & Mean:
Mode = 3 Median - 2 Mean
Example 1: 5 people were asked about the time in a week they spend in doing social work in their community. They said 10, 7, 13, 20 and 15 hours, respectively. Find the mean (or average) time in a week devoted by them for social work.
Solution:
Mean x̄ = Sum of all the observations / Total number of observations
= (x₁ + x₂ + x₃ + x₄ + x₅) / 5
= (10 + 7 + 13 + 20 + 15) / 5
= 65/5 = 13
So, the mean time spent by these 5 people in doing social work is 13 hours in week.
Example 2: The heights (in cm) of 9 students of a class are as follows: 155, 160, 145, 149, 150, 147, 152, 144, 148. Find the median of this data.
Solution:
First of all we arrange the data in ascending order, as follows:
144, 145, 147, 148, 149, 150, 152, 155, 160
Since the number of students is 9, an odd number, we find out the median by finding the height of the [(n+1)/2]th = [(9+1)/2]th = the 5th student, which is 149 cm.
So, the median, i.e. the median height is 149 cm.
Example 3: The points scored by a Kabaddi team in a series of matches are as follows: 17, 2, 7, 27, 15, 5, 14, 8, 10, 24, 48, 10, 8, 7, 18, 28. Find the median of points scored by the team.
Solution:
Arranging the points scored by the team in ascending order, we get
2, 5, 7, 7, 8, 8, 10, 10, 14, 15, 17, 18, 24, 27, 28, 48
There are 16 terms. So there are two middle terms, i.e. the (16/2)th and (16/2 + 1)th, i.e. the 8th and 9th terms.
So, the median is the mean of the values of the 8th and 9th terms.
i.e., the median = (10 + 14)/2 = 12
So, the median point scored by the Kabaddi team is 12.
Example 4: Find the mode of the following marks (out of 10) obtained by 20 students: 4, 6, 5, 9, 3, 2, 7, 7, 6, 5, 4, 9, 10, 10, 3, 4, 7, 6, 9, 9
Solution:
We arrange this data in the following form:
2, 3, 3, 4, 4, 4, 5, 5, 6, 6, 6, 7, 7, 7, 9, 9, 9, 9, 10, 10
Here 9 occur most frequently, i.e., four times. So, the mode is 9.
Practice Questions
Ques. For the following frequency distribution, draw a histogram and construct a frequency polygon with it.
| Class | 20 – 30 | 30 – 40 | 40 – 50 | 50 – 60 | 60 - 70 |
|---|---|---|---|---|---|
| Frequency | 8 | 12 | 17 | 9 | 4 |
Ques. Draw a frequency polygon of the following frequency distribution table.
| Class | 0–10 | 10–20 | 20–30 | 30–40 | 40–50 | 50–60 | 60–70 | 70–80 | 80–90 | 90–100 |
|---|---|---|---|---|---|---|---|---|---|---|
| Frequency | 8 | 10 | 6 | 7 | 9 | 8 | 8 | 6 | 3 | 4 |
Ques. Draw a frequency polygon of the following frequency distribution.
| Age (in years) | 0 – 10 | 10 – 20 | 20 – 30 | 30 – 40 | 40 – 50 | 50 – 60 |
|---|---|---|---|---|---|---|
| Frequency | 15 | 12 | 10 | 4 | 11 | 14 |
Ques. Find the median of the following values: 37, 31, 42, 43, 46, 25, 39, 45, 32
Ques. The median of the observation 11, 12, 14, 18, x + 2, x + 4, 30, 32, 35, 41 arranged in ascending order is 24. Find the value of x.
Exercise 1
Ques. The mode of the data: 5, 7, 8, 8, 9, 8, 5, 8, 10, 8, 7, is
(a) 10
(b) 5
(c) 8
(d) none of these
Ques. The median of the data is given below: 2, 7, 9, 13, 20, 22, 24, 25, 27, 28, 35, 40 is
(a) 24
(b) 23
(c) 25
(d) none of these
Ques. Following data have been arranged in the ascending order. 29, 32, 48, 50, x, x + 2, 72, 78, 84, 95. If the median of the data is 63, the value of x is
(a) 31
(b) 62
(c) 124
(d) none of these
Ques. Mean of 10, 12, 18, 13, 20 and 17 is
(a) 14
(b) 15
(c) 16
(d) none of these
Ques. Mean of first 8 prime numbers is
(a) 9.625
(b) 8.625
(c) 10.625
(d) none of these
Ques. Mean of 10, 12, 16, 20, p and 26 is 17, value of p is
(a) 16
(b) 18
(c) 20
(d) none of these
Ques. Mean of 10 observations is 20 and that of other 15 observations is 16. Mean of all 25 observations will be
(a) 16.6
(b) 18.6
(c) 19.6
(d) 17.6
Ques. Class marks of a distribution are: 47, 52, 57, 62, 67, 72, 77, 82. Class size of the given distribution is
(a) 51
(b) 10
(c) 15
(d) none of these
Ques. The mean of the data x₁, x₂,...., xₙ is 'a', then the mean of the data x₁ + a, x₂ + a,...., xₙ + a is
(a) a
(b) 2a
(c) a/2
(d) none of these
Ques. The mean of the data y₁, y₂,...., yₙ is 102, then mean of the data 5y₁, 5y₂,...., 5yₙ is
(a) 102
(b) 204
(c) 510
(d) 606
Answers to Exercise 1
1. (c) 2. (b) 3. (b) 4. (b) 5. (a)
6. (b) 7. (d) 8. (a) 9. (b) 10. (c)
Exercise 2
Ques. Find the sum of the deviations of the variates 6, 8, 10, 16, 20, 24 from their mean.
Ques. Write the class size and class limits in each of the following:
(i) 104, 114, 124, 134, 144, 154, and 164
(ii) 47, 52, 57, 62, 67, 72, 77, 82, 87, 92, 97, and 102
(iii) 12.5, 17.5, 22.5, 27.5, 32.5, 37.5, 42.5, 47.5
Ques. Explain the difference between a frequency distribution and a cumulative frequency distribution.
Ques. Explain the meaning of the following terms:
(i) Variate
(ii) Class-interval
(iii) Class-size
(iv) Class-mark
(v) Frequency
(vi) Class limits
(vii) True class limits
Ques. Following are the ages of 360 patients getting medical treatment in a hospital on a day:
| Age (in years): | 10 – 20 | 20 – 30 | 30 – 40 | 40 – 50 | 50 – 60 | 60 – 70 |
|---|---|---|---|---|---|---|
| No. of Patients: | 90 | 50 | 60 | 80 | 50 | 30 |
Prepare a histogram for the above data
Ques. The mean of five observations x, x + 2, x + 4, x + 6, x + 8 is 11, find the mean of first three observations.
Ques. Find the missing frequencies in the following frequency distribution if it is known that the mean of the distribution is 50.
| x: | 10 | 30 | 50 | 70 | 90 | Total |
|---|---|---|---|---|---|---|
| f: | 17 | f₁ | 32 | f₂ | 19 | 120 |
Answers to Exercise 2
1. 0
2. (i) Class size = 10 (ii) Class size = 5 (iii) Class size = 5
13. 9
15. f₁ = 28, f₂ = 24
Exercise 3
Ques. Find the mean of following data: 13, 17, 16, 14, 11, 13, 10, 16, 11, 18, 12, 17.
Ques. Find the median of following data: 38, 70, 48, 34, 42, 55, 63, 46, 54, 44.
Ques. Find the mode of following data: 2, 2, 6, 5, 4, 3, 4, 5, 7, 9, 4, 5, 3, 1, 10, 4.
Ques. Find the value of p if the median of following observations is 48: 14, 17, 33, 35, p-5, p + 7, 57, 63, 69, 80. [The above observation are in ascending order.]
Ques. Draw a histogram to represent the following data:
| Class Interval | 40-60 | 60-80 | 80-100 | 100-120 | 120-140 | 140-160 | 160-180 | 180-200 |
|---|---|---|---|---|---|---|---|---|
| Frequency | 20 | 40 | 30 | 50 | 30 | 20 | 10 | 40 |
Answers to Exercise 3
1. 14
2. 47
3. 4
5. P = 47