Some Concepts Used in Statistics

These various variables are measured in different units, such as age in years, weight in kilograms, height in inches or centimeters, etc.

some-concepts-used-in-statistics

Concepts Used in Statistics

(i) Variable

Variables can be presented numerically. The value of a variable differs. Examples of variables include age, height, weight, price, production, income, expenditure, sales, and profit. These various variables are measured in different units, such as age in years, weight in kilograms, height in inches or centimeters, income in rupees, etc.

Variables can be divided into two types based on the dependence of their values: independent variables and dependent variables. If the value of one variable does not depend on the value of another variable, it is called an independent variable. If the value of a variable is determined by the value of another variable, it is called a dependent variable.

Similarly, variables can be divided into two types based on the nature of their values: discrete variables and continuous variables. These are discussed below.

(a) Discrete Variable

If the value of a variable can be expressed in integers (whole numbers) rather than fractions or decimals, it is called a discrete variable. 

There is a gap between the values that a discrete variable can take. Examples of discrete variables include the number of workers in an industry, the number of road accidents, and the number of family members.

(b) Continuous Variable

If a variable can take all possible values (integers and fractions) within a certain range, it is called a continuous variable. Examples of continuous variables include weight, age, height, distance, and temperature.

(ii) Frequency Distribution

The number of times a particular value of a variable occurs is called the frequency of that value. In a frequency distribution, the different values of a variable and their corresponding frequencies are presented in a table. 

Generally, the values of the variable are listed individually or grouped into classes on the left side of the table, and the total number of units (frequency) for each value or group is presented on the right side of the table.

A frequency distribution that shows the frequency distribution of only one variable is called a univariate frequency distribution. 

If data is classified based on two variables at the same time and their frequencies are presented, it is called a bivariate frequency distribution.

(iii) Statistical Series

If data is presented in a suitable order or a logical and systematic order, it is called a statistical series. Generally, there are three types of statistical series. The types of statistical series can also be called types of univariate frequency distribution, which are presented below.

(a) Individual Series

If the value of each item in the data is presented individually or separately, it is called an individual series. The items in an individual series are not placed in any class or group, and their frequencies are not given. That is, the frequency of each item in such a series is 1.

An individual series can be presented in ascending or descending order. 

Below is an example of an individual series showing the marks obtained by six students of grade XII in Economics.

Student NameMarks
Mahesh46
Pujan60
Rama80
Diana50
Soyam70

(b) Discrete Series

Data presented with different values of a discrete variable and their corresponding frequencies is called a discrete series. In such a frequency distribution, the values of the variable are presented in order without repetition, along with their corresponding frequencies. 

Below is an example of a discrete series showing the marks obtained by 20 students of grade XI in Economics.

MarksNumber of Students
205
409
508
6012
8010
906

(c) Continuous Series

The frequency distribution of the values of a continuous variable is called a continuous series. In such a series, since a continuous variable can take values within a certain range, different classes are formed, and the frequencies corresponding to these classes are presented. 

Such a frequency distribution is also called class interval classification or class interval frequency distribution. 

Below is an example of a continuous series showing the daily wages of 40 workers in a city.

Daily Wages (in Rs.)Number of Workers
100-2005
200-3009
300-4008
400-50012
500-6006

(iv) Class/Class Interval

Each class has minimum and maximum values. These are called class limits. The minimum value of each class is called the lower limit, and the maximum value is called the upper limit. In other words, the value on the left side of the class is the lower limit, and the value on the right side is the upper limit. 

If data is classified by forming classes/class intervals, such classification is called class interval classification or class interval frequency distribution or class interval tabulation. 

The types of classes/class intervals are mentioned below.

(a) Inclusive Classes/Class Intervals

In a frequency distribution, if both the lower and upper limits of each class are included in that class, such classes are called inclusive classes. In this type of class, a gap appears between the upper limit of one class and the lower limit of the next class. 

Below are inclusive classes presented with their frequencies.

ClassesFrequency
6-97
10-134
14-172
18-216
22-252

The classes given above have a width of 4. There is a gap between one class and the next class. The first class 6-9 includes all values from 6 to 9. Similarly, the next value 10 falls in the class 10-13. However, fractional or decimal values between 9 and 10 (e.g., 9.1, 9.5, 9.8, etc.) do not fall in either the first or the second class. 

Therefore, inclusive classification is used for the frequency distribution of discrete variables. Examples of discrete variables include the marks obtained by students in an examination, the number of family members, and the number of road accidents. 

Their values are in integers. Continuous variables such as age, height, weight, etc., can have values not only in integers but also in fractions, so inclusive classes cannot be used for their frequency distribution.

(b) Exclusive Classes/Class Intervals

Inclusive classes do not include the values that fall between two classes. Therefore, exclusive classes are formed for the frequency distribution of continuous variables. 

In exclusive classes, there is no gap between one class and the next class. That is, the upper limit of one class and the lower limit of the next class are the same. 

Below are exclusive classes presented with their frequencies.

ClassesFrequency
20-405
40-602
60-802
80-1006
100-1208

In exclusive classes, the upper limit of a class is not included in that class but is included in the next class. Since the upper limit of each class is not included in that class, these types of classes are called exclusive classes. 

Therefore, inclusive classes should be used for discrete variables, and exclusive classes should be used for continuous variables. 

As much as possible, the class limit should start from zero (0) or a number divisible by 5. 

For example, if the minimum value of a variable is 24 and the class interval is 10, it is better to take the lower limit of the first class as 20 instead of 24.

Inclusive class intervals can be converted into exclusive class intervals. For this, the difference between the upper limit of one inclusive class interval and the lower limit of the next class interval is calculated. This difference is then halved to obtain the correction factor. 

This correction factor is subtracted from the lower limits of the inclusive class intervals and added to the upper limits. The new upper and lower limits of the exclusive class intervals thus formed are called class boundaries. 

The exact midpoint of a class is called the mid-value of that class. To find the mid-value, the lower and upper limits of the class are added together and divided by two.

(c) Open-End Classes

If, in a classification, the lower limit of the first class or the upper limit of the last class is not fixed, it is called open-end class classification. A class interval/class with one limit unspecified is called an open-end class. 

Below is the marks obtained by 20 students presented as an open-end class classification.

MarksNumber of Students
0-105
10-202
20-404
40-502
50-606
More than 601

(v) Cumulative Frequency Distribution

A simple frequency distribution only shows how many times a particular value of a variable or class is repeated. Cumulative frequency helps to know the total number of values that are more or less than a particular value of a variable or class. 

Cumulative frequency distribution is a modified form of a simple frequency distribution. Cumulative frequency distribution is formed by adding the frequencies of the values of the variable (class) given in the frequency distribution based on specific rules. 

The frequencies obtained in this way are called cumulative frequencies (C.F.).

Cumulative frequency distribution is prepared by considering only one of the upper and lower limits of the class interval, rather than mentioning both. When preparing a frequency distribution based on the upper limit, the words "less than" are used. Similarly, when preparing a frequency distribution based on the lower limit, the words "more than" are used.

Powered by Google Blogger | VIP