Introduction to Statistical Diagrams and Graphs
Data can be presented through diagrams and graphs. Data presented in this way is attractive, enjoyable, easy to understand, and stays in the mind for a long time.
Diagrams and graphs are geometric figures used in mathematics, such as points, lines, squares, rectangles, circles, cubes, as well as pictures, maps, and charts. Diagrams and graphs are used to present data in books, newspapers, annual reports, and advertisements.
The reasons why diagrams and graphs are more useful can be presented in points as follows:
(a) Data presented with the help of diagrams and graphs is simple and understandable.
(b) Raw data/data presented in groups is dull, but data presented in diagrams and graphs is attractive.
(c) Data presented in diagrams and graphs stays in the mind for a longer time compared to raw data presented in groups.
(d) Diagrams and graphs make it easy to compare data and find interrelationships.
(e) Diagrams and graphs simplify complex facts.
Bar Diagrams
Generally, if data is presented in the form of vertical rectangles, it is called a bar diagram. The magnitude of the data is shown by the height of the bars. Bar diagrams can be broadly classified as follows:
(a) Simple Bar Diagram
A simple bar diagram is used to compare two or more values of a single variable. It can be used to show sales volume, profit, production, population, etc., over different time periods.
(b) Component Bar Diagram
A component bar diagram is used when the different values of a selected variable for study need to be shown by dividing them into their components. This helps to compare different values of the variable, compare similar components, and compare one type of component with another.
Therefore, a component bar diagram is used when the magnitude of a given variable needs to be divided into different parts, subcategories, or elements.
The following procedure needs to be adopted for this:
(i) First, construct a bar representing the total.
(ii) Then, divide this bar into the required components. After this, represent the different components by shading or coloring them differently, drawing diagonal lines, or using dots. Mention the symbols representing these components along with the bar.
(iii) If there is more than one bar, the order of the components in the bars should be uniform. This makes comparison easier.
(iv) If there is only one bar, place the largest component at the bottom (base) and arrange the others in descending order, with the smallest component at the top.
(c) Percentage Bar Diagram
A percentage bar diagram is formed when a component bar diagram is presented on a percentage basis. The total value of the data is taken as 100, and its components are expressed as percentages of the respective total value, and cumulative percentages are calculated.
A percentage bar diagram is constructed accordingly. When constructing a percentage bar diagram, the height of one bar is taken as 100 units. Thus, the heights of the different components of the percentage bar diagram are determined based on their percentage values.
Percentage bar diagrams are considered more convenient and useful for comparing two or more groups of data.
Additionally, bar diagrams are useful for showing the relative importance of components in the total value. In a percentage bar diagram, the height of all bars is equal, i.e., 100 units.
(d) Multiple Bar Diagram
A multiple bar diagram is used to show two or more groups of interrelated data. For each data point in each group, a simple bar diagram is drawn, and the bars for the same group are shown adjacent to each other.
However, an appropriate and equal space is left between bars of different groups. Different colors, shades, dots, or diagonal lines are used to distinguish the different bars within a group, and these are indicated by a legend or index.
Pie Chart
A pie chart is used to show the division of the total value of data into different components. The circle represents the total value. Its components show the specific proportion or percentage of the total value.
Thus, the diagram formed by showing data in a circle and its various components is called a pie chart.
Steps for Constructing a Pie Chart:
(a) Since the total angle at the centre of a circle is 360°, consider the total value of the given components as 360°.
(b) Convert the given value of each component into degrees using the following formula:
Degree value of component = (Value of Component/ Total Value) × 360°
After expressing the value of each component in degrees, the sum of all components will be 360°.
(c) Draw a circle with a radius of appropriate size according to the size of the paper.
(d) Draw a radius from the center of the circle to the circumference. Using this radius as a base, draw an angle at the center of the circle according to the degree value of one component. The line drawn to form the angle with the initial radius should also be a radius of the circle, i.e., it should start from the center and touch the circumference.
Then, construct an angle equal to the angular value of another component (using the last drawn radius as the base). Continue this process to construct angles at the center of the circle equal to the angular values of all components (use a protractor or compass for constructing the angles).
(e) Fill each sector of the circle with different shades, colors, dots, or diagonal lines to distinguish them. Mention the title (name) of each component inside the respective sector of the circle or outside the circle using arrows or legends.
(f) After constructing the pie chart in this way, give it an appropriate title.
Graphs
Graphs are used to show the relationship between variables. Graphs are considered excellent for showing data related to time series and frequency distributions.
Graphs are usually drawn on a special type of paper called graph paper.
Types of Graphs
There are various types of graphs. Some of the main graphs are mentioned below.
(a) Graph of Time Series
Time series refers to data arranged based on the sequence of time periods. Time periods can be measured in years, months, weeks, days, hours, etc. Series of economic and business data are often in the form of time series.
For example, data on a country's population, currency in circulation, bank deposits, production volume, commodity prices, sales, profit, imports, exports, etc., are presented based on time periods.
A time series has two variables: an independent variable and a dependent variable. Time period is taken as the independent variable, and the event or subject selected for study is the dependent variable. The independent variable is shown on the X-axis, and the dependent variable is shown on the Y-axis.
When a time series is presented geometrically, it becomes a graph of the time series. The line graph is formed by joining the plotted points based on the values of the independent variable (X) and the dependent variable (Y). In a line graph, the Y-axis must always start from the origin O.
The use of time series graphs has been increasing day by day because they are simple to understand and easy to construct.
Note: When constructing a time series graph, if there is a large difference between the minimum value of the dependent variable (Y) and zero (or the origin), an artificial base line is used.
For this, a double line that goes up and down and moves forward to the right is drawn between the origin and the minimum point of the dependent variable, leaving some space between the two lines.
By using an artificial base line, the space between the origin and the minimum value of the dependent variable, which is far from the origin, is not wasted, and the graph is not forced to be concentrated only in the upper part of the paper.
In addition, an artificial base line eliminates the need to use very large paper. This also makes good use of the paper. As a result, the relationship between the dependent and independent variables can be easily seen.
(b) Histogram
A histogram is used to present a continuous series or frequency distribution graphically. When constructing a histogram, the variable is shown on the X-axis, and the frequency is shown on the Y-axis.
When constructing a histogram, the base of the rectangle is determined by making its width equal to the class interval. The height of the rectangle is determined such that the area of each rectangle is proportional to the frequency of the corresponding class interval.
In other words, if the class intervals in the frequency distribution are equal, the frequency of the corresponding class interval shows the height of the rectangle. However, if the class intervals are different, the height of a rectangle is adjusted proportionally.
If the width (class interval) of a rectangle is greater by a certain ratio compared to the smallest class interval, then the height (frequency) of that rectangle is reduced by the same ratio to determine the height of the rectangle.
The rectangles constructed in this way are placed vertically adjacent to each other with their bases on the X-axis to form a histogram.
When constructing a histogram, there should be no space left between one rectangle and another. If the lower limit of the first class (the smallest value of the variable) in a continuous series or frequency distribution starts from zero (0) or the origin, the first rectangle of the histogram is joined to the Y-axis.
If the lower limit of the first class starts far from zero (0) or the origin, the X-axis is slightly compressed between the origin and the lower limit of the first class, and the first class is brought closer. Similarly, other classes are also shown sequentially closer.
(c) Frequency Polygon
To construct a frequency polygon from the given data, a histogram is constructed first. Then, the midpoints of the upper sides of each rectangle are found and joined sequentially. The first midpoint and the last midpoint are joined to the base line to form a closed polygon.
Note: If a ruler is used to join the midpoints of the upper sides of the histogram rectangles sequentially, and to join the first midpoint and the last midpoint to the base line, a frequency polygon is formed.
If a free hand is used without using a ruler, a frequency curve is formed. A frequency polygon has vertices, while a frequency curve has no sharp corners.
(d) Cumulative Frequency Curve/Ogive
A line graph constructed based on a cumulative frequency distribution is called a cumulative frequency curve or ogive. There are two methods for constructing a cumulative frequency curve: the "less than" method and the "more than" method.
According to the "less than" method, the upper limits of each class interval are taken on the X-axis, and the cumulative frequencies are taken on the Y-axis. Points are plotted on the graph paper and joined sequentially.
Similarly, according to the "more than" method, the lower limits of each class interval are taken on the X-axis, and the cumulative frequencies are taken on the Y-axis. Points are plotted on the graph paper and joined sequentially.