Thursday, September 5, 2019
Measures of Central Tendency
Measures of Central Tendency The one single value that reflects the nature and characteristics of the entire given data is called as central value. Central tendency refers to the middle point of a given distribution. It is other wise called as ââ¬Ëmeasures of location. The nature of this value is such that it always lies between the highest value and the lowest value of that series. In other wards, it lies at the centre or at the middle of the series. CHARACTERISTICS OF A GOOD AVERAGE: Yule and Kendall have pointed out some basic characteristics which an average should satisfy to call it as good average. They are: Average is the easiest method to calculate It should be rigidly defined. This says that, the series of whose average is calculated should have only one interpretation. One interpretation will avoid personal prejudice or bias. It should be representative of the entire series. In other wards, the value should lie between the upper and lower limit of the data. It should have capable of further algebraic treatment. In other wards, an ideal average is one which can be used for further statistical calculations. It should not be affected by the extreme values of the observation or series. DEFINITIONS: Different experts have defined differently to the concept of average. Gupta (2008) in his work has narrated Lawrence J. Kaplan definition as ââ¬Ëone of the most widely used set of summery figures is known as measures of location, which are often referred to as averages, measures of central tendency or central location. The purpose of computing an average value for a set of observation is to obtain a single value which is representative of all the items and which the mind can grasp simply and quickly. The single value is the point of location around which the individual items cluster. This opinion clearly narrates the basic purpose of computing an average. Similarly, Croxton and Cowden define the concept as ââ¬Ëan average is a single value within the range of the data that is used to represent all of the values in the series. Since the average is somewhere within the range of data, it is sometimes called a measure of central value. TYPES OF AVERAGES: Following five are frequently used types of an average or measure of central tendency. They are Arithmetic mean Weighted arithmetic mean Median Mode Geometric Mean and Harmonic Mean All the above five types are discussed below in detail. THE ARITHMETIC MEAN: Arithmetic mean is the most simple and frequently used technique of computing central tendency. The average is also called as mean. It is other wise called as a single number representing a whole data set. It can be computed in a several ways. Commonly it can be computed by dividing the total value by the number of observations. Let ââ¬Ën be the number of items in a case. Each individual item in a list can be represented in a relationship as x1, x2, x3, ,xn. In this relationship, ââ¬Ëx1 is one value, ââ¬Ëx2 is another value in the series and the value extends upto a particular limit represented by ââ¬Ëxn. The dots in the relationship express that there are some values between the two extremes which are omitted in the relationship. Some people interprets the same relationship as, which can be read as ââ¬Ëx-sub-i, as i runs from 1 upto n. In case the numbers of variable in list is more, then it requires a long space for deriving the mean. Thus the summation notation is used to describe the entire relationship. The above relationship can be derived with the help of summation as: , representing the sum of the ââ¬Ëx values, using the index ââ¬Ëi to enumerate from the starting value i =1 to the ending value i = n. thus we have and the average can be represented as The symbol ââ¬Ëi is again nothing but a continuing covariance. The readers should not be confused while using the notation , rather they can also use or or any other similar notation which are of same meaning. The mean of a series can be calculated in a number of ways. Following are some basic ways that are commonly used in researchers related to management and social sciences, particularly by the beginners. However, the readers should not be confused on sample mean and population mean. A sample of a population of ââ¬Ën observations and the mean of sample is denoted by ââ¬Ë. Where as when one measure the population mean i.e., the entire variables of a study than the mean is represented by the symbol ââ¬Ëà µ, which is pronounced as ââ¬Ëmue and is derived from the Greek letter ââ¬Ëmu. Below we are discussing the concepts of sample mean. Type-1: In case of individual observation: a. Direct method- Mean or average can be calculated directly in the following way Step-1: First of all the researcher has to add all the observations of a given series. The observations are x1, x2, x3, xn. Step-2- Count how many observations are their in that series (n) Step-3- the following procedure than adopted to get the average. Thus the average or mean denoted as ââ¬Ëand can be read as ââ¬Ëx bar is derives as: Thus it can be said that the average mark of the final contestants in the quiz competition is 67.6 marks which can be rounded over to 70 marks. b. Short-cut method- The average or mean can also be calculated by using short-cut method. This method is applicable when a particular series is having so many observations. In other wards, to reduce calculations this method is generally used. The steps of calculating mean by this method is as follows: i. The research has to assume any one value from the entire series. This value is called as assumed value. Let this value be denoted here as ââ¬ËP. ii. Differentiate each a value from this assumed vale. That is find out individual values of each observation. Let this difference value be denoted as ââ¬ËB. Hence B=xn-P where n= 1,2,3,n. iii. Add all the difference value or get sum of B and count the number of observation ââ¬Ën. iv. Putting the values in the following formula and get the value of mean. Type-2: In case of discrete observations or series of data: Discrete series are the variables whose values can be identified and isolated. In such a case the variant is a whole number, but is form frequency distribution. The data set derived in case-1 above is called as ungrouped data. The computations in case of these data are not difficult. Where as, if the data set is having frequencies are called as groped data. a. Direct method: Following are some steps of calculating mean by using the direct method i. In the first step, the values of each row (X) are to be multiplied by its respective frequencies (f). ii. Calculate the sum of the frequencies (column-2 in our example) at the end of the column denoted as iii. Calculate the sum of the X*f values at the end of the column (column-3 in our below derived example) denoted as iv. Mean () can be calculated by using the formula b. Short cut method: Arithmetic mean can also be calculated by using the short cut method or assumed mean method. This method is generally used by the researchers to avoid the time requirements and calculation complexities. Following are the steps of calculating mean by this method. i. The first step is to assume a value from the ââ¬ËX values of the series (denoted as A= assumed value) ii. In this step in another column we have to calculate the deviation value (denoted as D) of ââ¬ËX to that of assumed value (A) i.e., D = X-A iii. Multiply each D with f i.e., find our Df iv. Calculate the value of sum of at the end of respective columns. v. Mean can be calculated by using the formula as Type-3: In case of continuous observations or series of data: Another type of frequency distributions is there which consists of data that are grouped by classes. In such case each value of an observation falls somewhere in one of the classes. Calculation of arithmetic mean in case of grouped data is some what different from that of ungrouped data. To find out the arithmetic mean of continuous series, one has to calculate the midpoint of each class interval. To make midpoints come out in whole cents, one has to round up the value. Mean in continuous series can be calculated in two ways as derived below: a. Direct method: In this method, mean can be calculated by using the steps as i. First step is to calculate the mid point of each class interval. The mid point is denoted by ââ¬Ëm and can be calculated as . ii. Multiply the mid points of each class interval (m) with its respective frequencies (f) i.e., find out mf iii. Calculate the value of sum of at the end of respective columns. iv. Mean can be calculated by using the formula as b. Short cut method: Mean can also be calculated by using short cut method. Following are the steps to calculate mean by this method. i. First step is to calculate the mid point of each class interval. The mid point is denoted by ââ¬Ëm and can be calculated as . ii. Assume a value from the ââ¬Ëm values of the series (denoted as A= assumed value) iii. In this step in another column we have to calculate the deviation value (denoted as D) of ââ¬Ëm to that of assumed value (A) i.e., D = m A iv. Multiply each D with f i.e., find our Df v. Calculate the value of sum of at the end of respective columns. vi. Put the values in the following formula to get mean of the series THE WEIGHTED ARITHMETIC MEAN: In real life situation in management studies and social sciences, some items need more importance than that of the other items of that series. Hence, importance assigned to different items with the help of numerical value as per the priority basis in a series as called as weights. The arithmetic mean on the other hand, gives equal weightage or importance to each observation of the series. In such a case, the weighted mean acts as the most important tool for studying the behaviour of the entire set of study. Here use of weighted mean is the only measure of central tendency for getting correct and accurate result. Following is the procedures of computing mean of a weighted series. By the way, an important problem that arises while using weighted mean is regarding selection of weights. Weights may be either actual or arbitrary, i.e., estimated. The researcher will not face any difficulty, if the actual weights are assigned to the set of data. But in case, if actual data is not assigned than it is advisable to assign arbitrary or imaginary weights. Following are some steps of calculating weighted mean: i. In the first step, the values of each row (X) are to be multiplied by its respective weights (W) ii. Calculate the sum of the weights (column-2 in our example) at the end of the column denoted as iii. Calculate the sum of the X*W values at the end of the column (column-3 in our below derived example) denoted as iv. Mean () can be calculated by using the formula Advantages of Arithmetic mean: Following are some advantages of arithmetic mean. i. The concept is more familiar concept among the people. It is unique because each data set has only one mean. ii. It is very easy to compute and requires fewer calculations. As every data set has a mean, hence, as a measure mean can be calculated. iii. Mean represents a single value to the entire data set. Thus easily one can interpret a data set its characteristics. iv. An average can be calculated of any type of series. Disadvantages of Arithmetic mean: The disadvantages are as follows. i. One of the greatest disadvantages of average is that it is mostly affected by the extreme values. For example let consider Sachin Tendulkars score in last three matches. Let it be, 100 in first match, 2 in second match and 10 in third match. The average score of these three matches will me 100+2+10/3=37. Thus it implies that Tendulkars average score is 37 which is not correct. Hence lead to wrong conclusion. ii. It is not possible to compute mean for a data set that has open-ended classes at either the high or low end of the scale. iii. The arithmetic average sometimes gives such value which cannot be found from the data series from which it is calculated. iv. It is unrealistic. v. It cannot be identified observation or graphic method of representing the data and interpretation. THE MEDIAN: Another one technique to measure central tendency of a series of observation is the median. Median is generally that value of the entire series which divides the entire series into two equal parts from the middle. In other wards, it is the exactly middle value of the series. Hence, fifty percent of the observations in the series are above the median value and other fifty or half observations are remains below the median value. However, if the series are having odd numbers of observations like 3,5,7,9,11,13 etc., then the median value will be equal to one of the exact value from the series. On the other hand, if the series is having even observations, then median value can be calculated by getting the arithmetic mean of the two middle values of the observations of the series. Median an a technique of measuring central tendency can be best used in cases where the problem sought for more qualitative or psychological in nature such as health, intelligence, satisfaction etc. Definitions: The concept of median can be clearer from the definitions derived below. Connor defined it as ââ¬Ëthe median is the value which divides the distribution into two equal parts, one part comprising all values greater, and the other values less than the median. Where as Croxton and Cowden defined it as ââ¬Ëthe median is that value which divides a series so that one half or more of the items are equal to or less than it and one half or more of the items are equal to or greater than it. Median can be computed in three different series separately. All the cases are discussed separately below. Computation of Median in Individual Series Computation of Median in Discrete Series and Computation of Median in Continuous Series Computation of Median in Individual Series: Following are some steps to calculate the median in individual series. The first and the most important requirement is that the data should be arranged in an ascending (increasing) or descending (decreasing) order. Than the median value can be calculated by using the formula th value or item from the series. Where, N= Number of observation in that series. When N is odd number (like 5, 7,9,11,13 etc.) median value is one of the item within that series, but in case N will be a even number than median is the arithmetic mean of the two middle value after applying the above formula. The following problem can make the concept clear. Computation of Median in Discrete Series: Discrete series are those where the data set is assigned with frequencies or repetitions. Following are the steps of computing the median when the series is discrete. The first and the most important requirement is that the data should be arranged in an ascending (increasing) or descending (decreasing) order. In the third column of the table, calculate the cumulative frequencies. Than the median class can be calculated by using the formula th value or item from the cumulative frequencies of the series. Computation of Median in Continuous Series: Continuous series are the series of data where the data ranges are in class intervals. Each class is having an upper limit and a lower limit. In such cases the computation of median is little bit different from that of the other two cases discussed above. Following are some steps to get median in continuous series of data. The first and the most important requirement is that the data should be arranged in an ascending (increasing) or descending (decreasing) order. In the third column of the table, calculate the cumulative frequencies. Than the median class can be calculated by using the formula th value or item from the cumulative frequencies column of the series. Form the cumulative frequencies, one can get the median class i.e., in which class the value lies. This class is called as median class and one can get the lower value of the class and the upper value of the class. The following formula can be used to calculate the median We have to get the median class first. For this, median class is N/2 th value or 70/2= 35. The value 35 lies in the third row of the table against the class 30-40. Thus 30-40 is the median class and it shows that the median value lies in this class only. After getting the median class, to get the median value we have to apply the formula . Advantages of Median: Median as a measure of central tendency has following advantages of its own. It is very simple and can be easily understood. It is very easy to calculate and interpret. It Includes all the observations while calculation. Like that of arithmetic mean, median is not affected by the extreme values of the observation. It has the advantages for using further analysis. It can even used to calculate for open ended distribution. Disadvantages of Median: Median as a means to calculate central tendency is also not free from draw backs. Following are some important draw backs that are leveled against median. Median is not a widely measure to calculate central tendency like that of arithmetic mean and also mode. It is not based on algebraic treatment. THE MODE: Mode is defined as the value which occurs most often in the series or other wise called as the value having the highest frequencies. It is, hence, the value which has maximum concentration around it. Like that of median, mode is also more useful in case of qualitative data analysis. It can be used in problems generally having the discrete series of data and particularly, problems involving the expression of psychological determinants. Definitions: The concept of mode can be clearer from the definitions derived below. Croxten and Cowden defined it as ââ¬Ëthe mode of a distribution is the value at the point around which the items tend to be most heavily concentrated. It may be regarded as the most typical of a series of value. Similarly, in the words of Prof. Kenny ââ¬Ëthe value of the variable which occurs most frequently in a distribution is called the mode. Mode can be computed in three different series separately. All the cases are discussed separately below. Computation of Mode in Individual Series Computation of Mode in Discrete Series and Computation of Mode in Continuous Series Computation of Mode in Individual Series: Calculation of mode in individual series is very easy. The data is to be arranged in a sequential order and that value which occurs maximum times in that series is the value mode. The following example will make the concept clear. Computation of Mode in Discrete Series: Discrete series are those where the data set is assigned with frequencies or repetitions. Hence directly, mode will be that value which is having maximum frequency. By the way, for accuracy in calculation, there is a method called as groping method which is frequently used for calculating mode. Following is the illustration to calculate mode of a series by using grouping method. Consider the following data set and calculate mode by using the grouping method. The calculation carried out in different steps is derived as: Step-1: Sum of two frequencies including the first one i.e., 1+2=3, then 4+3=7, then 2+1=3 etc. Step-2: Sum of two frequencies excluding the first one i.e., 2+4=7, then 3+2=5, then 1+2=3 etc. Step-3: Sum of three frequencies including the first one i.e., 1+2+4=7, then 3+2+1=6 etc. Step-4: Sum of two frequencies excluding the first one i.e., 2+4+3=9, then 2+1+2=5 etc. Step-5: Sum of three frequencies excluding the first and second i.e., 4+3+2=9, then 1+2+1=4. Computation of Mode in Continuous Series: As already discussed, continuous series are the series of data where the data ranges are in class intervals. Each class is having an upper limit and a lower limit. In such cases the computation of mode is little bit different from that of the other two cases discussed above. Following are some steps to get mode in continuous series of data. Select the mode class. A mode class can be selected by selecting the highest frequency size. Mode value can be calculated by using the following formula Advantages of Mode: Following are some important advantages of mode as a measure of central tendency. It is easy to calculate and easy to understand. It eliminates the impact of extreme values. It is easy to locate and in some cases we can estimate mode by mere inspection. It is not affected by extreme values. Disadvantages of Mode: Following are some important disadvantages of mode. It is not suitable for further mathematical treatment. It may lead to a wrong conclusion. Some critiques criticized mode by saying that mode is influenced by length of the class interval. THE GEOMETRIC MEAN: Geometric mean, as another measure of central tendency is very much useful in social science and business related problems. It is an average which is most suitable when large weights have to be assigned to small values of observations and small weights to large values of observation. Geometric mean best suits to the problems where a particular situation changes over time in percentage terms. Hence it is basically used to find the average percent increase or decrease in sales, production, population etc. Again it is also considered to be the best average in the construction of index numbers. Geometric mean is defined as the Nth root of the product where there are N observations of a given series of data. For example, if a series is having only two observations then N will be two or we will take square root of the observations. Similarly, when series is having three observations then we have to take cube root and the process will continue like wise. Geometric mean can be calculated separately for two sets of data. Both are discussed below. When the data is ungrouped: In case of ungrouped series of observations, GM can be calculated by using the following formula: where X1 , X2 , X3, XN various observations of a series and N is the Nth observation of the data. But it is very difficult to calculate GM by using the above formula. Hence the above formula needs to be simplified. To simplify the formula, both side of the above formula is to be taken logarithms. To calculate the G.M. of an ungrouped data, following steps are to be adopted. Take the log of individual observations i.e., calculate log X. Make the sum of all log X values i.e., calculate Then use the above formula to calculate the G.M. of the series. When the data is grouped: Calculation of geometric mean in case of grouped data is little bit different from that of calculation of G.M. in case of ungrouped series. Following are some steps to calculate the G.M. in case of grouped data series. To calculate the G.M. of a grouped data, following steps are to be adopted. Take the mid point of the continuous series. Take the log of mid points i.e., calculate log X and it can also be denoted as log m Make the sum of all log X values i.e., calculate or Then use the following formula to calculate the G.M. of the series. Advantages of G.M.: Following are some advantages of G.M. i. One of the greatest advantages of G.M. is that it can be possible for further algebraic treatment i.e., combined G.M., can be calculated when there is availability of G.M., of two or more series along with their corresponding number of observations. ii. It is a very useful method of getting average when the series of observation possess rates of growth i.e., increase or decrease over a period of time. iii. Since it is useful in averaging ratios and percentages, hence, are more useful in social science and business related problems. Disadvantages of G.M.: G.M., as a technique of calculating central value is also not free from defects. Following are some disadvantages of G.M. i. It is very difficult to calculate the value of log and antilog and hence, compared to other methods of central tendency, G.M., is very difficult to compute. ii. The greatest disadvantage of G.M., is that it cannot be used when the series is having both negative or positive observations and observations having more zero values. THE HARMONIC MEAN: The last technique of getting the central tendency of a series of data is the Harmonic mean (H.M.). Harmonic mean, like the other methods of central tendency is not clearly defined. It is the reciprocal of the arithmetic mean of the reciprocal of the individual observations. H.M., is very much useful in those cases of observations where the nature of data is such that it express the average rate of growth of any events. For example, the average rate of increase of sales or profits, the average speed of a train or bus or a journey can be completed etc. Following is the general formula to calculate H.M.: When the data is ungrouped: When the observations of the series are ungrouped, H.M., can be calculated as: The step for calculating H.M., of ungrouped data by using the derived formula is very simple. In such a case, one has to find out the values of 1/X and then sum of 1/X. When the data is grouped: In case of grouped data, the formula for calculating H.M., is discussed as below: Take the mid point of the continuous series. Calculate 1/X and it can also be denoted as 1/m Make the sum of all 1/X values i.e., calculate Then use the following formula to calculate the H.M. of the series. Advantages of H.M.: Harmonic mean as a measure of central tendency is having following advantages. i. Harmonic mean considers each and every observation of the series. ii. It is simple to compute when compared to G.M. iii. It is very useful for averaging rates. Disadvantages of H.M.: Following are some disadvantages of H.M. i. It is rarely used as a technique of measuring central tendency. ii. It is not defined clearly like that of other techniques of measuring central value mean, median and mode. iii. Like that of G.M., H.M., cannot be used when the series is having both negative or positive observations and observations having more zero values. CONCLUSION: An average is a single value representing a group of values. Each type of averages has their own advantages and disadvantages and hence, they are having their own usefulness. But it is always confusing among the researchers that which average is the best among the five different techniques that we have discussed above? The answer to this question is very simple and says that no single average can be considered as best for all types of data. However, experts opine two considerations that the researchers must be kept in mind while going for selecting a technique to determine the average. The first consideration is that of determining the nature of data. If the data is more skewed it is better to avoid arithmetic mean, if the data is having gap around the middle value of the series, then median should be avoided and on the other hand, if the nature of series is such that they are unequal in class-intervals, then mode is to be avoided. The second consideration is on the type of value req uired. When there is need of composite average of all absolute or relative values, then arithmetic mean or geometric mean is to be selected, in case the researcher is in need of a middle value of the series, then median may be the best choice, but in case the most common value is needed, then will not be any alternative except mode. Similarly, Harmonic mean is useful in averaging ratios and percentages. SUMMERY: 1. Different experts have defined differently to the concept of average. 2. Arithmetic mean is the most simple and frequently used technique of computing central tendency. The average is also called as mean. It is other wise called as a single number representing a whole data set. 3. The best use of arithmetic mean is at the time of correcting some wrong entered data. For example in a group of 10 students, scoring an average of 60 marks, in a paper it was wrongly marked 70 instead of 65. the solution in such a cases is derived below: 4. In such a case, the weighted mean acts as the most important tool for studying the behaviour of the entire set of study. Here use of weighted mean is the only measure of central tendency for getting correct and accurate result. 5. Median is generally that value of the entire series which divides the entire series into two equal parts from the middle. 6. Mode is defined as the value which occurs most often in the series or other wise called as the value having the highest frequencies. It is, hence, the value which has maximum concentration around it. 7. Geometric mean is defined as the Nth root of the product where there are N observations of a given series of data. 8. Harmonic mean is the reciprocal of the arithmetic mean of the reciprocal of the individual observations. QUESTIONS: 1. In a class containing 90 students following heights (in inches) has been observed. Based on the data calculate the mean, median and mode of the class. 2. In a physical test camp meant for selection of army solders the following heights of the candidates have been observed. Find the mean, median and mode of the distribution. 3. From the distribution derived below, calculate mean and standard deviation of the series. 4. The following table derives the marks obtained in Indian Economy paper by 90 students in a class. Calculate the mean, median and mode of the following distribution. 5. The monthly profits of 180 shop keepers selling different commodities in a city footpath is derived below. Calculate the mean and median of the distribution. 6. The daily wage of 130 labourers working in a cotton mill in Ahmadabad cith is derived below. Calculate the mean, median and mode. 7. There is always controversy before the BCCI before selection of batsmen between Rahul Dravid and V.V.S. Laxman. Runs of 10 test matches of both the players are given below. Suggest who the better run getter is and who the consistent player is. 8. Calculate the mean, median and mode of the following distribution. 9. What do you mean by measure of central tendency? How far it helpful to a decision-maker in the process of decision making? 10. Define measure of central tendency? What are the basic criteria of a good average? 11. What do you mean by measure of central tendency? Compare and contrast arithmetic mean, median and mode by pointing out the advantages and disadvantages. 12. The expenditure on purchase of snacks by a group of hosteller per week is Measures of Central Tendency Measures of Central Tendency The one single value that reflects the nature and characteristics of the entire given data is called as central value. Central tendency refers to the middle point of a given distribution. It is other wise called as ââ¬Ëmeasures of location. The nature of this value is such that it always lies between the highest value and the lowest value of that series. In other wards, it lies at the centre or at the middle of the series. CHARACTERISTICS OF A GOOD AVERAGE: Yule and Kendall have pointed out some basic characteristics which an average should satisfy to call it as good average. They are: Average is the easiest method to calculate It should be rigidly defined. This says that, the series of whose average is calculated should have only one interpretation. One interpretation will avoid personal prejudice or bias. It should be representative of the entire series. In other wards, the value should lie between the upper and lower limit of the data. It should have capable of further algebraic treatment. In other wards, an ideal average is one which can be used for further statistical calculations. It should not be affected by the extreme values of the observation or series. DEFINITIONS: Different experts have defined differently to the concept of average. Gupta (2008) in his work has narrated Lawrence J. Kaplan definition as ââ¬Ëone of the most widely used set of summery figures is known as measures of location, which are often referred to as averages, measures of central tendency or central location. The purpose of computing an average value for a set of observation is to obtain a single value which is representative of all the items and which the mind can grasp simply and quickly. The single value is the point of location around which the individual items cluster. This opinion clearly narrates the basic purpose of computing an average. Similarly, Croxton and Cowden define the concept as ââ¬Ëan average is a single value within the range of the data that is used to represent all of the values in the series. Since the average is somewhere within the range of data, it is sometimes called a measure of central value. TYPES OF AVERAGES: Following five are frequently used types of an average or measure of central tendency. They are Arithmetic mean Weighted arithmetic mean Median Mode Geometric Mean and Harmonic Mean All the above five types are discussed below in detail. THE ARITHMETIC MEAN: Arithmetic mean is the most simple and frequently used technique of computing central tendency. The average is also called as mean. It is other wise called as a single number representing a whole data set. It can be computed in a several ways. Commonly it can be computed by dividing the total value by the number of observations. Let ââ¬Ën be the number of items in a case. Each individual item in a list can be represented in a relationship as x1, x2, x3, ,xn. In this relationship, ââ¬Ëx1 is one value, ââ¬Ëx2 is another value in the series and the value extends upto a particular limit represented by ââ¬Ëxn. The dots in the relationship express that there are some values between the two extremes which are omitted in the relationship. Some people interprets the same relationship as, which can be read as ââ¬Ëx-sub-i, as i runs from 1 upto n. In case the numbers of variable in list is more, then it requires a long space for deriving the mean. Thus the summation notation is used to describe the entire relationship. The above relationship can be derived with the help of summation as: , representing the sum of the ââ¬Ëx values, using the index ââ¬Ëi to enumerate from the starting value i =1 to the ending value i = n. thus we have and the average can be represented as The symbol ââ¬Ëi is again nothing but a continuing covariance. The readers should not be confused while using the notation , rather they can also use or or any other similar notation which are of same meaning. The mean of a series can be calculated in a number of ways. Following are some basic ways that are commonly used in researchers related to management and social sciences, particularly by the beginners. However, the readers should not be confused on sample mean and population mean. A sample of a population of ââ¬Ën observations and the mean of sample is denoted by ââ¬Ë. Where as when one measure the population mean i.e., the entire variables of a study than the mean is represented by the symbol ââ¬Ëà µ, which is pronounced as ââ¬Ëmue and is derived from the Greek letter ââ¬Ëmu. Below we are discussing the concepts of sample mean. Type-1: In case of individual observation: a. Direct method- Mean or average can be calculated directly in the following way Step-1: First of all the researcher has to add all the observations of a given series. The observations are x1, x2, x3, xn. Step-2- Count how many observations are their in that series (n) Step-3- the following procedure than adopted to get the average. Thus the average or mean denoted as ââ¬Ëand can be read as ââ¬Ëx bar is derives as: Thus it can be said that the average mark of the final contestants in the quiz competition is 67.6 marks which can be rounded over to 70 marks. b. Short-cut method- The average or mean can also be calculated by using short-cut method. This method is applicable when a particular series is having so many observations. In other wards, to reduce calculations this method is generally used. The steps of calculating mean by this method is as follows: i. The research has to assume any one value from the entire series. This value is called as assumed value. Let this value be denoted here as ââ¬ËP. ii. Differentiate each a value from this assumed vale. That is find out individual values of each observation. Let this difference value be denoted as ââ¬ËB. Hence B=xn-P where n= 1,2,3,n. iii. Add all the difference value or get sum of B and count the number of observation ââ¬Ën. iv. Putting the values in the following formula and get the value of mean. Type-2: In case of discrete observations or series of data: Discrete series are the variables whose values can be identified and isolated. In such a case the variant is a whole number, but is form frequency distribution. The data set derived in case-1 above is called as ungrouped data. The computations in case of these data are not difficult. Where as, if the data set is having frequencies are called as groped data. a. Direct method: Following are some steps of calculating mean by using the direct method i. In the first step, the values of each row (X) are to be multiplied by its respective frequencies (f). ii. Calculate the sum of the frequencies (column-2 in our example) at the end of the column denoted as iii. Calculate the sum of the X*f values at the end of the column (column-3 in our below derived example) denoted as iv. Mean () can be calculated by using the formula b. Short cut method: Arithmetic mean can also be calculated by using the short cut method or assumed mean method. This method is generally used by the researchers to avoid the time requirements and calculation complexities. Following are the steps of calculating mean by this method. i. The first step is to assume a value from the ââ¬ËX values of the series (denoted as A= assumed value) ii. In this step in another column we have to calculate the deviation value (denoted as D) of ââ¬ËX to that of assumed value (A) i.e., D = X-A iii. Multiply each D with f i.e., find our Df iv. Calculate the value of sum of at the end of respective columns. v. Mean can be calculated by using the formula as Type-3: In case of continuous observations or series of data: Another type of frequency distributions is there which consists of data that are grouped by classes. In such case each value of an observation falls somewhere in one of the classes. Calculation of arithmetic mean in case of grouped data is some what different from that of ungrouped data. To find out the arithmetic mean of continuous series, one has to calculate the midpoint of each class interval. To make midpoints come out in whole cents, one has to round up the value. Mean in continuous series can be calculated in two ways as derived below: a. Direct method: In this method, mean can be calculated by using the steps as i. First step is to calculate the mid point of each class interval. The mid point is denoted by ââ¬Ëm and can be calculated as . ii. Multiply the mid points of each class interval (m) with its respective frequencies (f) i.e., find out mf iii. Calculate the value of sum of at the end of respective columns. iv. Mean can be calculated by using the formula as b. Short cut method: Mean can also be calculated by using short cut method. Following are the steps to calculate mean by this method. i. First step is to calculate the mid point of each class interval. The mid point is denoted by ââ¬Ëm and can be calculated as . ii. Assume a value from the ââ¬Ëm values of the series (denoted as A= assumed value) iii. In this step in another column we have to calculate the deviation value (denoted as D) of ââ¬Ëm to that of assumed value (A) i.e., D = m A iv. Multiply each D with f i.e., find our Df v. Calculate the value of sum of at the end of respective columns. vi. Put the values in the following formula to get mean of the series THE WEIGHTED ARITHMETIC MEAN: In real life situation in management studies and social sciences, some items need more importance than that of the other items of that series. Hence, importance assigned to different items with the help of numerical value as per the priority basis in a series as called as weights. The arithmetic mean on the other hand, gives equal weightage or importance to each observation of the series. In such a case, the weighted mean acts as the most important tool for studying the behaviour of the entire set of study. Here use of weighted mean is the only measure of central tendency for getting correct and accurate result. Following is the procedures of computing mean of a weighted series. By the way, an important problem that arises while using weighted mean is regarding selection of weights. Weights may be either actual or arbitrary, i.e., estimated. The researcher will not face any difficulty, if the actual weights are assigned to the set of data. But in case, if actual data is not assigned than it is advisable to assign arbitrary or imaginary weights. Following are some steps of calculating weighted mean: i. In the first step, the values of each row (X) are to be multiplied by its respective weights (W) ii. Calculate the sum of the weights (column-2 in our example) at the end of the column denoted as iii. Calculate the sum of the X*W values at the end of the column (column-3 in our below derived example) denoted as iv. Mean () can be calculated by using the formula Advantages of Arithmetic mean: Following are some advantages of arithmetic mean. i. The concept is more familiar concept among the people. It is unique because each data set has only one mean. ii. It is very easy to compute and requires fewer calculations. As every data set has a mean, hence, as a measure mean can be calculated. iii. Mean represents a single value to the entire data set. Thus easily one can interpret a data set its characteristics. iv. An average can be calculated of any type of series. Disadvantages of Arithmetic mean: The disadvantages are as follows. i. One of the greatest disadvantages of average is that it is mostly affected by the extreme values. For example let consider Sachin Tendulkars score in last three matches. Let it be, 100 in first match, 2 in second match and 10 in third match. The average score of these three matches will me 100+2+10/3=37. Thus it implies that Tendulkars average score is 37 which is not correct. Hence lead to wrong conclusion. ii. It is not possible to compute mean for a data set that has open-ended classes at either the high or low end of the scale. iii. The arithmetic average sometimes gives such value which cannot be found from the data series from which it is calculated. iv. It is unrealistic. v. It cannot be identified observation or graphic method of representing the data and interpretation. THE MEDIAN: Another one technique to measure central tendency of a series of observation is the median. Median is generally that value of the entire series which divides the entire series into two equal parts from the middle. In other wards, it is the exactly middle value of the series. Hence, fifty percent of the observations in the series are above the median value and other fifty or half observations are remains below the median value. However, if the series are having odd numbers of observations like 3,5,7,9,11,13 etc., then the median value will be equal to one of the exact value from the series. On the other hand, if the series is having even observations, then median value can be calculated by getting the arithmetic mean of the two middle values of the observations of the series. Median an a technique of measuring central tendency can be best used in cases where the problem sought for more qualitative or psychological in nature such as health, intelligence, satisfaction etc. Definitions: The concept of median can be clearer from the definitions derived below. Connor defined it as ââ¬Ëthe median is the value which divides the distribution into two equal parts, one part comprising all values greater, and the other values less than the median. Where as Croxton and Cowden defined it as ââ¬Ëthe median is that value which divides a series so that one half or more of the items are equal to or less than it and one half or more of the items are equal to or greater than it. Median can be computed in three different series separately. All the cases are discussed separately below. Computation of Median in Individual Series Computation of Median in Discrete Series and Computation of Median in Continuous Series Computation of Median in Individual Series: Following are some steps to calculate the median in individual series. The first and the most important requirement is that the data should be arranged in an ascending (increasing) or descending (decreasing) order. Than the median value can be calculated by using the formula th value or item from the series. Where, N= Number of observation in that series. When N is odd number (like 5, 7,9,11,13 etc.) median value is one of the item within that series, but in case N will be a even number than median is the arithmetic mean of the two middle value after applying the above formula. The following problem can make the concept clear. Computation of Median in Discrete Series: Discrete series are those where the data set is assigned with frequencies or repetitions. Following are the steps of computing the median when the series is discrete. The first and the most important requirement is that the data should be arranged in an ascending (increasing) or descending (decreasing) order. In the third column of the table, calculate the cumulative frequencies. Than the median class can be calculated by using the formula th value or item from the cumulative frequencies of the series. Computation of Median in Continuous Series: Continuous series are the series of data where the data ranges are in class intervals. Each class is having an upper limit and a lower limit. In such cases the computation of median is little bit different from that of the other two cases discussed above. Following are some steps to get median in continuous series of data. The first and the most important requirement is that the data should be arranged in an ascending (increasing) or descending (decreasing) order. In the third column of the table, calculate the cumulative frequencies. Than the median class can be calculated by using the formula th value or item from the cumulative frequencies column of the series. Form the cumulative frequencies, one can get the median class i.e., in which class the value lies. This class is called as median class and one can get the lower value of the class and the upper value of the class. The following formula can be used to calculate the median We have to get the median class first. For this, median class is N/2 th value or 70/2= 35. The value 35 lies in the third row of the table against the class 30-40. Thus 30-40 is the median class and it shows that the median value lies in this class only. After getting the median class, to get the median value we have to apply the formula . Advantages of Median: Median as a measure of central tendency has following advantages of its own. It is very simple and can be easily understood. It is very easy to calculate and interpret. It Includes all the observations while calculation. Like that of arithmetic mean, median is not affected by the extreme values of the observation. It has the advantages for using further analysis. It can even used to calculate for open ended distribution. Disadvantages of Median: Median as a means to calculate central tendency is also not free from draw backs. Following are some important draw backs that are leveled against median. Median is not a widely measure to calculate central tendency like that of arithmetic mean and also mode. It is not based on algebraic treatment. THE MODE: Mode is defined as the value which occurs most often in the series or other wise called as the value having the highest frequencies. It is, hence, the value which has maximum concentration around it. Like that of median, mode is also more useful in case of qualitative data analysis. It can be used in problems generally having the discrete series of data and particularly, problems involving the expression of psychological determinants. Definitions: The concept of mode can be clearer from the definitions derived below. Croxten and Cowden defined it as ââ¬Ëthe mode of a distribution is the value at the point around which the items tend to be most heavily concentrated. It may be regarded as the most typical of a series of value. Similarly, in the words of Prof. Kenny ââ¬Ëthe value of the variable which occurs most frequently in a distribution is called the mode. Mode can be computed in three different series separately. All the cases are discussed separately below. Computation of Mode in Individual Series Computation of Mode in Discrete Series and Computation of Mode in Continuous Series Computation of Mode in Individual Series: Calculation of mode in individual series is very easy. The data is to be arranged in a sequential order and that value which occurs maximum times in that series is the value mode. The following example will make the concept clear. Computation of Mode in Discrete Series: Discrete series are those where the data set is assigned with frequencies or repetitions. Hence directly, mode will be that value which is having maximum frequency. By the way, for accuracy in calculation, there is a method called as groping method which is frequently used for calculating mode. Following is the illustration to calculate mode of a series by using grouping method. Consider the following data set and calculate mode by using the grouping method. The calculation carried out in different steps is derived as: Step-1: Sum of two frequencies including the first one i.e., 1+2=3, then 4+3=7, then 2+1=3 etc. Step-2: Sum of two frequencies excluding the first one i.e., 2+4=7, then 3+2=5, then 1+2=3 etc. Step-3: Sum of three frequencies including the first one i.e., 1+2+4=7, then 3+2+1=6 etc. Step-4: Sum of two frequencies excluding the first one i.e., 2+4+3=9, then 2+1+2=5 etc. Step-5: Sum of three frequencies excluding the first and second i.e., 4+3+2=9, then 1+2+1=4. Computation of Mode in Continuous Series: As already discussed, continuous series are the series of data where the data ranges are in class intervals. Each class is having an upper limit and a lower limit. In such cases the computation of mode is little bit different from that of the other two cases discussed above. Following are some steps to get mode in continuous series of data. Select the mode class. A mode class can be selected by selecting the highest frequency size. Mode value can be calculated by using the following formula Advantages of Mode: Following are some important advantages of mode as a measure of central tendency. It is easy to calculate and easy to understand. It eliminates the impact of extreme values. It is easy to locate and in some cases we can estimate mode by mere inspection. It is not affected by extreme values. Disadvantages of Mode: Following are some important disadvantages of mode. It is not suitable for further mathematical treatment. It may lead to a wrong conclusion. Some critiques criticized mode by saying that mode is influenced by length of the class interval. THE GEOMETRIC MEAN: Geometric mean, as another measure of central tendency is very much useful in social science and business related problems. It is an average which is most suitable when large weights have to be assigned to small values of observations and small weights to large values of observation. Geometric mean best suits to the problems where a particular situation changes over time in percentage terms. Hence it is basically used to find the average percent increase or decrease in sales, production, population etc. Again it is also considered to be the best average in the construction of index numbers. Geometric mean is defined as the Nth root of the product where there are N observations of a given series of data. For example, if a series is having only two observations then N will be two or we will take square root of the observations. Similarly, when series is having three observations then we have to take cube root and the process will continue like wise. Geometric mean can be calculated separately for two sets of data. Both are discussed below. When the data is ungrouped: In case of ungrouped series of observations, GM can be calculated by using the following formula: where X1 , X2 , X3, XN various observations of a series and N is the Nth observation of the data. But it is very difficult to calculate GM by using the above formula. Hence the above formula needs to be simplified. To simplify the formula, both side of the above formula is to be taken logarithms. To calculate the G.M. of an ungrouped data, following steps are to be adopted. Take the log of individual observations i.e., calculate log X. Make the sum of all log X values i.e., calculate Then use the above formula to calculate the G.M. of the series. When the data is grouped: Calculation of geometric mean in case of grouped data is little bit different from that of calculation of G.M. in case of ungrouped series. Following are some steps to calculate the G.M. in case of grouped data series. To calculate the G.M. of a grouped data, following steps are to be adopted. Take the mid point of the continuous series. Take the log of mid points i.e., calculate log X and it can also be denoted as log m Make the sum of all log X values i.e., calculate or Then use the following formula to calculate the G.M. of the series. Advantages of G.M.: Following are some advantages of G.M. i. One of the greatest advantages of G.M. is that it can be possible for further algebraic treatment i.e., combined G.M., can be calculated when there is availability of G.M., of two or more series along with their corresponding number of observations. ii. It is a very useful method of getting average when the series of observation possess rates of growth i.e., increase or decrease over a period of time. iii. Since it is useful in averaging ratios and percentages, hence, are more useful in social science and business related problems. Disadvantages of G.M.: G.M., as a technique of calculating central value is also not free from defects. Following are some disadvantages of G.M. i. It is very difficult to calculate the value of log and antilog and hence, compared to other methods of central tendency, G.M., is very difficult to compute. ii. The greatest disadvantage of G.M., is that it cannot be used when the series is having both negative or positive observations and observations having more zero values. THE HARMONIC MEAN: The last technique of getting the central tendency of a series of data is the Harmonic mean (H.M.). Harmonic mean, like the other methods of central tendency is not clearly defined. It is the reciprocal of the arithmetic mean of the reciprocal of the individual observations. H.M., is very much useful in those cases of observations where the nature of data is such that it express the average rate of growth of any events. For example, the average rate of increase of sales or profits, the average speed of a train or bus or a journey can be completed etc. Following is the general formula to calculate H.M.: When the data is ungrouped: When the observations of the series are ungrouped, H.M., can be calculated as: The step for calculating H.M., of ungrouped data by using the derived formula is very simple. In such a case, one has to find out the values of 1/X and then sum of 1/X. When the data is grouped: In case of grouped data, the formula for calculating H.M., is discussed as below: Take the mid point of the continuous series. Calculate 1/X and it can also be denoted as 1/m Make the sum of all 1/X values i.e., calculate Then use the following formula to calculate the H.M. of the series. Advantages of H.M.: Harmonic mean as a measure of central tendency is having following advantages. i. Harmonic mean considers each and every observation of the series. ii. It is simple to compute when compared to G.M. iii. It is very useful for averaging rates. Disadvantages of H.M.: Following are some disadvantages of H.M. i. It is rarely used as a technique of measuring central tendency. ii. It is not defined clearly like that of other techniques of measuring central value mean, median and mode. iii. Like that of G.M., H.M., cannot be used when the series is having both negative or positive observations and observations having more zero values. CONCLUSION: An average is a single value representing a group of values. Each type of averages has their own advantages and disadvantages and hence, they are having their own usefulness. But it is always confusing among the researchers that which average is the best among the five different techniques that we have discussed above? The answer to this question is very simple and says that no single average can be considered as best for all types of data. However, experts opine two considerations that the researchers must be kept in mind while going for selecting a technique to determine the average. The first consideration is that of determining the nature of data. If the data is more skewed it is better to avoid arithmetic mean, if the data is having gap around the middle value of the series, then median should be avoided and on the other hand, if the nature of series is such that they are unequal in class-intervals, then mode is to be avoided. The second consideration is on the type of value req uired. When there is need of composite average of all absolute or relative values, then arithmetic mean or geometric mean is to be selected, in case the researcher is in need of a middle value of the series, then median may be the best choice, but in case the most common value is needed, then will not be any alternative except mode. Similarly, Harmonic mean is useful in averaging ratios and percentages. SUMMERY: 1. Different experts have defined differently to the concept of average. 2. Arithmetic mean is the most simple and frequently used technique of computing central tendency. The average is also called as mean. It is other wise called as a single number representing a whole data set. 3. The best use of arithmetic mean is at the time of correcting some wrong entered data. For example in a group of 10 students, scoring an average of 60 marks, in a paper it was wrongly marked 70 instead of 65. the solution in such a cases is derived below: 4. In such a case, the weighted mean acts as the most important tool for studying the behaviour of the entire set of study. Here use of weighted mean is the only measure of central tendency for getting correct and accurate result. 5. Median is generally that value of the entire series which divides the entire series into two equal parts from the middle. 6. Mode is defined as the value which occurs most often in the series or other wise called as the value having the highest frequencies. It is, hence, the value which has maximum concentration around it. 7. Geometric mean is defined as the Nth root of the product where there are N observations of a given series of data. 8. Harmonic mean is the reciprocal of the arithmetic mean of the reciprocal of the individual observations. QUESTIONS: 1. In a class containing 90 students following heights (in inches) has been observed. Based on the data calculate the mean, median and mode of the class. 2. In a physical test camp meant for selection of army solders the following heights of the candidates have been observed. Find the mean, median and mode of the distribution. 3. From the distribution derived below, calculate mean and standard deviation of the series. 4. The following table derives the marks obtained in Indian Economy paper by 90 students in a class. Calculate the mean, median and mode of the following distribution. 5. The monthly profits of 180 shop keepers selling different commodities in a city footpath is derived below. Calculate the mean and median of the distribution. 6. The daily wage of 130 labourers working in a cotton mill in Ahmadabad cith is derived below. Calculate the mean, median and mode. 7. There is always controversy before the BCCI before selection of batsmen between Rahul Dravid and V.V.S. Laxman. Runs of 10 test matches of both the players are given below. Suggest who the better run getter is and who the consistent player is. 8. Calculate the mean, median and mode of the following distribution. 9. What do you mean by measure of central tendency? How far it helpful to a decision-maker in the process of decision making? 10. Define measure of central tendency? What are the basic criteria of a good average? 11. What do you mean by measure of central tendency? Compare and contrast arithmetic mean, median and mode by pointing out the advantages and disadvantages. 12. The expenditure on purchase of snacks by a group of hosteller per week is
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.