The mean and median are both measures of central tendency, but How Do The Mean And Median Compare? The mean, also known as the average, is sensitive to extreme values, while the median, the middle value, is more robust. This comprehensive guide on compare.edu.vn explores their differences, applications, and when to use each to gain insightful data analysis, addressing skewness and central tendency.
1. What is the Mean and Median?
The mean and median are both fundamental concepts in statistics, serving as measures of central tendency. Understanding how the mean and median compare is crucial for interpreting data accurately. Here’s a breakdown of each:
1.1. Mean: The Average Value
The mean, often referred to as the average, is calculated by summing all values in a dataset and dividing by the number of values. This measure is widely used due to its simplicity and intuitive interpretation.
Formula:
Mean (μ) = (Σxᵢ) / n
Where:
- Σxᵢ represents the sum of all values in the dataset.
- n is the number of values in the dataset.
Example:
Consider the dataset: 2, 4, 6, 8, 10
Mean = (2 + 4 + 6 + 8 + 10) / 5 = 30 / 5 = 6
1.2. Median: The Middle Value
The median is the middle value in a dataset that is sorted in ascending or descending order. It divides the dataset into two equal halves, with 50% of the values falling below and 50% above it.
Steps to find the median:
-
Sort the dataset: Arrange the values in ascending or descending order.
-
Identify the middle value:
- If the number of values (n) is odd, the median is the middle value.
- If the number of values (n) is even, the median is the average of the two middle values.
Examples:
- Odd number of values:
Dataset: 1, 3, 5, 7, 9
Sorted dataset: 1, 3, 5, 7, 9
Median = 5 (the middle value) - Even number of values:
Dataset: 1, 3, 5, 7, 9, 11
Sorted dataset: 1, 3, 5, 7, 9, 11
Median = (5 + 7) / 2 = 6 (average of the two middle values)
Understanding how the mean and median compare helps in choosing the appropriate measure for different datasets.
2. How Do the Mean and Median Compare: Key Differences
While both the mean and median are measures of central tendency, understanding how the mean and median compare reveals their distinct characteristics and sensitivities to data. Here are the key differences between the mean and median:
2.1. Sensitivity to Outliers
- Mean: Highly sensitive to outliers. Outliers are extreme values in a dataset that can significantly pull the mean towards them.
- Example: Consider the dataset: 1, 2, 3, 4, 100. The mean is (1 + 2 + 3 + 4 + 100) / 5 = 22. The outlier 100 drastically increases the mean, making it not representative of the typical values.
- Median: Robust to outliers. The median is not affected by extreme values because it only considers the middle value(s).
- Example: Using the same dataset: 1, 2, 3, 4, 100. The sorted dataset is 1, 2, 3, 4, 100. The median is 3, which remains unaffected by the outlier 100.
2.2. Calculation Method
- Mean: Calculated by summing all values and dividing by the number of values. It takes into account every value in the dataset.
- Median: Determined by finding the middle value in a sorted dataset. It only considers the position of the values, not their actual magnitudes.
2.3. Use Cases
- Mean: Best used when the data is normally distributed and does not contain significant outliers. It provides a good measure of the “average” value in such cases.
- Median: Preferred when the data is skewed or contains outliers. It provides a more stable measure of central tendency that is not distorted by extreme values.
2.4. Interpretation
- Mean: Represents the balancing point of the dataset. It is the value around which the data is centered.
- Median: Represents the midpoint of the dataset. It is the value that separates the lower 50% of the data from the upper 50%.
2.5. Mathematical Properties
- Mean: Has useful mathematical properties that make it suitable for further statistical analysis, such as calculating variance and standard deviation.
- Median: Lacks some of the mathematical properties of the mean, making it less suitable for certain types of statistical analysis.
2.6. Data Distribution
- Symmetric Distribution: In a perfectly symmetric distribution, the mean and median are equal. This indicates that the data is evenly distributed around the central value.
- Skewed Distribution: In a skewed distribution, the mean is pulled towards the tail of the distribution, while the median remains closer to the center. This difference is a key indicator of skewness.
2.7. Impact of Data Transformation
- Linear Transformations: Applying a linear transformation (e.g., multiplying by a constant or adding a constant) affects both the mean and median predictably.
- Non-linear Transformations: Non-linear transformations can affect the mean and median differently, potentially altering their relationship.
Understanding how the mean and median compare in these aspects is essential for selecting the appropriate measure of central tendency for different types of data and analytical purposes.
3. Impact of Outliers on Mean and Median
Understanding how the mean and median compare under the influence of outliers is critical for data analysis. Outliers are extreme values in a dataset that can significantly distort the mean, while the median remains relatively unaffected.
3.1. How Outliers Affect the Mean
The mean is highly sensitive to outliers because it is calculated by summing all values in the dataset and dividing by the number of values. When outliers are present, they can disproportionately influence the sum, pulling the mean towards their extreme values.
Example:
Consider the dataset of incomes (in thousands of dollars): 50, 60, 70, 80, 1000
- Without the outlier: If we exclude the outlier 1000, the dataset is 50, 60, 70, 80.
- Mean = (50 + 60 + 70 + 80) / 4 = 65
- With the outlier: Including the outlier 1000, the dataset is 50, 60, 70, 80, 1000.
- Mean = (50 + 60 + 70 + 80 + 1000) / 5 = 252
In this example, the outlier 1000 significantly increases the mean from 65 to 252, making it a poor representation of the typical income in the dataset.
3.2. How Outliers Affect the Median
The median is robust to outliers because it only considers the middle value(s) in a sorted dataset. Outliers, regardless of their magnitude, do not affect the position of the middle value(s).
Example:
Using the same dataset of incomes (in thousands of dollars): 50, 60, 70, 80, 1000
- Without the outlier: If we exclude the outlier 1000, the dataset is 50, 60, 70, 80.
- Sorted dataset: 50, 60, 70, 80
- Median = (60 + 70) / 2 = 65
- With the outlier: Including the outlier 1000, the dataset is 50, 60, 70, 80, 1000.
- Sorted dataset: 50, 60, 70, 80, 1000
- Median = 70
In this example, the median only changes from 65 to 70 when the outlier 1000 is included. This demonstrates that the median is much less sensitive to extreme values compared to the mean.
3.3. Choosing Between Mean and Median in the Presence of Outliers
When outliers are present in a dataset, the median is generally preferred over the mean as a measure of central tendency. The median provides a more accurate representation of the “typical” value because it is not distorted by extreme values.
Guidelines:
- Use the median: When the dataset contains outliers or is skewed.
- Use the mean: When the dataset is approximately normally distributed and does not contain significant outliers.
3.4. Examples in Real-World Scenarios
- Income Data: As shown in the example above, income data often contains outliers (e.g., very high earners). The median income is a better measure of the typical income than the mean income.
- Housing Prices: Housing prices can also have outliers (e.g., luxury homes). The median house price is a more representative measure of the typical home price than the mean house price.
- Test Scores: If a few students score exceptionally low or high on a test, the median score will be a more stable measure of the typical performance than the mean score.
3.5. Techniques to Handle Outliers
While the median is robust to outliers, it may still be necessary to address outliers in some situations. Here are some techniques to handle outliers:
- Removal: Remove outliers from the dataset if they are due to errors or are not representative of the population.
- Transformation: Apply a mathematical transformation (e.g., logarithmic transformation) to reduce the impact of outliers.
- Winsorizing: Replace extreme values with less extreme values. For example, replace the top 5% of values with the value at the 95th percentile and the bottom 5% of values with the value at the 5th percentile.
Understanding how the mean and median compare under the influence of outliers and employing appropriate techniques to handle them is crucial for accurate and meaningful data analysis.
4. Mean vs. Median: Which Measure to Use?
Deciding whether to use the mean or median depends on the nature of the data and the purpose of the analysis. Understanding how the mean and median compare is essential for making the right choice. Here’s a guide to help you determine which measure is more appropriate:
4.1. When to Use the Mean
The mean is most suitable when the data meets certain conditions:
- Normally Distributed Data: The data follows a normal distribution, also known as a Gaussian distribution, which is symmetric and bell-shaped.
- No Significant Outliers: The dataset does not contain extreme values that can disproportionately influence the mean.
- Equal Intervals: The data is measured on an interval or ratio scale, where the intervals between values are equal.
Advantages of Using the Mean:
- Simple to Calculate: The mean is easy to compute and understand.
- Utilizes All Data Points: It takes into account every value in the dataset.
- Mathematical Properties: The mean has useful mathematical properties that make it suitable for further statistical analysis, such as calculating variance, standard deviation, and confidence intervals.
- Efficiency: In a normal distribution, the mean is the most efficient estimator of the population center.
Examples of When to Use the Mean:
- Average Test Scores: If the test scores of a class are normally distributed and there are no significant outliers, the mean test score is a good measure of the class’s performance.
- Average Height: The average height of individuals in a population is typically normally distributed, making the mean height a suitable measure.
- Average Temperature: The average daily temperature over a month can be a useful measure if there are no extreme temperature spikes.
4.2. When to Use the Median
The median is preferred when the data does not meet the conditions for using the mean:
- Skewed Data: The data is not symmetrically distributed and has a long tail on one side.
- Presence of Outliers: The dataset contains extreme values that can significantly distort the mean.
- Ordinal Data: The data is measured on an ordinal scale, where the values have a meaningful order but the intervals between values are not equal.
Advantages of Using the Median:
- Robust to Outliers: The median is not affected by extreme values, making it a more stable measure of central tendency.
- Suitable for Skewed Data: It provides a better representation of the “typical” value in skewed distributions.
- Easy to Understand: The median is simple to interpret as the middle value in a dataset.
- Applicable to Ordinal Data: It can be used with ordinal data, where the mean is not appropriate.
Examples of When to Use the Median:
- Income Data: Income distributions are typically skewed, with a few individuals earning significantly more than the majority. The median income is a better measure of the typical income than the mean income.
- Housing Prices: Housing prices often have outliers due to luxury homes. The median house price is a more representative measure of the typical home price than the mean house price.
- Customer Satisfaction Ratings: If customer satisfaction is rated on an ordinal scale (e.g., very dissatisfied, dissatisfied, neutral, satisfied, very satisfied), the median rating is a more appropriate measure than the mean rating.
- Reaction Times: In experiments, reaction times can be skewed due to occasional distractions or lapses in attention. The median reaction time is a more reliable measure of typical response time.
4.3. Guidelines for Choosing Between Mean and Median
Here are some general guidelines to help you decide whether to use the mean or median:
- Check for Skewness: Examine the data for skewness using histograms, box plots, or skewness coefficients. If the data is highly skewed, the median is generally preferred.
- Identify Outliers: Look for outliers in the dataset using scatter plots, box plots, or outlier detection methods. If there are significant outliers, the median is more robust.
- Consider the Scale of Measurement: If the data is measured on an ordinal scale, the median is the appropriate measure. If the data is measured on an interval or ratio scale and is normally distributed, the mean is suitable.
- Think About the Research Question: Consider what you are trying to measure and which measure of central tendency best answers your research question. If you want to know the “typical” value and the data is skewed or contains outliers, the median is a better choice. If you want to know the “average” value and the data is normally distributed, the mean is suitable.
4.4. Additional Considerations
- Trimmed Mean: A trimmed mean is calculated by removing a certain percentage of the highest and lowest values in a dataset before calculating the mean. This can be a compromise between the mean and median, providing a measure that is less sensitive to outliers than the mean but still utilizes more data points than the median.
- Weighted Mean: A weighted mean is calculated by assigning different weights to different values in a dataset. This can be useful when some values are more important or reliable than others.
Understanding how the mean and median compare and following these guidelines will help you choose the appropriate measure of central tendency for your data and analytical purposes.
5. Examples Illustrating the Difference
To further illustrate how the mean and median compare, let’s explore some practical examples across different scenarios.
5.1. Example 1: Real Estate Prices
Scenario:
Consider a dataset of housing prices in a neighborhood (in thousands of dollars): 200, 250, 300, 350, 400, 1000
Analysis:
- Mean:
Mean = (200 + 250 + 300 + 350 + 400 + 1000) / 6 = 2500 / 6 ≈ 416.67 - Median:
Sorted dataset: 200, 250, 300, 350, 400, 1000
Median = (300 + 350) / 2 = 650 / 2 = 325
Interpretation:
The mean housing price is approximately $416,670, while the median housing price is $325,000. The high outlier of $1,000,000 significantly pulls the mean upwards, making it less representative of the typical housing price in the neighborhood. The median, being robust to outliers, provides a more accurate representation of the central tendency.
Conclusion:
In this case, the median is a better measure to describe the typical housing price because it is not affected by the outlier.
5.2. Example 2: Salaries in a Company
Scenario:
Consider a dataset of annual salaries (in thousands of dollars) of employees in a small company: 40, 45, 50, 55, 60, 200
Analysis:
- Mean:
Mean = (40 + 45 + 50 + 55 + 60 + 200) / 6 = 450 / 6 = 75 - Median:
Sorted dataset: 40, 45, 50, 55, 60, 200
Median = (50 + 55) / 2 = 105 / 2 = 52.5
Interpretation:
The mean salary is $75,000, while the median salary is $52,500. The outlier salary of $200,000 (likely the CEO’s salary) significantly increases the mean, making it not representative of the typical employee’s salary. The median provides a more accurate representation of the central tendency.
Conclusion:
The median is a better measure to describe the typical salary in the company because it is not influenced by the CEO’s high salary.
5.3. Example 3: Exam Scores
Scenario:
Consider a dataset of exam scores of students in a class: 60, 70, 75, 80, 85, 90, 95
Analysis:
- Mean:
Mean = (60 + 70 + 75 + 80 + 85 + 90 + 95) / 7 = 555 / 7 ≈ 79.29 - Median:
Sorted dataset: 60, 70, 75, 80, 85, 90, 95
Median = 80
Interpretation:
The mean exam score is approximately 79.29, while the median exam score is 80. In this case, the dataset is relatively symmetric with no significant outliers, so the mean and median are close to each other.
Conclusion:
Either the mean or the median could be used to describe the central tendency of the exam scores, as they provide similar values. The mean might be preferred because it utilizes all data points, and the distribution is reasonably symmetric.
5.4. Example 4: Waiting Times at a Customer Service Center
Scenario:
Consider a dataset of waiting times (in minutes) for customers at a service center: 2, 3, 4, 5, 6, 7, 30
Analysis:
- Mean:
Mean = (2 + 3 + 4 + 5 + 6 + 7 + 30) / 7 = 57 / 7 ≈ 8.14 - Median:
Sorted dataset: 2, 3, 4, 5, 6, 7, 30
Median = 5
Interpretation:
The mean waiting time is approximately 8.14 minutes, while the median waiting time is 5 minutes. The outlier waiting time of 30 minutes significantly increases the mean, making it less representative of the typical waiting time. The median provides a more accurate representation of the central tendency.
Conclusion:
The median is a better measure to describe the typical waiting time because it is not affected by the outlier.
5.5. Example 5: Retail Sales
Scenario:
Consider a dataset of daily sales (in dollars) for a small retail store over a week: 100, 120, 130, 150, 160, 1000, 180
Analysis:
- Mean:
Mean = (100 + 120 + 130 + 150 + 160 + 1000 + 180) / 7 = 1840 / 7 ≈ 262.86 - Median:
Sorted dataset: 100, 120, 130, 150, 160, 180, 1000
Median = 150
Interpretation:
The mean daily sales is approximately $262.86, while the median daily sales is $150. The outlier sales day of $1000 (perhaps due to a special promotion) significantly increases the mean, making it less representative of the typical daily sales. The median provides a more accurate representation of the central tendency.
Conclusion:
The median is a better measure to describe the typical daily sales because it is not influenced by the outlier.
These examples illustrate how the mean and median compare in different real-world scenarios and highlight the importance of choosing the appropriate measure of central tendency based on the characteristics of the data.
6. Skewness and the Relationship Between Mean and Median
Skewness is a measure of the asymmetry of a probability distribution. Understanding how the mean and median compare in the context of skewness is crucial for interpreting data accurately.
6.1. Understanding Skewness
Skewness refers to the extent to which a distribution is not symmetric. A symmetric distribution has equal tails on both sides of the center, while a skewed distribution has a longer tail on one side.
Types of Skewness:
- Positive Skew (Right Skew): The distribution has a long tail extending to the right (higher values). The mean is typically greater than the median.
- Negative Skew (Left Skew): The distribution has a long tail extending to the left (lower values). The mean is typically less than the median.
- Zero Skew: The distribution is symmetric. The mean and median are approximately equal.
6.2. How Skewness Affects the Mean
The mean is sensitive to skewness because it is calculated by summing all values in the dataset. In a skewed distribution, the long tail pulls the mean towards that tail.
- Positive Skew: The long tail of high values pulls the mean to the right, making it greater than the median.
- Negative Skew: The long tail of low values pulls the mean to the left, making it less than the median.
6.3. How Skewness Affects the Median
The median is robust to skewness because it only considers the middle value(s) in a sorted dataset. The position of the middle value(s) is not significantly affected by the values in the tails.
- Positive Skew: The median remains closer to the peak of the distribution and is less affected by the long tail of high values.
- Negative Skew: The median remains closer to the peak of the distribution and is less affected by the long tail of low values.
6.4. Relationship Between Mean, Median, and Skewness
The relationship between the mean and median provides valuable information about the skewness of a distribution:
- Mean > Median: Indicates positive skew. The distribution has a long tail extending to the right.
- Mean < Median: Indicates negative skew. The distribution has a long tail extending to the left.
- Mean ≈ Median: Indicates zero skew or approximate symmetry. The distribution is roughly symmetric.
6.5. Examples Illustrating Skewness
- Income Distribution: Income distributions are typically positively skewed, with a few individuals earning significantly more than the majority. The mean income is greater than the median income.
- Age at Retirement: The distribution of age at retirement is often negatively skewed, with a long tail of individuals retiring earlier than the majority. The mean age at retirement is less than the median age at retirement.
- Symmetric Test Scores: If the test scores of a class are normally distributed, the mean and median scores will be approximately equal, indicating zero skew.
6.6. Identifying Skewness
Several methods can be used to identify skewness in a dataset:
- Histograms: A histogram provides a visual representation of the distribution. Look for a long tail on one side of the distribution.
- Box Plots: A box plot displays the median, quartiles, and outliers. In a skewed distribution, the median will not be in the center of the box, and the whiskers will be of different lengths.
- Skewness Coefficient: The skewness coefficient is a numerical measure of skewness. A positive value indicates positive skew, a negative value indicates negative skew, and a value close to zero indicates approximate symmetry.
6.7. Using Mean and Median to Interpret Data
Understanding the relationship between the mean, median, and skewness can help you interpret data more accurately:
- Positive Skew: When the mean is greater than the median, be aware that the “average” value (mean) is being pulled upwards by a few high values. The median provides a better representation of the “typical” value.
- Negative Skew: When the mean is less than the median, be aware that the “average” value (mean) is being pulled downwards by a few low values. The median provides a better representation of the “typical” value.
- Symmetry: When the mean and median are approximately equal, the distribution is roughly symmetric, and either measure can be used to describe the central tendency.
Understanding how the mean and median compare in the context of skewness is essential for accurate and meaningful data analysis.
7. Applications in Different Fields
The concepts of mean and median are widely used across various fields. Understanding how the mean and median compare in these contexts is crucial for effective data interpretation and decision-making.
7.1. Finance
- Income Analysis: In finance, income distributions are often skewed, with a few high earners. The median income is used to represent the typical income, while the mean income can be inflated by outliers.
- Real Estate: Real estate prices can be skewed due to luxury homes. The median home price is a more stable measure of central tendency than the mean home price.
- Investment Returns: When analyzing investment returns, the median return can provide a more accurate picture of typical performance, especially if there are extreme gains or losses.
7.2. Healthcare
- Patient Wait Times: Patient wait times at hospitals or clinics can be skewed due to emergencies. The median wait time is a better measure of the typical waiting experience.
- Length of Stay: The length of stay in a hospital can be skewed due to a few patients with long stays. The median length of stay provides a more representative measure.
- Medical Test Results: When analyzing medical test results, such as cholesterol levels, the median can be more informative if the distribution is skewed or contains outliers.
7.3. Education
- Exam Scores: Exam scores can be skewed if some students perform exceptionally well or poorly. The median score can provide a more stable measure of typical performance.
- Teacher Salaries: Teacher salaries can be skewed due to a few highly paid administrators. The median salary can provide a more accurate representation of typical teacher compensation.
- Student Loan Debt: The distribution of student loan debt can be skewed, with some students having very high debt. The median debt level can provide a more representative measure of typical debt.
7.4. Economics
- Wage Analysis: Wage distributions are often skewed, with a few high earners. The median wage is used to represent the typical wage, while the mean wage can be inflated by outliers.
- Household Income: Household income can be skewed due to a few wealthy households. The median household income provides a more accurate representation of typical household income.
- Poverty Rates: The median income is used to determine poverty thresholds. Households below the median income are considered to be at a higher risk of poverty.
7.5. Marketing
- Customer Spending: Customer spending can be skewed due to a few high-spending customers. The median spending amount is a more stable measure of typical customer spending.
- Website Traffic: Website traffic can be skewed due to viral content. The median number of visitors per day is a better measure of typical website traffic.
- Advertising Costs: Advertising costs can be skewed due to a few expensive campaigns. The median cost per click provides a more representative measure of typical advertising costs.
7.6. Sports Analytics
- Player Salaries: In professional sports, player salaries are often skewed, with a few star players earning significantly more than the majority. The median salary is a more representative measure of the typical player’s compensation.
- Game Attendance: Game attendance can be skewed due to a few popular games. The median attendance number provides a more accurate representation of typical game attendance.
- Player Performance Metrics: Performance metrics, such as points scored or goals scored, can be skewed if a few players consistently outperform others. The median performance metric provides a more stable measure of typical player performance.
7.7. Environmental Science
- Pollution Levels: Pollution levels can be skewed due to a few extreme events. The median pollution level is a more representative measure of typical pollution levels.
- Rainfall Amounts: Rainfall amounts can be skewed due to a few heavy storms. The median rainfall amount provides a more accurate representation of typical rainfall patterns.
- Species Population Sizes: Population sizes of certain species can be skewed due to a few large populations. The median population size provides a more stable measure of typical population sizes.
7.8. Social Sciences
- Survey Responses: Survey responses can be skewed if some respondents provide extreme answers. The median response provides a more stable measure of typical opinions or attitudes.
- Crime Rates: Crime rates can be skewed due to a few high-crime areas. The median crime rate provides a more accurate representation of typical crime levels.
- Education Levels: Education levels can be skewed due to a few highly educated individuals. The median education level provides a more representative measure of typical education attainment.
Understanding how the mean and median compare across these different fields enables professionals to make informed decisions based on accurate data interpretation.
8. Advanced Concepts and Considerations
Beyond the basic understanding of mean and median, there are more advanced concepts and considerations that can enhance your analytical capabilities. Understanding how the mean and median compare in these contexts is essential for sophisticated data analysis.
8.1. Trimmed Mean
The trimmed mean is a compromise between the mean and median. It is calculated by removing a certain percentage of the highest and lowest values in a dataset before computing the mean.
Advantages:
- Less Sensitive to Outliers: The trimmed mean is less sensitive to outliers than the standard mean but still utilizes more data points than the median.
- Robustness: It provides a more robust measure of central tendency in the presence of outliers.
Example:
Consider the dataset: 1, 2, 3, 4, 5, 6, 7, 8, 9, 100
- Standard Mean: (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 100) / 10 = 145 / 10 = 14.5
- 10% Trimmed Mean: Remove 10% of the highest and lowest values (in this case, 1 and 100). The trimmed dataset is 2, 3, 4, 5, 6, 7, 8, 9. The trimmed mean is (2 + 3 + 4 + 5 + 6 + 7 + 8 + 9) / 8 = 44 / 8 = 5.5
The trimmed mean provides a more accurate representation of the central tendency than the standard mean, which is heavily influenced by the outlier 100.
8.2. Weighted Mean
The weighted mean is calculated by assigning different weights to different values in a dataset. This can be useful when some values are more important or reliable than others.
Formula:
Weighted Mean = (Σ(wᵢ * xᵢ)) / Σwᵢ
Where:
- wᵢ is the weight assigned to the i-th value.
- xᵢ is the i-th value in the dataset.
Example:
Consider a student’s grades in a course:
- Homework: 80 (weight = 20%)
- Midterm Exam: 70 (weight = 30%)
- Final Exam: 90 (weight = 50%)
Weighted Mean = (0.20 80) + (0.30 70) + (0.50 * 90) = 16 + 21 + 45 = 82
The weighted mean provides a more accurate representation of the student’s overall performance than the standard mean, which would give equal weight to each grade component.
8.3. Geometric Mean
The geometric mean is used to calculate the average rate of change over time. It is particularly useful for financial analysis, such as calculating average investment returns.
Formula:
Geometric Mean = (x₁ x₂ … * xₙ)^(1/n)
Where:
- x₁, x₂, …, xₙ are the values in the dataset.
- n is the number of values in the dataset.
Example:
Consider an investment that returns 10% in the first year, 20% in the second year, and 30% in the third year.
Geometric Mean = (1.10 1.20 1.30)^(1/3) = (1.716)^(1/3) ≈ 1.197
The average annual return is approximately 19.7%.
8.4. Harmonic Mean
The harmonic mean is used to calculate the average rate when the rates are expressed as ratios. It is particularly useful for problems involving rates and ratios, such as average speed.
Formula:
Harmonic Mean = n / (Σ(1 / xᵢ))
Where:
- x₁, x₂, …, xₙ are the values in the dataset.
- n is the number of values in the dataset.
Example:
Consider a car that travels 100 miles at 50 mph and then returns 100 miles at 25 mph.
Harmonic Mean = 2 / ((1/50) + (1