SUMMARY-Miscellaneous metrics
miscellaneous metrics for the forecast of a time serie.
We have an example of data points / observations with the forecast.
Durbin Watson
The Durbin-Watson is a test statistic used in statistics to detect autocorrelation in the residuals from a regression analysis.
By checking for autocorrelation in the residuals, it helps ensure that the model's assumptions are met and that the results are reliable.
Key Points
- Autocorrelation: This refers to the correlation between successive observations of a variable. In regression, it means that the errors (or residuals) in one observation are related to the errors in other observations.
- Durbin-Watson Test: This test specifically checks for first-order autocorrelation, meaning the correlation between the error in one observation and the error in the immediately preceding observation.
- Value Range: The Durbin-Watson statistic ranges from 0 to 4.
. Value of 2: Indicates no autocorrelation.
. Values below 2: Suggest positive autocorrelation (errors tend to be similar to the previous errors).
. Values above 2: Indicate negative autocorrelation (errors tend to be opposite to the previous errors).
Importantance
Autocorrelation in the residuals can
- Violate the assumptions of regression analysis: This can lead to unreliable estimates of the regression coefficients and their standard errors.
- Produce inefficient and potentially biased forecasts: If the model doesn't account for the autocorrelation, predictions may be inaccurate.
Skew
Skew in statistics is a measure of the asymmetry of a probability distribution. It tells us whether the distribution is symmetrical or not.
Types of Skew
- Positive Skew (Right Skew): The tail of the distribution is longer on the right side.
- Negative Skew (Left Skew): The tail of the distribution is longer on the left side.
- Zero Skew: The distribution is symmetrical.
Causes of Skew
Skew can be caused by various factors, including:
- Outliers: Extreme values that pull the tail of the distribution in one direction.
- Non-normality: The data may not follow a normal distribution.
- Measurement error: Errors in the measurement process can introduce skew.
Importance of Skew
Skew is important because it can affect the interpretation of statistical measures like the mean and standard deviation. For example, in a positively skewed distribution, the mean is greater than the median, and in a negatively skewed distribution, the mean is less than the median.
How to Measure Skew
- Pearson's Coefficient of Skewness: This is a common measure of skew that is calculated using the mean, median, and standard deviation of the data.
- Bowley's Coefficient of Skewness: This is another measure of skew that is calculated using the quartiles of the data.
- Moment Coefficient of Skewness: This is a more general measure of skew that can be calculated using the moments of the data.
Notes
Skew is a descriptive statistic, meaning it describes the shape of the data. It is important to consider skew when interpreting statistical results. Skew can be corrected using transformations like the log transformation or the square root transformation.
Theilsu
Theil's is the uncertainty coefficient or entropy coefficient, is a measure of nominal association. Theil's U essentially measures how much knowing the value of one variable reduces the uncertainty about the other.
Key Concepts
- Measures Association: It quantifies the strength of the relationship between two nominal (categorical) variables
- Based on Information Entropy: It's derived from the concept of information entropy, which measures the uncertainty or randomness of a variable.
- Range: Theil's U ranges from 0 to 1
. 0: Indicates no association between the variables.
. 1: Indicates a perfect association, meaning knowing the value of one variable completely determines the value of the other.