One of the most commonly used data transformation method is taking the natural logs of the original values. Log transformation works for data where the errors/residuals get larger for larger values of the variable (s). And this trend occurs in most data because the error or change in the value of a variable is often a percent of the value rather than an absolute value. For the same percent error, a larger value of the variable means a larger absolute error, so errors are larger too.
For example, a 5% error translates into an error that is 5% of the value of the variable. If the original value is 100, the error is 5% x 100, or 5. If the original value is 500, the error becomes 5% x 500, or 25.
When we take logs, this multiplicative factor becomes an additive factor, because of the nature of logs.
log(X * error) = log(X) + log(error)
The percent error therefore becomes the same additive error, regardless of the original value of the variable. In other words, the non-uniform errors become uniform. And that's why taking logs of the variable(s) helps in meeting the requirements for our statistical analysis most of the times.
Reference
A New View of Statistics website
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment