Jump to content

Log–log plot: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
mNo edit summary
m Typed math quantities
 
(32 intermediate revisions by 19 users not shown)
Line 1: Line 1:
{{Short description|2D graphic with logarithmic scales on both axes}}
{{More citations needed|date=December 2009}}
{{More citations needed|date=December 2009}}
[[Image:LogLog exponentials.svg|thumb|A log–log plot of ''y''&nbsp;=&nbsp;''x''&nbsp;(blue), ''y''&nbsp;=&nbsp;''x''<sup>2</sup>&nbsp;(green), and ''y''&nbsp;=&nbsp;''x''<sup>3</sup>&nbsp;(red).<br>Note the logarithmic scale markings on each of the axes, and that the log&nbsp;''x'' and log&nbsp;''y'' axes (where the logarithms are 0) are where ''x'' and ''y'' themselves are 1.]]
[[Image:LogLog exponentials.svg|thumb|A log–log plot of ''y''&nbsp;=&nbsp;''x''&nbsp;(blue), ''y''&nbsp;=&nbsp;''x''<sup>2</sup>&nbsp;(green), and ''y''&nbsp;=&nbsp;''x''<sup>3</sup>&nbsp;(red).<br>Note the logarithmic scale markings on each of the axes, and that the log&nbsp;''x'' and log&nbsp;''y'' axes (where the logarithms are 0) are where ''x'' and ''y'' themselves are 1.]]

In [[science]] and [[engineering]], a '''log–log graph''' or '''log–log plot''' is a two-dimensional graph of numerical data that uses [[logarithmic scale]]s on both the horizontal and vertical axes. [[Monomial]]s – relationships of the form <math>y=ax^k</math> – appear as straight lines in a log–log graph, with the power term corresponding to the slope, and the constant term corresponding to the intercept of the line. Thus these graphs are very useful for recognizing these relationships and [[estimating parameters]]. Any base can be used for the logarithm, though most commonly base 10 (common logs) are used.
[[File:Comparison of simple power law curves in original and log-log scale.png|thumb|Comparison of linear, concave, and convex functions when plotted using a linear scale (left) or a log scale (right).]]

In [[science]] and [[engineering]], a '''log–log graph''' or '''log–log plot''' is a two-dimensional graph of numerical data that uses [[logarithmic scale]]s on both the horizontal and vertical axes. [[Exponentiation#Power_functions|Power functions]] – relationships of the form <math>y=ax^k</math> – appear as straight lines in a log–log graph, with the exponent corresponding to the slope, and the coefficient corresponding to the intercept. Thus these graphs are very useful for recognizing these relationships and [[estimating parameters]]. Any base can be used for the logarithm, though most commonly base 10 (common logs) are used.


== Relation with monomials ==
== Relation with monomials ==
Given a monomial equation <math>y=ax^k,</math> taking the logarithm of the equation (with any base) yields:
Given a monomial equation <math>y=ax^k,</math> taking the logarithm of the equation (with any base) yields:
:<math>\log y = k \log x + \log a.</math>
<math display="block">\log y = k \log x + \log a.</math>
Setting <math>X = \log x</math> and <math>Y = \log y,</math> which corresponds to using a log–log graph, yields the equation:
:<math>Y = mX + b</math>
where ''m''&nbsp;=&nbsp;''k'' is the slope of the line ([[Grade (slope)|gradient]]) and ''b''&nbsp;=&nbsp;log&nbsp;''a'' is the intercept on the (log&nbsp;''y'')-axis, meaning where log&nbsp;''x''&nbsp;=&nbsp;0, so, reversing the logs, ''a'' is the ''y'' value corresponding to ''x''&nbsp;=&nbsp;1.<ref>[http://www.intmath.com/Exponential-logarithmic-functions/7_Graphs-log-semilog.php M. Bourne ''Graphs on Logarithmic and Semi-Logarithmic Paper'' (www.intmath.com)]</ref>


Setting <math>X = \log x</math> and <math>Y = \log y,</math> which corresponds to using a log–log graph, yields the equation
== Equations ==
<math display="block">Y = mX + b</math>
The equation for a line on a log–log scale would be:


where ''m''&nbsp;=&nbsp;''k'' is the slope of the line ([[Grade (slope)|gradient]]) and ''b''&nbsp;=&nbsp;log&nbsp;''a'' is the intercept on the (log&nbsp;''y'')-axis, meaning where log&nbsp;''x''&nbsp;=&nbsp;0, so, reversing the logs, ''a'' is the ''y'' value corresponding to ''x''&nbsp;=&nbsp;1.<ref>{{Cite web |last=Bourne |first=Murray |title=7. Log-Log and Semi-log Graphs |url=https://www.intmath.com/exponential-logarithmic-functions/7-graphs-log-semilog.php |access-date=2024-10-15 |website=www.intmath.com |language=en-us}}</ref>
:<math> \log_{10}F(x) = m \log_{10}x + b, </math>
:<math> F(x) = x^m\cdot10^b, </math>


== Equations ==
The equation for a line on a log–log scale would be:
<math display="block"> \log_{10}F(x) = m \log_{10}x + b, </math>
<math display="block"> F(x) = x^m\cdot10^b, </math>
where ''m'' is the slope and ''b'' is the intercept point on the log plot.
where ''m'' is the slope and ''b'' is the intercept point on the log plot.


=== Slope of a log–log plot ===
=== Slope of a log–log plot ===
[[Image:Slope of log-log plot.PNG|thumbnail|250px|Finding the slope of a log–log plot using ratios]]
[[Image:Slope of log-log plot.PNG|thumbnail|250px|Finding the slope of a log–log plot using ratios]]
To find the slope of the plot, two points are selected on the ''x''-axis, say ''x''<sub>1</sub> and ''x''<sub>2</sub>. Using the above equation:
To find the slope of the plot, two points are selected on the ''x''-axis, say ''x''<sub>1</sub> and ''x''<sub>2</sub>. Using the below equation:
<math display="block"> \log[F (x_1)] = m \log (x_1) + b, </math>

:<math> \log[F (x_1)] = m \log (x_1) + b, \, </math>

and
and
<math display="block"> \log[F (x_2)] = m \log(x_2) + b. </math>

:<math> \mathrm {log}[F (x_2)] = m \log (x_2) + b. \, </math>

The slope ''m'' is found taking the difference:
The slope ''m'' is found taking the difference:
<math display="block"> m = \frac { \log (F_2) - \log (F_1)} { \log(x_2) - \log(x_1) } = \frac {\log (F_2/F_1)}{\log(x_2/x_1)}, </math>

where ''F''<sub>1</sub> is shorthand for ''F''(''x''<sub>1</sub>) and ''F''<sub>2</sub> is shorthand for ''F''(''x''<sub>2</sub>). The figure at right illustrates the formula. Notice that the slope in the example of the figure is ''negative''. The formula also provides a negative slope, as can be seen from the following property of the logarithm:
:<math> m = \frac { \mathrm {log} (F_2) - \mathrm {log} (F_1)} { \log(x_2) - \log(x_1) } = \frac {\log (F_2/F_1)}{\log(x_2/x_1)}, \,</math>
<math display="block"> \log(x_1/x_2) = -\log(x_2/x_1). </math>

where ''F''<sub>1</sub> is shorthand for ''F'' ( ''x''<sub>1</sub> ) and ''F''<sub>2</sub> is shorthand for ''F'' ( ''x''<sub>2</sub> ). The figure at right illustrates the formula. Notice that the slope in the example of the figure is ''negative''. The formula also provides a negative slope, as can be seen from the following property of the logarithm:

:<math> \log(x_1/x_2) = -\log(x_2/x_1). \, </math>


=== Finding the function from the log–log plot ===
=== Finding the function from the log–log plot ===
The above procedure now is reversed to find the form of the function ''F''(''x'') using its (assumed) known log–log plot. To find the function ''F'', pick some ''fixed point'' (''x''<sub>0</sub>, ''F''<sub>0</sub>), where ''F''<sub>0</sub> is shorthand for ''F''(''x''<sub>0</sub>), somewhere on the straight line in the above graph, and further some other ''arbitrary point'' (''x''<sub>1</sub>, ''F''<sub>1</sub>) on the same graph. Then from the slope formula above:
The above procedure now is reversed to find the form of the function ''F''(''x'') using its (assumed) known log–log plot. To find the function ''F'', pick some ''fixed point'' (''x''<sub>0</sub>, ''F''<sub>0</sub>), where ''F''<sub>0</sub> is shorthand for ''F''(''x''<sub>0</sub>), somewhere on the straight line in the above graph, and further some other ''arbitrary point'' (''x''<sub>1</sub>, ''F''<sub>1</sub>) on the same graph. Then from the slope formula above:
<math display="block"> m = \frac {\log (F_1 / F_0)}{\log(x_1 / x_0)} </math>

::<math> m = \frac {\log (F_1 / F_0)}{\log(x_1 / x_0)} </math>

which leads to
which leads to
<math display="block"> \log(F_1 / F_0) = m \log(x_1 / x_0) = \log[(x_1 / x_0)^m ]. </math>

::<math> \log(F_1 / F_0) = m \log(x_1 / x_0) = \log[(x_1 / x_0)^m ]. \, </math>

Notice that 10<sup>log<sub>10</sub>(''F''<sub>1</sub>)</sup> = ''F''<sub>1</sub>. Therefore, the logs can be inverted to find:
Notice that 10<sup>log<sub>10</sub>(''F''<sub>1</sub>)</sup> = ''F''<sub>1</sub>. Therefore, the logs can be inverted to find:
<math display="block"> \frac{F_1}{F_0} = \left(\frac{x_1}{x_0}\right)^m </math>

: <math> \frac{F_1}{F_0} = \left(\frac{x_1}{x_0}\right)^m </math>

or
or
<math display="block">F_1 = \frac{F_0}{x_0^m} \, x^m, </math>
which means that
<math display="block"> F(x) = \mathrm{constant}\cdot x^m. </math>
In other words, ''F'' is proportional to ''x'' to the power of the slope of the straight line of its log–log graph. Specifically, a straight line on a log–log plot containing points (''x''<sub>0</sub>,&nbsp;''F''<sub>0</sub>) and (''x''<sub>1</sub>,&nbsp;''F''<sub>1</sub>) will have the function:
<math display="block"> F(x) = {F_0}\left(\frac{x}{x_0} \right)^\frac {\log (F_1/F_0)}{\log(x_1/x_0)}, </math>
Of course, the inverse is true too: any function of the form
<math display="block"> F(x) = \mathrm{constant} \cdot x^m</math>
will have a straight line as its log–log graph representation, where the slope of the line is&nbsp;''m''.


=== Finding the area under a straight-line segment of log–log plot ===
: <math>F_1 = \frac{F_0}{x_0^m} \,\, x^m, \, </math>
To calculate the area under a continuous, straight-line segment of a log–log plot (or estimating an area of an almost-straight line), take the function defined previously
<math display="block"> F(x) = \mathrm{constant}\cdot x^m. </math>
and integrate it. Since it is only operating on a definite integral (two defined endpoints), the area A under the plot takes the form
<math display="block"> A(x) = \int_{x_0}^{x_1} F(x) \, dx = \left.\frac{\mathrm{constant}}{m+1} \cdot x^{m+1}\right|_{x_0}^{x_1} </math>


Rearranging the original equation and plugging in the fixed point values, it is found that
which means that
<math display="block"> \mathrm{constant} = \frac{F_0}{x_0^m} </math>


Substituting back into the integral, you find that for ''A'' over ''x''<sub>0</sub> to ''x''<sub>1</sub>
: <math> F(x) = \mathrm{constant}\cdot x^m. </math>


<math display="block">\begin{align}
In other words, ''F'' is proportional to ''x'' to the power of the slope of the straight line of its log–log graph. Specifically, a straight line on a log–log plot containing points (''F''<sub>0</sub>,&nbsp;''x''<sub>0</sub>) and (''F''<sub>1</sub>,&nbsp;''x''<sub>1</sub>) will have the function:
A &= \frac{F_0/x_0^m}{m+1}\cdot (x_1^{m+1}-x_0^{m+1}) \\[1.2ex]
:<math> F(x) = {F_0}\left(\frac{x}{x_0}\right)^\frac {\log (F_1/F_0)}{\log(x_1/x_0)}, </math>
\log A &= \log \left[\frac{F_0 / x_0^m}{m+1} \cdot (x_1^{m+1}-x_0^{m+1})\right] \\
&= \log \frac{F_0}{m+1} - \log \frac{1}{x_0^m} + \log (x_1^{m+1}-x_0^{m+1}) \\
&= \log \frac{F_0}{m+1} + \log \left(\frac{x_1^{m+1} - x_0^{m+1}}{x_0^m}\right) \\
&= \log \frac{F_0}{m+1} + \log \left(\frac{x_1^m}{x_0^m}\cdot x_1 - \frac{x_0^{m+1}}{x_0^m}\right)
\end{align}</math>


Therefore, <math> A = \frac{F_0}{m+1} \cdot \left[x_1 \cdot \left(\frac {x_1}{x_0}\right)^m - x_0\right] </math>
Of course, the inverse is true too: any function of the form


For ''m''&nbsp;=&nbsp;−1, the integral becomes
:<math> F(x) = \mathrm{constant} \cdot x^m</math>
<math display="block">\begin{align}
A_{(m=-1)} &= \int_{x_0}^{x_1} F(x) \, dx
= \int_{x_0}^{x_1} \frac {\mathrm{constant}}{x} \, dx
= \frac{F_0}{x_0^{-1}} \int_{x_0}^{x_1} \frac {dx}{x}
= F_0 \cdot x_0 \cdot {\ln x }\Big|_{x_0}^{x_1} \\
A_{(m=-1)} &= F_0 \cdot x_0 \cdot \ln \frac{x_1}{x_0}
\end{align}</math>


== Log-log linear regression models ==
will have a straight line as its log–log graph representation, where the slope of the line is&nbsp;''m''.


Log–log plots are often use for visualizing log-log linear regression models with (roughly) [[log-normal]], or [[Log-logistic distribution|Log-logistic]], errors. In such models, after log-transforming the dependent and independent variables, a [[Simple linear regression]] model can be fitted, with the errors becoming [[Homoscedasticity|homoscedastic]]. This model is useful when dealing with data that exhibits exponential growth or decay, while the errors continue to grow as the independent value grows (i.e., [[heteroscedasticity|heteroscedastic]] error).
=== Finding the area under a straight-line segment of log–log plot ===
To calculate the area under a continuous, straight-line segment of a log–log plot (or estimating an area of an almost-straight line), take the function defined previously


As above, in a log-log linear model the relationship between the variables is expressed as a power law. Every unit change in the independent variable will result in a constant percentage change in the dependent variable. The model is expressed as:
: <math> F(x) = \mathrm{constant}\cdot x^m. </math>


:<math>y = a \cdot x^b \cdot e^\epsilon</math>
and integrate it. Since it is only operating on a definite integral (two defined endpoints), the area A under the plot takes the form


Taking the logarithm of both sides, we get:
: <math> A(x) = \int_{x_0}^{x_1} F(x) \, dx = \frac{\mathrm{constant}}{m+1}\cdot x^{m+1}: [x_0,x_1] </math>


:<math>\log(y) = \log(a) + b \cdot \log(x) + \epsilon</math>
Rearranging the original equation and plugging in the fixed point values, it is found that


This is a [[linear equation]] in the logarithms of <math>x</math> and <math>y</math>, with <math>\log(a)</math> as the intercept and <math>b</math> as the slope. In which <math>\epsilon \sim \textrm{Normal}(\mu, \sigma^2)</math>, and <math>e^\epsilon \sim \textrm{Log-Normal}(\mu, \sigma^2)</math>.
: <math> \mathrm{constant} = \frac{F_0}{x_0^m} </math>


[[File:Visualizing Loglog Normal Data.png|thumb|Figure 1: Visualizing Loglog Normal Data]]
Substituting back into the integral, you find that for A over x<sub>0</sub> to x<sub>1</sub>


Figure 1 illustrates how this looks. It presents two plots generated using 10,000 simulated points. The left plot, titled 'Concave Line with Log-Normal Noise', displays a [[scatter plot]] of the observed data (y) against the independent variable (x). The red line represents the 'Median line', while the blue line is the 'Mean line'. This plot illustrates a dataset with a power-law relationship between the variables, represented by a concave line.
: <math> A = \frac{F_0/x_0^m}{m+1}\cdot (x_1^{m+1}-x_0^{m+1}) </math>


When both variables are log-transformed, as shown in the right plot of Figure 1, titled 'Log-Log Linear Line with Normal Noise', the relationship becomes linear. This plot also displays a scatter plot of the observed data against the independent variable, but after both axes are on a logarithmic scale. Here, both the mean and median lines are the same (red) line. This transformation allows us to fit a [[Simple linear regression]] model (which can then be transformed back to the original scale - as the median line).
: <math> \log A = \log \left[\frac{F_0 / x_0^m}{m+1}\cdot (x_1^{m+1}-x_0^{m+1})\right] = \log \frac{F_0}{m+1} - \log \frac{1}{x_0^m} + \log (x_1^{m+1}-x_0^{m+1}) </math>


[[File:Sliding Window Error Metrics Loglog Normal Data.png|thumb|Figure 2: Sliding Window Error Metrics Loglog Normal Data]]
: <math> \log A = \log \frac{F_0}{m+1} + \log \left(\frac{x_1^{m+1}-x_0^{m+1}}{x_0^m}\right) = \log \frac{F_0}{m+1} + \log \left(\frac{x_1^m}{x_0^m}\cdot x_1 - \frac{x_0^{m+1}}{x_0^m}\right) </math>


The transformation from the left plot to the right plot in Figure 1 also demonstrates the effect of the log transformation on the distribution of noise in the data. In the left plot, the noise appears to follow a [[log-normal distribution]], which is right-skewed and can be difficult to work with. In the right plot, after the log transformation, the noise appears to follow a [[normal distribution]], which is easier to reason about and model.
Therefore: <math> A = \frac{F_0}{m+1} \cdot \left[x_1 \cdot \left(\frac {x_1}{x_0}\right)^m - x_0\right] </math>


This normalization of noise is further analyzed in Figure 2, which presents a line plot of three error metrics ([[Mean Absolute Error]] - MAE, [[Root Mean Square Error]] - RMSE, and [[Mean Absolute Logarithmic Error]] - MALE) calculated over a sliding window of size 28 on the x-axis. The y-axis gives the error, plotted against the independent variable (x). Each error metric is represented by a different color, with the corresponding smoothed line overlaying the original line (since this is just simulated data, the error estimation is a bit jumpy). These error metrics provide a measure of the noise as it varies across different x values.
For ''m''&nbsp;=&nbsp;−1, the integral becomes <math> A_{(m=-1)} = \int_{x_0}^{x_1} F(x) = \int_{x_0}^{x_1} \frac {\mathrm{constant}}{x} = \frac{F_0}{x_0^{-1}} \int_{x_0}^{x_1} \frac {1}{x} = F_0 \cdot x_0 \cdot \ln x: [x_0,x_1] </math>


Log-log linear models are widely used in various fields, including economics, biology, and physics, where many phenomena exhibit power-law behavior. They are also useful in [[regression analysis]] when dealing with heteroscedastic data, as the log transformation can help to stabilize the variance.
: <math> A_{(m=-1)} = F_0 \cdot x_0 \cdot \ln \frac{x_1}{x_0}</math>


== Applications ==
== Applications ==
[[File:2010- Decreasing renewable energy costs versus deployment.svg|thumb|upright=1.3|A log-log plot condensing information that spans more than one order of magnitude along both axes]]
These graphs are useful when the parameters ''a'' and ''b'' need to be estimated from numerical data. Specifications such as this are used frequently in [[economics]].
These graphs are useful when the parameters ''a'' and ''b'' need to be estimated from numerical data. Specifications such as this are used frequently in [[economics]].


One example is the estimation of [[money demand]] functions based on [[Money demand#Inventory theory|inventory theory]], in which it can be assumed that money demand at time ''t'' is given by
One example is the estimation of [[money demand]] functions based on [[Money demand#Inventory theory|inventory theory]], in which it can be assumed that money demand at time ''t'' is given by
<math display="block">M_t = AR_t^bY_t^cU_t,</math>

:<math>M_t = AR_t^bY_t^cU_t,</math>

where ''M'' is the real quantity of [[money]] held by the public, ''R'' is the [[rate of return]] on an alternative, higher yielding asset in excess of that on money, ''Y'' is the public's [[real income]], ''U'' is an error term assumed to be [[log-normal distribution|lognormally distributed]], ''A'' is a scale parameter to be estimated, and ''b'' and ''c'' are [[Elasticity (economics)|elasticity]] parameters to be estimated. Taking logs yields
where ''M'' is the real quantity of [[money]] held by the public, ''R'' is the [[rate of return]] on an alternative, higher yielding asset in excess of that on money, ''Y'' is the public's [[real income]], ''U'' is an error term assumed to be [[log-normal distribution|lognormally distributed]], ''A'' is a scale parameter to be estimated, and ''b'' and ''c'' are [[Elasticity (economics)|elasticity]] parameters to be estimated. Taking logs yields
<math display="block">m_t = a + br_t + cy_t + u_t,</math>

:<math>m_t = a + br_t + cy_t + u_t,</math>

where ''m'' = log ''M'', ''a'' = log ''A'', ''r'' = log ''R'', ''y'' = log ''Y'', and ''u'' = log ''U'' with ''u'' being [[normal distribution|normally distributed]]. This equation can be estimated using [[ordinary least squares]].
where ''m'' = log ''M'', ''a'' = log ''A'', ''r'' = log ''R'', ''y'' = log ''Y'', and ''u'' = log ''U'' with ''u'' being [[normal distribution|normally distributed]]. This equation can be estimated using [[ordinary least squares]].


Another economic example is the estimation of a firm's [[Cobb–Douglas production function]], which is the right side of the equation
Another economic example is the estimation of a firm's [[Cobb–Douglas production function]], which is the right side of the equation
<math display="block">Q_t=AN_t^{\alpha}K_t^{\beta}U_t,</math>

:<math>Q_t=AN_t^{\alpha}K_t^{\beta}U_t,</math>

in which ''Q'' is the quantity of output that can be produced per month, ''N'' is the number of hours of labor employed in production per month, ''K'' is the number of hours of physical capital utilized per month, ''U'' is an error term assumed to be lognormally distributed, and ''A'', <math>\alpha</math>, and <math>\beta</math> are parameters to be estimated. Taking logs gives the linear regression equation
in which ''Q'' is the quantity of output that can be produced per month, ''N'' is the number of hours of labor employed in production per month, ''K'' is the number of hours of physical capital utilized per month, ''U'' is an error term assumed to be lognormally distributed, and ''A'', <math>\alpha</math>, and <math>\beta</math> are parameters to be estimated. Taking logs gives the linear regression equation
<math display="block">q_t = a + \alpha n_t + \beta k_t + u_t</math>

:<math>q_t = a + \alpha n_t + \beta k_t + u_t</math>

where ''q'' = log ''Q'', ''a'' = log ''A'', ''n'' = log ''N'', ''k'' = log ''K'', and ''u'' = log ''U''.
where ''q'' = log ''Q'', ''a'' = log ''A'', ''n'' = log ''N'', ''k'' = log ''K'', and ''u'' = log ''U''.


Log–log regression can also be used to estimate the [[fractal dimension]] of a naturally occurring [[fractal]].
Log–log regression can also be used to estimate the [[fractal dimension]] of a naturally occurring [[fractal]].


However, going in the other direction – observing that data appears as an approximate line on a log–log scale and concluding that the data follows a power law – is invalid.<ref name=clauset>{{cite journal|author1=Clauset, A. |author2=Shalizi, C. R. |author3=Newman, M. E. J. |year=2009| title=Power-Law Distributions in Empirical Data| journal=SIAM Review |volume=51 |issue=4 |pages=661–703 |arxiv=0706.1062 |bibcode=2009SIAMR..51..661C| doi=10.1137/070710111}}</ref>
However, going in the other direction – observing that data appears as an approximate line on a log–log scale and concluding that the data follows a power law – is not always valid.<ref name=clauset>{{cite journal|author1=Clauset, A. |author2=Shalizi, C. R. |author3=Newman, M. E. J. |year=2009| title=Power-Law Distributions in Empirical Data| journal=SIAM Review |volume=51 |issue=4 |pages=661–703 |arxiv=0706.1062 |bibcode=2009SIAMR..51..661C| doi=10.1137/070710111|s2cid=9155618 }}</ref>


In fact, many other functional forms appear approximately linear on the log–log scale, and simply evaluating the [[goodness of fit]] of a [[linear regression]] on logged data using the [[coefficient of determination]] (''R''<sup>2</sup>) may be invalid, as the assumptions of the linear regression model, such as Gaussian error, may not be satisfied; in addition, tests of fit of the log–log form may exhibit low [[statistical power]], as these tests may have low likelihood of rejecting power laws in the presence of other true functional forms. While simple log–log plots may be instructive in detecting possible power laws, and have been used dating back to [[Vilfredo Pareto|Pareto]] in the 1890s, validation as a power laws requires more sophisticated statistics.<ref name=clauset/>
In fact, many other functional forms appear approximately linear on the log–log scale, and simply evaluating the [[goodness of fit]] of a [[linear regression]] on logged data using the [[coefficient of determination]] (''R''<sup>2</sup>) may be invalid, as the assumptions of the linear regression model, such as Gaussian error, may not be satisfied; in addition, tests of fit of the log–log form may exhibit low [[statistical power]], as these tests may have low likelihood of rejecting power laws in the presence of other true functional forms. While simple log–log plots may be instructive in detecting possible power laws, and have been used dating back to [[Vilfredo Pareto|Pareto]] in the 1890s, validation as a power laws requires more sophisticated statistics.<ref name=clauset/>
Line 125: Line 133:


[[Bode plot]] (a [[plot (graphics)|graph]] of the [[frequency response]] of a system) is also log–log plot.
[[Bode plot]] (a [[plot (graphics)|graph]] of the [[frequency response]] of a system) is also log–log plot.

In [[chemical kinetics]], the general form of the dependence of the [[reaction rate]] on concentration takes the form of a power law ([[law of mass action]]), so a log-log plot is useful for estimating the reaction parameters from experiment.


== See also ==
== See also ==
* [[Semi-log plot]] (lin–log or log–lin)
* [[Semi-log plot]] (lin–log or log–lin)
* [[Power law]]

* [[Zipf law]]
== External links ==
* [[Log-linear model]]
* [https://sites.google.com/site/nonnewtoniancalculus/ Non-Newtonian calculus website]
* [[Log-normal distribution]]
* [[Log-logistic distribution]]
* [[Data transformation (statistics)]]
* [[Variance-stabilizing transformation]]


== References ==
== References ==
{{reflist}}
{{reflist}}

== External links ==
* [https://sites.google.com/site/nonnewtoniancalculus/ Non-Newtonian calculus website]


{{DEFAULTSORT:Log-Log Graph}}
{{DEFAULTSORT:Log-Log Graph}}

Latest revision as of 22:36, 25 November 2024

A log–log plot of y = x (blue), y = x2 (green), and y = x3 (red).
Note the logarithmic scale markings on each of the axes, and that the log x and log y axes (where the logarithms are 0) are where x and y themselves are 1.
Comparison of linear, concave, and convex functions when plotted using a linear scale (left) or a log scale (right).

In science and engineering, a log–log graph or log–log plot is a two-dimensional graph of numerical data that uses logarithmic scales on both the horizontal and vertical axes. Power functions – relationships of the form – appear as straight lines in a log–log graph, with the exponent corresponding to the slope, and the coefficient corresponding to the intercept. Thus these graphs are very useful for recognizing these relationships and estimating parameters. Any base can be used for the logarithm, though most commonly base 10 (common logs) are used.

Relation with monomials

[edit]

Given a monomial equation taking the logarithm of the equation (with any base) yields:

Setting and which corresponds to using a log–log graph, yields the equation

where m = k is the slope of the line (gradient) and b = log a is the intercept on the (log y)-axis, meaning where log x = 0, so, reversing the logs, a is the y value corresponding to x = 1.[1]

Equations

[edit]

The equation for a line on a log–log scale would be: where m is the slope and b is the intercept point on the log plot.

Slope of a log–log plot

[edit]
Finding the slope of a log–log plot using ratios

To find the slope of the plot, two points are selected on the x-axis, say x1 and x2. Using the below equation: and The slope m is found taking the difference: where F1 is shorthand for F(x1) and F2 is shorthand for F(x2). The figure at right illustrates the formula. Notice that the slope in the example of the figure is negative. The formula also provides a negative slope, as can be seen from the following property of the logarithm:

Finding the function from the log–log plot

[edit]

The above procedure now is reversed to find the form of the function F(x) using its (assumed) known log–log plot. To find the function F, pick some fixed point (x0, F0), where F0 is shorthand for F(x0), somewhere on the straight line in the above graph, and further some other arbitrary point (x1, F1) on the same graph. Then from the slope formula above: which leads to Notice that 10log10(F1) = F1. Therefore, the logs can be inverted to find: or which means that In other words, F is proportional to x to the power of the slope of the straight line of its log–log graph. Specifically, a straight line on a log–log plot containing points (x0F0) and (x1F1) will have the function: Of course, the inverse is true too: any function of the form will have a straight line as its log–log graph representation, where the slope of the line is m.

Finding the area under a straight-line segment of log–log plot

[edit]

To calculate the area under a continuous, straight-line segment of a log–log plot (or estimating an area of an almost-straight line), take the function defined previously and integrate it. Since it is only operating on a definite integral (two defined endpoints), the area A under the plot takes the form

Rearranging the original equation and plugging in the fixed point values, it is found that

Substituting back into the integral, you find that for A over x0 to x1

Therefore,

For m = −1, the integral becomes

Log-log linear regression models

[edit]

Log–log plots are often use for visualizing log-log linear regression models with (roughly) log-normal, or Log-logistic, errors. In such models, after log-transforming the dependent and independent variables, a Simple linear regression model can be fitted, with the errors becoming homoscedastic. This model is useful when dealing with data that exhibits exponential growth or decay, while the errors continue to grow as the independent value grows (i.e., heteroscedastic error).

As above, in a log-log linear model the relationship between the variables is expressed as a power law. Every unit change in the independent variable will result in a constant percentage change in the dependent variable. The model is expressed as:

Taking the logarithm of both sides, we get:

This is a linear equation in the logarithms of and , with as the intercept and as the slope. In which , and .

Figure 1: Visualizing Loglog Normal Data

Figure 1 illustrates how this looks. It presents two plots generated using 10,000 simulated points. The left plot, titled 'Concave Line with Log-Normal Noise', displays a scatter plot of the observed data (y) against the independent variable (x). The red line represents the 'Median line', while the blue line is the 'Mean line'. This plot illustrates a dataset with a power-law relationship between the variables, represented by a concave line.

When both variables are log-transformed, as shown in the right plot of Figure 1, titled 'Log-Log Linear Line with Normal Noise', the relationship becomes linear. This plot also displays a scatter plot of the observed data against the independent variable, but after both axes are on a logarithmic scale. Here, both the mean and median lines are the same (red) line. This transformation allows us to fit a Simple linear regression model (which can then be transformed back to the original scale - as the median line).

Figure 2: Sliding Window Error Metrics Loglog Normal Data

The transformation from the left plot to the right plot in Figure 1 also demonstrates the effect of the log transformation on the distribution of noise in the data. In the left plot, the noise appears to follow a log-normal distribution, which is right-skewed and can be difficult to work with. In the right plot, after the log transformation, the noise appears to follow a normal distribution, which is easier to reason about and model.

This normalization of noise is further analyzed in Figure 2, which presents a line plot of three error metrics (Mean Absolute Error - MAE, Root Mean Square Error - RMSE, and Mean Absolute Logarithmic Error - MALE) calculated over a sliding window of size 28 on the x-axis. The y-axis gives the error, plotted against the independent variable (x). Each error metric is represented by a different color, with the corresponding smoothed line overlaying the original line (since this is just simulated data, the error estimation is a bit jumpy). These error metrics provide a measure of the noise as it varies across different x values.

Log-log linear models are widely used in various fields, including economics, biology, and physics, where many phenomena exhibit power-law behavior. They are also useful in regression analysis when dealing with heteroscedastic data, as the log transformation can help to stabilize the variance.

Applications

[edit]
A log-log plot condensing information that spans more than one order of magnitude along both axes

These graphs are useful when the parameters a and b need to be estimated from numerical data. Specifications such as this are used frequently in economics.

One example is the estimation of money demand functions based on inventory theory, in which it can be assumed that money demand at time t is given by where M is the real quantity of money held by the public, R is the rate of return on an alternative, higher yielding asset in excess of that on money, Y is the public's real income, U is an error term assumed to be lognormally distributed, A is a scale parameter to be estimated, and b and c are elasticity parameters to be estimated. Taking logs yields where m = log M, a = log A, r = log R, y = log Y, and u = log U with u being normally distributed. This equation can be estimated using ordinary least squares.

Another economic example is the estimation of a firm's Cobb–Douglas production function, which is the right side of the equation in which Q is the quantity of output that can be produced per month, N is the number of hours of labor employed in production per month, K is the number of hours of physical capital utilized per month, U is an error term assumed to be lognormally distributed, and A, , and are parameters to be estimated. Taking logs gives the linear regression equation where q = log Q, a = log A, n = log N, k = log K, and u = log U.

Log–log regression can also be used to estimate the fractal dimension of a naturally occurring fractal.

However, going in the other direction – observing that data appears as an approximate line on a log–log scale and concluding that the data follows a power law – is not always valid.[2]

In fact, many other functional forms appear approximately linear on the log–log scale, and simply evaluating the goodness of fit of a linear regression on logged data using the coefficient of determination (R2) may be invalid, as the assumptions of the linear regression model, such as Gaussian error, may not be satisfied; in addition, tests of fit of the log–log form may exhibit low statistical power, as these tests may have low likelihood of rejecting power laws in the presence of other true functional forms. While simple log–log plots may be instructive in detecting possible power laws, and have been used dating back to Pareto in the 1890s, validation as a power laws requires more sophisticated statistics.[2]

These graphs are also extremely useful when data are gathered by varying the control variable along an exponential function, in which case the control variable x is more naturally represented on a log scale, so that the data points are evenly spaced, rather than compressed at the low end. The output variable y can either be represented linearly, yielding a lin–log graph (log x, y), or its logarithm can also be taken, yielding the log–log graph (log x, log y).

Bode plot (a graph of the frequency response of a system) is also log–log plot.

In chemical kinetics, the general form of the dependence of the reaction rate on concentration takes the form of a power law (law of mass action), so a log-log plot is useful for estimating the reaction parameters from experiment.

See also

[edit]

References

[edit]
  1. ^ Bourne, Murray. "7. Log-Log and Semi-log Graphs". www.intmath.com. Retrieved 2024-10-15.
  2. ^ a b Clauset, A.; Shalizi, C. R.; Newman, M. E. J. (2009). "Power-Law Distributions in Empirical Data". SIAM Review. 51 (4): 661–703. arXiv:0706.1062. Bibcode:2009SIAMR..51..661C. doi:10.1137/070710111. S2CID 9155618.
[edit]