Jump to content

Moving average: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
 
(688 intermediate revisions by more than 100 users not shown)
Line 1: Line 1:
{{Short description|Type of statistical measure over subsets of a dataset}}
{{Otheruses}}
{{Other uses|Moving-average model|Moving average (disambiguation)}}
[[File:Lissage sinus bruite moyenne glissante.svg|thumb|Smoothing of a noisy sine (blue curve) with a moving average (red curve).]]


In [[statistics]], a '''moving average''' ('''rolling average''' or '''running average''' or '''moving mean'''<ref>[http://www.waterboards.ca.gov/waterrights/water_issues/programs/bay_delta/docs/cmnt091412/sldmwa/booth_et_al_2006.pdf Hydrologic Variability of the Cosumnes River Floodplain] (Booth et al., San Francisco Estuary and Watershed Science, Volume 4, Issue 2, 2006)</ref> or '''rolling mean''') is a calculation to analyze data points by creating a series of [[average]]s of different selections of the full data set. Variations include: [[#Simple moving average|simple]], [[#Cumulative moving average|cumulative]], or [[#Weighted moving average|weighted]] forms.
<!-- Deleted image removed: [[Image:MA Example.jpg|thumb|right|13-point moving average. {{deletable image-caption}}]] -->
In [[statistics]], a '''moving average''', also called '''rolling average''', '''rolling mean''' or '''running average''', is a type of [[finite impulse response filter]] used to analyze a set of data points by creating a series of [[average]]s of different subsets of the full data set.


Mathematically, a moving average is a type of [[convolution]]. Thus in [[signal processing]] it is viewed as a [[low-pass filter|low-pass]] [[finite impulse response]] filter. Because the [[boxcar function]] outlines its filter coefficients, it is called a '''boxcar filter'''. It is sometimes followed by [[Downsampling (signal processing)|downsampling]].
Given a series of numbers and a fixed subset size, the moving average can be obtained by first taking the average of the first subset. The fixed subset size is then shifted forward, creating a new subset of numbers, which is averaged. This process is repeated over the entire data series. The plot line connecting all the (fixed) averages is the moving average. Thus, a moving average is not a single number, but it is a set of numbers, each of which is the [[average]] of the corresponding subset of a larger set of data points. A moving average may also use unequal weights for each data value in the subset to emphasize particular values in the subset.


Given a series of numbers and a fixed subset size, the first element of the moving average is obtained by taking the average of the initial fixed subset of the number series. Then the subset is modified by "shifting forward"; that is, excluding the first number of the series and including the next value in the subset.
A moving average is commonly used with [[time series]] data to smooth out short-term fluctuations and highlight longer-term trends or cycles. The threshold between short-term and long-term depends on the application, and the parameters of the moving average will be set accordingly. For example, it is often used in [[technical analysis]] of financial data, like stock [[price]]s, [[return (finance)|returns]] or trading volumes. It is also used in [[economics]] to examine gross domestic product, employment or other macroeconomic time series. Mathematically, a moving average is a type of [[convolution]] and so it is also similar to the [[low-pass filter]] used in [[signal processing]]. When used with non-time series data, a moving average simply acts as a generic smoothing operation without any specific connection to time, although typically some kind of ordering is implied.

A moving average is commonly used with [[time series]] data to smooth out short-term fluctuations and highlight longer-term trends or cycles. The threshold between short-term and long-term depends on the application, and the parameters of the moving average will be set accordingly. It is also used in [[economics]] to examine gross domestic product, employment or other macroeconomic time series. When used with non-time series data, a moving average filters higher frequency components without any specific connection to time, although typically some kind of ordering is implied. Viewed simplistically it can be regarded as smoothing the data.


==Simple moving average==
==Simple moving average==
[[File: Moving Average Types comparison - Simple and Exponential.png|thumb]]
A '''simple moving average''' (SMA) is the unweighted [[arithmetic mean|mean]] of the previous ''n'' data points.{{Citation needed|date=February 2010}} For example, a 10-day simple moving average of closing price is the mean of the previous 10 days' closing prices. If those prices are <math>p_M, p_{M-1},\dots,p_{M-9}</math> then the formula is
In financial applications a '''simple moving average''' ('''SMA''') is the unweighted [[arithmetic mean|mean]] of the previous <math>k</math> data-points. However, in science and engineering, the mean is normally taken from an equal number of data on either side of a central value. This ensures that variations in the mean are aligned with the variations in the data rather than being shifted in time. {{Anchor|algorithm}}An example of a simple equally weighted running mean is the mean over the last <math>k</math> entries of a data-set containing <math>n</math> entries. Let those data-points be <math>p_1, p_2, \dots, p_n</math>. This could be closing prices of a stock. The mean over the last <math>k</math> data-points (days in this example) is denoted as <math>\textit{SMA}_{k}</math> and calculated as:
<!--
<math display="block">\begin{align}
this used to be a hairy sigma style with i's and j's, but prefer dots since it's a lot easier for financial people with limited math -->
\textit{SMA}_{k} &= \frac{p_{n-k+1} + p_{n-k+2} + \cdots + p_{n}}{k} \\
&= \frac{1}{k} \sum_{i=n-k+1}^{n} p_{i}
\end{align}</math>


When calculating the next mean <math>\textit{SMA}_{k,\text{next}}</math> with the same sampling width <math>k</math> the range from <math> n - k + 2 </math> to <math> n+1 </math> is considered. A new value <math>p_{n+1}</math> comes into the sum and the oldest value <math>p_{n-k+1}</math> drops out. This simplifies the calculations by reusing the previous mean <math>\textit{SMA}_{k,\text{prev}}</math>.
:<math>\textit{SMA} = { p_M + p_{M-1} + \cdots + p_{M-9} \over 10 }</math>
<math display="block">
\begin{align}
\textit{SMA}_{k, \text{next}} &= \frac{1}{k} \sum_{i=n-k+2}^{n+1} p_{i} \\
&= \frac{1}{k} \Big( \underbrace{ p_{n-k+2} + p_{n-k+3} + \dots + p_{n} + p_{n+1} }_{ \sum_{i=n-k+2}^{n+1} p_{i} } + \underbrace{ p_{n-k+1} - p_{n-k+1} }_{= 0} \Big) \\
&= \underbrace{ \frac{1}{k} \Big( p_{n-k+1} + p_{n-k+2} + \dots + p_{n} \Big) }_{= \textit{SMA}_{k, \text{prev}}} - \frac{p_{n-k+1}}{k} + \frac{p_{n+1}}{k} \\
&= \textit{SMA}_{k, \text{prev}} + \frac{1}{k} \Big( p_{n+1} - p_{n-k+1} \Big)
\end{align}
</math>
This means that the moving average filter can be computed quite cheaply on real time data with a FIFO / [[circular buffer]] and only 3 arithmetic steps.


During the initial filling of the FIFO / circular buffer the sampling window is equal to the data-set size thus <math> k = n </math> and the average calculation is performed as a [[#Cumulative average|cumulative moving average]].
When calculating successive values, a new value comes into the sum and an old value drops out, meaning a full summation each time is unnecessary,


The period selected (<math>k</math>) depends on the type of movement of interest, such as short, intermediate, or long-term.
:<math>\textit{SMA}_\mathrm{today} = \textit{SMA}_\mathrm{yesterday} - {p_{M-n} \over n} + {p_{M} \over n}</math>


If the data used are not centered around the mean, a simple moving average lags behind the latest datum by half the sample width. An SMA can also be disproportionately influenced by old data dropping out or new data coming in. One characteristic of the SMA is that if the data has a periodic fluctuation, then applying an SMA of that period will eliminate that variation (the average always containing one complete cycle). But a perfectly regular cycle is rarely encountered.<ref>''Statistical Analysis'', Ya-lun Chou, Holt International, 1975, {{ISBN|0-03-089422-0}}, section 17.9.</ref>
In technical analysis there are various popular values for ''n'', like 10 days, 40 days, or 200 days. The period selected depends on the kind of movement one is concentrating on, such as short, intermediate, or long term. In any case moving average levels are interpreted as [[support (technical analysis)|support]] in a rising market, or [[resistance (technical analysis)|resistance]] in a falling market.


For a number of applications, it is advantageous to avoid the shifting induced by using only "past" data. Hence a '''central moving average''' can be computed, using data equally spaced on either side of the point in the series where the mean is calculated.<ref>The derivation and properties of the simple central moving average are given in full at [[Savitzky–Golay filter]].</ref> This requires using an odd number of points in the sample window.
In all cases a moving average lags behind the latest data point, simply from the nature of its smoothing. An SMA can lag to an undesirable extent, and can be disproportionately influenced by old data points dropping out of the average. This is addressed by giving extra weight to more recent data points, as in the [[#Weighted moving average|weighted]] and [[#Exponential moving average|exponential]] moving averages.


A major drawback of the SMA is that it lets through a significant amount of the signal shorter than the window length. Worse, it ''actually inverts it.''{{Citation needed|date=October 2023}} This can lead to unexpected artifacts, such as peaks in the smoothed result appearing where there were troughs in the data. It also leads to the result being less smooth than expected since some of the higher frequencies are not properly removed.
One characteristic of the SMA is that if the data have a periodic fluctuation, then applying an SMA of that period will eliminate that variation (the average always containing one complete cycle). But a perfectly regular cycle is rarely encountered in economics or finance.<ref>''Statistical Analysis'', Ya-lun Chou, Holt International, 1975, ISBN 0030894220, section 17.9.</ref>


Its frequency response is a type of low-pass filter called [[Sinc filter#Frequency-domain sinc|sinc-in-frequency]].
For a number of applications it is advantageous to avoid the shifting induced by using only 'past' data. Hence a '''central moving average''' can be computed, using both 'past' and 'future' data. The 'future' data in this case are ''not'' predictions, but merely data obtained after the time at which the average is to be computed.

==Cumulative moving average==

The '''cumulative moving average'''{{Citation needed|date=February 2010}} is also frequently called a ''running average'' or a ''long running average''{{Citation needed|date=February 2010}} although the term ''running average'' is also used as synonym for a ''moving average''.{{Citation needed|date=February 2010}} This article uses the term '''cumulative moving average''' or simply '''cumulative average''' since this term is more descriptive and unambiguous.

In some data acquisition systems, the data arrives in an ordered data stream and the statistician would like to get the average of all of the data up until the current data point. For example, an investor may want the average price of all of the stock transactions for a particular stock up until the current time. As each new transaction occurs, the average price at the time of the transaction can be calculated for all of the transactions up to that point using the cumulative average. This is the cumulative average, which is typically an unweighted [[average]] of the sequence of ''i'' values ''x''<sub>1</sub>, ..., ''x<sub>i</sub>'' up to the current time:

:<math>CA_i = {{x_1 + \cdots + x_i} \over i}\,.</math>

The brute force method to calculate this would be to store all of the data and calculate the sum and divide by the number of data points every time a new data point arrived. However, it is possible to simply update cumulative average as a new value ''x''<sub>''i''+1</sub> becomes available, using the formula:

:<math>CA_{i+1} = {{x_{i+1} + i CA_i} \over {i+1}}\,,</math>

where ''CA''<sub>0</sub> can be taken to be equal to 0.

Thus the current cumulative average for a new data point is equal to the previous cumulative average plus the difference between the latest data point and the previous average divided by the number of points received so far. When all of the data points arrive ({{nowrap|1=''i'' = ''N''}}), the cumulative average will equal the final average.

The derivation of the cumulative average formula is straightforward. Using

:<math>x_1 + \cdots + x_i = iCA_i\,,</math>

and similarly for {{nowrap|''i'' + 1}}, it is seen that

:<math>x_{i+1} = (x_1 + \cdots + x_{i+1}) - (x_1 + \cdots + x_i) = (i+1)CA_{i+1} - iCA_i\,.</math>

Solving this equation for ''CA''<sub>''i''+1</sub> results in:

:<math>CA_{i+1} = {(x_{i+1} + iCA_i) \over {i+1}} = {CA_i} + {{x_{i+1} - CA_i} \over {i+1}}\,.</math>

==Weighted moving average==

A weighted average is any average that has multiplying factors to give different weights to different data points. Mathematically, the moving average is the [[convolution]] of the data points with a moving average function;{{Citation needed|date=February 2010}} in technical analysis, a '''weighted moving average''' (WMA) has the specific meaning of weights that decrease arithmetically.{{Citation needed|date=February 2010}} In an ''n''-day WMA the latest day has weight ''n'', the second latest ''n''&nbsp;&minus;&nbsp;1, etc, down to zero.

:<math>\text{WMA}_{M} = { n p_{M} - (n-1) p_{M-1} + \cdots + 2 p_{(M-n+2)} + p_{(M-n+1)} \over n + (n-1) + \cdots + 2 + 1}</math>

[[Image:Weighted moving average weights N=15.png|thumb|right|WMA weights ''n''&nbsp;=&nbsp;15]]

The denominator is a [[triangle number]], and can be easily computed as <math>\frac{n(n+1)}{2}.</math>

When calculating the WMA across successive values, it can be noted the difference between the numerators of WMA<sub>''M''+1</sub> and WMA<sub>M</sub> is ''np''<sub>''M''+1</sub>&nbsp;&minus;&nbsp;''p''<sub>M</sub>&nbsp;&minus;&nbsp;...&nbsp;&minus;&nbsp;''p''<sub>''M''&minus;n+1</sub>. If we denote the sum ''p''<sub>M</sub>&nbsp;+&nbsp;...&nbsp;+&nbsp;''p''<sub>''M''&minus;''n''+1</sub> by Total<sub>M</sub>, then

:<math>\text{Total}_{M+1} = Total_{M} + p_{M+1} - p_{M-n+1} \,</math>

:<math>\text{Numerator}_{M+1} = \text{Numerator}_M + n p_{M+1} - Total_M \,</math>

:<math>\text{WMA}_{M+1} = { \text{Numerator}_{M+1} \over n + (n-1) + \cdots + 2 + 1} \,</math>

The graph at the right shows how the weights decrease, from highest weight for the most recent data points, down to zero. It can be compared to the weights in the exponential moving average which follows.

==Exponential moving average==

[[Image:Exponential moving average weights N=15.png|thumb|right|EMA weights ''N''=15]]

An '''exponential moving average''' (EMA), sometimes also called an '''exponentially weighted moving average''' (EWMA),{{Citation needed|date=February 2010}} is a type of [[infinite impulse response]] filter that applies weighting factors which decrease [[exponentiation|exponentially]]. The weighting for each older data point decreases exponentially, never reaching zero. The graph at right shows an example of the weight decrease.

The formula for calculating the EMA at time periods ''t''&nbsp;&gt;&nbsp;2 is
:<math>S_{t} = \alpha \times Y_{t-1} + (1-\alpha) \times S_{t-1}</math>

Where:
* The coefficient ''α'' represents the degree of weighting decrease, a constant smoothing factor between 0 and 1. A higher ''α'' discounts older observations faster. Alternatively, ''α'' may be expressed in terms of ''N'' time periods, where ''α''&nbsp;=&nbsp;2/(''N''+1). For example, ''N''&nbsp;=&nbsp;19 is equivalent to ''α''&nbsp;=&nbsp;0.1. The half-life of the weights (the interval over which the weights decrease by a factor of two) is approximately ''N''/2.8854 (within 1% if ''N''&nbsp;>&nbsp;5).
* ''Y<sub>t</sub>'' is the observation at a time period ''t''.
* ''S<sub>t</sub>''' is the value of the EMA at any time period ''t''.

''S''<sub>1</sub> is undefined. ''S''<sub>2</sub> may be initialized in a number of different ways, most commonly by setting ''S''<sub>2</sub> to ''Y''<sub>1</sub>, though other techniques exist, such as setting ''S''<sub>2</sub> to an average of the first 4 or 5 observations. The prominence of the ''S''<sub>2</sub> initialization's effect on the resultant moving average depends on ''α''; smaller ''α'' values make the choice of ''S''<sub>2</sub> relatively more important than larger ''α'' values, since a higher ''α'' discounts older observations faster.

This formulation is according to Hunter (1986)<ref>[http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc431.htm NIST/SEMATECH e-Handbook of Statistical Methods: Single Exponential Smoothing] at the [[National Institute of Standards and Technology]]</ref>. By repeated application of this formula for different times, we can eventually write ''S<sub>t</sub>'' as a weighted sum of the data points ''Y<sub>t</sub>'', as:

:<math>S_{t} = \alpha \times (Y_{t-1} + (1-\alpha) \times Y_{t-2} + (1-\alpha)^2 \times Y_{t-3} + ... + (1-\alpha)^k \times Y_{t-(k+1)}) + (1-\alpha)^{k+1} \times S_{t-(k+1)}</math>

for any suitable k = 0, 1, 2, ... The weight of the general data point <math>Y_{t-i}</math> is <math>\alpha(1-\alpha)^{i-1} </math>.

An alternate approach by Roberts (1959) uses ''Y<sub>t</sub>'' in lieu of ''Y''<sub>''t''&minus;1</sub><ref>[http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc324.htm NIST/SEMATECH e-Handbook of Statistical Methods: EWMA Control Charts] at the [[National Institute of Standards and Technology]]</ref>:

:<math>S_{t,\text{ alternate}} = \alpha \times Y_t + (1-\alpha) \times S_{t-1}</math>

This formula can also be expressed in technical analysis terms as follows, showing how the EMA steps towards the latest data point, but only by a proportion of the difference (each time):<ref name="stockcharts">[http://www.stockcharts.com/education/IndicatorAnalysis/indic_movingAvg.html Moving Averages page] at StockCharts.com</ref>

:<math>\text{EMA}_{\text{today}} = \text{EMA}_{\text{yesterday}} + \alpha \times

(\text{price}_{\text{today}} - \text{EMA}_\text{yesterday})</math>

Expanding out <math>\text{EMA}_{\text{yesterday}}</math> each time results in the following power series, showing how the weighting factor on each data point ''p''<sub>1</sub>, ''p''<sub>2</sub>, etc, decreases exponentially:
:<math>\text{EMA} = { \alpha \times (p_1 + (1-\alpha) p_2 + (1-\alpha)^2 p_3 + (1-\alpha)^3
p_4 + \cdots ) }</math><ref><math>\text{EMA} = { p_1 + (1-\alpha) p_2 + (1-\alpha)^2 p_3 + (1-\alpha)^3 p_4 + \cdots \over 1 + (1-\alpha) + (1-\alpha)^2 + (1-\alpha)^3 + \cdots }</math>,

since <math>1/\alpha = 1+(1-\alpha)+(1-\alpha)^2+\cdots</math>.</ref>

This is an [[series (mathematics)|infinite sum]] with decreasing terms.

The ''N'' periods in an ''N''-day EMA only specify the ''α'' factor. ''N'' is not a stopping point for the calculation in the way it is in an [[#Simple moving average|SMA]] or [[#Weighted moving average|WMA]]. For sufficiently large ''N'', The first ''N'' data points in an EMA represent about 86% of the total weight in the calculation<ref>The denominator on the left-hand side should be unity, and the numerator will become the right-hand side ([[geometric progression#Geometric series|geometric series]]),
<math>\alpha \left({1-(1-\alpha)^{N+1} \over 1-(1-\alpha)}\right)</math>.</ref>:

:<math> {{\alpha \times \left(1+(1-\alpha)+(1-\alpha)^2+\cdots +(1-\alpha)^N \right)} \over {\alpha \times \left(1+(1-\alpha)+(1-\alpha)^2+\cdots +(1-\alpha)^\infty \right)}}= 1-{\left(1-{2 \over N+1}\right)}^{N+1}</math>

:i.e. <math> \lim_{N \to \infty} \left[1-{\left(1-{2 \over N+1}\right)}^{N+1} \right] </math> simplified<ref>Because (1+''x''/''n'')<sup>''n''</sup> becomes e<sup>''x''</sup> for large ''n''.</ref>, tends to <math>1-\text{e}^{-2} \approx 0.8647</math>.

The [[power series|power formula]] above gives a starting value for a particular day, after which the successive days formula shown first can be applied. The question of how far back to go for an initial value depends, in the worst case, on the data. If there are huge ''p'' price values in old data then they'll have an effect on the total even if their weighting is very small. If one assumes prices don't vary too wildly then just the weighting can be considered. The weight omitted by stopping after ''k'' terms is

:<math>\alpha \times \left( (1-\alpha)^k + (1-\alpha)^{k+1} + (1-\alpha)^{k+2} + \cdots \right),</math>

which is

:<math>\alpha \times (1-\alpha)^k \times \left(1 + (1-\alpha) + (1-\alpha)^2 + \cdots \right),</math>

i.e. a fraction


== Continuous moving average ==
The continuous moving average is defined with the following integral. The <math>\varepsilon</math> environment <math> [ x_o-\varepsilon, x_o+\varepsilon] </math> around <math>x_o</math> defines the intensity of smoothing of the graph of the function.
:<math>
:<math>
\begin{array}{rrcl}
{{\text{weight omitted by stopping after k terms}} \over {\text{total weight}}} = { { \alpha \times \left[ (1-\alpha)^k +(1-\alpha)^{k+1} +(1-\alpha)^{k+2} + \cdots \right] } \over { { \alpha \times \left[ 1 + (1-\alpha) +(1-\alpha)^{2} + \cdots \right] } } }
f: & \mathbb{R} & \rightarrow & \mathbb{R} \\
& x & \mapsto & f\left( x \right)
\end{array}
</math>
</math>
The continuous moving average of the function <math>f</math> is defined as:

:<math>
:<math>
\begin{array}{rrcl}
= { {\alpha (1-\alpha)^k \times {{1} \over {1-(1-\alpha)}}} \over { { {\alpha} \over {1-(1-\alpha) } } } }
MA_f: & \mathbb{R} & \rightarrow & \mathbb{R} \\
</math>
& x & \mapsto & \displaystyle \frac{1}{2\cdot \varepsilon} \cdot \int_{x_o -\varepsilon}^{x_o +\varepsilon} f\left( t \right) \, dt
\end{array}
</math>
A larger <math> \varepsilon > 0 </math> smoothes the source graph of the function (blue) <math>f</math> more. The animations below show the moving average as animation in dependency of different values for <math> \varepsilon > 0 </math>. The fraction <math>\frac{1}{2\cdot \varepsilon} </math> is used, because <math> 2\cdot \varepsilon </math> is the interval width for the integral.


<gallery widths=240>
:<math>
File:Moving average sin polynom mov av.gif|Continuous moving average sine and polynom - visualization of the smoothing with a small interval for integration
= (1 - \alpha)^k
File:Moving average sine and polynom with a larger interval.gif|Continuous moving average sine and polynom - visualization of the smoothing with a larger interval for integration
</math>
File:Moving average sine and polynom - visualization of interval width.gif|Animation showing the impact of interval width and smoothing by moving average.
</gallery>


==Cumulative average==
out of the total weight.
In a '''cumulative average''' ('''CA'''), the data arrive in an ordered datum stream, and the user would like to get the average of all of the data up until the current datum. For example, an investor may want the average price of all of the stock transactions for a particular stock up until the current time. As each new transaction occurs, the average price at the time of the transaction can be calculated for all of the transactions up to that point using the cumulative average, typically an equally weighted [[average]] of the sequence of ''n'' values <math>x_1. \ldots, x_n</math> up to the current time:
<math display="block">\textit{CA}_n = {{x_1 + \cdots + x_n} \over n}\,.</math>


The brute-force method to calculate this would be to store all of the data and calculate the sum and divide by the number of points every time a new datum arrived. However, it is possible to simply update cumulative average as a new value, <math>x_{n+1}</math> becomes available, using the formula
For example, to have 99.9% of the weight, set above ratio equal to 0.1% and solve for ''k'':
<math display="block">\textit{CA}_{n+1} = {{x_{n+1} + n \cdot \textit{CA}_n} \over {n+1}}.</math>


Thus the current cumulative average for a new datum is equal to the previous cumulative average, times ''n'', plus the latest datum, all divided by the number of points received so far, ''n''+1. When all of the data arrive ({{math|1=''n'' = ''N''}}), then the cumulative average will equal the final average. It is also possible to store a running total of the data as well as the number of points and dividing the total by the number of points to get the CA each time a new datum arrives.
:<math>k={ \log (0.001) \over \log (1-\alpha)}</math>


The derivation of the cumulative average formula is straightforward. Using
terms should be used. Since <math>\log\,(1-\alpha)</math> approaches <math>-2 \over N+1</math> as N increases<ref>It means <math>\alpha</math> -> 0, and the [[Taylor series]] of <math>\log(1-\alpha) = -\alpha -\alpha^2/2 - \cdots</math> tends to <math>-\alpha</math>.</ref>, this simplifies to approximately<ref>log<sub>e</sub>(0.001) / 2 = -3.45</ref>
<math display="block">x_1 + \cdots + x_n = n \cdot \textit{CA}_n</math>
and similarly for {{math|''n'' + 1}}, it is seen that
<math display="block">x_{n+1} = (x_1 + \cdots + x_{n+1}) - (x_1 + \cdots + x_n)</math>
<math display="block">x_{n+1} = (n + 1) \cdot \textit{CA}_{n + 1} - n \cdot \textit{CA}_n </math>


:<math>k = 3.45(N+1) \,</math>
Solving this equation for <math>\textit{CA}_{n+1}</math> results in
<math display="block">\begin{align}
\textit{CA}_{n+1} & = {x_{n+1} + n \cdot \textit{CA}_n \over {n+1}} \\[6pt]
& = {x_{n+1} + (n + 1 - 1) \cdot \textit{CA}_n \over {n+1}} \\[6pt]
& = {(n + 1) \cdot \textit{CA}_n + x_{n+1} - \textit{CA}_n \over {n+1}} \\[6pt]
& = {\textit{CA}_n} + {{x_{n+1} - \textit{CA}_n} \over {n+1}}
\end{align}</math>


==Weighted moving average==
for this example (99.9% weight).
A weighted average is an average that has multiplying factors to give different weights to data at different positions in the sample window. Mathematically, the weighted moving average is the [[convolution]] of the data with a fixed weighting function. One application is removing [[pixelization]] from a digital graphical image.{{Citation needed|date=February 2018}}


In the financial field, and more specifically in the analyses of financial data, a '''weighted moving average''' (WMA) has the specific meaning of weights that decrease in arithmetical progression.<ref>{{cite web|title=Weighted Moving Averages: The Basics |url=http://www.investopedia.com/articles/technical/060401.asp |publisher=Investopedia}}</ref> In an ''n''-day WMA the latest day has weight ''n'', the second latest <math>n-1</math>, etc., down to one.
===Modified moving average===
A '''modified moving average''' (MMA), '''running moving average''' (RMA), or '''smoothed moving average''' is defined as:


:<math>\text{MMA}_{\text{today}} = {(N - 1) \times \text{MMA}_{\text{yesterday}} + \text{price} \over{N}}</math>
<math display="block">\text{WMA}_{M} = { n p_{M} + (n-1) p_{M-1} + \cdots + 2 p_{((M-n)+2)} + p_{((M-n)+1)} \over n + (n-1) + \cdots + 2 + 1}</math>


[[Image:Weighted moving average weights N=15.svg|thumb|right|WMA weights ''n'' = 15]]
In short, this is exponential moving average, with <math>\alpha=1/N</math>.


The denominator is a [[triangle number]] equal to <math display="inline">\frac{n(n + 1)}{2}.</math> In the more general case the denominator will always be the sum of the individual weights.
=== Application to measuring computer performance ===
Some computer performance metrics, e.g. the average process queue length, or the average CPU utilization, use a form of exponential moving average.


When calculating the WMA across successive values, the difference between the numerators of <math>\text{WMA}_{M+1}</math> and <math>\text{WMA}_{M}</math> is <math>np_{M+1} - p_{M} - \dots - p_{M-n+1}</math>. If we denote the sum <math>p_{M} + \dots + p_{M-n+1}</math> by <math>\text{Total}_{M}</math>, then
:<math>S_n = \alpha(t_{n}-t_{n-1}) \times Y_n + (1-\alpha(t_n-t_{n-1})) \times
S_{n-1}.</math>


<math display="block">\begin{align}
Here <math>\alpha</math> is defined as a function of time between two readings. An example of a coefficient giving bigger weight to the current reading, and smaller weight to the older readings is
\text{Total}_{M+1} &= \text{Total}_M + p_{M+1} - p_{M-n+1} \\[3pt]
\text{Numerator}_{M+1} &= \text{Numerator}_M + n p_{M+1} - \text{Total}_M \\[3pt]
\text{WMA}_{M+1} &= { \text{Numerator}_{M+1} \over n + (n-1) + \cdots + 2 + 1}
\end{align}</math>


The graph at the right shows how the weights decrease, from highest weight for the most recent data, down to zero. It can be compared to the weights in the exponential moving average which follows.
:<math>\alpha(t_{n}-t_{n-1}) = 1-e^{-{ {t_{n}-t_{n-1}} \over {W \times 60} }}</math>


=={{anchor|Exponential}}Exponential moving average==
where time for readings ''t''<sub>''n''</sub> is expressed in seconds, and <math>W</math> is the period of time in minutes over which the reading is said to be averaged (the mean lifetime of each reading in the average). Given the above definition of <math>\alpha</math>, the moving average can be expressed as
{{main|Exponential smoothing}}
{{further|EWMA chart}}
An '''exponential moving average (EMA)''', also known as an '''exponentially weighted moving average (EWMA)''',<ref>{{cite web |url=http://lorien.ncl.ac.uk/ming/filter/filewma.htm |title=DEALING WITH MEASUREMENT NOISE - Averaging Filter |access-date=2010-10-26 |url-status=dead |archive-url=https://web.archive.org/web/20100329135531/http://lorien.ncl.ac.uk/ming/filter/filewma.htm |archive-date=2010-03-29 }}</ref> is a first-order [[infinite impulse response]] filter that applies weighting factors which decrease [[Exponential decay|exponentially]]. The weighting for each older [[data|datum]] decreases exponentially, never reaching zero.
This formulation is according to Hunter (1986).<ref>[http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc431.htm NIST/SEMATECH e-Handbook of Statistical Methods: Single Exponential Smoothing] at the [[National Institute of Standards and Technology]]</ref>


==Other weightings==
:<math>S_n = (1-e^{-{ {t_n - t_{n-1}} \over {W \times 60}}}) \times Y_n +
Other weighting systems are used occasionally – for example, in share trading a '''volume weighting''' will weight each time period in proportion to its trading volume.


A further weighting, used by actuaries, is Spencer's 15-Point Moving Average<ref>[http://mathworld.wolfram.com/Spencers15-PointMovingAverage.html Spencer's 15-Point Moving Average — from Wolfram MathWorld<!-- Bot generated title -->]</ref> (a central moving average). Its symmetric weight coefficients are [−3, −6, −5, 3, 21, 46, 67, 74, 67, 46, 21, 3, −5, −6, −3], which factors as {{sfrac|[1, 1, 1, 1]×[1, 1, 1, 1]×[1, 1, 1, 1, 1]×[−3, 3, 4, 3, −3]|320}} and leaves samples of any quadratic or cubic polynomial unchanged.<ref>Rob J Hyndman. "[https://robjhyndman.com/papers/movingaverage.pdf Moving averages]". 2009-11-08. Accessed 2020-08-20.</ref><ref>Aditya Guntuboyina. "[https://www.stat.berkeley.edu/~aditya/Site/Statistics_153;_Spring_2012_files/Spring2012Statistics153LectureThree.pdf Statistics 153 (Time Series) : Lecture Three]". 2012-01-24. Accessed 2024-01-07.</ref>
e^{-{{t_{n}-t_{n-1}} \over {W \times 60}}} \times S_{n-1}</math>


Outside the world of finance, weighted running means have many forms and applications. Each weighting function or "kernel" has its own characteristics. In engineering and science the frequency and phase response of the filter is often of primary importance in understanding the desired and undesired distortions that a particular filter will apply to the data.
For example, a 15-minute average ''L'' of a process queue length ''Q'', measured every 5 seconds (time difference is 5 seconds), is computed as


A mean does not just "smooth" the data. A mean is a form of low-pass filter. The effects of the particular filter used should be understood in order to make an appropriate choice. On this point, the French version of this article discusses the spectral effects of 3 kinds of means (cumulative, exponential, Gaussian).
:<math>L_n = (1-e^{-{5 \over {15 \times 60}}}) \times Q_n + e^{-{5 \over {15 \times 60}}}
\times L_{n-1} = (1-e^{-{1 \over {180}}}) \times Q_n + e^{-1/180} \times L_{n-1} = Q_n + e^{-1/180} \times ( L_{n-1} - Q_n )</math>


==Moving median==
== Other weightings ==

Other weighting systems are used occasionally &ndash; for example, in share trading a '''volume weighting''' will weight each time period in proportion to its trading volume.

A further weighting, used by actuaries, is Spencer's 15-Point Moving Average<ref>[http://mathworld.wolfram.com/Spencers15-PointMovingAverage.html Spencer's 15-Point Moving Average &mdash; from Wolfram MathWorld<!-- Bot generated title -->]</ref> (a central moving average). The symmetric weight coefficients are -3, -6, -5, 3, 21, 46, 67, 74, 67, 46, 21, 3, -5, -6, -3.

== Moving median ==
From a statistical point of view, the moving average, when used to estimate the underlying trend in a time series, is susceptible to rare events such as rapid shocks or other anomalies. A more robust estimate of the trend is the '''simple moving median''' over ''n'' time points:
From a statistical point of view, the moving average, when used to estimate the underlying trend in a time series, is susceptible to rare events such as rapid shocks or other anomalies. A more robust estimate of the trend is the '''simple moving median''' over ''n'' time points:
<math display="block">\widetilde{p}_\text{SM} = \text{Median}( p_M, p_{M-1}, \ldots, p_{M-n+1} )</math>
where the [[median]] is found by, for example, sorting the values inside the brackets and finding the value in the middle. For larger values of ''n'', the median can be efficiently computed by updating an [[Skip list#Indexable skiplist|indexable skiplist]].<ref>{{Cite web | url=http://code.activestate.com/recipes/576930/ |title = Efficient Running Median using an Indexable Skiplist « Python recipes « ActiveState Code}}</ref>


Statistically, the moving average is optimal for recovering the underlying trend of the time series when the fluctuations about the trend are [[normal distribution|normally distributed]]. However, the normal distribution does not place high probability on very large deviations from the trend which explains why such deviations will have a disproportionately large effect on the trend estimate. It can be shown that if the fluctuations are instead assumed to be [[Laplace distribution|Laplace distributed]], then the moving median is statistically optimal.<ref>G.R. Arce, "Nonlinear Signal Processing: A Statistical Approach", Wiley:New Jersey, US, 2005.</ref> For a given variance, the Laplace distribution places higher probability on rare events than does the normal, which explains why the moving median tolerates shocks better than the moving mean.
:<math>\textit{SMM} = \text{Median}( p_M, p_{M-1}, \ldots, p_{M-n+1} )</math>


When the simple moving median above is central, the smoothing is identical to the [[median filter]] which has applications in, for example, image signal processing. The Moving Median is a more robust alternative to the Moving Average when it comes to estimating the underlying trend in a time series. While the Moving Average is optimal for recovering the trend if the fluctuations around the trend are normally distributed, it is susceptible to the impact of rare events such as rapid shocks or anomalies. In contrast, the Moving Median, which is found by sorting the values inside the time window and finding the value in the middle, is more resistant to the impact of such rare events. This is because, for a given variance, the Laplace distribution, which the Moving Median assumes, places higher probability on rare events than the normal distribution that the Moving Average assumes. As a result, the Moving Median provides a more reliable and stable estimate of the underlying trend even when the time series is affected by large deviations from the trend. Additionally, the Moving Median smoothing is identical to the Median Filter, which has various applications in image signal processing.
where the [[median]] is found by, for example, sorting the values inside the brackets and finding the value in the middle.


==Moving average regression model==
Statistically, the moving average is optimal for recovering the underlying trend of the time series when the fluctuations about the trend are [[normal distribution|normally distributed]]. However, the normal distribution does not place high probability on very large deviations from the trend which explains why such deviations will have a disproportionately large effect on the trend estimate. It can be shown that if the fluctuations are instead assumed to be [[Laplace distribution|Laplace distributed]], then the moving median is statistically optimal<ref>G.R. Arce, "Nonlinear Signal Processing: A Statistical Approach", Wiley:New Jersey, USA, 2005.</ref>. For a given variance, the Laplace distribution places higher probability on rare events than does the normal, which explains why the moving median tolerates shocks better than the moving mean.
{{Main|Moving-average model}}In a [[Moving average model|moving average regression model]], a variable of interest is assumed to be a weighted moving average of unobserved independent error terms; the weights in the moving average are parameters to be estimated.


Those two concepts are often confused due to their name, but while they share many similarities, they represent distinct methods and are used in very different contexts.
When the simple moving median above is central, the smoothing is identical to the [[median filter]] which has applications in, for example, image signal processing.


==See also==
==See also==
{{commons category|Moving averages}}

*[[MACD|Moving average convergence/divergence]]
*[[Exponential smoothing]]
*[[Exponential smoothing]]
*[[Local regression]] (LOESS and LOWESS)
*[[Kernel smoothing]]
*[[MACD|Moving average convergence/divergence indicator]]
*[[Martingale (probability theory)]]
*[[Moving average crossover]]
*[[Moving least squares]]
*[[Rising moving average]]
*[[Rolling hash]]
*[[Running total]]
*[[Savitzky–Golay filter]]
*[[Window function]]
*[[Window function]]
*[[Zero lag exponential moving average]]
*[[Steklov function]]


==References==
{{More footnotes|date=February 2010}}
{{reflist|30em}}
==Notes and references==

<references/>

==External links==

*[http://www.think-lamp.com/2009/03/the-hidden-power-of-ping/ EWMA in determining network traffic and ethernet]
*[http://www.eng.ox.ac.uk/samp/members/max/software/ Fast software for computing the simple moving median of a time series.]


{{statistics}}
{{statistics}}
{{technical analysis}}
{{technical analysis}}
{{Quantitative forecasting methods}}


{{DEFAULTSORT:Moving Average}}
[[Category:Statistical charts and diagrams]]
[[Category:Statistical charts and diagrams]]
[[Category:Time series]]

[[Category:Time series analysis]]
[[Category:Chart overlays]]

[[Category:Mathematical finance]]

[[Category:Technical analysis]]
[[Category:Technical analysis]]

[[cs:Klouzavý průměr]]
[[de:Gleitender Mittelwert]]
[[es:Media móvil]]
[[eu:Batezbesteko higikor]]
[[fr:Moyenne glissante]]
[[it:Media mobile]]
[[nl:Voortschrijdend gemiddelde]]
[[ja:移動平均]]
[[pl:Średnia krocząca]]
[[pt:EWMA]]
[[ru:Скользящая средняя]]
[[sv:Glidande medelvärde]]
[[uk:Ковзаюче середнє]]
[[vi:Trung bình trượt]]
[[zh:移動平均]]

Latest revision as of 20:55, 23 November 2024

Smoothing of a noisy sine (blue curve) with a moving average (red curve).

In statistics, a moving average (rolling average or running average or moving mean[1] or rolling mean) is a calculation to analyze data points by creating a series of averages of different selections of the full data set. Variations include: simple, cumulative, or weighted forms.

Mathematically, a moving average is a type of convolution. Thus in signal processing it is viewed as a low-pass finite impulse response filter. Because the boxcar function outlines its filter coefficients, it is called a boxcar filter. It is sometimes followed by downsampling.

Given a series of numbers and a fixed subset size, the first element of the moving average is obtained by taking the average of the initial fixed subset of the number series. Then the subset is modified by "shifting forward"; that is, excluding the first number of the series and including the next value in the subset.

A moving average is commonly used with time series data to smooth out short-term fluctuations and highlight longer-term trends or cycles. The threshold between short-term and long-term depends on the application, and the parameters of the moving average will be set accordingly. It is also used in economics to examine gross domestic product, employment or other macroeconomic time series. When used with non-time series data, a moving average filters higher frequency components without any specific connection to time, although typically some kind of ordering is implied. Viewed simplistically it can be regarded as smoothing the data.

Simple moving average

[edit]

In financial applications a simple moving average (SMA) is the unweighted mean of the previous data-points. However, in science and engineering, the mean is normally taken from an equal number of data on either side of a central value. This ensures that variations in the mean are aligned with the variations in the data rather than being shifted in time. An example of a simple equally weighted running mean is the mean over the last entries of a data-set containing entries. Let those data-points be . This could be closing prices of a stock. The mean over the last data-points (days in this example) is denoted as and calculated as:

When calculating the next mean with the same sampling width the range from to is considered. A new value comes into the sum and the oldest value drops out. This simplifies the calculations by reusing the previous mean . This means that the moving average filter can be computed quite cheaply on real time data with a FIFO / circular buffer and only 3 arithmetic steps.

During the initial filling of the FIFO / circular buffer the sampling window is equal to the data-set size thus and the average calculation is performed as a cumulative moving average.

The period selected () depends on the type of movement of interest, such as short, intermediate, or long-term.

If the data used are not centered around the mean, a simple moving average lags behind the latest datum by half the sample width. An SMA can also be disproportionately influenced by old data dropping out or new data coming in. One characteristic of the SMA is that if the data has a periodic fluctuation, then applying an SMA of that period will eliminate that variation (the average always containing one complete cycle). But a perfectly regular cycle is rarely encountered.[2]

For a number of applications, it is advantageous to avoid the shifting induced by using only "past" data. Hence a central moving average can be computed, using data equally spaced on either side of the point in the series where the mean is calculated.[3] This requires using an odd number of points in the sample window.

A major drawback of the SMA is that it lets through a significant amount of the signal shorter than the window length. Worse, it actually inverts it.[citation needed] This can lead to unexpected artifacts, such as peaks in the smoothed result appearing where there were troughs in the data. It also leads to the result being less smooth than expected since some of the higher frequencies are not properly removed.

Its frequency response is a type of low-pass filter called sinc-in-frequency.

Continuous moving average

[edit]

The continuous moving average is defined with the following integral. The environment around defines the intensity of smoothing of the graph of the function.

The continuous moving average of the function is defined as:

A larger smoothes the source graph of the function (blue) more. The animations below show the moving average as animation in dependency of different values for . The fraction is used, because is the interval width for the integral.

Cumulative average

[edit]

In a cumulative average (CA), the data arrive in an ordered datum stream, and the user would like to get the average of all of the data up until the current datum. For example, an investor may want the average price of all of the stock transactions for a particular stock up until the current time. As each new transaction occurs, the average price at the time of the transaction can be calculated for all of the transactions up to that point using the cumulative average, typically an equally weighted average of the sequence of n values up to the current time:

The brute-force method to calculate this would be to store all of the data and calculate the sum and divide by the number of points every time a new datum arrived. However, it is possible to simply update cumulative average as a new value, becomes available, using the formula

Thus the current cumulative average for a new datum is equal to the previous cumulative average, times n, plus the latest datum, all divided by the number of points received so far, n+1. When all of the data arrive (n = N), then the cumulative average will equal the final average. It is also possible to store a running total of the data as well as the number of points and dividing the total by the number of points to get the CA each time a new datum arrives.

The derivation of the cumulative average formula is straightforward. Using and similarly for n + 1, it is seen that

Solving this equation for results in

Weighted moving average

[edit]

A weighted average is an average that has multiplying factors to give different weights to data at different positions in the sample window. Mathematically, the weighted moving average is the convolution of the data with a fixed weighting function. One application is removing pixelization from a digital graphical image.[citation needed]

In the financial field, and more specifically in the analyses of financial data, a weighted moving average (WMA) has the specific meaning of weights that decrease in arithmetical progression.[4] In an n-day WMA the latest day has weight n, the second latest , etc., down to one.

WMA weights n = 15

The denominator is a triangle number equal to In the more general case the denominator will always be the sum of the individual weights.

When calculating the WMA across successive values, the difference between the numerators of and is . If we denote the sum by , then

The graph at the right shows how the weights decrease, from highest weight for the most recent data, down to zero. It can be compared to the weights in the exponential moving average which follows.

Exponential moving average

[edit]

An exponential moving average (EMA), also known as an exponentially weighted moving average (EWMA),[5] is a first-order infinite impulse response filter that applies weighting factors which decrease exponentially. The weighting for each older datum decreases exponentially, never reaching zero. This formulation is according to Hunter (1986).[6]

Other weightings

[edit]

Other weighting systems are used occasionally – for example, in share trading a volume weighting will weight each time period in proportion to its trading volume.

A further weighting, used by actuaries, is Spencer's 15-Point Moving Average[7] (a central moving average). Its symmetric weight coefficients are [−3, −6, −5, 3, 21, 46, 67, 74, 67, 46, 21, 3, −5, −6, −3], which factors as [1, 1, 1, 1]×[1, 1, 1, 1]×[1, 1, 1, 1, 1]×[−3, 3, 4, 3, −3]/320 and leaves samples of any quadratic or cubic polynomial unchanged.[8][9]

Outside the world of finance, weighted running means have many forms and applications. Each weighting function or "kernel" has its own characteristics. In engineering and science the frequency and phase response of the filter is often of primary importance in understanding the desired and undesired distortions that a particular filter will apply to the data.

A mean does not just "smooth" the data. A mean is a form of low-pass filter. The effects of the particular filter used should be understood in order to make an appropriate choice. On this point, the French version of this article discusses the spectral effects of 3 kinds of means (cumulative, exponential, Gaussian).

Moving median

[edit]

From a statistical point of view, the moving average, when used to estimate the underlying trend in a time series, is susceptible to rare events such as rapid shocks or other anomalies. A more robust estimate of the trend is the simple moving median over n time points: where the median is found by, for example, sorting the values inside the brackets and finding the value in the middle. For larger values of n, the median can be efficiently computed by updating an indexable skiplist.[10]

Statistically, the moving average is optimal for recovering the underlying trend of the time series when the fluctuations about the trend are normally distributed. However, the normal distribution does not place high probability on very large deviations from the trend which explains why such deviations will have a disproportionately large effect on the trend estimate. It can be shown that if the fluctuations are instead assumed to be Laplace distributed, then the moving median is statistically optimal.[11] For a given variance, the Laplace distribution places higher probability on rare events than does the normal, which explains why the moving median tolerates shocks better than the moving mean.

When the simple moving median above is central, the smoothing is identical to the median filter which has applications in, for example, image signal processing. The Moving Median is a more robust alternative to the Moving Average when it comes to estimating the underlying trend in a time series. While the Moving Average is optimal for recovering the trend if the fluctuations around the trend are normally distributed, it is susceptible to the impact of rare events such as rapid shocks or anomalies. In contrast, the Moving Median, which is found by sorting the values inside the time window and finding the value in the middle, is more resistant to the impact of such rare events. This is because, for a given variance, the Laplace distribution, which the Moving Median assumes, places higher probability on rare events than the normal distribution that the Moving Average assumes. As a result, the Moving Median provides a more reliable and stable estimate of the underlying trend even when the time series is affected by large deviations from the trend. Additionally, the Moving Median smoothing is identical to the Median Filter, which has various applications in image signal processing.

Moving average regression model

[edit]

In a moving average regression model, a variable of interest is assumed to be a weighted moving average of unobserved independent error terms; the weights in the moving average are parameters to be estimated.

Those two concepts are often confused due to their name, but while they share many similarities, they represent distinct methods and are used in very different contexts.

See also

[edit]

References

[edit]
  1. ^ Hydrologic Variability of the Cosumnes River Floodplain (Booth et al., San Francisco Estuary and Watershed Science, Volume 4, Issue 2, 2006)
  2. ^ Statistical Analysis, Ya-lun Chou, Holt International, 1975, ISBN 0-03-089422-0, section 17.9.
  3. ^ The derivation and properties of the simple central moving average are given in full at Savitzky–Golay filter.
  4. ^ "Weighted Moving Averages: The Basics". Investopedia.
  5. ^ "DEALING WITH MEASUREMENT NOISE - Averaging Filter". Archived from the original on 2010-03-29. Retrieved 2010-10-26.
  6. ^ NIST/SEMATECH e-Handbook of Statistical Methods: Single Exponential Smoothing at the National Institute of Standards and Technology
  7. ^ Spencer's 15-Point Moving Average — from Wolfram MathWorld
  8. ^ Rob J Hyndman. "Moving averages". 2009-11-08. Accessed 2020-08-20.
  9. ^ Aditya Guntuboyina. "Statistics 153 (Time Series) : Lecture Three". 2012-01-24. Accessed 2024-01-07.
  10. ^ "Efficient Running Median using an Indexable Skiplist « Python recipes « ActiveState Code".
  11. ^ G.R. Arce, "Nonlinear Signal Processing: A Statistical Approach", Wiley:New Jersey, US, 2005.