runmad {caTools} | R Documentation |
Moving (aka running, rolling) Window MAD (Median Absolute Deviation) calculated over a vector
runmad(x, k, center = runmed(x,k), constant = 1.4826, endrule=c("mad", "NA", "trim", "keep", "constant", "func"), align = c("center", "left", "right"))
x |
numeric vector of length n or matrix with n rows. If |
k |
width of moving window; must be an integer between one and n. In case
of even k's one will have to provide different |
endrule |
character string indicating how the values at the beginning
and the end, of the data, should be treated. Only first and last
Similar to |
center |
moving window center. Defaults
to running median ( |
constant |
scale factor such that for Gaussian
distribution X, |
align |
specifies whether result should be centered (default),
left-aligned or right-aligned. If |
Apart from the end values, the result of y = runmad(x, k) is the same as
“for(j=(1+k2):(n-k2)) y[j]=mad(x[(j-k2):(j+k2)], na.rm = TRUE)
”.
It can handle non-finite numbers like NaN's and Inf's
(like “mad(x, na.rm = TRUE)
”).
The main incentive to write this set of functions was relative slowness of
majority of moving window functions available in R and its packages. With the
exception of runmed
, a running window median function, all
functions listed in "see also" section are slower than very inefficient
“apply(embed(x,k),1,FUN)
” approach.
Functions runquantile
and runmad
are using insertion sort to
sort the moving window, but gain speed by remembering results of the previous
sort. Since each time the window is moved, only one point changes, all but one
points in the window are already sorted. Insertion sort can fix that in O(k)
time.
Returns a numeric vector or matrix of the same size as x
. Only in case of
endrule="trim"
the output vectors will be shorter and output matrices
will have fewer rows.
Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com
About insertion sort used in runmad
function see:
R. Sedgewick (1988): Algorithms. Addison-Wesley (page 99)
Links related to:
runmad
- mad
Other moving window functions from this package: runmin
,
runmax
, runquantile
, runmean
and
runsd
generic running window functions: apply
(embed(x,k), 1, FUN)
(fastest), running
from gtools
package (extremely slow for this purpose), subsums
from
magic library can perform running window operations on data with any
dimensions.
# show runmed function k=25; n=200; x = rnorm(n,sd=30) + abs(seq(n)-n/4) col = c("black", "red", "green") m=runmed(x, k) y=runmad(x, k, center=m) plot(x, col=col[1], main = "Moving Window Analysis Functions") lines(m , col=col[2]) lines(m-y/2, col=col[3]) lines(m+y/2, col=col[3]) lab = c("data", "runmed", "runmed-runmad/2", "runmed+runmad/2") legend(0,0.9*n, lab, col=col, lty=1 ) # basic tests against apply/embed eps = .Machine$double.eps ^ 0.5 k=25 # odd size window a = runmad(x,k, center=runmed(x,k), endrule="trim") b = apply(embed(x,k), 1, mad) stopifnot(all(abs(a-b)<eps)); k=24 # even size window a = runmad(x,k, center=runquantile(x,k,0.5,type=2), endrule="trim") b = apply(embed(x,k), 1, mad) stopifnot(all(abs(a-b)<eps)); # test against loop approach # this test works fine at the R prompt but fails during package check - need to investigate k=24; n=200; x = rnorm(n,sd=30) + abs(seq(n)-n/4) # create random data x = rep(1:5,40) #x[seq(1,n,11)] = NaN; # commented out for time beeing - on to do list #x[5] = NaN; # commented out for time beeing - on to do list k2 = k k1 = k-k2-1 ac = array(runquantile(x,k,0.5)) a = runmad(x, k, center=ac) bc = array(0,n) b = array(0,n) for(j in 1:n) { lo = max(1, j-k1) hi = min(n, j+k2) bc[j] = median(x[lo:hi], na.rm = TRUE) b [j] = mad (x[lo:hi], na.rm = TRUE, center=bc[j]) } eps = .Machine$double.eps ^ 0.5 #stopifnot(all(abs(ac-bc)<eps)); # commented out for time beeing - on to do list #stopifnot(all(abs(a-b)<eps)); # commented out for time beeing - on to do list # compare calculation at array ends k=25; n=200; x = rnorm(n,sd=30) + abs(seq(n)-n/4) c = runquantile(x,k,0.5,type=2) # find the center a = runmad(x, k, center=c, endrule="mad" ) # fast C code b = runmad(x, k, center=c, endrule="func") # slow R code stopifnot(all(abs(a-b)<eps)); # test if moving windows forward and backward gives the same results k=51; a = runmad(x , k) b = runmad(x[n:1], k) stopifnot(all(a[n:1]==b, na.rm=TRUE)); # test vector vs. matrix inputs, especially for the edge handling nRow=200; k=25; nCol=10 x = rnorm(nRow,sd=30) + abs(seq(nRow)-n/4) X = matrix(rep(x, nCol ), nRow, nCol) # replicate x in columns of X a = runmad(x, k, center = runquantile(x,k,0.5,type=2)) b = runmad(X, k, center = runquantile(X,k,0.5,type=2)) stopifnot(all(abs(a-b[,1])<eps)); # vector vs. 2D array stopifnot(all(abs(b[,1]-b[,nCol])<eps)); # compare rows within 2D array # speed comparison ## Not run: x=runif(1e5); k=51; # reduce vector and window sizes system.time(runmad( x,k,endrule="trim")) system.time(apply(embed(x,k), 1, mad)) ## End(Not run)