Title: | Level-Dependent Cross-Validation Thresholding |
---|---|
Description: | The level-dependent cross-validation method is implemented for the selection of thresholding value in wavelet shrinkage. This procedure is implemented by coupling a conventional cross validation with an imputation method due to a limitation of data length, a power of 2. It can be easily applied to classical leave-one-out and k-fold cross validation. Since the procedure is computationally fast, a level-dependent cross validation can be performed for wavelet shrinkage of various data such as a data with correlated errors. |
Authors: | Donghoh Kim <[email protected]>, Hee-Seok Oh <[email protected]> |
Maintainer: | Donghoh Kim <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1.2 |
Built: | 2024-11-12 02:54:13 UTC |
Source: | https://github.com/cran/CVThresh |
This function performs imputation for test dataset of cross-validation given test dataset index and initial values.
cvimpute.by.wavelet(y, impute.index, impute.tol=0.1^3, impute.maxiter=100, impute.vscale="independent", filter.number=10, family="DaubLeAsymm", ll=3)
cvimpute.by.wavelet(y, impute.index, impute.tol=0.1^3, impute.maxiter=100, impute.vscale="independent", filter.number=10, family="DaubLeAsymm", ll=3)
y |
observation |
impute.index |
test dataset index for cross-validation |
impute.tol |
tolerance for imputation |
impute.maxiter |
maximum iteration for imputation |
impute.vscale |
specifies whether variance is adjusted level-by-level or not. “level" or “independent" |
filter.number |
specifies the smoothness of wavelet in the decomposition (argument of WaveThresh) |
family |
specifies the family of wavelets “DaubExPhase" or “DaubLeAsymm" (argument of WaveThresh) |
ll |
specifies the lowest level to be thresholded |
In wavelet context, test dataset of cross-validation is missing values. Based on h-likelihood concept and penalized least squares, this function performs imputation by wavelet for missing dataset, given the missing dataset. Lee and Nelder (1996, 2001) introduced the hierarchical likelihood as an extended likelihood for general models that include unobserved random variables such as missing. Following Lee and Nelder (1996, 2001), treat the missing values as random parameters and it has been known that a wavelet shrinkage estimator can be formulated by penalized least squares problem (Antoniadis and Fan, 2001). This arguments lead to the iterative algorithm for imputing the missing values based on wavelet shrinkage.
Imputed values according to cross-validation scheme.
Antoniadis, A. and Fan, J. (2001) Regularization of wavelet approximations. Journal of the American Statistical Association, 96, 939–962.
Lee, Y. and Nelder, J.A. (1996) Hierarchical generalised linear models (with discussion). Journal of the Royal Statistical Society Ser. B, 58, 619–678.
Lee, Y. and Nelder, J.A. (2001) Hierarchical generalised linear models: A synthesis of generalised linear models, random-effect models and structured dispersions. Biometrika, 88, 987–1006.
cvwavelet
, cvtype
, cvwavelet.after.impute
.
# 8-fold cross-validation scheme with block size 2 set.seed(1) cv.index <- cvtype(n=1024, cv.bsize=2, cv.kfold=8, cv.random=TRUE)$cv.index # Generate 1024 observation from Heavisine function snr <- 5 testdata <- heav(1024) x <- testdata$x y <- testdata$meanf + rnorm(1024, 0, testdata$sdf / snr) # Impute by wavelet yimpute <- cvimpute.by.wavelet(y=y, impute.index=cv.index)$yimpute # Compare imputed values and observations par(mar=0.1+c(4,4,2,1)) plot(y, yimpute, xlab="Observations", ylab="Imputed Values", main="Piecewise Polynomial", cex=0.5);abline(0,1)
# 8-fold cross-validation scheme with block size 2 set.seed(1) cv.index <- cvtype(n=1024, cv.bsize=2, cv.kfold=8, cv.random=TRUE)$cv.index # Generate 1024 observation from Heavisine function snr <- 5 testdata <- heav(1024) x <- testdata$x y <- testdata$meanf + rnorm(1024, 0, testdata$sdf / snr) # Impute by wavelet yimpute <- cvimpute.by.wavelet(y=y, impute.index=cv.index)$yimpute # Compare imputed values and observations par(mar=0.1+c(4,4,2,1)) plot(y, yimpute, xlab="Observations", ylab="Imputed Values", main="Piecewise Polynomial", cex=0.5);abline(0,1)
This function performs imputation for two-dimensional test dataset of cross-validation given test dataset index and initial values.
cvimpute.image.by.wavelet(images, impute.index1, impute.index2, impute.tol=0.1^3, impute.maxiter=100, filter.number=2, ll=3)
cvimpute.image.by.wavelet(images, impute.index1, impute.index2, impute.tol=0.1^3, impute.maxiter=100, filter.number=2, ll=3)
images |
noisy image |
impute.index1 |
test dataset row index according to cross-validation scheme |
impute.index2 |
test dataset column index according to cross-validation scheme |
impute.tol |
tolerance for imputation |
impute.maxiter |
maximum iteration for imputation |
filter.number |
specifies the smoothness of wavelet in the decomposition (argument of WaveThresh) |
ll |
specifies the lowest level to be thresholded |
In wavelet context, test dataset of cross-validation is missing values. Based on h-likelihood concept and penalized least squares, this function performs imputation by wavelet for missing dataset, given the missing dataset. Lee and Nelder (1996, 2001) introduced the hierarchical likelihood as an extended likelihood for general models that include unobserved random variables such as missing. Following Lee and Nelder (1996, 2001), treat the missing values as random parameters and it has been known that a wavelet shrinkage estimator can be formulated by penalized least squares problem (Antoniadis and Fan, 2001). This arguments lead to the iterative algorithm for imputing the missing values based on wavelet shrinkage.
Imputed values according to cross-validation scheme.
Antoniadis, A. and Fan, J. (2001) Regularization of wavelet approximations. Journal of the American Statistical Association, 96, 939–962.
Lee, Y. and Nelder, J.A. (1996) Hierarchical generalised linear models (with discussion). Journal of the Royal Statistical Society Ser. B, 58, 619–678.
Lee, Y. and Nelder, J.A. (2001) Hierarchical generalised linear models: A synthesis of generalised linear models, random-effect models and structured dispersions. Biometrika, 88, 987–1006.
cvtype.image
, cvwavelet
, cvimpute.by.wavelet
,
cvwavelet.after.impute
, cvwavelet.image
,
This package carries out level-dependent cross-validation method for the selection of thresholding value in wavelet shrinkage. This procedure is implemented by coupling a conventional cross validation with an imputation method due to a limitation of data length, a power of 2. It can be easily applied to classical leave-one-out and k-fold cross validation. Since the procedure is computationally fast, a level-dependent cross validation can be performed for wavelet shrinkage of various data such as a data with correlated errors.
This function generates test dataset index for cross-validation.
cvtype(n, cv.bsize=1, cv.kfold, cv.random=TRUE)
cvtype(n, cv.bsize=1, cv.kfold, cv.random=TRUE)
n |
the number of observation |
cv.bsize |
block size of cross-validation |
cv.kfold |
the number of fold of cross-validation |
cv.random |
whether or not random cross-validation scheme should be used. Set cv.random=TRUE for random cross-validation scheme |
This function provides index of test dataset according to various cross-validation scheme.
One may construct K test datasets in a way that each testset consists of blocks of b
consecutive data. Set cv.bsize = b
for this.
To select each fold at random, set cv.random = TRUE
.
matrix of which row is test dataset index for cross-validation.
cvwavelet
,
cvimpute.by.wavelet
,
cvwavelet.after.impute
.
# Traditional 4-fold cross-validation for 100 observations cvtype(n=100, cv.bsize=1, cv.kfold=4, cv.random=FALSE) # Random 4-fold cross-validation with block size 2 for 100 observations cvtype(n=100, cv.bsize=2, cv.kfold=4, cv.random=TRUE)
# Traditional 4-fold cross-validation for 100 observations cvtype(n=100, cv.bsize=1, cv.kfold=4, cv.random=FALSE) # Random 4-fold cross-validation with block size 2 for 100 observations cvtype(n=100, cv.bsize=2, cv.kfold=4, cv.random=TRUE)
This function generates test dataset index of two-dimensional data for cross-validation
cvtype.image(n, cv.bsize=c(1,1), cv.kfold)
cvtype.image(n, cv.bsize=c(1,1), cv.kfold)
n |
the size of image |
cv.bsize |
two-dimensional block size of cross-validation |
cv.kfold |
the number of fold of cross-validation |
This function provides indexes of two-dimensional cross-validation scheme. Only random cross-validation scheme is provided.
Two matrix representing test dataset index of each dimension for cross-validation.
cv.index1 |
each row is test dataset index of one dimension |
cv.index2 |
each row is test dataset index of the other dimension |
cvtype
, cvwavelet
, cvimpute.by.wavelet
,
cvwavelet.after.impute
, cvwavelet.image
,
cvimpute.image.by.wavelet
, cvwavelet.image.after.impute
# Two-dimensional 4-fold cross-validation with block size 2*2 out <- cvtype.image(n=c(256,256), cv.bsize=c(2,2), cv.kfold=4)
# Two-dimensional 4-fold cross-validation with block size 2*2 out <- cvtype.image(n=c(256,256), cv.bsize=c(2,2), cv.kfold=4)
This function reconstructs the noise data by level-dependent cross-validation wavelet shrinkage.
cvwavelet(y=y, ywd=ywd, cv.optlevel, cv.bsize=1, cv.kfold, cv.random=TRUE, cv.tol=0.1^3, cv.maxiter=100, impute.vscale="independent", impute.tol=0.1^3, impute.maxiter=100, filter.number=10, family="DaubLeAsymm", thresh.type ="soft", ll=3)
cvwavelet(y=y, ywd=ywd, cv.optlevel, cv.bsize=1, cv.kfold, cv.random=TRUE, cv.tol=0.1^3, cv.maxiter=100, impute.vscale="independent", impute.tol=0.1^3, impute.maxiter=100, filter.number=10, family="DaubLeAsymm", thresh.type ="soft", ll=3)
y |
observation |
ywd |
DWT object |
cv.optlevel |
thresholding levels |
cv.bsize |
block size of cross-validation |
cv.kfold |
the number of fold of cross-validation |
cv.random |
whether or not random cross-validation scheme should be used. Set cv.random=TRUE for random cross-validation scheme |
cv.tol |
tolerance for cross-validation |
cv.maxiter |
maximum iteration for cross-validation |
impute.vscale |
specifies whether variance is adjusted level-by-level or not. “level" or “independent" |
impute.tol |
tolerance for imputation |
impute.maxiter |
maximum iteration for imputation |
filter.number |
specifies the smoothness of wavelet in the decomposition (argument of WaveThresh) |
family |
specifies the family of wavelets “DaubExPhase" or “DaubLeAsymm" (argument of WaveThresh) |
thresh.type |
specifies the type of thresholding “hard" or “soft" (argument of WaveThresh) |
ll |
specifies the lowest level to be thresholded |
This function performs level-dependent cross-validation wavelet shrinkage.
y |
observations |
yimpute |
imputed values by provided cross-validation scheme |
yc |
reconstruction by level-dependent cross-validation wavelet shrinkage |
cvthresh |
threshold values by level-dependent cross-validation |
cvtype
, cvimpute.by.wavelet
, cvwavelet.after.impute
.
data(ipd) y <- as.numeric(ipd); n <- length(y); nlevel <- log2(n) ywd <- wd(y) #out <- cvwavelet(y=y, ywd=ywd, cv.optlevel=c(3:(nlevel-1)), # cv.bsize=2, cv.kfold=4) #ts.plot(ts(out$yc, start=1229.98, deltat=0.02, frequency=50), # main="Level-dependent Cross Validation", xlab = "Seconds", ylab="")
data(ipd) y <- as.numeric(ipd); n <- length(y); nlevel <- log2(n) ywd <- wd(y) #out <- cvwavelet(y=y, ywd=ywd, cv.optlevel=c(3:(nlevel-1)), # cv.bsize=2, cv.kfold=4) #ts.plot(ts(out$yc, start=1229.98, deltat=0.02, frequency=50), # main="Level-dependent Cross Validation", xlab = "Seconds", ylab="")
This function performs level-dependent cross-validation wavelet shrinkage given the cross-validation scheme and imputation values.
cvwavelet.after.impute(y, ywd, yimpute, cv.index, cv.optlevel, cv.tol=0.1^3, cv.maxiter=100, filter.number=10, family="DaubLeAsymm", thresh.type="soft", ll=3)
cvwavelet.after.impute(y, ywd, yimpute, cv.index, cv.optlevel, cv.tol=0.1^3, cv.maxiter=100, filter.number=10, family="DaubLeAsymm", thresh.type="soft", ll=3)
y |
observation |
ywd |
DWT object |
yimpute |
imputed values according to cross-validation scheme |
cv.index |
test dataset index according to cross-validation scheme |
cv.optlevel |
thresholding levels |
cv.tol |
tolerance for cross-validation |
cv.maxiter |
maximum iteration for cross-validation |
filter.number |
specifies the smoothness of wavelet in the decomposition (argument of WaveThresh) |
family |
specifies the family of wavelets “DaubExPhase" or “DaubLeAsymm" (argument of WaveThresh) |
thresh.type |
specifies the type of thresholding “hard" or “soft" (argument of WaveThresh) |
ll |
specifies the lowest level to be thresholded |
Calculating the threshold values and reconstructing noisy data , given the index of each testdata,
imputed values according to cross-validation scheme and discrete wavelet transform of
.
Reconstruction and thresholding values by level-dependent cross-validation
yc |
reconstruction |
cvthresh |
thresholding values by level-dependent cross-validation |
cvwavelet
, cvtype
, cvimpute.by.wavelet
.
data(ipd) y <- as.numeric(ipd); n <- length(y); nlevel <- log2(n) set.seed(1) cv.index <- cvtype(n=n, cv.bsize=2, cv.kfold=4, cv.random=TRUE)$cv.index yimpute <- cvimpute.by.wavelet(y=y, impute.index=cv.index)$yimpute ywd <- wd(y) #out <- cvwavelet.after.impute(y=y, ywd=ywd, yimpute=yimpute, #cv.index=cv.index, cv.optlevel=c(3:(nlevel-1))) #ts.plot(ts(out$yc, start=1229.98, deltat=0.02, frequency=50), # main="Level-dependent Cross Validation", xlab = "Seconds", ylab="") ##### Specifying thresholding structure # cv.optlevel <- c(3) # Threshold (level 3 to finest level) at the same time. # cv.optlevel <- c(3, 5) # Threshold two groups of resolution levels, # (level 3, 4) and (level 5 to finest level). # cv.optlevel <- c(3,4,5,6,7,8) # Threshold each resolution level 3 to 8.
data(ipd) y <- as.numeric(ipd); n <- length(y); nlevel <- log2(n) set.seed(1) cv.index <- cvtype(n=n, cv.bsize=2, cv.kfold=4, cv.random=TRUE)$cv.index yimpute <- cvimpute.by.wavelet(y=y, impute.index=cv.index)$yimpute ywd <- wd(y) #out <- cvwavelet.after.impute(y=y, ywd=ywd, yimpute=yimpute, #cv.index=cv.index, cv.optlevel=c(3:(nlevel-1))) #ts.plot(ts(out$yc, start=1229.98, deltat=0.02, frequency=50), # main="Level-dependent Cross Validation", xlab = "Seconds", ylab="") ##### Specifying thresholding structure # cv.optlevel <- c(3) # Threshold (level 3 to finest level) at the same time. # cv.optlevel <- c(3, 5) # Threshold two groups of resolution levels, # (level 3, 4) and (level 5 to finest level). # cv.optlevel <- c(3,4,5,6,7,8) # Threshold each resolution level 3 to 8.
This function reconstructs image by level-dependent cross-validation wavelet shrinkage.
cvwavelet.image(images, imagewd, cv.optlevel, cv.bsize=c(1,1), cv.kfold, cv.tol=0.1^3, cv.maxiter=100, impute.tol=0.1^3, impute.maxiter=100, filter.number=2, ll=3)
cvwavelet.image(images, imagewd, cv.optlevel, cv.bsize=c(1,1), cv.kfold, cv.tol=0.1^3, cv.maxiter=100, impute.tol=0.1^3, impute.maxiter=100, filter.number=2, ll=3)
images |
noisy image |
imagewd |
two-dimensional wavelet transform |
cv.optlevel |
thresholding level |
cv.bsize |
block size of cross-validation |
cv.kfold |
the number of fold of cross-validation |
cv.tol |
tolerance for cross-validation |
cv.maxiter |
maximum iteration for cross-validation |
impute.tol |
tolerance for imputation |
impute.maxiter |
maximum iteration for imputation |
filter.number |
specifies the smoothness of wavelet in the decomposition (argument of WaveThresh) |
ll |
specifies the lowest level to be thresholded |
This function performs level-dependent cross-validation wavelet shrinkage for two-dimensional data.
imagecv |
reconstruction of image by level-dependent cross-validation wavelet shrinkage |
cvthresh |
threshold values by level-dependent cross-validation |
cvtype.image
, cvimpute.image.by.wavelet
, cvwavelet.image.after.impute
.
# Generate Noisy Lennon Image data(lennon) sdimage <- sd(as.numeric(lennon)) nlennon <- ncol(lennon); nlevel <- log2(ncol(lennon)) optlevel <- c(3:(nlevel-1)) set.seed(55) lennonnoise <- lennon+matrix(rnorm(nlennon^2, 0, sdimage), nlennon, nlennon) # Level-dependent Cross-validation Thresholding lennonwd <- imwd(lennonnoise) #lennoncv <- cvwavelet.image(images=lennonnoise, imagewd=lennonwd, # cv.optlevel=optlevel, cv.bsize=c(1,1), cv.kfold=10)$imagecv #image(lennoncv, axes=FALSE, col=gray(100:0/100), # main="Level-dependent CV")
# Generate Noisy Lennon Image data(lennon) sdimage <- sd(as.numeric(lennon)) nlennon <- ncol(lennon); nlevel <- log2(ncol(lennon)) optlevel <- c(3:(nlevel-1)) set.seed(55) lennonnoise <- lennon+matrix(rnorm(nlennon^2, 0, sdimage), nlennon, nlennon) # Level-dependent Cross-validation Thresholding lennonwd <- imwd(lennonnoise) #lennoncv <- cvwavelet.image(images=lennonnoise, imagewd=lennonwd, # cv.optlevel=optlevel, cv.bsize=c(1,1), cv.kfold=10)$imagecv #image(lennoncv, axes=FALSE, col=gray(100:0/100), # main="Level-dependent CV")
This function performs level-dependent cross-validation wavelet shrinkage for two-dimensional data given the cross-validation scheme and imputation values.
cvwavelet.image.after.impute(images, imagewd, imageimpute, cv.index1=cv.index1, cv.index2=cv.index2, cv.optlevel=cv.optlevel, cv.tol=cv.tol, cv.maxiter=cv.maxiter, filter.number=2, ll=3)
cvwavelet.image.after.impute(images, imagewd, imageimpute, cv.index1=cv.index1, cv.index2=cv.index2, cv.optlevel=cv.optlevel, cv.tol=cv.tol, cv.maxiter=cv.maxiter, filter.number=2, ll=3)
images |
noisy image |
imagewd |
two-dimensional wavelet transform |
imageimpute |
two-dimensional imputed values according to cross-validation scheme |
cv.index1 |
test dataset row index according to cross-validation scheme |
cv.index2 |
test dataset column index according to cross-validation scheme |
cv.optlevel |
thresholding levels |
cv.tol |
tolerance for cross-validation |
cv.maxiter |
maximum iteration for cross-validation |
filter.number |
specifies the smoothness of wavelet in the decomposition (argument of WaveThresh) |
ll |
specifies the lowest level to be thresholded |
Calculating thresholding values and reconstructing noisy image given cross-validation scheme and imputation.
Reconstruction of images and thresholding values by level-dependent cross-validation
imagecv |
reconstruction of images |
cvthresh |
thresholding values by level-dependent cross-validation |
cvwavelet.image
, cvtype.image
, cvimpute.image.by.wavelet
.
This function generates Doppler function values for equally spaced points in
.
dopp(norx=1024)
dopp(norx=1024)
norx |
the number of data or x values in [0, 1] |
Doppler function is introduced by Donoho and Johnstone (1994) and is useful test function evaluating a wavelet shrinkage method.
Doppler function values and its variability
where
.
Donoho, D.L. and Johnstone, I.M. (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika, 81, 425–455.
testdopp <- dopp(1024) plot(testdopp$x, testdopp$meanf, xlab="", ylab="", main="Plot of Doppler function", type="l")
testdopp <- dopp(1024) plot(testdopp$x, testdopp$meanf, xlab="", ylab="", main="Plot of Doppler function", type="l")
This function generates fg1 function values for equally spaced points in
.
fg1(norx=1024)
fg1(norx=1024)
norx |
the number of data or x values in [0, 1] |
A smooth function fg1 is introduced by Fan and Gijbels (1995) and is useful test function evaluating a wavelet shrinkage method.
fg1 function values and its variability
where
.
Fan, J. and Gijbels, I. (1995) Data-driven bandwidth selection in local polynomial fitting: Variable bandwidth and spatial adaptation. Journal of the Royal Statistical Society Ser. B 57, 371–394.
testfg1 <- fg1(1024) plot(testfg1$x, testfg1$meanf, xlab="", ylab="", main="Plot of fg1 function", type="l")
testfg1 <- fg1(1024) plot(testfg1$x, testfg1$meanf, xlab="", ylab="", main="Plot of fg1 function", type="l")
This function generates Heavisine function values for equally spaced points in
.
heav(norx=1024)
heav(norx=1024)
norx |
the number of data or x values in [0, 1] |
Heavisine function is introduced by Donoho and Johnstone (1994) and is useful test function evaluating a wavelet shrinkage method.
Heavisine function values and its variability
where
.
Donoho, D.L. and Johnstone, I.M. (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika, 81, 425–455.
testheav <- heav(1024) plot(testheav$x, testheav$meanf, xlab="", ylab="", main="Plot of Heavisine function", type="l")
testheav <- heav(1024) plot(testheav$x, testheav$meanf, xlab="", ylab="", main="Plot of Heavisine function", type="l")
4,096 observations of inductance plethysmography data regularly sampled at 50Hz starting at 1229.98 seconds.
data(ipd)
data(ipd)
time series.
This data set contains 4,096 observations of inductance plethysmography data regularly sampled at 50Hz starting at 1229.98 seconds. The data were collected in an investigation of the recovery of patients after general anesthesia.
The data set was used in Nason (1996) to illustrate cross-validation method for threshold selection. See the reference; Nason, G.P. (1996) Wavelet shrinkage by cross-validation. Journal of the Royal Statistical Society Ser. B 58, 463–479.
This function generates Piecewise polynomial function values for equally spaced points in
.
ppoly(norx=1024)
ppoly(norx=1024)
norx |
the number of data or x values in [0, 1] |
Piecewise polynomial function with the discontinuity is introduced by Nason and Silverman (1994) and is useful test function evaluating a wavelet shrinkage method.
Piecewise polynomial function values and its variability
where
.
Nason, G.P. and Silverman, B.W. (1994) The discrete wavelet transform in S. Journal of Computational and Graphical Statistics, 3, 163–191.
testpoly <- ppoly(1024) plot(testpoly$x, testpoly$meanf, xlab="", ylab="", main="Plot of Piecewise polynomial function", type="l")
testpoly <- ppoly(1024) plot(testpoly$x, testpoly$meanf, xlab="", ylab="", main="Plot of Piecewise polynomial function", type="l")