Package 'coroICA' reference manual

Title:	Confounding Robust Independent Component Analysis for Noisy and Grouped Data
Description:	Contains an implementation of a confounding robust independent component analysis (ICA) for noisy and grouped data. The main function coroICA() performs a blind source separation, by maximizing an independence across sources and allows to adjust for varying confounding based on user-specified groups. Additionally, the package contains the function uwedge() which can be used to approximately jointly diagonalize a list of matrices. For more details see the project website <https://sweichwald.de/coroICA/>.
Authors:	Niklas Pfister and Sebastian Weichwald
Maintainer:	Niklas Pfister <[email protected]>
License:	AGPL-3
Version:	1.0.2
Built:	2025-03-21 03:01:52 UTC
Source:	https://github.com/sweichwald/coroica-r

coroICA

Description

Estimates the unmixing matrix V=A^-1 of a confounded ICA model of the form X=AS+H, where H is confounding noise which is group-wise stationary and S are non-stationary signal sources. The function can also be used without a group-structure (i.e., using a single group) in which it corresponds to a noisy ICA that allows for arbitrary stationary noise H.

Usage

coroICA(X, group_index = NA, partition_index = NA, n_components = NA,
  n_components_uwedge = NA, rank_components = FALSE,
  pairing = "complement", max_matrices = 1, groupsize = 1,
  partitionsize = NA, timelags = NA, instantcov = TRUE,
  max_iter = 1000, tol = 1e-12, minimize_loss = FALSE,
  condition_threshold = NA, silent = TRUE)
coroICA(X, group_index = NA, partition_index = NA, n_components = NA,
  n_components_uwedge = NA, rank_components = FALSE,
  pairing = "complement", max_matrices = 1, groupsize = 1,
  partitionsize = NA, timelags = NA, instantcov = TRUE,
  max_iter = 1000, tol = 1e-12, minimize_loss = FALSE,
  condition_threshold = NA, silent = TRUE)

Arguments

`X`	data matrix. Each column corresponds to one predictor variable.
`group_index`	vector coding to which group each sample belongs, with length(`group_index`)=nrow(`X`). If no group index is provided a rigid grid with `groupsize` samples per group is used (which defaults to all samples if `groupsize` was not set).
`partition_index`	vector coding to which partition each sample belongs, with length(`partition_index`)=nrow(`X`). If no partition index is provided a rigid grid with `partitionsize` samples per partition is used.
`n_components`	number of components to extract. If NA is passed, the same number of components as the input has dimensions is used.
`n_components_uwedge`	number of components to extract during uwedge approximate joint diagonalization of the matrices. If NA is passed, the same number of components as the input has dimensions is used.
`rank_components`	boolean, optional. When TRUE, the components will be ordered in decreasing stability.
`pairing`	either 'complement', 'neighbouring' or 'allpairs'. If 'allpairs' the difference matrices are computed for all pairs of partition covariance matrices, if 'complement' a one-vs-complement scheme is used and if 'neighbouring' differences with the right neighbour parition are used.
`max_matrices`	float or 'no_partitions', optional (default=1). The fraction of (lagged) covariance matrices to use during training or, if 'no_partitions', at most as many covariance matrices are used as there are partitions.
`groupsize`	int, optional. Approximate number of samples in each group when using a rigid grid as groups. If NA is passed, all samples will be in one group unless group_index is passed during fitting in which case the provided group index is used (the latter is the advised and preferred way).
`partitionsize`	int or vector of ints, optional. Approximate number of samples in each partition when using a rigid grid as partition. If NA is passed, a (hopefully sane) default is used, again, unless partition_index is passed during fitting in which case the provided partition index is used. If a vector is passed, each element is used to construct a grid and all resulting partitions are used.
`timelags`	vector of ints, optional. Specifies which timelags should be included. 0 correpsonds to covariance matrix.
`instantcov`	boolean, default TRUE. Specifies whether to include covariance matrix when timelags are used.
`max_iter`	int, optional. Maximum number of iterations for the uwedge approximate joint diagonalisation during fitting.
`tol`	float, optional. Tolerance for terminating the uwedge approximate joint diagonalisation during fitting.
`minimize_loss`	boolean, optional. Parameter is passed to uwedge and specifies whether to compute loss function in each iteration step of uwedge.
`condition_threshold`	float, optional. Parameter is passed to uwedge and specifies whether and at which threshold to terminate uwedge iteration depending on the condition number of the unmixing matrix.
`silent`	boolean whether to supress status outputs.

Details

For further details see the references.

Value

object of class 'CoroICA' consisting of the following elements

`V`	the unmixing matrix.
`coverged`	boolean indicating whether the approximate joint diagonalisation converged due to tol.
`n_iter`	number of iterations of the approximate joint diagonalisation.
`meanoffdiag`	mean absolute value of the off-diagonal values of the to be jointly diagonalised matrices, i.e., a proxy of the approximate joint diagonalisation objective function.

Author(s)

Niklas Pfister and Sebastian Weichwald

References

Pfister, N., S. Weichwald, P. Bühlmann and B. Schölkopf (2018). Robustifying Independent Component Analysis by Adjusting for Group-Wise Stationary Noise ArXiv e-prints (arXiv:1806.01094).

Project website (https://sweichwald.de/coroICA/)

Examples

## Example
set.seed(1)

# Generate data from a block-wise variance model
d <- 2
m <- 10
n <- 5000
group_index <- rep(c(1,2), each=n)
partition_index <- rep(rep(1:m, each=n/m), 2)
S <- matrix(NA, 2*n, d)
H <- matrix(NA, 2*n, d)
for(i in unique(group_index)){
  varH <- abs(rnorm(d))/4
  H[group_index==i, ] <- matrix(rnorm(d*n)*rep(varH, each=n), n, d)
  for(j in unique(partition_index[group_index==i])){
    varS <- abs(rnorm(d))
    index <- partition_index==j & group_index==i
    S[index,] <- matrix(rnorm(d*n/m)*rep(varS, each=n/m),
                                                     n/m, d)
  }
}
A <- matrix(rnorm(d^2), d, d)
A <- A%*%t(A)
X <- t(A%*%t(S+H))

# Apply coroICA
res <- coroICA(X, group_index, partition_index, pairing="allpairs", rank_components=TRUE)

# Compare results
par(mfrow=c(2,2))
plot((S+H)[,1], type="l", main="true source 1", ylab="S+H")
plot(res$Shat[,1], type="l", main="estimated source 1", ylab="Shat")
plot((S+H)[,2], type="l", main="true source 2", ylab="S+H")
plot(res$Shat[,2], type="l", main="estimated source 2", ylab="Shat")
cor(res$Shat, S+H)
## Example
set.seed(1)

# Generate data from a block-wise variance model
d <- 2
m <- 10
n <- 5000
group_index <- rep(c(1,2), each=n)
partition_index <- rep(rep(1:m, each=n/m), 2)
S <- matrix(NA, 2*n, d)
H <- matrix(NA, 2*n, d)
for(i in unique(group_index)){
  varH <- abs(rnorm(d))/4
  H[group_index==i, ] <- matrix(rnorm(d*n)*rep(varH, each=n), n, d)
  for(j in unique(partition_index[group_index==i])){
    varS <- abs(rnorm(d))
    index <- partition_index==j & group_index==i
    S[index,] <- matrix(rnorm(d*n/m)*rep(varS, each=n/m),
                                                     n/m, d)
  }
}
A <- matrix(rnorm(d^2), d, d)
A <- A%*%t(A)
X <- t(A%*%t(S+H))

# Apply coroICA
res <- coroICA(X, group_index, partition_index, pairing="allpairs", rank_components=TRUE)

# Compare results
par(mfrow=c(2,2))
plot((S+H)[,1], type="l", main="true source 1", ylab="S+H")
plot(res$Shat[,1], type="l", main="estimated source 1", ylab="Shat")
plot((S+H)[,2], type="l", main="true source 2", ylab="S+H")
plot(res$Shat[,2], type="l", main="estimated source 2", ylab="Shat")
cor(res$Shat, S+H)

uwedge

Description

Performs an approximate joint matrix diagonalization on a list of matrices. More precisely, for a list of matrices Rx the algorithm finds a matrix V such that for all i V Rx[i] t(V) is approximately diagonal.

Usage

uwedge(Rx, init = NA, Rx0 = NA, return_diag = FALSE, tol = 1e-10,
  max_iter = 1000, n_components = NA, minimize_loss = FALSE,
  condition_threshold = NA, silent = TRUE)
uwedge(Rx, init = NA, Rx0 = NA, return_diag = FALSE, tol = 1e-10,
  max_iter = 1000, n_components = NA, minimize_loss = FALSE,
  condition_threshold = NA, silent = TRUE)

Arguments

`Rx`	list of matrices to be diagaonlized.
`init`	matrix used in first step of initialization. If NA a default based on PCA is used
`Rx0`	matrix used for initial scaling.
`return_diag`	boolean. Specifies whether to return the list of diagonalized matrices.
`tol`	float, optional. Tolerance for terminating the iteration.
`max_iter`	int, optional. Maximum number of iterations.
`n_components`	number of components to extract. If NA is passed, all components are used.
`minimize_loss`	boolean whether to compute loss function in each iteration step and output V with smallest loss over all iterations. Defaults to FALSE since it is computationally more expensive.
`condition_threshold`	float, optional. Stops iteration if condition number of V passes this threshold. Default NA, means no threshold is used.
`silent`	boolean whether to supress status outputs.

Details

For further details see the references.

Value

object of class 'uwedge' consisting of the following elements

`V`	joint diagonalizing matrix.
`Rxdiag`	list of diagonalized matrices.
`converged`	boolean specifying whether the algorithm converged for the given `tol`.
`iterations`	number of iterations of the approximate joint diagonalisation.
`meanoffdiag`	mean absolute value of the off-diagonal values of the to be jointly diagonalised matrices, i.e., a proxy of the approximate joint diagonalisation objective function.

Author(s)

Niklas Pfister and Sebastian Weichwald

References

Pfister, N., S. Weichwald, P. Bühlmann and B. Schölkopf (2018). Robustifying Independent Component Analysis by Adjusting for Group-Wise Stationary Noise ArXiv e-prints (arXiv:1806.01094).

Tichavsky, P. and Yeredor, A. (2009). Fast Approximate Joint Diagonalization Incorporating Weight Matrices. IEEE Transactions on Signal Processing.

Examples

## Example
set.seed(1)

# Generate data 20 matrix that can be jointly diagonalized
d <- 10
A <- matrix(rnorm(d*d), d, d)
A <- A%*%t(A)
Rx <- lapply(1:20, function(x) A %*% diag(rnorm(d)) %*% t(A))

# Perform approximate joint diagonalization
ptm <- proc.time()
res <- uwedge(Rx,
              return_diag=TRUE,
              max_iter=1000)
print(proc.time()-ptm)

# Average value of offdiagonal elements:
print(res$meanoffdiag)
## Example
set.seed(1)

# Generate data 20 matrix that can be jointly diagonalized
d <- 10
A <- matrix(rnorm(d*d), d, d)
A <- A%*%t(A)
Rx <- lapply(1:20, function(x) A %*% diag(rnorm(d)) %*% t(A))

# Perform approximate joint diagonalization
ptm <- proc.time()
res <- uwedge(Rx,
              return_diag=TRUE,
              max_iter=1000)
print(proc.time()-ptm)

# Average value of offdiagonal elements:
print(res$meanoffdiag)

Package 'coroICA'

Help Index

coroICA

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

uwedge

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples