Optimal training set determination

This function is designed for determining optimal training set.

Usage

optTrain(
  geno,
  cand,
  n.train,
  subpop = NULL,
  test = NULL,
  method = "rScore",
  min.iter = NULL,
  console = TRUE
)

Arguments

geno: A numeric matrix of principal components (rows: individuals; columns: PCs).
cand: An integer vector of which rows of individuals are candidates of the training set in the geno matrix.
n.train: The size of the target training set. This could be determined with the help of the ssdfgp function provided in this package.
subpop: A character vector of sub-population's group name. The algorithm will ignore the population structure if it remains NULL.
test: An integer vector of which rows of individuals are in the test set in the geno matrix. The algorithm will use an un-target method if it remains NULL.
method: Choices are rScore, PEV and CD. rScore will be used by default.
min.iter: Minimum iteration of all methods can be appointed. One should always check if the algorithm is converged or not. A minimum iteration will set by considering the candidate and test set size if it remains NULL.
console: Default: TRUE. Set it to FALSE if you don't want the function printing out the number count of each iteration.

Value

This function will return 3 information including OPTtrain (a vector of chosen optimal training set), TOPscore (highest scores of before iteration), and ITERscore (criteria scores of each iteration).

Author

Jen-Hsiang Ou

Examples

data(geno)
if (FALSE) optTrain(geno, cand = 1:404, n.train = 100)