Package 'rvHPDT' reference manual

Title:	Calling Haplotype-Based and Variant-Based Pedigree Disequilibrium Test for Rare Variants in Pedigrees
Description:	To detecting rare variants for binary traits using general pedigrees, the pedigree disequilibrium tests are proposed by collapsing rare haplotypes/variants with/without weights. To run the test, MERLIN is needed in Linux for haplotyping.
Authors:	Wei Guo <[email protected]>
Maintainer:	Wei Guo <[email protected]>
License:	GPL (>= 2)
Version:	4.0
Built:	2025-03-04 02:45:45 UTC
Source:	https://github.com/weiguonimh/rvhpdt

Convert data.frame columns from factors to characters

Description

Convert data.frame columns from factors to characters

Usage

  
convert.factors.to.strings.in.dataframe(dataframe)
convert.factors.to.strings.in.dataframe(dataframe)

Arguments

dataframe

dataframe with columns of factors

Value

dataframe

dataframe with columns of characters

Internal function.

Description

Complete family members as requested by Merlin software.

Internal function.

Description

Calculate PDT statistics by permuting the transmission and non-transmission status for each child based on parents' genotype.

Internal function.

Description

Generate child's genotype by permuting the transmission and non-transmission status based on parents' genotype.

Calling haplotype-based and variant-based pedigree disequilibrium test for rare variants in pedigrees.

Description

To detecting rare variants for binary traits using general pedigrees, the pedigree disequilibrium tests are proposed by collapsing rare haplotypes/variants with/without weights.

Usage

 rhapPDT(ped, map, aff=2, unaff=1, mu=1.04,
merlinFN.prefix="merlin", nperm=1000, trace=TRUE) 
rhapPDT(ped, map, aff=2, unaff=1, mu=1.04,
merlinFN.prefix="merlin", nperm=1000, trace=TRUE)

Arguments

`ped`	input data, has same format with PLINK but having column names. The PED file is a white-space (space or tab) delimited file, and the first six columns are mandatory: FID: Family ID; IID: Individual ID; FA: Paternal ID; MO: Maternal ID; SEX: Sex (1=male; 2=female; other=unknown); PHENO: Phenotype; Genotypes (column 7 onwards) should also be white-space delimited; they are coded as 0, 1 and 2, indicating the number of coding allele, and NA is for missing genotype.
`map`	input data, has same format with MAP file required by MERLIN. The MAP file is a white-space (space or tab) delimited file with 3 columns as follows, CHROMOSOME: chromosome (1-22, X, Y or 0 if unplaced) MARKER: marker name in PED file that is usually rs# or snp identifier POSITION: Genetic distance (morgans) The data file and map file can include different sets of markers, but markers that are absent from the map file will be ignored by MERLIN.
`aff`	indicates the values that represents affected status in "PHENO" column of PED data; default is 2.
`unaff`	indicates the values that represents unaffected status in "PHENO" column of PED data; default is 1.
`mu`	indicates mu value that defines causal in the training data; default is 1.04.
`merlinFN.prefix`	Requests that output file of MERLIN names should be derived from outFN.prefix. For example, when it is set to be "merlin" as default, estimated haplotypes should be stored in a file called merlin.chr.
`nperm`	indicates the times of permutation; default is 1000.
`trace`	Indicates whether or not the intermediate outcomes should be printed; default is FALSE.

Value

`hPDT_v0`	P value of unweighted haplotype PDT test statistic.
`hPDT_v1`	P value of weighted haplotype PDT test statistic.
`rvPDT_v0`	P value of unweighted rvPDT test statistic.
`rvPDT_v1`	P value of weighted rvPDT test statistic.

References

Guo W , Shugart YY, Does Haplotype-based Collapsing Tests Gain More Power than Variant-based Collapsing Tests for Detecting Rare Variants in Pedigrees (manuscript).

Examples

#ped<-read.table("MLIP.ped",head=1,stringsAsFactors=FALSE)
#map<-read.table("MLIP.map",head=1,stringsAsFactors=FALSE)  
#test<-rhapPDT(ped, map, trace=TRUE) 
#test 
#$hPDT_v0
#[1] 0.4231359

#$hPDT_v1
#[1] 0.1481145

#$rvPDT_v0
#[1] 0.03237073

#$rvPDT_v1
#[1] 0.162997
#ped<-read.table("MLIP.ped",head=1,stringsAsFactors=FALSE)
#map<-read.table("MLIP.map",head=1,stringsAsFactors=FALSE)  
#test<-rhapPDT(ped, map, trace=TRUE) 
#test 
#$hPDT_v0
#[1] 0.4231359

#$hPDT_v1
#[1] 0.1481145

#$rvPDT_v0
#[1] 0.03237073

#$rvPDT_v1
#[1] 0.162997

Variants-based pedigree disequilibrium test for rare variants in pedigrees.

Description

To detecting rare variants for binary traits using general pedigrees, the pedigree disequilibrium tests are extended by collapsing rare variants with/without weights.

Usage

 rvPDT.test(seed=NULL,ped, aff=2,unaff=1, snpCol, hfreq=NULL,
 training=0.3, mu=1.28,useFamWeight=TRUE,trace=FALSE)   
rvPDT.test(seed=NULL,ped, aff=2,unaff=1, snpCol, hfreq=NULL,
 training=0.3, mu=1.28,useFamWeight=TRUE,trace=FALSE)

Arguments

`seed`	indicates the seed for randomly selectiong training data.
`ped`	input data, has same format with PLINK but having column names. The PED file is a white-space (space or tab) delimited file: the first six columns are mandatory: FID: Family ID; IID: Individual ID; FA: Paternal ID; MO: Maternal ID; SEX: Sex (1=male; 2=female; other=unknown); PHENO: Phenotype; Genotypes (column 7 onwards) should also be white-space delimited; they are coded as 0, 1 and 2, indicating the number of coding allele, and NA is for missing genotype.
`aff`	indicates the values that represents affected status in ped data; default is 2.
`unaff`	indicates the values that represents unaffected status in ped data; default is 1.
`snpCol`	indicates the columns of variants in ped data.
`hfreq`	indicates the frequencies of variants that used in calculating weights; when it is NULL, the frequencies are estimated by ped data.
`training`	indicates the proportion of training data; default is 0.3.
`mu`	indicates mu value that defines causal in the training data; default is 1.04.
`useFamWeight`	indicates whether the family weights need to be used in the test.
`trace`	indicates whether or not the intermediate outcomes should be printed; default is FALSE.

Value

`TDT`	Transmission/disequilibrium matrix for each pedigrees.
`Sib`	Discordant sib pairs matrix for each pedigrees.
`PDT`	Pedigree disequilibrium matrix for each pedigrees, which is the sum of TDT and Sib.
`W`	Weights used in Weighted rvPDT test.
`test.v1`	Weighted rvPDT test statistic with weights W.
`test.v0`	Unweighted rvPDT test statistic with weights=1.
`pvalue.v1`	P value of weighted rvPDT test statistic (test.v1).
`pvalue.v0`	P value of unweighted rvPDT test statistic (test.v0).

References

Guo W , Shugart YY, Does Haplotype-based Collapsing Tests Gain More Power than Variant-based Collapsing Tests for Detecting Rare Variants in Pedigrees (manuscript).

Variants-based pedigree disequilibrium test for rare variants in pedigrees.

Description

To detecting rare variants for binary traits using general pedigrees, the pedigree disequilibrium tests are extended by collapsing rare variants with/without weights.

Usage

rvPDT.test.permu(ped, aff=2,unaff=1, snpCol, hfreq=NULL,
useFamWeight=TRUE, nperm=1000,trace=FALSE)    
rvPDT.test.permu(ped, aff=2,unaff=1, snpCol, hfreq=NULL,
useFamWeight=TRUE, nperm=1000,trace=FALSE)

Arguments

`ped`	input data, has same format with PLINK but having column names. The PED file is a white-space (space or tab) delimited file: the first six columns are mandatory: FID: Family ID; IID: Individual ID; FA: Paternal ID; MO: Maternal ID; SEX: Sex (1=male; 2=female; other=unknown); PHENO: Phenotype; Genotypes (column 7 onwards) should also be white-space delimited; they are coded as 0, 1 and 2, indicating the number of coding allele, and NA is for missing genotype.
`aff`	indicates the values that represents affected status in ped data; default is 2.
`unaff`	indicates the values that represents unaffected status in ped data; default is 1.
`snpCol`	indicates the columns of variants in ped data.
`hfreq`	indicates the frequencies of variants that used in calculating weights; when it is NULL, the frequencies are estimated by ped data.
`useFamWeight`	indicates whether the family weights need to be used in the test.
`nperm`	indicates the times of permutation; default is 1000.
`trace`	indicates wether or not the intermediate outcomes should be printed; default is FALSE.

Value

`TDT`	Transmission/disequilibrium matrix for each pedigrees.
`Sib`	Discordant sib pairs matrix for each pedigrees.
`PDT`	Pedigree disequilibrium matrix for each pedigrees, which is the sum of TDT and Sib.
`W`	Weights used in Weighted rvPDT test.
`test.v1`	Weighted rvPDT test statistic with weights W.
`test.v0`	Unweighted rvPDT test statistic with weights=1.
`pvalue.v1`	P value of weighted rvPDT test statistic (test.v1).
`pvalue.v0`	P value of unweighted rvPDT test statistic (test.v0).

References

Guo W , Shugart YY, Does Haplotype-based Collapsing Tests Gain More Power than Variant-based Collapsing Tests for Detecting Rare Variants in Pedigrees (manuscript).

Internal function.

Description

Internal function of testing rare variants for binary traits using general pedigrees.

Prepare haplotype pairs for hPDT tests in pedigree data.

Description

Before running hPDT test, haplotype pairs are inferred by calling MERLIN in linux for all pedigree members, and then perpare some internal statistics. Require the R package of "gregmisc" and MERLIN software.

Usage

 
whap.prehap(ped,map, merlinDir="", outFN.prefix="merlin",aff=2,trace=FALSE)  
whap.prehap(ped,map, merlinDir="", outFN.prefix="merlin",aff=2,trace=FALSE)

Arguments

`ped`	input data, has same format with PLINK but having column names. The PED file is a white-space (space or tab) delimited file, and the first six columns are mandatory: FID: Family ID; IID: Individual ID; FA: Paternal ID; MO: Maternal ID; SEX: Sex (1=male; 2=female; other=unknown); PHENO: Phenotype; Genotypes (column 7 onwards) should also be white-space delimited; they are coded as 0, 1 and 2, indicating the number of coding allele, and NA is for missing genotype.
`map`	input data, has same format with MAP file required by MERLIN. The MAP file is a white-space (space or tab) delimited file with 3 columns as follows, CHROMOSOME: chromosome (1-22, X, Y or 0 if unplaced) MARKER: marker name in PED file that is usually rs# or snp identifier POSITION: Genetic distance (morgans) The data file and map file can include different sets of markers, but markers that are absent from the map file will be ignored by MERLIN.
`merlinDir`	indicates the directory of Merlin, for example, merlinDir="./Merlin/"; use the default="" when Merlin is in current directory or your bin directory.
`outFN.prefix`	Requests that output file of MERLIN names should be derived from outFN.prefix. For example, when it is set to be "merlin" as default, estimated haplotypes should be stored in a file called merlin.chr.
`aff`	indicates the values that represents affected status in ped data; default is 2.
`trace`	indicates whether or not the intermediate outcomes should be printed; default is FALSE.

Value

`SNPname`	SNP names of testing.
`hapData`	Haplotype data for each individuals.
`freq`	Estimated frequencies of haplotypes.
`trans`	Transmission matrix of haplotypes.
`hapScore`	Score matrix of haplotypes.

References

Guo W , Shugart YY, Does Haplotype-based Collapsing Tests Gain More Power than Variant-based Collapsing Tests for Detecting Rare Variants in Pedigrees (manuscript).

Package 'rvHPDT'

Help Index

Convert data.frame columns from factors to characters

Description

Usage

Arguments

Value

Internal function.

Description

Internal function.

Description

Internal function.

Description

Internal function.

Description

Internal function.

Description

Internal function.

Description

Calling haplotype-based and variant-based pedigree disequilibrium test for rare variants in pedigrees.

Description

Usage

Arguments

Value

References

Examples

Variants-based pedigree disequilibrium test for rare variants in pedigrees.

Description

Usage

Arguments

Value

References

Variants-based pedigree disequilibrium test for rare variants in pedigrees.

Description

Usage

Arguments

Value

References

Internal function.

Description

Prepare haplotype pairs for hPDT tests in pedigree data.

Description

Usage

Arguments

Value

References