Title: | QTL Analysis in Autopolyploid Bi-Parental F1 Populations |
---|---|
Description: | Quantitative trait loci (QTL) analysis and exploration of meiotic patterns in autopolyploid bi-parental F1 populations. For all ploidy levels, identity-by-descent (IBD) probabilities can be estimated. Significance thresholds, exploring QTL allele effects and visualising results are provided. For more background and to reference the package see <doi:10.1093/bioinformatics/btab574>. |
Authors: | Peter Bourke [aut, cre], Christine Hackett [ctb], Chris Maliepaard [ctb], Geert van Geest [ctb], Roeland Voorrips [ctb], Johan Willemsen [ctb] |
Maintainer: | Peter Bourke <[email protected]> |
License: | GPL-3 |
Version: | 0.1.1 |
Built: | 2024-11-02 04:59:28 UTC |
Source: | https://github.com/cran/polyqtlR |
nlme
packageCalculation of BLUEs from data frame of genotype names and phenotypes (assuming repeated measurements)
BLUE(data, model, random, genotype.ID)
BLUE(data, model, random, genotype.ID)
data |
Data frame of genotype codes and corresponding phenotypes |
model |
The model specification of fixed terms, eg. Yield ~ Clones |
random |
The random component of the model (repeat structure, can be nested), eg. ~1 | Blocks if only Blocks are used |
genotype.ID |
The colname used to describe genotypes, e.g. "Clones" |
A data-frame with columns "geno" for the genotype names, and "blue" for the BLUEs.
data("Phenotypes_4x") blue <- BLUE(data = Phenotypes_4x,model = pheno~geno,random = ~1|year,genotype.ID = "geno")
data("Phenotypes_4x") blue <- BLUE(data = Phenotypes_4x,model = pheno~geno,random = ~1|year,genotype.ID = "geno")
Best Linear Unbiased Estimates of phenotype
BLUEs.pheno
BLUEs.pheno
An object of class data.frame
with 50 rows and 2 columns.
The function check_cofactors
initially fits all significant QTL positions as co-factors, both individually and in combination. Significance thresholds
are re-estimated each time, yielding threshold-corrected LOD scores. If this leads to a change in the estimated position of QTL, or detection of subsequent peaks, a second
round of co-factor inclusion is performed for all new QTL or novel QTL combinations. Finally, the multi-QTL model that maximises the individual significance of each
QTL is returned as a data.frame. This can be directly passed to the function PVE
to estimate the percentage variance explained by the full
multi-QTL model and all possible sub-models.
Note: this function estimates the most likely QTL positions by maximising the threshold-corrected LOD at QTL peaks.
Non-additive interactions between QTL may be missed as a result. It is recommended to run a manual co-factor analysis as well,
as described in the package vignette.
check_cofactors( IBD_list, Phenotype.df, genotype.ID, trait.ID, LOD_data = NULL, min_res = 20, test_full_model = FALSE, verbose = TRUE, ... )
check_cofactors( IBD_list, Phenotype.df, genotype.ID, trait.ID, LOD_data = NULL, min_res = 20, test_full_model = FALSE, verbose = TRUE, ... )
IBD_list |
List of IBD_probabilities as estimated using one of the various methods available (e.g. |
Phenotype.df |
A data.frame containing phenotypic values |
genotype.ID |
The colname of |
trait.ID |
The colname of |
LOD_data |
Output of |
min_res |
The minimum genetic distance (resolution) assumed possible to consider 2 linked QTL (on the same linkage group) as independent. By default a value of 20 cM is used. This is not to suggest that 20 cM is a realistic resolution in a practical mapping study, but it provides the function with a criterion to consider 2 significant QTL within this distance as one and the same. For this purpose, 20 cM seems a reasonable value to use. In practice, closely linked QTL will generally "explain" all the variation at nearby positions, making it unlikely to be able to disentangle their effects. QTL positions will vary slightly when co-factors are introduced, but again this variation is presumed not to exceed 20 cM either side. |
test_full_model |
By default |
verbose |
Logical, by default |
... |
Option to pass extra arguments to |
Data frame with the following columns:
Linkage group identifier
CentiMorgan position
The difference between the LOD score at the peak and the significance threshold (always positive, otherwise the QTL would not be significant)
An identifier giving the co-factor model used in detecting the QTL (if no co-factors were included then NA
). The co-factor model is described
by concatenating all co-factor positions with a '+', so for example 1_10+4_20 would mean a co-factor model with 2 positions included as co-factors, namely 10 cM on linkage
group 1 and 20 cM on linkage group 4.
data("IBD_4x","BLUEs.pheno","qtl_LODs.4x") check_cofactors(IBD_list=IBD_4x,Phenotype.df=BLUEs.pheno, genotype.ID="Geno",trait.ID="BLUE",LOD_data=qtl_LODs.4x)
data("IBD_4x","BLUEs.pheno","qtl_LODs.4x") check_cofactors(IBD_list=IBD_4x,Phenotype.df=BLUEs.pheno, genotype.ID="Geno",trait.ID="BLUE",LOD_data=qtl_LODs.4x)
Convert MAPpoly.map object into a phased maplist, needed for IBD estimation
convert_mappoly_to_phased.maplist(mappoly_object)
convert_mappoly_to_phased.maplist(mappoly_object)
mappoly_object |
An object of class 'mappoly.map', for example output of the function |
A phased.maplist, with linkage group names LG1 etc. Each list item is a data.frame with columns marker, position followed by the phased map, coded in 1 and 0 for presence/absence of SNP (alternative) allele on parental homologues (h) numbered 1:ploidy for parent 1 and ploidy + 1 : 2*ploidy for parent 2.
## Not run: library("mappoly") phased.maplist <- convert_mappoly_to_phased.maplist(maps.hexafake) ## End(Not run)
## Not run: library("mappoly") phased.maplist <- convert_mappoly_to_phased.maplist(maps.hexafake) ## End(Not run)
The function count_recombinations
returns a list of all predicted recombination breakpoints. The output can be passed
using the argument recombination_data
to the function visualiseHaplo
, where the predicted breakpoints overlay the haplotypes.
Alternatively, a genome-wide visualisation of the recombination landscape both per linkage group and per individual can be generated using the function plotRecLS
,
which can be useful in identifying problematic areas of the linkage maps, or problematic individuals in the population. Currently, recombination break-points
are only estimated from bivalents in meiosis; any offspring resulting from a predicted multivalent is excluded from the analysis and will be returned with a NA
value.
count_recombinations(IBD_list, plausible_pairing_prob = 0.3)
count_recombinations(IBD_list, plausible_pairing_prob = 0.3)
IBD_list |
List of IBD_probabilities as estimated using one of the various methods available (e.g. |
plausible_pairing_prob |
The minimum probability of a pairing configuration needed to analyse an individual's IBD data.
The default setting of 0.3 accommodates scenarios where e.g. two competing plausible pairing scenarios are possible.
In such situations, both pairing configurations (also termed "valencies") would be expected to have a probability close to 0.5. Both are then considered,
and the output contains the probability of both situations. These can then be used to generate a probabilistic recombination landscape. In some cases,
it may not be possible to discern the pairing in one of the parents due to a lack of recombination (ie. full parental haplotypes were transmitted). In such cases,
having a lower threshold here will allow more offspring to be analysed without affecting the quality of the predictions. If a more definite
set of predictions is required, simply increase |
A nested list corresponding to each linkage group. Within each LG, a list with 3 items is returned, specifying the plausible_pairing_prob
, the map
and
the predicted recombinations
in each individual in the mapping population. Per individual, all valencies with a probability greater than
plausible_pairing_prob
are returned, specifying both the Valent_probability
and the best estimate of the cM position of the
recombination_breakpoints
involving pairs of homologues A, B, C etc. (in the order parent 1, parent 2).
If no recombinations are predicted, a NA
value is given instead.
data("IBD_4x") recom.ls <- count_recombinations(IBD_4x)
data("IBD_4x") recom.ls <- count_recombinations(IBD_4x)
Function to estimate the GIC per homologue using IBD probabilities
estimate_GIC(IBD_list)
estimate_GIC(IBD_list)
IBD_list |
List of IBD probabilities |
A nested list; each list element (per linkage group) contains the following items:
Matrix of GIC values estimated from the IBD probabilities
Integrated linkage map positions of markers used in IBD calculation
The parental marker phasing, coded in 1 and 0's
data("IBD_4x") GIC_4x <- estimate_GIC(IBD_list = IBD_4x)
data("IBD_4x") GIC_4x <- estimate_GIC(IBD_list = IBD_4x)
estimate_IBD
is a function for creating identity-by-descent (IBD) probabilities. Two computational methods are offered:
by default IBD probabilites are estimated using hidden Markov models, but a heuristic method based on Bourke et al. (2014) is also included.
Basic input data for this function are marker genotypes (either discrete marker dosages (ie scores 0, 1, ..., ploidy representing the number of copies of the marker allele),
or the probabilities of these dosages) and a phased linkage map. Details on each of the methods are included under method
estimate_IBD( input_type = "discrete", genotypes, phased_maplist, method = "hmm", remove_markers = NULL, ploidy, ploidy2 = NULL, parent1 = "P1", parent2 = "P2", individuals = "all", log = NULL, map_function = "haldane", bivalent_decoding = TRUE, error = 0.01, full_multivalent_hexa = FALSE, verbose = FALSE, ncores = 1, fix_threshold = 0.1, factor_dist = 1 )
estimate_IBD( input_type = "discrete", genotypes, phased_maplist, method = "hmm", remove_markers = NULL, ploidy, ploidy2 = NULL, parent1 = "P1", parent2 = "P2", individuals = "all", log = NULL, map_function = "haldane", bivalent_decoding = TRUE, error = 0.01, full_multivalent_hexa = FALSE, verbose = FALSE, ncores = 1, fix_threshold = 0.1, factor_dist = 1 )
input_type |
Can be either one of 'discrete' or 'probabilistic'. For the former (default), |
genotypes |
Marker genotypes, either a 2d matrix of integer marker scores or a data.frame of dosage probabilities. Details are as follows:
|
phased_maplist |
A list of phased linkage maps, the output of |
method |
The method used to estimate IBD probabilities, either |
remove_markers |
Optional vector of marker names to remove from the maps. Default is |
ploidy |
Integer. Ploidy of the organism. |
ploidy2 |
Optional integer, by default |
parent1 |
Identifier of parent 1, by default assumed to be |
parent2 |
Identifier of parent 2, by default assumed to be |
individuals |
By default "all" offspring are included, but otherwise a subset can be selected, using a vector of offspring indexing numbers (1,2, etc.)
according to their order in |
log |
Character string specifying the log filename to which standard output should be written. If |
map_function |
Mapping function to use when converting map distances to recombination frequencies.
Currently only |
bivalent_decoding |
Option to consider only bivalent pairing during formation of gametes (ignored for diploid populations, as only bivalents possible there), by default |
error |
The (prior) probability of errors in the offspring dosages, usually assumed to be small but non-zero |
full_multivalent_hexa |
Option to allow multivalent pairing in both parents at the hexaploid level, by default |
verbose |
Logical, by default |
ncores |
How many CPU cores should be used in the evaluation? By default 1 core is used. |
fix_threshold |
If |
factor_dist |
If |
A list of IBD probabilities, organised by linkage group (as given in the input phased_maplist
). Each
list item is itself a list containing the following:
The type of IBD; for this function only "genotypeIBD" are calculated.
A 3d array of IBD probabilities, with dimensions marker, genotype-class and F1 individual.
A 3-column data-frame specifying chromosome, marker and position (in cM)
Phasing of the markers in the parents, as given in the input phased_maplist
A list of marginal likelihoods of different valencies if method "hmm" was used, otherwise NULL
The predicted valency that maximised the marginal likelihood, per offspring. For method "heur", NULL
Offspring names
Logical, whether bivalent decoding was used in the estimation of the F1 IBD probabilities.
The size of the gap (in cM) used when interpolating the IBD probabilities. See function spline_IBD
for details.
Ordered list of genotype codes used to represent different genotype classes.
log likelihoods of each of the different pairing scenarios considered (can be used e.g. for post-mapping check of preferential pairing)
ploidy of parent 1
ploidy of parent 2
The method used, either "hmm" (default) or "heur". See argument method
The error prior used, if method "hmm" was used, otherwise NULL
Durbin R, Eddy S, Krogh A, Mitchison G (1998) Biological sequence analysis: Probabilistic models of proteins and nucleic acids. Cambridge: Cambridge University Press.
Hackett et al. (2013) Linkage analysis and QTL mapping using SNP dosage data in a tetraploid potato mapping population. PLoS One 8(5): e63939
Zheng et al. (2016) Probabilistic multilocus haplotype reconstruction in outcrossing tetraploids. Genetics 203: 119-131
Bourke P.M. (2014) QTL analysis in polyploids: Model testing and power calculations. Wageningen University (MSc thesis)
data("phased_maplist.4x", "SNP_dosages.4x") estimate_IBD(phased_maplist=phased_maplist.4x,genotypes=SNP_dosages.4x,ploidy=4)
data("phased_maplist.4x", "SNP_dosages.4x") estimate_IBD(phased_maplist=phased_maplist.4x,genotypes=SNP_dosages.4x,ploidy=4)
Function to explore the possible segregation type at a QTL position using the Schwarz Information Criterion
exploreQTL( IBD_list, Phenotype.df, genotype.ID, trait.ID, linkage_group, LOD_data, cM = NULL, QTLconfig = NULL, plotBIC = TRUE, deltaBIC = 6, testAllele_Effects = TRUE, log = NULL )
exploreQTL( IBD_list, Phenotype.df, genotype.ID, trait.ID, linkage_group, LOD_data, cM = NULL, QTLconfig = NULL, plotBIC = TRUE, deltaBIC = 6, testAllele_Effects = TRUE, log = NULL )
IBD_list |
List of IBD probabilities |
Phenotype.df |
A data.frame containing phenotypic values |
genotype.ID |
The colname of |
trait.ID |
The colname of |
linkage_group |
Numeric identifier of the linkage group being tested, based on the order of |
LOD_data |
Output of |
cM |
By default |
QTLconfig |
Nested list of homologue configurations and modes of action of QTL to be explored and compared, the output of
|
plotBIC |
Logical, with default |
deltaBIC |
Numeric, by default 6. Configurations within this distance of the minimum BIC are considered plausible. |
testAllele_Effects |
Logical, with default |
log |
Character string specifying the log filename to which standard output should be written. If |
List with the following items:
Linkage group of the QTL peak being explored
CentiMorgan position of the locus being explored
Vector of BIC values corresponding to elements of QTLconfig
provided for testing
Summary of the means and standard errors of groups with (+)
and without(-) the specified allele combinations for the most likely QTLconfig
if testAllele_Effects
= TRUE
(NULL
otherwise).
A one-column matrix of mean phenotype values of offspring classes, with rownames
corresponding to the genotype class. If the probability of certain genotype classes is 0 (e.g. double reduction
classes where no double reduction occurred), then the genotype mean for that class will be NA
data("IBD_4x","BLUEs.pheno","qtl_LODs.4x") exploreQTL(IBD_list = IBD_4x, Phenotype.df = BLUEs.pheno, genotype.ID = "Geno", trait.ID = "BLUE", linkage_group = 1, LOD_data = qtl_LODs.4x)
data("IBD_4x","BLUEs.pheno","qtl_LODs.4x") exploreQTL(IBD_list = IBD_4x, Phenotype.df = BLUEs.pheno, genotype.ID = "Geno", trait.ID = "BLUE", linkage_group = 1, LOD_data = qtl_LODs.4x)
Given QTL output, this function returns the position of maximum LOD for a specified linkage group.
findPeak(LOD_data, linkage_group, verbose = TRUE)
findPeak(LOD_data, linkage_group, verbose = TRUE)
LOD_data |
Output of |
linkage_group |
Numeric identifier of the linkage group being tested, based on the order of |
verbose |
Should messages be written to standard output? By default |
data("qtl_LODs.4x") findPeak(LOD_data=qtl_LODs.4x,linkage_group=1)
data("qtl_LODs.4x") findPeak(LOD_data=qtl_LODs.4x,linkage_group=1)
Given QTL output, this function returns the LOD - x support for a specified linkage group, taking the maximum LOD position as the desired QTL peak.
findSupport(LOD_data, linkage_group, LOD_support = 2)
findSupport(LOD_data, linkage_group, LOD_support = 2)
LOD_data |
Output of |
linkage_group |
Numeric identifier of the linkage group being tested, based on the order of |
LOD_support |
The level of support around a QTL peak, by default 2 (giving a LOD - 2 support interval, the range of positions with a LOD score within 2 LOD units of the maximum LOD on that linkage group). |
data("qtl_LODs.4x") findSupport(LOD_data=qtl_LODs.4x,linkage_group=1)
data("qtl_LODs.4x") findSupport(LOD_data=qtl_LODs.4x,linkage_group=1)
Genotypic Information Coefficient for example tetraploid
GIC_4x
GIC_4x
An object of class list
of length 2.
Identical by descent probabilities for example tetraploid
IBD_4x
IBD_4x
An object of class list
of length 2.
Imports the IBD probability output of TetraOrigin (Mathematica software) or PolyOrigin (julia software) into the same format as natively-estimated IBD probabilities from the polyqtlR package.
import_IBD( method, folder = NULL, filename, bivalent_decoding = TRUE, error = 0.01, log = NULL )
import_IBD( method, folder = NULL, filename, bivalent_decoding = TRUE, error = 0.01, log = NULL )
method |
The method used for IBD estimation, either "TO" for TetraOrigin or "PO" for PolyOrigin |
folder |
The path to the folder in which the Tetra/PolyOrigin (ie. TetraOrigin or PolyOrigin) output is contained,
default is |
filename |
If method = "TO", the (vector of) character filename stem(s) of the |
bivalent_decoding |
Logical, if method = "TO" you must specify |
error |
If method = "TO", the offspring error prior used in the offspring decoding step of TetraOrigin, by default assumed to be 0.01. For method = "PO", this is automatically read in. |
log |
Character string specifying the log filename to which standard output should be written. If |
Returns a list with the following items:
IBDtype : |
Always "genotypeIBD" for the output of TetraOrigin |
IBDarray : |
An array of IBD probabilities. The dimensions of the array are: markers, genotype classes and individuals. |
map : |
Integrated linkage map positions of markers used in IBD calculation |
parental_phase : |
The parental marker phasing as used by TetraOrigin, recoded in 1 and 0's |
marginal.likelihoods : |
A list of marginal likelihoods of different valencies, currently |
valency : |
The predicted valency that maximised the marginal likelihood, per offspring. Currently |
offspring : |
Offspring names |
biv_dec : |
Logical, the bivalent_decoding parameter specified. |
gap : |
The gap size used in IBD interpolation if performed by |
genocodes : |
Ordered list of genotype codes used to represent different genotype classes. |
pairing : |
log likelihoods of each of the different pairing scenarios considered (can be used e.g. for post-mapping check of preferential pairing) |
ploidy : |
The ploidy of parent 1, by default assumed to be 4 |
ploidy2 : |
The ploidy of parent 2, by default assumed to be 4 |
method : |
The method used, either "hmm_TO" (TetraOrigin) or "hmm_PO" (PolyOrigin) |
error : |
The error prior used in the calculation in TetraOrigin, assumed to be 0.01 |
## Not run: ## These examples demonstrate the function call for both methods, but won't run without input files ## from either package, hence this call will normally result in an Error: IBD_TO <- import_IBD(method = "TO", filename = paste0("test_LinkageGroup",1:5,"_Summary"), bivalent_decoding = FALSE, error = 0.05) ## Equivalent call for PolyOrigin output: IBD_PO <- import_IBD(method = "PO",filename = "test") ## End(Not run)
## Not run: ## These examples demonstrate the function call for both methods, but won't run without input files ## from either package, hence this call will normally result in an Error: IBD_TO <- import_IBD(method = "TO", filename = paste0("test_LinkageGroup",1:5,"_Summary"), bivalent_decoding = FALSE, error = 0.05) ## Equivalent call for PolyOrigin output: IBD_PO <- import_IBD(method = "PO",filename = "test") ## End(Not run)
Function to correct marker dosage scores given a list of previously estimated IBD probabilities. This may
prove useful to correct genotyping errors. Running the estimate_IBD
function with a high error prior will
result in suppressed predictions of double recombination events, associated with genotyping errors. By forcing the HMM to penalise
double recombinations heavily, a smoothed haplotype landscape is achieved in which individual genotype observations are down-weighted.
This smoothed output is then used to re-estimate marker dosages, dependent on (correct) parental scores.
An alternative strategy is to use the function maxL_IBD
over a range of error priors first, and use the resulting $maxL_IBD
output
as input here (as the IBD_list
). In this case, set the argument min_error_prior
to a low value (0.005 say) to avoid issues.
impute_dosages( IBD_list, dosage_matrix, parent1 = "P1", parent2 = "P2", rounding_error = 0.05, min_error_prior = 0.1, verbose = TRUE )
impute_dosages( IBD_list, dosage_matrix, parent1 = "P1", parent2 = "P2", rounding_error = 0.05, min_error_prior = 0.1, verbose = TRUE )
IBD_list |
List of IBD probabilities |
dosage_matrix |
An integer matrix with markers in rows and individuals in columns. Note that probabilistic genotypes are not currently catered for here. |
parent1 |
The identifier of parent 1, by default "P1" |
parent2 |
The identifier of parent 2, by default "P2" |
rounding_error |
The maximum deviation from an integer value that an inputed value can have, by default 0.05. For example, an imputed
score of 2.97 or 3.01 would both be rounded to a dosage of 3, while 2.87 would be deemed too far from an integer score, and would be made missing.
If you find the output contains too many missing values, a possibility would be to increase the |
min_error_prior |
Suggestion for a suitably high error prior to be used in IBD calculations to ensure IBD smoothing is achieved. If IBD probabilities were estimated with a smaller error prior, the function aborts. |
verbose |
Should messages be written to standard output? |
## Not run: # Toy example only, as this will result in an Error: the original error prior was too low data("IBD_4x","SNP_dosages.4x") impute_dosages(IBD_list=IBD_4x,dosage_matrix=SNP_dosages.4x) ## End(Not run)
## Not run: # Toy example only, as this will result in an Error: the original error prior was too low data("IBD_4x","SNP_dosages.4x") impute_dosages(IBD_list=IBD_4x,dosage_matrix=SNP_dosages.4x) ## End(Not run)
Function to run the estimate_IBD
function over a range of possible error priors. The function returns
a merged set of results that maximise the marginal likelihood per individual, i.e. allowing a per-individual error rate within the options
provided in the errors argument.
maxL_IBD(errors = c(0.01, 0.05, 0.1, 0.2), ...)
maxL_IBD(errors = c(0.01, 0.05, 0.1, 0.2), ...)
errors |
Vector of offspring error priors to test (each between 0 and 1) |
... |
Arguments passed to |
A list containing the following components:
A nested list as would have been returned by the estimate_IBD function, but composite across error priors to maximise the marginal likelihoods. Note that the $error values per linkage group are now the average error prior across the population per linkage group
A 3d array of the maximal marginal likelihoods, per error prior. Dimensions are individuals, linkage groups, error priors.
A matrix of the most likely genotyping error rates per individual (in rows) for each linkage group (in columns)
The error priors used (i.e. the input vector is returned for later reference.)
## Not run: data("phased_maplist.4x","SNP_dosages.4x") maxL_IBD(phased_maplist=phased_maplist.4x,genotypes=SNP_dosages.4x, ploidy=4,errors=c(0.01,0.02,0.05,0.1)) ## End(Not run)
## Not run: data("phased_maplist.4x","SNP_dosages.4x") maxL_IBD(phased_maplist=phased_maplist.4x,genotypes=SNP_dosages.4x, ploidy=4,errors=c(0.01,0.02,0.05,0.1)) ## End(Not run)
Function to extract the chromosome pairing predictions as estimated by estimate_IBD
.
Apart from producing an overview of the pairing during parental meiosis (including counts of multivalents, per linkage group per parent),
the function also applies a simple chi-squared test to look for evidence of non-random pairing behaviour from the bivalent counts (deviations from a polysomic model)
meiosis_report(IBD_list, visualise = FALSE, precision = 2)
meiosis_report(IBD_list, visualise = FALSE, precision = 2)
IBD_list |
List of IBD probabilities as estimated by |
visualise |
Logical, by default |
precision |
To how many decimal places should summed probabilities per bivalent pairing be rounded? By default 2. |
The function returns a nested list, with one element per linkage group in the same order as the input IBD list. Per linkage group, a list is returned containing the following components:
The count of multivalents in parent 1 (only relevant if bivalent_decoding = FALSE
during IBD calculation)
Similarly, the count of multivalents in parent 2
The counts of each bivalent pairing predicted in parent 1, with an extra column Pr(X2) which gives the p-value of the X2 test of the off-diagonal terms in the matrix. In the case of a tetraploid, pairing A with B automatically implies C with D pairing, so the count table contains a lot of redundancy. The table should be read using both row and column names, so row A and column B corresponds to the count of individuals with A and B pairing (and hence C and D pairing). In a hexaploid, A-B pairing does not imply a particular pairing configuration in the remaining homologues. In this case, row A and column B is the count of individuals where A and B were predicted to have paired, summed over all three bivalent configurations with A and B paired (AB-CD-EF, AB-CE-DF, AB-CF,DE).
Same as P1_pairing, except using parent 2
The ploidy of parent 1
The ploidy of parent 2
data("IBD_4x") mr.ls<-meiosis_report(IBD_list = IBD_4x)
data("IBD_4x") mr.ls<-meiosis_report(IBD_list = IBD_4x)
Example output of meiosis report function
mr.ls
mr.ls
An object of class list
of length 2.
Phased maplist for example tetraploid
phased_maplist.4x
phased_maplist.4x
An object of class list
of length 2.
Phenotypes for example tetraploid
Phenotypes_4x
Phenotypes_4x
An object of class data.frame
with 150 rows and 3 columns.
Up to package v.0.0.9, there were three plotting functions for the output of QTLscan
, namely plotQTL
, plotLinearQTL
and plotLinearQTL_list
.
Since release 0.1.0, the functionality of all three functions has been combined into a single general plotting function, named plotQTL
.
The plot layout is now specified by a new argument layout
, allowing the user to plot results for single chromosomes separately, or together either adjacently or in a grid layout.
Results from multiple analyses can be overlaid. Previously, it was possible to call the function plotQTL
multiple times and overlay subsequent plots using the argument overlay = TRUE
.
This approach is no longer supported. Instead, if multiple results are to be overlaid, they can be provided as a list of QTLscan
or singleMarkerRegression
outputs. Note however that this
is only possible using the default layout. If significance thresholds are
present, the default behaviour is to rescale LOD values so that multiple plots can be combined with overlapping signficance thresholds. This rescaling behaviour can also be
disabled (by setting rescale = FALSE
). Note that not all arguments may be appropriate for all layouts.
plotQTL( LOD_data, layout = "l", inter_chm_gap = 5, ylimits = NULL, sig.unit = "LOD", plot_type = "lines", colour = c("black", "red", "dodgerblue", "sienna4"), add_xaxis = TRUE, add_rug = TRUE, add_thresh = TRUE, override_thresh = NULL, thresh.lty = 3, thresh.lwd = 2, thresh.col = "darkred", return_plotData = FALSE, show_thresh_CI = FALSE, use_LG_names = TRUE, axis_label.cex = 1, custom_LG_names = NULL, LGdiv.col = "gray42", ylab.at = 2.5, highlight_positions = NULL, mainTitle = FALSE, rescale = TRUE, ... ) plotLinearQTL( LOD_data, layout = "l", inter_chm_gap = 5, ylimits = NULL, sig.unit = "LOD", plot_type = "lines", colour = c("black", "red", "dodgerblue", "sienna4"), add_xaxis = TRUE, add_rug = TRUE, add_thresh = TRUE, override_thresh = NULL, thresh.lty = 3, thresh.lwd = 2, thresh.col = "darkred", return_plotData = FALSE, show_thresh_CI = FALSE, use_LG_names = TRUE, axis_label.cex = 1, custom_LG_names = NULL, LGdiv.col = "gray42", ylab.at = 2.5, highlight_positions = NULL, mainTitle = FALSE, rescale = TRUE, ... ) plotLinearQTL_list( LOD_data, layout = "l", inter_chm_gap = 5, ylimits = NULL, sig.unit = "LOD", plot_type = "lines", colour = c("black", "red", "dodgerblue", "sienna4"), add_xaxis = TRUE, add_rug = TRUE, add_thresh = TRUE, override_thresh = NULL, thresh.lty = 3, thresh.lwd = 2, thresh.col = "darkred", return_plotData = FALSE, show_thresh_CI = FALSE, use_LG_names = TRUE, axis_label.cex = 1, custom_LG_names = NULL, LGdiv.col = "gray42", ylab.at = 2.5, highlight_positions = NULL, mainTitle = FALSE, rescale = TRUE, ... )
plotQTL( LOD_data, layout = "l", inter_chm_gap = 5, ylimits = NULL, sig.unit = "LOD", plot_type = "lines", colour = c("black", "red", "dodgerblue", "sienna4"), add_xaxis = TRUE, add_rug = TRUE, add_thresh = TRUE, override_thresh = NULL, thresh.lty = 3, thresh.lwd = 2, thresh.col = "darkred", return_plotData = FALSE, show_thresh_CI = FALSE, use_LG_names = TRUE, axis_label.cex = 1, custom_LG_names = NULL, LGdiv.col = "gray42", ylab.at = 2.5, highlight_positions = NULL, mainTitle = FALSE, rescale = TRUE, ... ) plotLinearQTL( LOD_data, layout = "l", inter_chm_gap = 5, ylimits = NULL, sig.unit = "LOD", plot_type = "lines", colour = c("black", "red", "dodgerblue", "sienna4"), add_xaxis = TRUE, add_rug = TRUE, add_thresh = TRUE, override_thresh = NULL, thresh.lty = 3, thresh.lwd = 2, thresh.col = "darkred", return_plotData = FALSE, show_thresh_CI = FALSE, use_LG_names = TRUE, axis_label.cex = 1, custom_LG_names = NULL, LGdiv.col = "gray42", ylab.at = 2.5, highlight_positions = NULL, mainTitle = FALSE, rescale = TRUE, ... ) plotLinearQTL_list( LOD_data, layout = "l", inter_chm_gap = 5, ylimits = NULL, sig.unit = "LOD", plot_type = "lines", colour = c("black", "red", "dodgerblue", "sienna4"), add_xaxis = TRUE, add_rug = TRUE, add_thresh = TRUE, override_thresh = NULL, thresh.lty = 3, thresh.lwd = 2, thresh.col = "darkred", return_plotData = FALSE, show_thresh_CI = FALSE, use_LG_names = TRUE, axis_label.cex = 1, custom_LG_names = NULL, LGdiv.col = "gray42", ylab.at = 2.5, highlight_positions = NULL, mainTitle = FALSE, rescale = TRUE, ... )
LOD_data |
Output of |
layout |
There are three possible plot layouts - single chromosome plots ("s"), genome-wide plots arranged adjacently in a linear fashion ("l") which is
also the default, and genome-wide plots arranged in a grid ("g"), i.e. a grid of single chromosome plots. In the latter case, a suitable grid dimension will be determined
based on the number of linkage groups detected in |
inter_chm_gap |
The gap size (in units of cM) between successive chromosomes when |
ylimits |
Use to specify ylimits of plot region, though by default |
sig.unit |
Label to use on the y-axis for significance units, by default assumed to be LOD score. |
plot_type |
Plots can be either in line drawings ("lines", default) or scatter plot format ("points"). |
colour |
Vector of colours to be used in the plotting. A default set of 4 colours is provided, the first of which is used when results from a single QTL scan are to be plotted. |
add_xaxis |
Should an x-axis be drawn? If multiple QTL analyses are performed on different traits, specifying this to be |
add_rug |
Logical, by default |
add_thresh |
Logical, by default |
override_thresh |
By default |
thresh.lty |
Gives user control over the line type of the significance threshold to be drawn. Default threshold lty is 3. |
thresh.lwd |
Gives user control over the line width of the significance threshold to be drawn. Default threshold lwd is 2. |
thresh.col |
Gives user control over the line colour of the significance threshold to be drawn. Default threshold colour is dark red. If plotting multiple analyses with |
return_plotData |
Logical, by default |
show_thresh_CI |
Logical, by default |
use_LG_names |
Logical, by default |
axis_label.cex |
Argument to adjust the size of the axis labels. Can be useful if there are many linkage groups to plot |
custom_LG_names |
Option to specify a vector that contains custom linkage group names. By default |
LGdiv.col |
Colour of dividing lines between linkage groups when |
ylab.at |
Distance from the y-axis to place label (by default at 2.5 points) |
highlight_positions |
Option to include a (list of) positions to highlight (e.g. peak QTL positions). Each list element should be a 2-column data.frame with columns giving
the linkage group numbers (numeric) and the corresponding cM positions (numeric) to highlight. If |
mainTitle |
Option to supply vector of plot titles if |
rescale |
If results from multiple analyses are to be overlaid and different significance thresholds are detected, then by default plots will be rescaled so that threshold lines overlap.
This behaviour can be disabled by setting |
... |
Arguments passed to |
The plot data, if return_plotData = TRUE. Otherwise NULL
. Output is returned invisibly
## Not run: data("qtl_LODs.4x") plotQTL(LOD_data = qtl_LODs.4x,layout = "l") ## End(Not run)
## Not run: data("qtl_LODs.4x") plotQTL(LOD_data = qtl_LODs.4x,layout = "l") ## End(Not run)
Function which visualises the recombination landscape in two ways: per linkage group, and per individual.
For the first analysis, a rudimentary spline is also fitted to estimate the recombination rate along a grid of positions defined by gap
,
which is also returned by the function.
plotRecLS( recombination_data, plot_per_LG = TRUE, plot_per_ind = TRUE, gap = 1, ... )
plotRecLS( recombination_data, plot_per_LG = TRUE, plot_per_ind = TRUE, gap = 1, ... )
recombination_data |
Data on predicted recombination events, as returned by the function |
plot_per_LG |
Logical argument, plot recombination events per linkage group? By default |
plot_per_ind |
Logical argument, plot recombination events per individual? By default |
gap |
The size (in cM) of the gap used to define the grid of positions to define the window in which to estimate recombination rate. By default 1 cM. Interpolated positions are taken to be the centre of an interval, so a 1 cM gap would result in predictions for positions 0.5 cM, 1.5 cM etc. |
... |
Option to pass extra arguments to the |
A list with two elements, per_LG
and per_individual
. The first of these is itself a list with the same length as recombination_data
, giving the estimated recombination rates along the linkage group.
This rate is simply estimated as the (weighted) count of recombination breakpoints divided by the population size.
data("Rec_Data_4x") plotRecLS(Rec_Data_4x)
data("Rec_Data_4x") plotRecLS(Rec_Data_4x)
This function builds a (maximal) QTL model from previously detected QTL peaks and outputs the percentage variance explained (PVE)
of the full QTL model and all sub-models. It uses a similar approach to the fitting of genetic co-factors in the function QTLscan
.
The PVE is very similar to but not exactly equal to the adjusted R2 returned in QTLscan
at each position (and note: in the former case, these
R2 values are per-locus, while this function can estimate the PVE combined over multiple loci). The discrepancy has to do with how PVE is calculated
using the formula 100(1 - RSS0/RSS1), where RSS0 and RSS1 are the residual sums of squares of the NULL and QTL models, respectively.
PVE( IBD_list, Phenotype.df, genotype.ID, trait.ID, block = NULL, QTL_df = NULL, prop_Pheno_rep = 0.5, log = NULL, verbose = FALSE )
PVE( IBD_list, Phenotype.df, genotype.ID, trait.ID, block = NULL, QTL_df = NULL, prop_Pheno_rep = 0.5, log = NULL, verbose = FALSE )
IBD_list |
List of IBD probabilities |
Phenotype.df |
A data.frame containing phenotypic values |
genotype.ID |
The colname of |
trait.ID |
The colname of |
block |
The blocking factor to be used, if any (must be colname of |
QTL_df |
A 2-column data frame of previously-detected QTL; column 1 gives linkage group identifiers,
column 2 specifies the cM position of the QTL. If not specified, an error results. It can be convenient to generate a compatible
data.frame by first running the function |
prop_Pheno_rep |
The minimum proportion of phenotypes represented across blocks. If less than this, the individual is removed from the analysis. If there is incomplete data, the missing phenotypes are imputed using the mean values across the recorded observations. |
log |
Character string specifying the log filename to which standard output should be written. If |
verbose |
Should messages be written to standard output? |
A list with percentage variance explained of maximal QTL model and all sub-models
data("IBD_4x","Phenotypes_4x") PVE(IBD_list = IBD_4x, Phenotype.df = Phenotypes_4x, genotype.ID = "geno",trait.ID = "pheno", block = "year", QTL_df = data.frame(LG=1,cM=12.3))
data("IBD_4x","Phenotypes_4x") PVE(IBD_list = IBD_4x, Phenotype.df = Phenotypes_4x, genotype.ID = "geno",trait.ID = "pheno", block = "year", QTL_df = data.frame(LG=1,cM=12.3))
QTL output for example tetraploid
qtl_LODs.4x
qtl_LODs.4x
An object of class list
of length 6.
Function to run QTL analysis using IBD probabilties given (possibly replicated) phenotypes, assuming randomised experimental design
QTLscan( IBD_list, Phenotype.df, genotype.ID, trait.ID, block = NULL, cofactor_df = NULL, allelic_interaction = FALSE, folder = NULL, filename.short, prop_Pheno_rep = 0.5, perm_test = FALSE, N_perm.max = 1000, alpha = 0.05, gamma = 0.05, ncores = 1, log = NULL, verbose = TRUE, ... )
QTLscan( IBD_list, Phenotype.df, genotype.ID, trait.ID, block = NULL, cofactor_df = NULL, allelic_interaction = FALSE, folder = NULL, filename.short, prop_Pheno_rep = 0.5, perm_test = FALSE, N_perm.max = 1000, alpha = 0.05, gamma = 0.05, ncores = 1, log = NULL, verbose = TRUE, ... )
IBD_list |
List of IBD probabilities |
Phenotype.df |
A data.frame containing phenotypic values |
genotype.ID |
The colname of |
trait.ID |
The colname of |
block |
The blocking factor to be used, if any (must be colname of |
cofactor_df |
A 3-column data frame of co-factor(s); column 1 gives the numeric linkage group identifier(s),
column 2 specifies the cM position of the co-factor(s), column 3 specifies whether the QTL was fitted using "a" = additive effects or
"f" = full allelic interactions (note that any other symbol for the full model will also be accepted, as long as it is not "a").
For backward compatibility with package versions <= 0.0.9, it is possible to just supply the first two columns,
in which case an additive-effects model is assumed for each cofactor (so, a third column will be automatically filled with "a").
By default |
allelic_interaction |
The QTL detection model can be for additive main effects only (by default |
folder |
If markers are to be used as co-factors, the path to the folder in which the imported IBD probabilities is contained can be provided here.
By default this is |
filename.short |
If TetraOrigin was used and co-factors are being included, the shortened stem of the filename of the |
prop_Pheno_rep |
The minimum proportion of phenotypes represented across blocks. If less than this, the individual is removed from the analysis. If there is incomplete data, the missing phenotypes are imputed using the mean values across the recorded observations. |
perm_test |
Logical, by default |
N_perm.max |
The maximum number of permutations to run if |
alpha |
The P-value to be used in the selection of a threshold if |
gamma |
The width of the confidence intervals used around the permutation test threshold using the approach of Nettleton & Doerge (2000), by default 0.05. |
ncores |
Number of cores to use if parallel computing is required. Works both for Windows and UNIX (using |
log |
Character string specifying the log filename to which standard output should be written. If |
verbose |
Logical, by default |
... |
Arguments passed to |
A nested list; each list element (per linkage group) contains the following items:
Single matrix of QTL results with columns chromosome, position, LOD, adj.r.squared and PVE (percentage variance explained).
If perm_test
= FALSE
, this will be NULL
.
Otherwise, Perm.res contains a list of the results of the permutation test, with list items
"quantile","threshold" and "scores". Quantile refers to which quantile of scores was used to determine the threshold.
Note that scores are each of the maximal LOD scores across the entire genome scan per permutation, thus returning a
genome-wide threshold rather than a chromosome-specific threshold. If the latter is preferred, restricting the
IBD_list
to a single chromosome and re-running the permutation test will provide the desired threshold.
If a blocking factor or co-factors are used, this is the (named) vector of residuals used as input for the QTL scan. Otherwise, this is the set of (raw) phenotypes used in the QTL scan.
Original map of genetic marker positions upon which the IBDs were based, most often used for adding rug of marker positions to QTL plots.
Names of the linkage groups
Whether argument allelic_interaction
was TRUE
or FALSE
in the QTL scan
data("IBD_4x","Phenotypes_4x") qtl_LODs.4x <- QTLscan(IBD_list = IBD_4x, Phenotype.df = Phenotypes_4x, genotype.ID = "geno", trait.ID = "pheno", block = "year")
data("IBD_4x","Phenotypes_4x") qtl_LODs.4x <- QTLscan(IBD_list = IBD_4x, Phenotype.df = Phenotypes_4x, genotype.ID = "geno", trait.ID = "pheno", block = "year")
Recombination data for example tetraploid
Rec_Data_4x
Rec_Data_4x
An object of class list
of length 2.
Expected segregation for all markers types of a diploid cross
segList_2x
segList_2x
An object of class list
of length 8.
Expected segregation for all markers types of a triploid cross (4 x 2)
segList_3x
segList_3x
An object of class list
of length 27.
Expected segregation for all markers types of a triploid cross (2 x 4)
segList_3x_24
segList_3x_24
An object of class list
of length 27.
Expected segregation for all markers types of a tetraploid cross
segList_4x
segList_4x
An object of class list
of length 224.
Expected segregation for all markers types of a hexaploid cross
segList_6x
segList_6x
An object of class list
of length 3735.
Function to generate list of segregation types for the exploreQTL
function
segMaker(ploidy, segtypes, modes = c("a", "d"))
segMaker(ploidy, segtypes, modes = c("a", "d"))
ploidy |
The ploidy of the population. Currently assumed to be an even number for this function. |
segtypes |
List of QTL segregation types to consider,
so e.g. c(1,0) would mean all possible simplex x nulliplex QTL (ie. 4 QTL, on each of homologues 1 - 4 of parent 1).
Note that symmetrical QTL types that cannot be distinguished are not automatically removed and need to be manually identified.
If this is an issue, use the inbuilt list for tetraploids provided with the package to search the full model space.
Such an inbuilt list is currently only available for tetraploids, and is available from the |
modes |
Character vector of modes of QTL action to consider, with options "a" for "additive" and "d" for dominant QTL action. |
Function to run a single marker regression using marker dosages
singleMarkerRegression( dosage_matrix, Phenotype.df, genotype.ID, trait.ID, maplist = NULL, perm_test = FALSE, N_perm = 1000, alpha = 0.05, ncores = 1, return_R2 = FALSE, log = NULL )
singleMarkerRegression( dosage_matrix, Phenotype.df, genotype.ID, trait.ID, maplist = NULL, perm_test = FALSE, N_perm = 1000, alpha = 0.05, ncores = 1, return_R2 = FALSE, log = NULL )
dosage_matrix |
An integer matrix with markers in rows and individuals in columns. All markers in this matrix will be tested for association with the trait. |
Phenotype.df |
A data.frame containing phenotypic values |
genotype.ID |
The colname of |
trait.ID |
The colname of |
maplist |
Option to include linkage map in the format returned by |
perm_test |
Logical, by default |
N_perm |
Integer. The number of permutations to run if |
alpha |
Numeric. The P-value to be used in the selection of a threshold if |
ncores |
Number of cores to use if parallel processing required. Works both for Windows and UNIX (using |
return_R2 |
Should the (adjusted) R2 of the model fit also be determined? |
log |
Character string specifying the log filename to which standard output should be written. If |
A list containing the following components:
The -log(p) of the model fit per marker are returned as "LOD" scores, although "LOP" would have been a better description. If requested, R2 values are also returned in column "R2adj"
The results of the permutation test if performed, otherwise NULL
The linkage map if provided, otherwise NULL
Names of the linkage groups, if a map was provided, otherwise NULL
data("SNP_dosages.4x","BLUEs.pheno") Trait_1.smr <- singleMarkerRegression(dosage_matrix = SNP_dosages.4x, Phenotype.df = BLUEs.pheno,genotype.ID = "Geno",trait.ID = "BLUE")
data("SNP_dosages.4x","BLUEs.pheno") Trait_1.smr <- singleMarkerRegression(dosage_matrix = SNP_dosages.4x, Phenotype.df = BLUEs.pheno,genotype.ID = "Geno",trait.ID = "BLUE")
SNP marker dosage data for example tetraploid
SNP_dosages.4x
SNP_dosages.4x
An object of class matrix
(inherits from array
) with 186 rows and 52 columns.
Fits splines to IBD probabilities at a grid of positions at user-defined spacing.
spline_IBD(IBD_list, gap, method = "cubic", ncores = 1, log = NULL)
spline_IBD(IBD_list, gap, method = "cubic", ncores = 1, log = NULL)
IBD_list |
List of IBD probabilities |
gap |
The size (in centiMorgans) of the gap between splined positions |
method |
One of two options, either "linear" or "cubic". The default method (cubic) fits cubic splines, and although more accurate, becomes computationally expensive in higher-density data-sets, where the linear option may be preferable. |
ncores |
Number of cores to use, by default 1 only. Works both for Windows and UNIX (using |
log |
Character string specifying the log filename to which standard output should be written. If |
Returns a list of similar format as IBD_list, with a splined IBD_array
in place of the original IBD_array
data("IBD_4x") IBD_4x.spl <- spline_IBD(IBD_list = IBD_4x, gap = 1)
data("IBD_4x") IBD_4x.spl <- spline_IBD(IBD_list = IBD_4x, gap = 1)
thinmap
is a function for thinning out an integrated map, in order that IBD estimation runs more quickly. Especially
useful for maps with very high marker densities for which the estimate_IBD
function is to be used.
thinmap( maplist, dosage_matrix, bin_size = 1, bounds = NULL, remove_markers = NULL, plot_maps = TRUE, use_SN_phase = FALSE, parent1 = "P1", parent2 = "P2", log = NULL )
thinmap( maplist, dosage_matrix, bin_size = 1, bounds = NULL, remove_markers = NULL, plot_maps = TRUE, use_SN_phase = FALSE, parent1 = "P1", parent2 = "P2", log = NULL )
maplist |
A list of maps. In the first column marker names and in the second their position. |
dosage_matrix |
An integer matrix with markers in rows and individuals in columns. |
bin_size |
Numeric. Size (in cM) of the bins to include. By default, a bin size of 1 cM is used. Larger |
bounds |
Numeric vector. If |
remove_markers |
Optional vector of marker names to remove from the maps. Default is |
plot_maps |
Logical. Plot the marker positions of the selected markers using |
use_SN_phase |
Logical, by default |
parent1 |
Identifier of parent 1, by default assumed to be |
parent2 |
Identifier of parent 2, by default assumed to be |
log |
Character string specifying the log filename to which standard output should be written. If NULL log is send to stdout. |
A maplist of the same structure as the input maplist, but with fewer markers based on the bin_size.
data("phased_maplist.4x","SNP_dosages.4x") maplist_thin<-thinmap(maplist=phased_maplist.4x,dosage_matrix=SNP_dosages.4x)
data("phased_maplist.4x","SNP_dosages.4x") maplist_thin<-thinmap(maplist=phased_maplist.4x,dosage_matrix=SNP_dosages.4x)
Function to visualise the GIC of a certain region
visualiseGIC( GIC_list, add_rug = TRUE, add_leg = FALSE, ylimits = NULL, gic.cex = 1, show_markers = TRUE, add.mainTitle = TRUE, plot.cols = NULL )
visualiseGIC( GIC_list, add_rug = TRUE, add_leg = FALSE, ylimits = NULL, gic.cex = 1, show_markers = TRUE, add.mainTitle = TRUE, plot.cols = NULL )
GIC_list |
List of GIC data, the output of |
add_rug |
Should original marker positions be added to the plot? |
add_leg |
Should a legend be added to the plot? |
ylimits |
Optional argument to control the plotting area, by default |
gic.cex |
Option to increase the size of the GIC |
show_markers |
Should markers be shown? |
add.mainTitle |
Should a main title be added to the plot? |
plot.cols |
Optional argument to specify plot colours, otherwise suitable contrasting colours are chosen |
The phased map data for the specified region, recoded into 1's and 0's.
data("GIC_4x") visualiseGIC(GIC_list = GIC_4x)
data("GIC_4x") visualiseGIC(GIC_list = GIC_4x)
Function to visualise the haplotypes of a certain region in certain individuals
visualiseHaplo( IBD_list, display_by = c("phenotype", "name"), linkage_group = NULL, Phenotype.df = NULL, genotype.ID = NULL, trait.ID = NULL, pheno_range = NULL, cM_range = "all", highlight_region = NULL, select_offspring = NULL, recombinant_scan = NULL, allele_fish = NULL, presence_threshold = 0.95, xlabl = TRUE, ylabl = TRUE, mainTitle = NULL, multiplot = NULL, append = FALSE, colPal = c("white", "navyblue", "darkred"), hap.wd = 0.4, recombination_data = NULL, reset_par = TRUE, log = NULL )
visualiseHaplo( IBD_list, display_by = c("phenotype", "name"), linkage_group = NULL, Phenotype.df = NULL, genotype.ID = NULL, trait.ID = NULL, pheno_range = NULL, cM_range = "all", highlight_region = NULL, select_offspring = NULL, recombinant_scan = NULL, allele_fish = NULL, presence_threshold = 0.95, xlabl = TRUE, ylabl = TRUE, mainTitle = NULL, multiplot = NULL, append = FALSE, colPal = c("white", "navyblue", "darkred"), hap.wd = 0.4, recombination_data = NULL, reset_par = TRUE, log = NULL )
IBD_list |
List of IBD probabilities |
display_by |
Option to display a subset of the population's haplotypes either by |
linkage_group |
Numeric identifier of the linkage group being examined, based on the order of |
Phenotype.df |
A data.frame containing phenotypic values, which can be used to select a subset of the population
to visualise (with extreme phenotypes for example). By default |
genotype.ID |
The colname of |
trait.ID |
The colname of |
pheno_range |
Vector of numeric bounds of the phenotypic scores to include (offspring selection). |
cM_range |
Vector of numeric bounds of the genetic region to be explored. If none are specified, the default of |
highlight_region |
Option to hightlight a particular genetic region on the plot; can be a single position or a vector of 2 positions. By default |
select_offspring |
Vector of offspring identifiers to visualise, must be supplied if |
recombinant_scan |
Vector of homologue numbers between which to search for recombinant offspring in the visualised region and selected individuals.
By default |
allele_fish |
Vector of homologue numbers of interest, for which to search for offspring that carry these homologues (in the visualised
region). By default |
presence_threshold |
Numeric. The minimum probability used to declare presence of a homologue in an individual. This is only needed if a |
xlabl |
Logical, by default |
ylabl |
Logical, by default |
mainTitle |
Option to override default plot titles with a (vector of) captions. By default |
multiplot |
Vector of integers. By default |
append |
Option to allow user to append new plots to spaces generated by |
colPal |
Colour palette to use in the visualisation (best to provide 3 colours). |
hap.wd |
The width of the haplotype tracks to be plotted, generally recommended to be about 0.4 (default value) |
recombination_data |
List object as returned by the function |
reset_par |
By default |
log |
Character string specifying the log filename to which standard output should be written. If |
If recombinant_scan
vector is supplied, a vector of recombinant offspring ID in the region of interest (otherwise NULL
).
data("IBD_4x") visualiseHaplo(IBD_list = IBD_4x, display_by = "name", linkage_group = 1, select_offspring = "all", multiplot = c(3,3))
data("IBD_4x") visualiseHaplo(IBD_list = IBD_4x, display_by = "name", linkage_group = 1, select_offspring = "all", multiplot = c(3,3))
Function to visualise the pairing of parental homologues across the population using graph, with nodes to denote parental homologues and edges to denote deviations from expected proportions under a polysomic model of inheritance
visualisePairing( meiosis_report.ls, pos.col = "red", neg.col = "blue", parent, max.lwd = 20, datawidemax, add.label = TRUE, return.data = FALSE, ... )
visualisePairing( meiosis_report.ls, pos.col = "red", neg.col = "blue", parent, max.lwd = 20, datawidemax, add.label = TRUE, return.data = FALSE, ... )
meiosis_report.ls |
List output of function |
pos.col |
Colour corresponding to excess of pairing associations predicted (positive deviations), by default red |
neg.col |
Colour corresponding to lack of pairing associations predicted (negative deviations), by default blue |
parent |
The parent, either "P1" (mother) or "P2 (father) |
max.lwd |
Maximum line width, by default 20 |
datawidemax |
This argument is currently a work-around to allow multiple plots to have the same scale (line thicknesses consistent).
No default is provided. To estimate this value, simply set argument |
add.label |
Should a label be applied, giving the maximum deviation in the plot? By default |
return.data |
Should plot data be returned? By default |
... |
Optional arguments passed to |
If return.data = TRUE
, the values for pairwise deviations from the expected numbers are
returned, useful for determining the value datawidemax
to provide consistent scaling across multiple plots
data("mr.ls") visualisePairing(meiosis_report.ls = mr.ls, parent = "P1", datawidemax = 3)
data("mr.ls") visualisePairing(meiosis_report.ls = mr.ls, parent = "P1", datawidemax = 3)
Function to visualise the effect of parental homologues around a QTL peak across the population.
visualiseQTLeffects( IBD_list, Phenotype.df, genotype.ID, trait.ID, linkage_group, LOD_data, cM_range = NULL, col.pal = c("purple4", "white", "seagreen"), point.density = 50, zero.sum = FALSE, allelic_interaction = FALSE, exploreQTL_output = NULL, return_plotData = FALSE )
visualiseQTLeffects( IBD_list, Phenotype.df, genotype.ID, trait.ID, linkage_group, LOD_data, cM_range = NULL, col.pal = c("purple4", "white", "seagreen"), point.density = 50, zero.sum = FALSE, allelic_interaction = FALSE, exploreQTL_output = NULL, return_plotData = FALSE )
IBD_list |
List of IBD probabilities |
Phenotype.df |
A data.frame containing phenotypic values |
genotype.ID |
The colname of |
trait.ID |
The colname of |
linkage_group |
Numeric identifier of the linkage group being tested, based on the order of |
LOD_data |
Output of |
cM_range |
If required, the plotting region can be restricted to a specified range of centiMorgan positions (provided as a vector of start and end positions). |
col.pal |
Vector of colours to use in the visualisations (it is best to provide two or three colours for simplicity). By default, effects will be coloured from purple to green through white. |
point.density |
Parameter to increase the smoothing of homologue effect tracks |
zero.sum |
How allele substitution effect should be defined. If |
allelic_interaction |
By default |
exploreQTL_output |
If |
return_plotData |
Logical, by default |
The estimated effects of the homologues, used in the visualisation
data("IBD_4x","BLUEs.pheno","qtl_LODs.4x") visualiseQTLeffects(IBD_list = IBD_4x, Phenotype.df = BLUEs.pheno, genotype.ID = "Geno", trait.ID = "BLUE", linkage_group = 2, LOD_data = qtl_LODs.4x)
data("IBD_4x","BLUEs.pheno","qtl_LODs.4x") visualiseQTLeffects(IBD_list = IBD_4x, Phenotype.df = BLUEs.pheno, genotype.ID = "Geno", trait.ID = "BLUE", linkage_group = 2, LOD_data = qtl_LODs.4x)