Biostatistical Analysis

Differentially expressed genes and miRNA

Differential Expression on Chromosomes

High-throughput technologies enable us to test tens of thousands of features in one experiment. This  requires special skills to analyze all the results and extract the most meaningful .
Statistics allow us to order the features that are the most likely to differentiate  biological conditions. A typical statistical analysis requires properly normalized sexpression values and the belonging of the samples to a group. Statistical tests will compute a p-value that defines the probability that the observed expression pattern of a given feature is related to the underlying design.
The lower the p-value, the more confidence one can have in the relevance of the gene. However, there is always a possibility that the expression pattern is observed by chance (false positive). This effect is most important when repeating  tests a large number of times, as in microarray experiments. Even with a low chance of observing false positives, the actual number of false positives can be high. Again, statistical techniques can adjust the p-values to take this risk into account.
Other strategies based on resampling procedures (bootstrap, jackknife) can also be used to estimate empirical p-values. Depending on the nature of the data, different parametric or non-parametric tests can be applied.

Copy number variation

CNV on chromosome 8Even though SNPs are an important source of variability, it is known that structural variations also play a crucial role in  many biological scenarios. Hundreds of thousands of non- polymorphic nucleotides in the genome or short DNA regions (CGH) can be tested so as to infer their number of copies in the genome. In turn, stretches of nucleotides of regions that show such variations in copy number are used to identify portions or whole chromosomes that are deleted or duplicated.

Genetic association studies

Gene association studies aim to detect associations between genetic markers (mostly single nucleotide polymorphisms - SNPs) with a given phenotypical trCnv on chromosome 8AIT (disease, reaction to a given drug,...). Several analyses including characterization of frequency distribution, testing of Hardy-Weinberg equilibrium and analysis of association of different binary and quantitative traits are applied. As for the expression of genes, statistical tests allow us to estimate to what extent the genetic pattern observed in the groups of interest (case-control) is related to the trait under investigation. As up to hundreds of thousands of SNPs are tested at once, additional adjustments or empirical evaluations need to be carried out to address false positives. Genetic or allelic models can be applied to the genetic data and SNPs can also be grouped into haplotypes that will be tested for association.

Need
more
information ?
Contact us