Statistical analysis of DNA polymorphism

Jpn J Genet. 1993 Dec;68(6):567-95. doi: 10.1266/jjg.68.567.

Abstract

A large amount of genetic variation can be maintained in natural populations. In order to understand the mechanism maintaining genetic variation, we must first estimate the amount of genetic variation. There are two measures for estimating the amount of DNA polymorphism, i.e., the average number of pairwise nucleotide differences and the number of segregating sites among a sample of DNA sequences. Using these two measures, we can test the neutral mutation-random drift hypothesis (the neutral theory). The expectation of the amount of DNA polymorphism has been studied under several models, including population subdivision, change in population size, and natural selection. When a population is subdivided, a large amount of DNA polymorphism can be maintained in the population if the migration rates among subpopulations are small. In this case the amount of DNA polymorphism in the subpopulation with lower migration rate is expected to be smaller than that of higher migration rate. When the population size changes, the number of segregating sites changes more rapidly than does the average number of nucleotide differences. When purifying selection is operating, the number of segregating sites is more strongly affected by the existence of deleterious mutants than is the average number of nucleotide differences. On the other hand, when balancing selection is operating, the effect of the selection on the average number of nucleotide differences is larger than that on the number of segregating sites. A mutant under natural selection affects the amount of DNA polymorphism at linked sites (hitchhiking effect). DNA sequences are not random sequences and there may be conservative and variable regions in them. A statistical method for determining the window size and for finding nonrandom regions in the sequence is also presented.

MeSH terms

  • Animals
  • Base Sequence
  • DNA / genetics*
  • Genetics, Population
  • Models, Genetic*
  • Models, Statistical*
  • Molecular Sequence Data
  • Polymorphism, Genetic*
  • Selection, Genetic

Substances

  • DNA