The Pharmacogenetics and Pharmacogenomics Knowledge Base (PharmGKB, http://www.pharmgkb.org) is a public repository of genotype and phenotype information relevant to pharmacogenetics. The PharmGKB is web-based and supports the representation, storage, analysis, and dissemination of pharmacogenetic data. It is currently being developed at Stanford University with funding from the National Institutes of Health (NIH, with the National Institute for General Medical Sciences as the lead institute), and is part of the NIH Pharmacogenetics Research Network (PGRN), a collaborative research consortium. The aim of the PharmGKB is to catalyze research in the field, and promote sharing of key pharmacogenetic data sets.

There are many types of data relevant to pharmacogenetics and pharmacogenomics. The PharmGKB organizes its data into five high-level categories: data pertinent to (1) variation in clinical outcome, (2) variation in pharmacodynamics and drug responses, (3) variation in pharmacokinetics, (4) variation in molecular and cellular functional assays, and (5) variation in genetic sequence. All data sets are classified into these five categories, and are also annotated with their associated genes, drugs, and diseases. The PharmGKB has a large collection of genotypes for genes of pharmacogenetic interest, showing the patterns of polymorphisms identified in different populations. For example, the primary data associated with the report published in this journal on the pharmacogenetics of the human sulfotransferase (SULT2A1) gene1 are available on the PharmGKB, along with PCR primer information and the populations studied.2 The PharmGKB also contains phenotype data sets that are linked to particular genotypes. For example, p-glycoprotein 1, commonly known as MDR1 or ABCB1, is critical for drug transport across the blood–brain barrier. PharmGKB contains multiple genotypic variants of MDR1 for different populations.3 At one position in the gene (Golden Path position chr7:86736872), three different research groups have submitted a synonymous SNP (A → T, Ile/Ile) with population frequencies of 56.9/43.1, 60.82/39.18, and 53.34/46.66, respectively. It is satisfying to see concordant, independent measurements in the database. One of the studied populations is associated with a phenotype data set showing individual responses to tamoxifen. Other genotypic variants are linked to other drug phenotypes, including the pharmacokinetics of midazolam,4 docetaxel,5 and irinotecan.6 In addition to core genotype and phenotype data sets, PharmGKB also collects information about key gene–drug interactions of relevance to pharmacogenetics culled from the literature. PharmGKB integrates all these data with relevant information from other databases and provides browsing and analytical functions to help scientists discover connections between genetic variations and alterations in drug response and related phenotypes.

We welcome the opportunity to interact with the general pharmacogenetics community, and encourage all pharmacogenetic scientists to deposit their data in the PharmGKB. PharmGKB has mechanisms to protect the confidentiality and security of research subjects, including the removal of all direct identifiers and measures for implementing access control. The PharmGKB is designed to serve as a long-term archival storage mechanism for pharmacogenetic data sets, and thus the effort required to make a submission guarantees that data sets will always be available for interpretation and correlation as new hypotheses emerge. We welcome suggestions and feedback about new functionality that would make the PharmGKB more useful (feedback@pharmgkb.org).