Introduction

The antiviral activity of the TRIM5α protein was first reported by Stremlau and colleagues, who discovered that a rhesus monkey ortholog (rhTRIM5α) could protect human cell lines from infection by HIV-1 (Stremlau et al. 2004). Subsequent work by a number of groups demonstrated that TRIM5α from a variety of primates, including apes, Old World monkeys, and New World monkeys (Diehl et al. 2008; Hatziioannou et al. 2004; Keckesova et al. 2004; Kono et al. 2008; Saenz et al. 2005; Sawyer et al. 2005; Song et al. 2005b; Stremlau et al. 2004, 2005; Yap et al. 2005, 2008; Ylinen et al. 2005), as well as related proteins from cows (Si et al. 2006; Ylinen et al. 2006) and rabbits (Schaller et al. 2007), have similar antiretroviral activity when tested against a variety of retroviruses. Analyses of the TRIM5α protein sequences revealed considerable interspecies divergence, with much of the variation localized to discrete subdomains in the C-terminal B30.2/SPRY domain (sometimes referred to as the SPRY or PRYSPRY domain, and referred to herein as the B30.2 domain). The differential retroviral specificities of the various primate TRIM5α orthologs can be attributed in many cases to divergence in these variable regions.

There are over 70 TRIM family genes in the human genome. Genes in the TRIM family characteristically encode an ordered array of three protein domains (Fig. 1), including (a) an amino-terminal RING domain, (b) one or two B-box domains, and (c) a long coiled-coil, which together constitute the canonical TRIM (tripartite) motif (Reymond et al. 2001; Sardiello et al. 2008). Most TRIM genes encode an additional C-terminal domain, which in many cases is a B30.2 domain (Nisole et al. 2005; Reymond et al. 2001; Rhodes et al. 2005; Sardiello et al. 2008). The TRIM5 gene encodes multiple protein isoforms. TRIM5α (product of the longest alternately spliced transcript) is a cytoplasmic protein that is broadly expressed in many tissue types (Sawyer et al. 2007). TRIM5α recognizes the capsid coat of incoming retroviral cores as they enter newly infected cells, blocking the retroviral replication cycle during the earliest, post-entry stage(s) of infection (Fig. 2; Anderson et al. 2006; Hatziioannou et al. 2004; Perron et al. 2004, 2007; Sebastian and Luban 2005; Shi and Aiken 2006; Stremlau et al. 2004, 2006; Wu et al. 2006; Yap et al. 2004, 2006). The precise mechanism of TRIM5α restriction is still the subject of intense scrutiny, but probably involves premature disassembly of the capsid core (Anderson et al. 2006; Perron et al. 2007; Sebastian and Luban 2005; Shi and Aiken 2006; Stremlau et al. 2006; Wu et al. 2006). The majority of work published thus far has focused on primate orthologs of TRIM5α, and collectively, these have been reported to restrict a variety of divergent retroviral targets, including both primate and nonprimate lentiviruses, gammaretroviruses, betaretroviruses, and spumaretroviruses (Diehl et al. 2008; Hatziioannou et al. 2004; Keckesova et al. 2004; Kono et al. 2008; Saenz et al. 2005; Sawyer et al. 2005; Song et al. 2005b; Stremlau et al. 2004, 2005; Yap et al. 2005, 2008; Ylinen et al. 2005). Primate studies indicate that TRIM5α proteins have the potential to affect immunity against HIV, but like all other layers of potential human immunity to this viral pathogen, the human TRIM5 ortholog is ineffective against HIV (Stremlau et al. 2004).

Fig. 1
figure 1

The TRIM5 gene. The human TRIM5 gene is approximately 21 kb in length. Several alternately spliced RNA transcripts are produced from this gene, one of which encodes the TRIM5α protein. This protein contains a RING domain, a B-box2 domain, and a coiled-coil domain, the three signature domains of the TRIM gene family. The α possesses encodes an additional C-terminal B30.2 domain. The B30.2 domain is sometimes referred to as the SPRY, PRYSPRY, or B30.2/SPRY domain in the TRIM5 literature (Rhodes et al. 2005)

Fig. 2
figure 2

TRIM5α blocks an early step in retroviral infection. An HIV virion is shown infecting a target cell. The cytoplasmic TRIM5α protein interacts with the capsid core of incoming retroviral particles, subsequently blocking the retroviral life cycle at a step before reverse transcription

Evolutionary analysis has provided many insights into how the TRIM5α antiretroviral protein works. It is clear that this protein has been co-evolving with retroviruses for tens of millions of years, a relationship whose origins may predate the radiation of eutherian mammals. This long-term, evolutionary “arms race” has demanded many functional innovations in TRIM5α, including short insertions/deletions and tandem duplications in discrete subdomains of the protein, rapid sequence change and elevated levels of nonsynonymous substitution, balanced polymorphisms, and structural variation including expansions of gene copy number and exon shuffling/trapping. The B30.2 domain of TRIM5α has been found to be the major determinant of viral specificity, and extreme rates of positive selection of protein sequence are found within short stretches of this domain (Sawyer et al. 2005). At least twice in the course of primate evolution TRIM5–cyclophilin A chimeric proteins arose via the capture of new exons (Brennan et al. 2008; Liao et al. 2007; Newman et al. 2008; Nisole et al. 2004; Sayah et al. 2004; Virgen et al. 2008; Wilson et al. 2008b). The TRIM family is vigorously engaged in gene duplication (Nisole et al. 2005; Reymond et al. 2001; Sardiello et al. 2008), and in some mammalian genomes, expansions, inversions, and deletions of the TRIM5 locus itself are evident (Sawyer et al. 2007; Tareen et al. 2009). Given abundant evidence that retroviral pathogens have been ubiquitously present throughout primate evolution (Gifford and Tristem 2003; Johnson 2008; Lower et al. 1996; Mager and Freeman 1995; Sverdlov 2000; Weiss 2006; Blikstad et al. 2008), and the variety of retroviruses that are known to colonize extant primates (Vogt 1997), it is reasonable to hypothesize that primate TRIM5α has long played, and continues to play, a role in governing patterns of susceptibility to cross-species transmission and spread (within species) of retroviral pathogens (Song et al. 2005a).

The evolutionary analysis of TRIM5 holds lessons that will translate well to the study of other intrinsic antiviral factors. Unlike many genes, whose histories are dominated by long periods of purifying selection, molecular evolution of TRIM5 is characterized by frequent episodes of positive selection (Liu et al. 2005; Ortiz et al. 2006; Sawyer et al. 2005, 2007), and this must be taken into account when interpreting the results of phylogenetic reconstruction and other molecular evolutionary analyses. In this regard, TRIM5 most closely resembles genes of the innate and adaptive immune system. As with any locus displaying high levels of divergence and polymorphism, one cannot assume that a TRIM5 sequence retrieved from a single genome project or from a single biological sample represents fixed interspecies differences or the most common allele. At a minimum, additional sampling of multiple, unrelated individuals is required to ensure that species-specific generalizations are based on major allele(s). A similar caution applies to the viral targets frequently employed in tissue culture studies of restriction: for decades, isolation of many laboratory retroviral strains involved rescue or propagation on convenient, heterologous cell lines, often from unrelated species, such that the isolation process itself could have selected for viral sequences that were rare or absent in the natural host. Finally, reports of TRIM5α antiviral activity are almost exclusively based on restriction of retroviral clones commonly studied in the laboratory, whose patterns of sensitivity/resistance to any given TRIM5α ortholog may simply be due to chance cross-reactivity and not the result of a prior evolutionary encounter. Thus, establishing restriction activity for any given TRIM5α variant depends on those viruses against which it is tested, and the absence of demonstrable antiviral activity may only mean that restriction of a complementary target was not attempted. In fact, it is possible that a given TRIM5α sequence may be highly specific for viral targets that no longer exist.

Retroviruses as a long-term selective pressure

Human genomes each contain around a half-million remnants of integrated retroviruses, comprising approximately 8% of the total genomic sequence. In this regard, the amount of sequence derived from retroviral integration exceeds the amount devoted to encoding cellular proteins (Lander et al. 2001). Such loci are referred to as endogenous retroviruses (or ERV), despite the fact that a majority are sufficiently degraded such that they are no longer capable of expressing infectious virus (for a recent review see (Jern and Coffin 2008)). A similar endogenous retroviral load is also found in the genomes of the chimpanzee and rhesus macaque (Mikkelsen et al. 2005; Gibbs et al. 2007; Han et al. 2007), and there are large numbers of these elements in the genomes of other mammals (Gibbs et al. 2004; Gifford and Tristem 2003; Lindblad-Toh et al. 2005; Pontius et al. 2007). Collectively, millions of such elements distributed among the genomes of different species represent an immense “fossil” record of the Retroviridae, extending back over hundreds of millions of years (Coffin 2004). It is likely that the majority of exogenous retroviral infections leave no such record (Johnson 2008), so the millions of ERVs represent only a tiny and potentially biased fraction of the true extent and variety of retroviral infections encountered by a given host lineage. These observations cumulatively suggest that modern primates (including humans) and their ancestors evolved in a landscape rich with retroviral pathogens, and that HIV is only the most recent chapter in a voluminous history of retrovirus–host interactions.

Most primate endogenous retroviruses identified to date resemble exogenous gamma- and betaretroviruses. While it has been widely assumed that the lentiviruses are of relatively recent origin, dating based on sequence divergence between pairs of duplicated endogenous lentivirus elements in the genome of European rabbits (Katzourakis et al. 2007) established that these viruses have been around for at least seven million years, if not longer. This raises the intriguing possibility that ancient HIV-like viruses may have contributed to patterns of evolution still discernible in the TRIM5 loci of living mammalian taxa. It has long been suggested that lentiviruses may not be prone to germline integration due to tissue specificity or other properties of their life cycle. However, sequences unambiguously related to the primate lentiviruses (HIV and SIV) have recently been discovered in the genome of a Malagasy primate, the gray mouse lemur (Microcebus murinus; Gifford et al. 2008), indicating that the dearth of endogenous lentiviruses may simply be a result of the fact that lentiviruses are a relatively new family of retroviruses.

Structural diversity in TRIM5

TRIM5 gene gain and loss

The TRIM5 gene is estimated to be between 90 and 180 million years old (Sawyer et al. 2007). TRIM5 homologs have been described in the genomes of primates, mouse, rat, dog, cow, and pig, but are not found in the genome sequence of chicken, suggesting a mammalian origin (Sawyer et al. 2007; Tareen et al. 2009). Because no TRIM5 ortholog exists in the current release of the single marsupial genome project (opossum), it is possible that TRIM5 may be specific to eutherian mammals. It is clear that this locus has not been conservatively inherited through eutherian evolution (Fig. 3). The genomes of cow, rat, and mouse contain expanded arrays of TRIM5 genes, while the human genome contains a single TRIM5 gene, and the dog genome has lost TRIM5 altogether as the result of a relatively recent gene disruption (Sawyer et al. 2007; Tareen et al. 2009). In addition to primate orthologs of TRIM5, antiviral activity has been reported for rabbit and cow TRIM5 genes (Schaller et al. 2007; Si et al. 2006; Ylinen et al. 2006). So far, no antiviral activity has reported for the TRIM5 genes of mice and rats (Fig. 3; Tareen et al. 2009). It is easy to imagine that TRIM5 gene duplications in certain species (cow, rat, mouse) might have been selectively retained, with new paralogs providing an opportunity for simultaneous selection for multiple CA specificities. Among the cow TRIM5 genes, only TRIM5-3 has been shown to be an antiviral; it is not yet known whether the other cow TRIM5 genes have antiviral activity. It is possible that some are specific for viruses that are not yet known to science (or which may no longer exist), and without the proper target, it will be difficult to prove that they are antiviral.

Fig. 3
figure 3

Structural variation in the TRIM5 locus. The TRIM5 loci of five eutherian mammals are depicted. Humans encode one active TRIM5 gene. The human TRIM5 pseudogene (TRIMP1) is supported by mRNA AF230412 and combines a region of the neighboring gene, TRIM34, with a region within TRIM5. Mouse, rat, dog, and cow all encode variable numbers of TRIM5 genes. In dog, there is only one TRIM5 gene, but it has been interrupted by a gene insertion and has apparently decayed into a pseudogene. In contrast, mouse, rat, and cow all encode multiple TRIM5 genes. The mouse TRIM5 locus contains 8 TRIM5 genes predicted to encode open reading frames. Two of the mouse TRIM5-like genes were previously known as TRIM12 and TRIM30 in the literature. Phylogenetic analysis revealed that the remaining mouse TRIM5 genes form clades with either TRIM12 or TRIM30; hence, these TRIM5 genes were named according to their order on the chromosome and the clade to which they belong (TRIM12-1, 12-2, and TRIM30-1, 30-2, and so on). In the case of the rat TRIM30s, it was not possible to determine 1:1 orthology to the different mouse TRIM30 genes. For this reason, the rat genes were named rat TRIM30-like 1 (L1) or rat TRIM30-like 2 (L2), so as not to mistakenly imply orthology to mouse TRIM30-1 and 30-2. This pattern of gene duplication and/or loss suggests that the TRIM5 gene locus has been highly dynamic in gene content, in addition to the dynamic sequence evolution that has been observed in these genes

It is unclear why the dog genome has tolerated the loss of TRIM5. Perhaps there was no significant fitness cost to losing the TRIM5 gene, although it seems unlikely that retroviruses would not have been a pathogenic threat for dogs or their ancestors. Surprisingly, despite their long association with humans as companion animals and frequent use as research subjects, there are no confirmed reports of exogenous retroviruses isolated from dogs. However, the dog lineage has experienced historical retroviral infections, as witnessed by the many endogenous retroviruses present in the canine genome (Lindblad-Toh et al. 2005). Other genes may have provided sufficient defense against retroviral infections of the canine lineage. By analogy to the murine Fv1 and Fv4 loci (Best et al. 1996; Ikeda et al. 1985), perhaps a subset of canine ERV-related loci could partially fulfill this function. Alternately, other TRIM genes may be fulfilling this role in dogs (Sawyer et al. 2007). In primate genomes (human, chimpanzee, and rhesus), TRIM5 is found in a cluster of four genes (TRIM6, TRIM34, TRIM5, and TRIM22), and phylogenetic analyses indicate that these four expanded from a common ancestral TRIM locus (Sawyer et al. 2007; Si et al. 2006; Ylinen et al. 2006). At least two reports suggest that human TRIM22 may also have anti-HIV activity, although the two papers propose unrelated mechanisms targeting different aspects of the viral replication cycle (Barr et al. 2008; Tissot and Mechti 1995). Patterns of TRIM22 sequence evolution among primates are also suggestive of strong positive selection, consistent with the possibility that TRIM22 may have played a role in host defense (Sawyer et al. 2007). Weak restriction of SIV has also been reported for human TRIM34 (Li et al. 2007; Zhang et al. 2006), making these TRIM genes candidate restriction factors in dogs. However, selective breeding of dogs over the last two centuries could simply have resulted in chance fixation of the disrupted TRIM5 allele (regardless of any increased susceptibility to retroviral pathogenesis), just as it has resulted in fixation of deleterious alleles of many other genes (Bjornerfeldt et al. 2006; Wayne and Ostrander 2007). Surveys of additional canid species should reveal the timing of the TRIM5 disruption and shed light on this issue. The lack of a functional TRIM5 may explain the sensitivity of several dog cell lines to laboratory retroviral infection and their widespread use as target cell lines in viral infectivity assays, although this remains to be explored. Interestingly, experimentally introduced TRIM5α proteins appear to be less potent when expressed in canine cell lines than in other cell types (Berube et al. 2007; Saenz et al. 2005), although it is still not clear why this is the case.

Independent evolution of TRIM5–CypA chimeras in two primate lineages

In 1999, Hoffmann et al. reported the results of a survey of primate cell lines, tested for susceptibility to single-round infection with HIV-1(Hofmann et al. 1999). Because many of the cell types available were not from tissues expressing the natural viral surface receptors, the viruses were pseudotyped with the pantropic envelope glycoprotein of the vesicular stomatitis virus (VSV-G). Several interesting observations from the Hoffman paper were to remain unexplained until the discovery of TRIM5α-mediated restriction several years later. Among these was the finding that cells from South American owl monkeys (Aotus sp.) were resistant to HIV-1 infection, a point of contrast to most other New World species tested. Even more puzzling was the observation that treatment of owl monkey cells with a cyclophilin A agonist relieved this block (Towers et al. 2003). The explanation arrived when it was discovered that owl monkey cells express an messenger RNA (mRNA) encoding a TRIM5-cyclophilin A fusion protein, the consequence of a cyclophilin A (CypA) pseudogene situated in the short intron between exons 7 and 8 of the owl monkey TRIM5 locus (Fig. 4; Nisole et al. 2004; Sayah et al. 2004). It has been known for many years that cellular CypA binds to the capsid protein of HIV-1 (Franke et al. 1994; Luban et al. 1993; Thali et al. 1994); thus, the TRIM5–CypA fusion expressed in owl monkey cells essentially couples the HIV-1 recognition property of CypA to the antiviral mechanism(s) encoded by TRIM5. Unlike human cell lines, where treatment with cyclosporine A reduces HIV-1 infectivity by interfering with CypA binding (Braaten et al. 1996; Dorfman and Gottlinger 1996; Franke and Luban 1996), in Aotus cells cyclosporine A enhances infectivity indirectly, by preventing inhibition by the TRIM5–CypA fusion protein (Nisole et al. 2004; Sayah et al. 2004; Towers et al. 2003).

Fig. 4
figure 4

Independent evolution of genes encoding TRIM5–cyclophilin A fusion proteins in two different primate lineages. The figure depicts the TRIM5 locus found in the New World owl monkeys (top) and the TRIM5–CypA allele found in Asian macaques (bottom). Both loci direct expression of TRIM5–CypA fusion proteins (known as TRIM5CypA1 and TRIM5CypA2, respectively (Stoye and Yap 2008)). In both cases, the CypA moiety is encoded by a CypA ORF generated by retrotranspositional insertion (orange boxes and arrowheads). The insertion events are clearly distinct, with the owl monkey CypA insertion located in an intron and the macaque insertion in the 3′UTR. The location of a G/T polymorphism in a 3′splice acceptor is indicated; the nonfunctional T variant is linked to the CypA insertion and is thought to favor expression of the TRIM5–CypA isoform. Note that the macaque variant lacks sequences corresponding to exon 7, whereas this exon is retained in the owl monkey version

The CypA ORF in the owl monkey TRIM5 locus is flanked by direct repeats of genomic DNA sequence, a hallmark of LINE-1-mediated retrotranspositional insertion (Nisole et al. 2004; Sayah et al. 2004). Ribeiro et al. reported that all species of owl monkeys examined contained the CypA insertion (Ribeiro et al. 2005). However, inspection of the syntenic position in the genome sequence of another New World monkey, Callithrix jacchus (marmoset), reveals a preserved pre-insertion target site. Thus, the TRIM5–CypA variant may have arisen in a common ancestor of all modern owl monkeys. Because of this, the TRIM5–CypA variant is probably fixed in extant species of the Aotus genus.

Amazingly, expression of a second TRIM5–CypA chimera was discovered in old world monkeys of the Macaca genus (Liao et al. 2007). Similar to the owl monkey TRIM5–CypA variant, macaque TRIM5–CypA expression results in part from the presence of a retrotransposed CypA coding sequence in the TRIM5 locus. The CypA insertion in macaques, however, is clearly distinct from the insertion in owl monkeys (Fig. 4; Brennan et al. 2008; Liao et al. 2007; Newman et al. 2008; Virgen et al. 2008; Wilson et al. 2008b). In the macaque version, the CypA insertion lies within the 3′UTR of TRIM5, downstream of all known coding exons. In this case, expression of an in-frame TRIM5–CypA fusion protein results from alternative splicing from exon 6 of TRIM5 to a 3′ acceptor site immediately upstream of the CypA ORF. 3′ splice acceptors are highly conserved features of exons and are typically composed of a polypyrimidine tract followed closely by an AG dinucleotide, which in turn defines the precise border of the intron and the downstream exon. The insertion of the CypA element immediately 3′ to a stretch of pyrimidines, and the presence within the retrotransposed element of a suitable AG dinucleotide, thus reconstituted the essential elements of a splice acceptor, and permits in-frame readthrough and expression of a TRIM5–CypA chimeric protein when exon 6 of TRIM5 serves as the splice donor.

Remarkably, the CypA insertion in macaque TRIM5 is always found linked to a second mutation within the TRIM5 gene, this being a single G-T substitution in the 3′ splice acceptor at the 5′ boundary of exon 7 (Brennan et al. 2007). As a consequence of this mutation, splicing to exon 7 is obliterated and the TRIM5α isoform is not expressed. Thus, the mutation could have been selected separately as a result of preferential expression of the TRIM5–CypA mRNA at the cost of expressing the α isoform mRNA. A two-nucleotide deletion within the B30.2 domain coding sequence of the rhesus ortholog of TRIM5–CypA results in a frameshift and premature stop codon, further ensuring that a functional TRIM5α isoform probably cannot be expressed from the TRIM5–CypA allele, at least in rhesus macaques (Wilson et al. 2008a, b). Thus, the macaque TRIM5–CypA gene appears to be the culmination of multiple rounds of evolutionary refinement, although the exact order of the steps involved is not known.

Thus far, the TRIM5–CypA variant in macaques has been found in three species of Asian macaque, Macaca mulatta (rhesus macaques), Macaca nemestrina (pig-tailed macaques), and Macaca fascicularis (cynomolgus or crab-eating macaques) (Brennan et al. 2008; Liao et al. 2007; Newman et al. 2008; Virgen et al. 2008; Wilson et al. 2008b). Two surveys were conducted using samples from captive rhesus macaques, and in both studies, less than half of the animals typed carried a copy of TRIM5 containing the inserted CypA element (allele frequencies of 8% and 25%, respectively; Newman et al. 2008; Wilson et al. 2008b). Rhesus macaques are widely dispersed across Asia, and thus far the insertion has only been reported among Indian origin rhesus macaques (Wilson et al. 2008b). Although the numbers may not reveal the true frequency or distribution among wild populations of rhesus macaques, it is clear the CypA insertion is not fixed in that species. It is noteworthy that macaque TRIM5–CypA proteins do not specifically block HIV-1 infection (but they do block other lentiviruses), suggesting that the TRIM5-mediated barrier to HIV-1 infection of rhesus macaques should be absent in individual macaques homozygous for the TRIM5–CypA allele (i.e., TRIM5–CypA/TRIM5–CypA homozygotes). This knowledge may be useful in the further development of macaques as an animal model for HIV (Ambrose et al. 2007). It is especially noteworthy that the TRIM5–CypA allele appears at very high frequency (and may be fixed) in pig-tailed macaques (M. nemestrina), in which case the entire species may not express the alpha isoform (TRIM5α).

The lineage leading to the Asian macaques split from the other African primates about nine to ten million years ago, and fossil evidence indicates that the ancestors of modern macaques migrated out of Northern Africa into what is now Southern Europe and across what is now the Middle East and Asia (Delson 1975; Raaum et al. 2005; Stewart and Disotell 1998). A small survey of African sooty mangabeys (Cercocebus atys), a sister taxon to the Asian macaques, did not turn up the CypA insertion or the G-T mutation in those animals (Newman et al. 2008); likewise, TRIM5α mRNA has been cloned from a variety of African old world primates but there are as yet no reports of this TRIM5–CypA fusion in African species. From this, it is possible to infer that the formation of the TRIM5–CypA allele, including the CypA insertion itself and the linked G-T splice acceptor mutation, occurred sometime between five and ten million years ago in a common ancestor of modern macaques (Tosi et al. 2003). This interpretation is complicated, however, by the possibility of hybridization/introgression among macaque lineages (Bonhomme et al. 2009; Tosi et al. 2003), such that the TRIM5–CypA allele could have arisen more recently in one species and subsequently spread to allied populations/species in the Macaca genus. Further elucidation of the origins of macaque TRIM5–CypA, and its timing relative to phylogeographic events in macaque history, could come from examination of a wider distribution of extant macaque species.

The owl monkey and macaque TRIM5–CypA genes are thus far each confined to a single genus (Aotus in one case and Macaca in the other), and the unique genomic structure of the owl monkey and macaque variants unambiguously proves that the two TRIM5–CypA fusions evolved independently (Fig. 4). Given that CypA is known to bind Gag of HIV-1 and several other lentiviruses (Franke et al. 1994; Lin and Emerman 2006; Luban et al. 1993; Thali et al. 1994), and given the well-established antiretroviral activity of TRIM5α, a scenario can be envisioned in which the random formation of genetic TRIM5–CypA fusions provided a selective advantage to their respective hosts in the face of lentiviral infections.

The nature of the C-terminal protein domain of the large TRIM family varies and includes a number of completely distinct, unrelated protein domains (Nisole et al. 2005; Reymond et al. 2001; Sardiello et al. 2008). To this list can now be added to cyclophilin A, having arisen at least twice during the evolution of modern primates. From this, one can infer modularity of the two elements (the TRIM motif and the various C-terminal domains), and that the TRIM motif has the property of being readily combined with otherwise unrelated recognition domains. The relatively recent acquisition of the CypA domains (in evolutionary time) means that the primary steps involved in the formation of the new genes could still be elucidated (as described above); perhaps similar processes, such as retrotranspositional introduction of new sequences and “capture” of novel C-terminal exons by alternative splicing, have played a wider role in the expansion of the TRIM family.

TRIM5 sequence evolution

Extreme TRIM5 sequence divergence between primate species

In addition to structural variations that distinguish the TRIM5 locus in different lineages, interspecies comparisons also reveal considerable variation at the nucleotide level. TRIM5 displays some of the highest relative rates of nonsynonymous change (dN/dS) in primate genomes (Liu et al. 2005; Ortiz et al. 2006; Sawyer et al. 2005, 2007), suggesting that it has been adapting through repeated rounds of positive selection. There are currently over 30 primate TRIM5 sequences available in Genbank. These sequences represent geographically distributed primate species from around the world, who share their last common ancestor about 35 million years ago. When dN/dS values were analyzed along each branch of the primate phylogeny, nearly universally high signals were observed (Sawyer et al. 2005). This is also true for deeper, ancestral branches where these values can be calculated after reconstruction of ancestral node sequences. These results suggest that the adaptive evolution of TRIM5, and its engagement in defense against retroviral infections, has been occurring for tens of millions of years.

The distribution of positively selected amino acid sites is not random, but instead these sites fall predominantly in the B30.2 and coiled-coil domains (Ortiz et al. 2006; Sawyer et al. 2005; Song et al. 2005a). Multiple, positively selected codons fall in a very tight cluster at the beginning of the B30.2 domain, in a “patch” of positive selection that is 11 residues long in the human protein (Sawyer et al. 2005). Based on the assumption that adaptation will be occurring at precisely the spot that must evolve to continually keep step with variation in the capsid proteins (CA) of retroviruses, it was predicted that this patch represents the interface with CA. This hypothesis was tested by making chimeric proteins in the location of this patch, and showing that the specificity of one TRIM5α protein could be transferred to another through this method (Sawyer et al. 2005). In fact, substitution of just a single amino acid in this region of the human protein can convey levels of restriction of HIV-1 similar to the rhesus protein (Stremlau et al. 2005; Yap et al. 2005). It turns out that the amino acid at position 332 in the context of human TRIM5α strongly influences HIV recognition. Replacement of the arginine at position 332 with proline, as is found in the rhesus protein, results in improved restriction of HIV-1 (Li et al. 2006; Yap et al. 2005). In fact, any nonpositively charged amino acid at position 332 in the context of human TRIM5α makes the recognition of HIV-1 CA more efficient (Li et al. 2006).

B30.2 domains are thought to form a “beta-sandwich” structure, with a series of strands that fold into two parallel beta sheets (Grutter et al. 2006; Masters et al. 2006; Ohkura et al. 2006; Seto et al. 1999; Woo et al. 2006a, b). Interspecies comparisons indicate that the protein sequence of the beta strand segments tends to be stable in length, in contrast to the many insertions and deletions in the putative loops connecting these strands (Sawyer et al. 2005; Song et al. 2005a). It is assumed that these regions are likely to lie on the surface of the B30.2 structure, where they may be available to interact with viral target structures. Several lineages have experienced expansions and tandem duplications of short peptide sequences in these regions of the TRIM5α B30.2 domain. In a particularly dramatic example, the third variable region (V3) of the spider monkey TRIM5α B30.2 domain is 96 residues long (compared to typical V3 lengths of 32–41 residues for most other primate orthologs) and appears to be the result of multiple, tandem sequence duplications. Similarly, the V1 region ranges from 17 to 46 amino acids in length among primates. Whether such short insertions/deletions served as the substrate for positive selection remains to be established. Such events are concentrated in the same regions of the protein that display high levels of nonsynonymous substitution (Sawyer et al. 2005; Song et al. 2005a), and both types of variation overlap suspected points of contact between the B30.2 domain and viral CA targets. One could imagine that some length variants comprised structural adaptations selected to accommodate complementary features on the exposed surfaces of viral CA targets, or alternatively, that the extra sequence generated by insertion/duplication provided additional genetic material for amino acid variation and subsequent positive selection. For example, a six nucleotide insertion/deletion polymorphism in the V1 region of rhesus macaque TRIM5α (TFP339Q in Fig. 5) is sufficient to explain differential restriction of HIV-2 (Wilson et al. 2008a) and SIVagmTAN (Newman and Johnson, unpublished results) by the most common rhTRIM5α alleles, but it is not clear whether this is due to the small difference in length or amino acid composition (or both). Similarly, a 20 amino acid duplication found in an African green monkey TRIM5α ortholog transfers the capacity to restrict SIVmac239 when reconstituted in the context of human TRIM5α (Nakayama et al. 2005). High ratios of non-synonymous to synonymous mulation have been documented within the duplicated regions of several of these expansions, supporting the hypothesis that adaptive change can occur within these inserted sequences (Sawyer et al. 2005).

Fig. 5
figure 5

Comparison of nonsynonymous polymorphism in TRIM5α of humans, rhesus macaques, and sooty mangabeys. a Despite the large number of humans surveyed (>1,000 as of this writing), only a small handful of polymorphic sites have been identified. In contrast, much smaller surveys of Asian rhesus macaques and the African sooty mangabeys have revealed extensive polymorphism, particularly in the C-terminal portion of the coiled-coil domain and the variable portions of the B30.2 domain. In addition, the TRIM5–CypA variant (not shown) is also found at significant frequency in rhesus macaques (see text). Note that humans lack polymorphism in these same regions. Sites in blue indicate putative shared polymorphisms. The TFP339Q indel described in the text is indicated in red. H. sapiens human, M. mulatta rhesus macaque, C. atys sooty mangabey. b A cladogram depicting the evolutionary relationship among several primate species is shown, in order to illustrate the pattern of divergence and diversity at position 332 in the TRIM5α protein (equivalent to position 334 in the macaque and sooty mangabey orthologs). Arginine (R), proline (P), and glutamine (Q) appear to have replaced each other repeatedly during primate evolution. For this reason, position 332 was identified as a codon position evolving under positive selection. Amazingly, alleles encoding all three of these variant amino acids were reported in African sooty mangabeys, and at least two of these (P and Q) were found in Asian rhesus macaques

TRIM5 diversity in populations

Diversity between species and variation within species arise from the same evolutionary mechanisms, albeit representing different timescales (with divergence between species representing deeper separations and variation within species representing more recent events). The discovery of extensive diversity in TRIM5 between primate species begged the question of whether the locus would be polymorphic within species as well, and whether naturally occurring variants differentially affect susceptibility to retroviral infection or disease progression. Thus far, reports of extensive within-species sampling for polymorphism in TRIM5 have been limited to humans (Sawyer et al. 2006), most commonly in the context of HIV/AIDS cohorts (Goldschmidt et al. 2006; Javanbakht et al. 2006; Nakayama et al. 2007; Speelmon et al. 2006; van Manen et al. 2008), and two species of Old World monkey (Fig. 5; Newman et al. 2005; Wilson et al. 2008a).

Human polymorphism

The observation that one or a few amino acid changes dramatically alter the ability of TRIM5α to interact with HIV-1 CA immediately raised the question of what TRIM5 looks like in different human populations: Is it possible that humans encode TRIM5α proteins with a spectrum of potencies against HIV? Several diversity and epidemiological studies in humans ensued. In the first study, world diversity in the human TRIM5 gene was assessed (Sawyer et al. 2006). Of the 37 geographically/ethnically distinct human samples surveyed, all encoded an arginine at position 332. Only a handful of nonsynonymous single nucleotide polymorphisms (SNPs) were identified, and these were spread throughout the coding sequence; strikingly, the N-terminal portion of the B30.2 domain, encompassing the variable regions, was found to be devoid of polymorphism (Fig. 5). Of the common protein-altering SNPs, only one (H43Y) was found to modulate activity against retroviruses in vitro; in this case, the 43Y allele weakens TRIM5α restriction (Sawyer et al. 2006). This study brought H43Y forward as a candidate SNP for differential susceptibility of certain human populations, and at least one study has reported an association between the 43Y/Y genotype and accelerated disease progression (van Manen et al. 2008). It will be especially interesting to explore the effects of TRIM5 genotype in cohorts from Central and South America, where the 43Y allele is found at high prevalence compared to other parts of the world, where it is rare (Sawyer et al. 2006).

The most extensive surveys of human TRIM5 polymorphism come from analysis of HIV/AIDS cohorts and related control populations. Studies using these cohorts have sought associations between variation in TRIM5 and a variety of outcomes, such as (a) susceptibility/resistance to HIV-1 infection (by measuring allele frequencies in high-risk yet uninfected individuals), (b) levels of viral replication in HIV-1-infected individuals, and (c) various indicators of disease progression among HIV-1-infected individuals. Most studies have reported weak associations with at least one of these outcomes, although these independent studies did not arrive at the same conclusions (Goldschmidt et al. 2006; Javanbakht et al. 2006; Nakayama et al. 2007; Speelmon et al. 2006; van Manen et al. 2008). Further work will be required to determine the source of the discrepancies, and whether these reflect true biological/genotypic differences between cohorts. While a large number of sequence variants in human TRIM5 were identified in the course of these studies, very few nonsynonymous SNPs with significant minor allele frequencies were uncovered. Notably, none of the common human polymorphisms fall in the B30.2 variable regions or in previously identified sites of strong positive selection (Fig. 5). Associations between human TRIM5 variation and other retroviral infections (HIV-2, HTLV-1, HTLV-2) have not yet been reported, nor are we aware of TRIM5 association studies in animal models of retroviral pathogenesis.

TRIM5 polymorphism in Old World monkeys

Patterns of TRIM5 polymorphism in Old World monkeys seems to be quite different from what is observed in humans. This is the case in at least two Old World monkey species that have been surveyed, the Asian rhesus macaque (M. mulatta), and the African sooty mangabey (C. atys; Newman et al. 2006, 2008; Wilson et al. 2008a, b). These studies revealed extensive nonsynonymous polymorphism, with high frequency variants present at multiple sites within the B30.2 domain and, surprisingly, in the C-terminal portion of the coiled-coil (Fig. 5a; Newman et al. 2006). Surveys of other Asian and African species will be required to determine whether this pattern is the rule or the exception for primate TRIM5 loci.

While surveys of human populations have revealed a lack of coding polymorphism in the variable regions of B30.2 domain, the TRIM5 gene of both rhesus macaques and sooty mangabeys have numerous nonsynonymous polymorphisms in this domain. One of the most striking examples occurs at position 334 (homologous to position 332 in human TRIM5α). As described above, this position is an important determinant of retroviral specificity, and all primate genomes sampled encode either an arginine (R), proline (P), or glutamine (Q) at this position (representative species shown in Fig. 5b). While the R found at position 332 in human TRIM5α clearly has a negative impact on restriction of HIV-1 (Li et al. 2006; Yap et al. 2005), two different sooty mangabey TRIM5 alleles encoding an R at this position readily restrict HIV-1 (Newman et al. 2006). This suggests that the effect of an arginine at this position may be context-dependent (there are more than a dozen fixed differences between the human and sooty mangabey TRIM5α protein sequences), or alternatively, that these sooty mangabey TRIM5α alleles may engage the HIV-1 CA via an entirely different set of protein–protein contacts. Based on an extensive analysis of interspecies TRIM5α B30.2 subdomain recombinants, Ohkura et al. also found evidence for context dependence of 332R and restriction of HIV-1 (Ohkura et al. 2006). Most surprisingly, alleles encoding different amino acids at this site are present in both species for which polymorphism has been surveyed (Fig. 5b; P/Q in rhesus macaques and R/P/Q in sooty mangabeys; Newman et al. 2006; Wilson et al. 2008a, b). These alleles may be preserved in these populations through the forces of balancing selection, as they differ to some degree in their retroviral specificities.

Some positions in the TRIM5 coiled-coil (CC) region of rhesus macaques and sooty mangabeys also exhibited trans-species polymorphism, the phenomenon wherein multiple alleles are inherited and maintained in sister species (Klein et al. 2007). Outside of the MHC (Hughes and Yeager 1998), examples of trans-species polymorphism in humans and other primates are rare (Asthana et al. 2005), and the pattern is a predicted outcome of long-term balancing selection (Charlesworth 2006). Consistent with this interpretation, synonymous polymorphisms in the OWM TRIM5 locus clustered tightly around regions of nonsynonymous polymorphism. Such clustering of synonymous SNPs is also an expectation of balancing selection maintained over long evolutionary periods, as recombination between alleles at regions distal to the sites targeted by selection homogenizes sequences, and silent SNPs closely linked to sites that are the actual targets of balancing selection persist over longer periods (Charlesworth 2006). Among the sites found to be polymorphic within M. mulatta and C.atys were sites independently reported to have high dN/dS values in interspecies comparisons of the primate TRIM5 locus (Sawyer et al. 2005). Taken together, these observations suggest that the CC domain of TRIM5 was also subjected to bouts of positive selection during the course of primate evolution.

There are several hypothetical scenarios to explain positive selection operating on the CC domain, although they are all speculative. The only well established function of the CC thus far is to promote TRIM5α multimerization (Kar et al. 2008; Langelier et al. 2008; Mische et al. 2005). However, at least one report indicated that the CC may sometimes play a role in target specificity (Yap et al. 2005), and variation in this region, as with the B30.2 domain, could therefore reflect adaptations for recognition of different retroviral targets. An indirect influence of the CC on specificity can also be imagined, wherein amino acid changes have been selected to modulate the spatial arrangement of the neighboring B30.2 domains, thus augmenting interactions with multimeric retroviral CA structures. For gene systems with multiple, intermediate frequency alleles (and thus high heterozygosity), the potential for negative interference between alleles of a multimerizing protein also exists; thus, another possibility is that variation in the CC represents secondary adaptations to reduce or alter interactions between proteins encoded by alleles of differing specificities. Finally, an interesting possibility is that the CC domain is the target of viral factors that actively antagonize TRIM5α function, and changes in this domain represent “escape” from such factors. There is ample precedence for such interactions in retroviral systems, most notably interactions between the lentiviral Nef, Vif, and Vpu proteins and their respective host cell targets, MHC class I, APOBEC3G, and Tetherin (Collins et al. 1998; Neil et al. 2008; Sheehy et al. 2002). Thus, it is hypothetically possible that in some species the TRIM5 CC has evolved to evade virally encoded antagonist(s) of TRIM5α restriction.

Paleovirology

Why is the human version of TRIM5α so poor at restricting HIV? Sawyer and colleagues have hypothesized that the paucity of nonsynonymous polymorphism in the variable portion of the human TRIM5 B30.2 domain was the result of a selective event in the human lineage, possibly due to a retroviral pathogen (Sawyer et al. 2006). This previous viral dodge could have left us in a state of uniform susceptibility to the most serious retroviral pathogen modern humans have faced. It is possible that sequences of the virus responsible for this ancient selective sweep could still be present among the thousands of ERVs in our genomes, or in the genomes of species closely related to humans. In fact, several groups have resurrected ancient, inactivated endogenous retroviruses (or their CA proteins), with the potential to test their sensitivity to human TRIM5α (Dewannieux et al. 2006; Kaiser et al. 2007; Lee and Bieniasz 2007). Parts of the Gag protein of two endogenous retroviruses from the chimpanzee genome, PtERV/CERV1 and CERV2, two endogenous retroviruses from the rhesus macaque genome, RhERV1 and RhERV2, and an entire genomic reconstruction of a “young” human endogenous retrovirus, HERV-K(HML-2), have now been tested (Kaiser et al. 2007; Lee and Bieniasz 2007; Perez-Caballero et al. 2008). One group found that human TRIM5α, which is poor at restricting HIV, effectively restricts a recombinant retrovirus that contains the CA and p12 proteins of PtERV (Kaiser et al. 2007), although interestingly, another group did not observe this restriction when using a recombinant virus containing only the CA derived from PtERV (Perez-Caballero et al. 2008). In a similar vein, at least one group has attempted to use phylogenetic methods to reconstruct ancestral TRIM5α proteins (inferred from ancestral nodes) and tested these for activity against several modern retroviruses (Goldschmidt et al. 2008). Among other things, these authors reported that a predicted ancestral TRIM5α of Old World primates (~25 million years old) was more effective against HIV-1 than the variant found in modern humans. This is perhaps not surprising if one considers that, upon zoonotic transmission, HIV was counter-selected for resistance to human TRIM5α. It might prove interesting to test ancestral TRIM5α reconstructions such as these for activity against capsids derived from ancient ERV loci, such as those described above.

TRIM5: lessons in retroviral immunity, lessons in molecular evolution

Although we are currently unaware of any direct evidence demonstrating that TRIM5α expression modulates or prevents retroviral infections in nature, such a role can be inferred from laboratory experiments, and additional evidence may still come from human cohort studies or animal models. Importantly, given (a) the demonstrable antiretroviral activity displayed by a variety of TRIM5α orthologues in tissue culture, (b) lineage-specific differences in TRIM5α specificity, and (c) strong evidence of positive selection operating on the TRIM5 gene, it is reasonable to hypothesize that TRIM5 played a role in governing patterns of spread of retroviruses within and among ancestral species, and that naturally occurring variation in the TRIM5 locus of extant species continues to determine patterns of cross-species transmission (including zoonotic infections) and drives adaptation of retroviruses during the process of colonizing a new host.

To date, the bulk of published work on TRIM5 is related to the structure and function of the TRIM5α protein, but TRIM5 also constitutes a highly attractive target for molecular evolutionary analyses. The apparent specificity of TRIM5α for a specific class of pathogen (the Retroviridae) provides, as a starting point for such studies, a source of well-defined and plausible evolutionary hypotheses. Such studies are further complemented by the existence of a vast “fossil” record comprised of the millions of ancient endogenous retroviruses embedded in the genomes of modern species, and the large catalogue of modern, infectious retroviruses found in nature. From a practical perspective, molecular evolutionary and population genetic analyses of TRIM5 can be wed to functional/structural insights drawn from almost a century of basic research on animal retroviruses, lending strength to evolutionary hypotheses derived from sequence analysis. The antiretroviral activity of TRIM5 is a fairly recent discovery, yet existing studies of this gene have exposed evidence of a complex and fascinating evolutionary history. The ancient origins of TRIM5 (and its presence in widely divergent mammalian lineages) and its membership in a large and mostly uncharacterized gene family further ensure a wealth of material for future studies.