Introduction

As a worldwide medical and economic problem type 2 diabetes is expanding internationally. The International Diabetes Federation estimates that in 2010 approximately 285 million individuals have type 2 diabetes across the world;1 this number is expected to expand to 439 million individuals by 2030. Diabetes imposes a significant health and economic burden, and factoring in the additional costs of undiagnosed diabetes, prediabetes, and gestational diabetes, the total cost of diabetes in the US in 2007 amounted to $218 billion.2 Despite the availability of several oral and injectable therapies for type 2 diabetes, there remains significant unmet medical need in this disease, justifying the search for more efficacious and safe treatments that can prevent disease progression and protect patients from microvascular and macrovascular complications. Among the types of therapies under development, inhibitors of SGLT2 (for “Sodium GLucose coTransporter” protein 2) represent a promising new class.3,4

One consideration for choosing a molecular target for the identification of a new treatment of a chronic disease such as type 2 diabetes is the spectrum of tissues in which the target of interest is expressed. A molecular target with a ubiquitous pattern of expression could pose concerns related to the activities of agonists or antagonists to this target in a wide variety of tissues, whereas a molecular target expressed in a restricted number of tissues might suggest a more selective pharmacologic profile. We have evaluated the expression pattern of SGLT2 and related family members by quantitative reverse transcription real-time polymerase chain reaction (RT-PCR) methodology in order to better understand the potential impact of a selective SGLT2 inhibitor in vivo.

There are more than 200 SGLT family members, including 12 human orthologs.5 Based on sequence homology, these 12 SGLT family members can be divided into two subfamilies, as shown in Table 1. SGLT1, SGLT2, sodium-dependent amino acid transporter (SAAT1; also known as SGLT3), sodium myo-inositol cotransporter (SMIT), SGLT4, SGLT5, and SGLT6 belong to one subfamily, sharing between 45% and 70% protein sequence identity amongst themselves. Most of the members of this subfamily transport or bind sugar molecules. The five other solute carrier family 5A (SLC5A) family members Na+/I- symporter (NIS), sodium-dependent multivitamin transporter (SMVT), choline transporter (CHT), apical iodide transporter/sodium monocarboxylate cotransporter 1 (AIT/SMCT1), and SMCT2 form another subfamily. They share between 40% and 50% protein sequence identity amongst themselves; members of this latter subfamily are involved in the cotransport of sodium with other physiologically important molecules such as iodide, ascorbate, biotin, pantothenate, lipoate, choline, and monocarboxylates such as lactate.5 Since only 18% to 20% protein sequence identity exists between the two subfamilies, the focus of our studies was the sugar-binding class of SGLT cotransporters most closely related to SGLT2 (Table 1).

Table 1 SGLT (sodium glucose cotransporter protein) family members. 1A and 1B: The common names, system names, and human reference sequence numbers of the 12 SGLT family members are listed. The putative substrates of the transporters and the chromosomal locations of the transporter genes are given in the tables. 1C: Percentage of protein sequence identity among SGLT family members as determined with VectorNTI AlignX software.
Table 2 1B
Table 3 1C

The first sugar-binding SGLT sequence to be cloned, by Wright and colleagues, was the high-affinity sodium-glucose cotransporter SGLT1, which was found to be expressed in the small intestinal mucosa6 and associated with glucose and galactose transport at that site. SGLT1 was later found to be expressed in many tissues across the body,7 and mutations in SGLT1 were associated with the human genetic syndrome glucose-galactose malabsorption.8 SGLT2 was cloned subsequently, and was characterized as a low-affinity sodium-glucose cotransporter expressed in the renal early proximal tubule.9,10 SAAT1 was first cloned as a sodium-amino acid cotransporter11 but was later found to have glucose cotransporter activity.12 It was found in kidney, small intestine, and other tissues and is now suggested to be a sodium-dependent glucose sensor rather than a sodium-glucose cotransporter.13 SMIT is an osmoregulatory sodium-inositol cotransporter found in many tissues including brain and cardiac myocytes.14,15 SGLT4 is a low-affinity sodium-dependent transporter for mannose and fructose found in kidney and small intestine tissue.16 SGLT5 was identified in the Mammalian Gene Collection (MGC) human cDNA sequencing project, by similarity to other SGLT family members.17 SGLT6 (also known as KST1 or SMIT2) was identified as a novel sodium-glucose cotransporter18 located within a genomic region associated with infantile convulsion and choreoathetosis as well as benign familial infantile convulsion diseases. It was found to be able to transport myo-inositol in a sodium-dependent manner.19 Even though the functional activities of these SGLT family members have been described, there has not been a systematic study of the expression profiles of this family across the same broad set of normal human tissues.

There have been conflicting reports about the mRNA expression profile of SGLT2 in human tissues. It was initially reported to be expressed predominantly in the kidney using northern blot techniques;9,20 however, Wright and colleagues have subsequently reported a broader pattern of tissue expression of SGLT2 beyond the kidney using RNase protection methods, although no detailed methods or data were presented.21 In 2003, Zhou et al.22 employed quantitative RT-PCR techniques to show that SGLT2 was ubiquitously expressed in most human tissues. In 2005, however, Tazawa et al.16 used the same methodological approach but reported contradictory findings: SGLT2 was primarily expressed in the kidney and to a smaller degree in the small intestine. It has also been reported that in mice, SGLT2 is specifically expressed in kidney proximal tubule.23 Because SGLT2 inhibitor compounds are being developed by several pharmaceutical companies as antidiabetic agents which inhibit renal glucose reabsorption, it has become increasingly important to verify the tissue expression profile of SGLT2. To our knowledge, no antibodies shown to be specific for individual human SGLT family members have been developed successfully to enable expression profiling based on the protein level, although antibodies to one or more members of the protein family have been reported.6,2429 Therefore, we chose to verify the expression profile of seven SGLT family members in human tissues by quantitative PCR methods across a broad panel of human tissues.

Materials and Methods

Reverse Transcription-Coupled Quantitative Real-Time PCR

AmpliTaq Gold™ DNA Polymerase and AmpErase® Uracil-N-Glycosylase were obtained from Applied Biosystems (Warrington, UK) and MMLV reverse transcriptase was purchased from Promega (Southampton, UK). TRIzol® and DNase I were obtained from Invitrogen (Paisley, UK). TaqMan™ probes and oligonucleotide primers were purchased from Sigma-Genosys (Haverhill, UK) and deoxynucleotide triphosphates were obtained from BioGene Limited (Kimbolton, UK). Purification of total RNA and quantitative RT-PCR was performed as previously described.30 Briefly, total RNA was extracted from snap-frozen human tissues using TRIzol® according to the manufacturer’s instructions. Purified RNA samples were subject to a number of quality control criteria before being passed as suitable for quantitative RT-PCR.30 RNA samples were treated with DNase I to remove any residual genomic DNA and were reverse transcribed using gene-specific primers.

Quantitative RT-PCR was performed with TaqMan primer/probe sets designed using PrimerExpress software (Applied Biosystems). To analyze the expression of the SGLT2 locus we used five primer/probe combinations: four new designs and one previously published.22 These primer/probe sets were designed to survey the entire locus and to be specific with respect to other known expressed sequences, as well as specific for spliced transcripts versus nonspliced transcripts and genomic DNA. The primer/probe combinations were not matched for amplification efficiency, hence the absolute transcript number detected with each primer/probe set can be expected to vary. The sequences of the primer/probe sets utilized in these studies are available in the Appendix, Additional Table S1.

The amplification reaction was performed using AmpliTaq Gold™ polymerase in a standard PCR buffer containing cDNA prepared from 100 ng of total RNA. Uracil N-glycosylase was included in all reactions to prevent cross-contamination and amplification of previous PCR products. PCR reactions were done using an ABI 7900 Sequence Detection System and the thermocycling conditions were 50°C for 2 minutes followed by 95°C for 2 minutes, and then 40 cycles of 95°C for 15 seconds and 60°C for 30 seconds. Amplication of the target transcripts was performed as part of a multiplex reaction in which glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used as an internal control. Asterand’s Global Standard curve was used to interpolate transcript copy numbers from the quantitative PCR CT values.30

Results

All tissues used in this study were obtained from ethically approved intermediaries. Tissue supply was governed by legal agreements and by stringent ethical review from local research ethics committees. In addition, and in all cases, the informed consent of the donor or the donor’s next of kin was obtained for the use of the donated tissue for research. In all, 46% of the tissues were from female donors while 54% were from male donors. The age distributions of the tissue donors were: <20 years old: 5%; 20–39 years old: 17%; 40–60 years old: 32%; 60–80 years old: 31%; >80 years old: 14%. No ethnicity information relating to tissue donors was available.

The genes that were profiled by quantitative RT-PCR for their expression levels across a panel of 72 different human tissues are listed by both their common name and their systematic name in Table 1A. Also listed are the putative substrates for each of the transporters, as well as the chromosomal location of each transporter gene. The primer sequences used for PCR were designed to be general for all known isoforms and splice variants of each gene. The sequences of the amplicons, the exon location of the amplicons, and their precise genomic coordinates are listed in Table 2.31 Included in each PCR reaction was a primer/probe set for the GAPDH gene. This was done to control for the success of the first strand cDNA synthesis reaction and the eventual PCR. These data were not used for normalization purposes since GAPDH levels themselves vary considerably across the tissue type and individual donor.30 The study included total RNA isolated from three different individuals for each tissue in a panel of 72 tissues. The 72 tissues in the study extend across all major human biological systems: cardiovascular, digestive, endocrine, male and female reproductive, hematopoietic and lymphatic, integumentary, musculoskeletal, nervous, respiratory, and urinary systems.30 The individual RNA samples used in the study were prepared from normal tissues, although their donors may have had abnormal or diseased symptoms in other tissues or organs. The donors represent different genders and age groups, and every attempt was made to use the same samples for the analysis of all seven SGLT family members. A few exceptions did occur which bear no impact on our study conclusions.

Table 2 Genomic information of the quantitative polymerase chain reaction (PCR) amplicons of sodium glucose cotransporter protein (SGLT) family members.

Tissue expression profiling of the seven SGLT family members reveals distinct patterns of tissue expression, summarized below.

The Kidney-Specific and Kidney-Abundant SGLT Family Members

Figure 1A presents the results evaluating the expression of SGLT2 from a subset of the 72 tissues profiled (data from all the tissues are available in the Appendix, Additional Figure S1) with a primer/probe set (SGLT2-e6,7) designed to span the boundaries of exons 6 and 7 of SGLT2, as noted in Table 2). In all figures presented, the Y-axis represents the number of transcripts per μg of RNA. The tissue with the highest level of expression was the kidney cortex where the expression was approximately 300-fold higher than the tissue with the next highest level of expression, the kidney medulla. Although small numbers of putative transcripts could be observed in some of the other tissues, no evidence of SGLT2 expression was observed in the majority of other tissues, including the 20 brain subregion RNA samples tested (eight of these are shown in Figure 1A). These data are consistent with several other reports evaluating the pattern of expression of SGLT2 in human tissues.9,10,16 The GAPDH data, plotted in log2 format on the right-hand y-axis of the panel of Figure 1A, indicates that all tissue RNA samples had successful first strand synthesis and PCR reactions. The error bar associated with the tissue data indicates the variation observed in expression measurements using individual RNA samples from three individual donors and not that obtained with technical replicates. Our study methodology is thus different from other studies reported in the literature where pooled samples from commercial vendors were utilized.22

Figure 1
figure 1

Expression of sodium glucose cotransport protein SGLT2 in human tissues. Copies of SGLT2 transcript per 1 μg of total RNA are shown on the left Y-axis of the graph. Log2 value of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) transcript per 1 μg of total RNA is shown on the right Y-axis. A subset of the 72 human tissues is shown on the X-axis. The average number of transcripts from tissues of three human donors is shown in each tissue category. The error bar represents the standard deviation of transcripts of the three donors. (A) SGLT2 tissue expression from TaqMan quantitative polymerase chain reactions (PCR) performed using primer/probe set designed in exon 6-7 of SGLT2 (SGLT2-e6,7); (B) SGLT2 tissue expression from TaqMan quantitative PCR reactions performed using primer/probe set designed in exon 13 of SGLT2 (SGLT2-e13)

1A

Similar data was obtained using three additional primer/probe combinations spanning different exon regions (exons 1 and 2, exons 4 and 5, exons 9 and 10) of the SGLT2 gene (see the Appendix, Additional Figures S2S4), further supporting that the expression of the SGLT2 gene is highly specific for the kidney cortex. Low transcript levels were detected across a range of tissues outside the kidney for the primer/probe set SGLT2-e4,5 (Table S2), possibly due to greater primer/probe set efficiency; however, even in this case, kidney expression was 100-fold higher than in the next highest tissue observed (the ileum). In addition, the primer/probe combination (SGLT2-e13), located in exon 13 close to the 3’ end of the SGLT2 gene and identical to the one used by Zhou et al.,22 shows that the SGLT2 gene is highly specifically expressed in the kidney (Figure 1B, Figure S5, and Table S2), unlike the result previously reported by these investigators. We observed that this primer/probe set essentially mirrors the data obtained with the primer/probe set spanning the junction of exons 6 and 7, as well as primer/probe sets spanning the exons 1–2, 4–5, and 9–10. We note that the primer/probe SGLT2-e13 lies in a region of overlap with another SGLT2-unrelated transcript, and thus is less specific to the SGLT2 sequence compared with other SGLT2 primer/probes we tested. The potential consequences of using this probe are further discussed in “the SGLT2 locus” section below. In summary, these data suggest that the steady state levels of SGLT2 transcripts are highly specific for the cortex of the kidney in humans.

Figure 1
figure 2

1B

SGLT5, a relatively uncharacterized SGLT family member, is also found to have a highly kidney-abundant tissue expression pattern. The expression of SGLT5 in a subset of the 72 human tissues is shown in Figure 2A (data from all 72 human tissues can be found in the Appendix, Additional Figure S6). The highest expression level of SGLT5 is found in the kidney cortex, while in the kidney medulla SGLT5 is expressed at about half of the level found in the kidney cortex. Because the extremely high level of kidney expression may obscure the magnitude of expression observed in other tissues, we replotted the data without the kidney cortex and medulla data in Figure 2B. Unlike the profile observed for SGLT2, which has little or no detectable level in tissues other than the kidney cortex and medulla, SGLT5 exhibits a low level of expression in some tissues like the kidney pelvis, vas deferens, left atrium of the heart, skin, and testes. However, the expression of SGLT5 in the kidney cortex is still 35 times higher than that observed in the vas deferens. These data suggest that the expression of SGLT5 in human tissues is highly abundant in kidney, compared with other tissues.

Figure 2
figure 3

Expression of sodium glucose cotransport protein SGLT5 in human tissues. Copies of SGLT5 transcript per 1 μg of total RNA are shown on the left Y-axis of the graph. Log2 value of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) transcript per 1 μg of total RNA is shown on the right Y-axis. A subset of the 72 human tissues is shown on the X-axis. The average number of transcripts from tissues of three human donors are shown in each tissue category. The error bar represents the standard deviation of transcripts of the three donors. (A) SGLT5 tissue expression from TaqMan quantitative polymerase chain reactions (PCR) performed using primer/probe set designed in exon 11 of SGLT5. (B) Tissue expression of SGLT2 (primer/probe set in exon 6–7) and SGLT5 (primer/probe set in exon 11) plotted side-by-side in tissues absent of kidney cortex or medulla.

2A

Figure 2
figure 4

2B

The Small Intestine and Muscle-Abundant SGLT Family Members

As shown in Figure 3A, the expression of SGLT1 (SLC5A1), the closest homolog to SGLT2, is essentially restricted to the small intestine, the skeletal muscle, and the heart (the data for all 72 human tissues are in the Appendix, Additional Figure S7 and Additional Table S2). Minor numbers of transcripts are observed in the trachea, prostate, cervix, and mesenteric adipose tissue, but like SGLT2, we see no evidence for any expression in the brain subregions tested. The GAPDH data indicates successful first strand synthesis and PCR in all samples. The expression pattern of SGLT1 across human tissues we observed above is generally consistent with reports from other studies using mRNA-based methods.6,7,9 It is worth noting that brain expression of SGLT1 has been reported in other species (rats and pigs) using immunological, in situ hybridization and RT-PCR techniques.29,32,33

Figure 3
figure 5

Expression of sodium glucose cotransport proteins SGLT1, sodium-dependent amino acid transporter (SAAT1), and SGLT4 in human tissues. Copies of SGLT1 transcript per 1 μg of total RNA are shown on the left Y-axis of the graph. Log2 value of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) transcript per 1 μg of total RNA is shown on the right Y-axis. A subset of the 72 human tissues are shown on the X-axis. The average number of transcripts from tissues of three human donors is shown in each tissue category. The error bar represents the standard deviation of transcripts of the three donors. (A) SGLT1 tissue expression from TaqMan quantitative polymerase chain reations (PCR) performed using primer/ probe set designed in exon 4–5 of SGLT1. (B) Tissue expression of SAAT1 (primer/probe set in exon 2–3) and SGLT4 (primer/probe set in exon 11) plotted side-by-side

3A

Interestingly, the glucose sensor SAAT1 (SLC5A4) and the low-affinity glucose/mannose cotransporter SGLT4 (SLC5A9) display a similar high level of expression in the small intestine (duodenum, jejunum, and ileum) as well as in skeletal muscle (Figure 3B; data for all 72 human tissues are in the Appendix, Additional Figures S8 and S9). In the case of SAAT1, the highest expression could be found in the jejunum, and the steady state SAAT1 RNA level there is about 3.5-fold higher than in skeletal muscle. These data are consistent with other reports of SAAT1 expression in the intestine and skeletal muscle.11,13 The steady state SGLT4 RNA level in the ileum is on the order of 5-fold higher than the next highest-expressing tissue outside of the gastrointestinal (GI) tract, the skeletal muscle. The expression of SGLT1, SAAT1, and SGLT4 in the GI tract seems to be enriched in the small intestinal region: duodenum, jejunum, and ileum; whereas these genes are expressed at a much lower level in the large intestine (cecum, colon, and rectum) as well as in other parts of the GI tract such as stomach and oesophagus (Figure 3). Unlike SGLT1, SAAT1 and SGLT4 display a much lower level of expression in the heart. On the other hand, SGLT4 has a uniquely moderate expression level in pancreas compared with SGLT1 and SAAT1. In all other tissues tested, SGLT1, SAAT1, and SGLT4 have a generally low level of expression (see the Appendix, Additional Figures S7, S8, and S9). Overall, the profile observed here for SGLT4 expression across human tissues is similar to that previously reported.16

Figure 3
figure 6

3B

The Brain Expresses SGLT6

The solute carrier family 5A11 or SGLT6, a cotransporter with substrate specificity for myo-inositol and glucose, seems to be the only SGLT family member, aside from the ubiquitously-expressed SMIT (see next section), that has extensive expression in all the brain subregions tested. We observed high steady state RNA levels of SGLT6 collectively in the brain, with the highest subregion being the substantia nigra where it was found to be expressed at levels 2–5-fold higher than the other regions tested (Figure 4; data for all 72 human tissues are in the Appendix, Additional Figure S10). SGLT6 is also highly expressed in the spinal cord, at a level similar to that in substantia nigra. However, it is not detected in the dorsal root ganglion (DRG). Other tissue RNA samples with notable expression levels of SGLT6 are the small intestine (ileum and jejunum), kidney (cortex and medulla), as well as skeletal muscle. The observed pattern of SGLT6 in these studies appears to be unique among the SGLT family members tested, and if this pattern is similar in rodents, SGLT6 activity could account, at least in part, for the observation of functional SGLT expression in rat brain.29 It is interesting to note that the human genomic location of SGLT6 coincides with a locus associated with the nervous system disorders of infantile convulsions and choreoathetosis, though no disease-associated mutations were found in the exon or intron/exon boundary sequences of the SGLT6 gene.18 The pattern of expression in human tissues described here differs from that initially described by northern blot,18 although brain expression was detected in both studies.

Figure 4
figure 7

Expression of sodium glucose cotransport protein SGLT6 in human tissues. Copies of SGLT6 transcript per 1 μg of total RNA are shown on the left Y-axis of the graph. Log2 value of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) transcript per 1 μg of total RNA is shown on the right Y-axis. A subset of the 72 human tissues are shown on the X-axis. The average number of transcripts from tissues of three human donors is shown in each tissue category. The error bar represents the standard deviation of transcripts of the three donors.

The Ubiquitously Expressed SMIT

The sodium myo-inositol cotransporter SMIT (SLC5A3) shows a ubiquitous pattern of expression with the highest expression level in the medulla of the kidney and the blood vessel of the choroid plexus (Figure 5; data for all 72 human tissues are in the Appendix, Additional Figure S11). The thyroid gland, pineal gland, dorsal root ganglion, and the testes also have high levels of SMIT expression. Overall, the pattern of expression of SMIT is consistent with what has been previously reported,14 and its expression in all tissues examined highlights its potentially important role in the maintenance of osmotic balance within cells.

Figure 5
figure 8

Expression of sodium myo-inositol cotransporter (SMIT) in human tissues. Copies of SMIT transcript per 1 μg of total RNA are shown on the left Y-axis of the graph. Log2 value of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) transcript per 1 μg of total RNA is shown on the right Y-axis. A subset of the 72 human tissues are shown on the X-axis. The average number of transcripts from tissues of three human donors is shown in each tissue category. The error bar represents the standard deviation of transcripts of the three donors.

The SGLT2 Locus

The discrepancy between our results and that reported by Zhou et al.22 prompted us to examine the SGLT2 locus in more detail. The SGLT2 gene resides on chromosome 16 where 14 exons span approximately 7 kilobases of genomic DNA. The last two exons, exon 13 and 14, overlap with exon 13 of a gene that is encoded on the opposite strand, called C16orf58 (Figure 6A). This gene, conserved in plants, invertebrates, and vertebrates, contains 13 exons spanning 20 kb of genomic DNA, and encodes a protein homologous to the Arabidopsis RUS1 gene.34,35 Numerous sequence submissions to Genbank suggest C16orf58 is indeed expressed. Our internal cDNA cloning effort has obtained full-length cDNA clone of C16orf58. The C16orf58 cDNA clone has a long 3’ untranslated region that contains the reverse-complement sequence of exon 13 and 14 of the SGLT2 gene (data not shown). Electronic northern blots using the region of overlap as a ‘probe’ indicates that expressed sequence tags (ESTs) for C16orf58 versus that of SGLT2 can be found in the NBCI database at the relative abundance of 10 transcripts to one (data not shown). We designed a primer/probe set located in exon 8 of C16orf58 and one that is specific for C16orf58 (Figure 6B). Expression profiling analysis with this primer/probe set indicates that this gene is expressed ubiquitously with the highest steady state RNA levels also located in the cortex of the kidney, the cerebellum (Figure 6C), and the thyroid (see the Appendix, Additional Figure S12). The primer/probe set (SGLT2-e13) used by Zhou et al.22 resides in exon 13 of SGLT2, in the area of overlap between SGLT2 and C16orf58 (Figure 6B), suggesting that measurement of both transcripts would be confounding in experiments where poly d(T) was used to prime first strand synthesis in a one-step PCR process. We have used each of the primers employed by these investigators in separate first strand reactions and have shown that the reverse primer is capable of synthesizing kidney cortex abundant SGLT2 cDNA (Figure 1B), however we have been unable to recapitulate the expression pattern seen with the C16orf58-specific primer (Figure 6C) using the Zhou et al.22 forward primer (data not shown).

Figure 6
figure 9

Analysis of the sodium glucose cotransport protein SGLT2 locus in human genome. (A) A genome browser view of the 22 kb region of the human SGLT2 locus in chromosome 16 obtained from the UCSC Genome Browser of the human genome GRCh37 Assembly (hg19). The red rectangular box highlights the overlapping of exon 13 and 14 of SGLT2 with exon 13 of C16orf58. (B) Schematic drawing of the exon structure of the SGLT2 gene and C16orf58 gene on chromosome 16. Green block arrows represent the open reading frames of the genes. Pink block arrows represent the exons of SGLT2 gene while yellow block arrows represent the exons of C16orf58 genes. Locations of amplicons of TaqMan quantitative polymerase chain reactions (PCR) are represented by red arrow heads. (C) Copies of C16orf58 transcript per 1 μg of total RNA are shown on the left Y-axis of the graph. Log2 value of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) transcript per 1 μg of total RNA is shown on the right Y-axis. A subset of the 72 human tissues are shown on the X-axis. The average number of transcripts from tissues of three human donors is shown in each tissue category. The error bar represents the standard deviation of transcripts of the three donors.

6A

Figure 6
figure 10

6B

Figure 6
figure 11

6C

Discussion

We have used quantitative RT-PCR to examine the expression profile of SGLT2 and related family members across a panel of 72 tissues from three healthy individuals. This study is the most comprehensive analysis done to date on this important family of sodium-monosaccharide cotransporters, and was carried out to increase our understanding of where these transporters are expressed in human tissues. The data presented clearly demonstrates that SGLT2 is expressed primarily in the cortex of the kidney at steady state levels that are several hundred-fold greater than the tissue with the next highest abundance, the kidney medulla. We have confirmed these data with multiple primer/probe combinations representing essentially the entire SGLT2 locus.

These data are in agreement with some previous studies9,16,20 but in conflict with others,21,22 who reported that SGLT2 showed a more widespread pattern of expression. We have attempted to reconcile our results with those of Zhou et al.22 by using their primer/probe combination in conjunction with our PCR methodology, which uses orientation-specific first strand cDNA synthesis; however, we were unable to reproduce their results. Instead we demonstrate that like that obtained with all SGLT2 primer/probe combinations tested, SGLT2 is indeed restricted in its expression to the cortex of the kidney. We analyzed the SGLT2 locus and discovered that the primer/probe amplicon SGLT2-e13 overlaps with extreme 3’ end of another gene encoded on the opposite strand, C16orf58. Our results indicate that this conserved gene with homology to the Arabidopsis RUS1 gene34,35 is indeed ubiquitously expressed at moderately abundant levels in all 72 tissues tested and is also expressed in the cortex of the kidney. We attempted to use the forward primer employed by Zhou et al.22 to prime first strand synthesis of the 3’ end of C16orf58 followed by PCR, to determine if that method could give rise to the expression profile they reported, but we were unable to recapitulate the published profile, nor could we replicate the profile obtained with the C16orf58-specific primer/probe set located in exon 8. Either the C16orf58 forward primer is inefficient in this role as a primer of first strand synthesis, or the 3’ end of the C16orf58 transcript is particularly unstable and easily degraded. However, the large number of Express Sequence Tags that can be found in Genbank generated from the C16orf58 strand that overlaps the 3’ end of SGLT2 as well as our own internal C16orf58 cloning efforts suggests that the latter is not the case.

To better understand the physiological consequences of agonizing and antagonizing protein function it is extremely important to know in which tissues a gene is expressed, particularly from the viewpoint of assessing potential liabilities. Included in the tissue panel was RNA isolated from 20 different brain subregions. We see no evidence that SGLT2 is expressed in any of the brain subregions tested, even though the same RNA samples were used for other profiles that did return detectable transcript numbers, e.g. SGLT6, where 52 out of 60 brain subregion RNA samples used in both experiments were identical, and the GAPDH control indicated successful enzymatic reactions. However, this conclusion is limited by the detection level afforded by quantitative PCR methodologies, and we can not rule out the possibility that SGLT2 is expressed within small numbers of discrete brain cells, as well as other tissues, that are undetectable by the TaqMan protocol.

Another method to evaluate RNA expression is by Affymetrix RNA chip hybridization. The human SGLT2 gene is represented by probe set 207771_at on Affymetrix RNA chips. The mouse SGLT2 gene is represented by probe sets 1419166_at and 1455005_a_at, while rat SGLT2 is represented by probe set U29881_at. The public domain gene expression database BioGPS36 contains tissue expression profiling data for these probe sets (see the Appendix, Additional Figure S13). In the Affymetrix chip database, SGLT2 transcripts are found to be highly specific to the kidney tissue of human, mouse, and rat, consistent with our human tissue quantitative RT-PCR study results. These data are further supported by the recent publication of immunohistochemical data in mice, confirming the original localization of the rodent SGLT2 protein to the early proximal tubule,19 which was clearly absent in the SGLT2 knock-out mouse.27

Certainly, the sodium/glucose cotransporter family member with the closest expression profile to that of SGLT2 is SGLT5, which is expressed almost exclusively in the cortex and medulla of the kidney. The only other tissues that show a much lower level of expression are the heart, skin, and vas deferens. This highly restricted gene expression suggests a kidney-specific function, however, little is known about this gene and its gene product, including what its substrate specificity may be. Hence the putative role of this protein in regulating solute homeostasis in the kidney remains to be determined.

The SGLT1 expression profile agrees with the well-established role of this gene as encoding the intestinal high-affinity low capacity glucose reabsorption cotransporter6 with the highest steady state levels being observed in the ileum and other regions of the small intestine.7 The observed expression of SGLT1 in the ventricle of the heart agrees with that reported by Zhou et al.22 as well as with our own unpublished data (Hagan D., data not shown). It is curious to note that SGLT1 expression in the heart is apparently restricted to humans, since no expression of SGLT1 is detectable in the heart tissue of rats,7 dogs, pigs, or cynomolgus monkeys (Hagan D., unpublished results.) Our data also suggest that transcripts for SGLT1 can be found in skeletal muscle at a level similar to the heart. The relative high level of SGLT1 expression in heart and skeletal muscle suggests that it might play a significant but yet-to-be-defined role in glucose absorption in those tissues.

The glucose transporters SAAT1 and SGLT4, the latter also having affinity for mannose as a substrate, have a surprisingly similar tissue expression pattern in the small intestine and skeletal muscle compared to SGLT1. Interestingly, the expression of these three SGLT member genes seem to be limited to the small intestinal portion of the GI tract with the duodenum, jejunum, and ileum all having the most prominent levels of detectable RNA; other parts of the GI tract, such as the cecum, colon, rectum, and the stomach have much lower level of gene expression. However, unlike SGLT1, SAAT1 and SGLT4 do not have a high level of expression in the heart. SGLT4 also demonstrates high level of expression in the pancreas, although at this time we do not know which specific cell types in the pancreas may contribute to the overall steady state level of RNA. The specific overlap in tissue expression pattern of SGLT1, SAAT1, and SGLT4 suggest that these three SGLT members could have similar and/or complementary functions in the body. Based on the localization of the SAAT1 protein, and the observation of ion transport decoupled from glucose transport activity, Diez-Sampedro and colleagues have proposed a role for SAAT1 as a glucose sensor in cholinergic neurons of the GI tract and at neuromuscular junctions, as a potential modulator of gastric motility and muscular activity.13

We show here that SGLT6, a gene recently implicated in a genetic association study to be a modifying locus of systemic lupus erythematosis (SLE), displays a neuronal expression pattern suggesting an additional putative role for this gene outside that of the immune system in regulating myo-inositol homeostasis in the brain and spinal cord.37 Interestingly, our data does not support an immune cell expression pattern but instead indicates that SGLT6 is not significantly expressed (at the level of quantitative RT-PCR resolution) outside of the nervous system, the small intestine, and the kidney.

Given the interest in SGLT2 inhibitors as a potential therapeutic approach for diabetes, it is worth noting that not only is the tissue localization of SGLT2 expression relevant to the therapeutic profile, but also the selectivity profile for inhibitors currently in clinical development, since a lack of specificity for one or more other SGLT proteins could have physiological consequences beyond the kidney. Various groups have reported the in vitro, assay-specific 50% inhibitory concentration (IC50)-based selectivity of SGLT2 inhibitors versus SGLT1, with dapagliflozin and BI 10773 showing the highest degree of selectivity: 1241-fold and >2500-fold selectivity, respectively;38,39 whereas the compounds canagliflozin, ASP1941, and LX4211 have reported selectivities of 414-, 255- and 20-fold, respectively.40,41,42 The activity of these compounds vs. other members of the SGLT family have not been consistently reported to date. Once the inhibitory constant (Ki) values, generated from data obtained across a range of substrate concentrations, are reported, the values should be more directly comparable across compounds. The impact of the overall selectivity profile of these agents should become clear as the therapeutic profile emerges during clinical development.

Conclusion

In summary, in order to better understand the spectrum of human tissues in which the sodium-glucose cotransporter family of proteins is expressed, we have examined the mRNA expression profile of human SGLT2 and six of its homologs (SGLT1, SAAT1, SMIT, SGLT4, SGLT5, SGLT6) in a panel of 72 normal human tissues from three separate donors using quantitative RT-PCR methods. For SGLT2, using five different primer/probe sets spanning the entire locus, we confirm the kidney-specific expression of this gene. Thus we conclude that SGLT2 expression is highly restricted to the kidney in humans, and that the expression of related proteins of this family, with the exception of SGLT5, diverges from this kidney specificity. It is a highly desirable requirement that the expression of a protein drug target be restricted to the tissue where its therapeutic effects can be most efficacious. The work presented here demonstrating the kidney-specific expression of SGLT2 helps to satisfy this requirement and further augments the importance and value of this protein as a drug target for treatment of diabetes.