Introduction

Members of the transforming growth factor β (TGF-β) family, which include three TGF-β isoforms, as well as activins, nodal and bone morphogenetic proteins (BMPs), regulate a variety of cellular processes including differentiation, proliferation, migration and cell death in cell-type-specific and context-dependent manners.1, 2, 3 The biological effects of TGF-β family members are highly contextual, for example, their responses may differ in different tissues, local environments and stage of disease. Since TGF-β activates cytostatic and cell death processes that maintain homeostasis in mature tissues, it functions as a suppressor of epithelial cell tumorigenesis at early stages. Inactivation of the TGF-β signaling pathway through mutation and/or loss of heterozygosity of TGF-β receptors or Smad proteins has been found in certain types of cancer and is related to poor prognosis for the patients (reviewed in Levy and Hill4). However, TGF-β promotes tumor progression by enhancing migration, invasion and survival of tumor cells during the later stages of tumorigenesis, through stimulating extracellular matrix deposition and tissue fibrosis, perturbing immune and inflammatory function, stimulating angiogenesis and promoting epithelial–mesenchymal transition (reviewed in Yoshimura et al.5, Roberts and Wakefield6, Moustakas and Heldin7 and Miyazono et al.8). Accumulating evidences also indicate critical roles of TGF-β/activin signaling in the maintenance of stem cell-like properties of certain cancer-initiating cells, such as glioma-initiating cells,9, 10 breast cancer-initiating cells,11 pancreatic cancer-initiating cells,12 and leukemia-initiating cells in chronic myeloid leukemia.13 Intriguingly, small molecular inhibitors for type I receptors have therapeutic effects at least in animal models.9, 10, 12, 13 These observations suggest that targeting the TGF-β/activin signaling pathways could be an attractive therapy in certain advanced cancers, although it is possible that shutdown of these pathways in normal tissues will increase the risk for the development of other tumors. Thus, one of the major questions that remain to be addressed in this field is what defines the dual role of TGF-β in cancer biology.

Identification of the signaling components of TGF-β family members, including membrane receptor serine/threonine kinases and Smad transcription factors, has led to an understanding of the molecular mechanisms underlying this highly contextual process.14, 15 Genome-wide transcriptome analyses in various cell types have identified many target genes that are required for ligand-mediated cellular responses. Direct binding of Smad complexes was confirmed by in vitro binding assays, promoter assays and chromatin immunoprecipitation (ChIP) followed by polymerase chain reaction. Until recently, however, regulatory elements were mainly identified in the promoter regions of the target genes, especially 1–2 kb upstream of their transcription start sites.

ChIP with promoter array analysis (ChIP-chip) and ChIP followed by sequencing (ChIP-seq) have become powerful tools to analyze genome-wide mapping of protein-binding sites and epigenetic marks.16, 17 In this case, a DNA sample obtained after ChIP procedure is analyzed using promoter-tiling arrays, or massively parallel sequencing (Supplementary Figure 1), which provides a comprehensive chromatin-binding landscape of target transcription factors. Information obtained by these analyses has shed light on previously unrecognized mechanisms and sometimes challenged notions previously characterized in a specific situation. Recently, several groups have reported that Smad proteins tend to co-occupy target sites with cell-type-specific master transcription factors.18, 19, 20 The results also indicate that co-occupied regions mainly overlap with enhancer elements, although previous studies have identified numerous Smad-responsive elements in the promoter regions of their target genes. In addition, recent ChIP-chip/ChIP-seq studies have identified a group of direct target genes, or target gene signatures, in specific cell types and cellular contexts. Intriguingly, Kennedy et al.21 reported that the TGF-β/Smad4 target gene signature identified in ovarian cancer cell lines predicts patient survival.

In this review, we discuss current knowledge of cell-type-specific binding patterns of Smad proteins and mechanisms of transcriptional regulation obtained from recent ChIP-chip/ChIP-seq studies (Supplementary Table 1). We also highlight applications of the genome-wide analyses for cancer research. These insights contribute to the unraveling of the complex mechanisms of TGF-β signaling in cancer biology.

Overview of signaling pathways of TGF-β family members

The TGF-β family consists of 33 members in mammals. Two types of serine/threonine kinase transmembrane receptors, that is, type II and type I receptors, are required for intracellular signal transduction by the TGF-β family members.14 Five type II receptors and seven type I receptors are present in mammals.22 Ligand binding assembles specific type II and type I receptors into heterotetramers. Then the type II receptor transphosphorylates and activates the type I receptor, which subsequently transduces the signal by phosphorylating the carboxyl terminus of receptor-regulated (R)-Smad. In most cell types, TGF-β and activin induce phosphorylation of Smad2 and Smad3 (activin/TGF-β-specific R-Smads, or AR-Smads) and BMPs induce phosphorylation of Smad1, Smad5 and Smad8 (BMP-specific R-Smads, or BR-Smads). Activated R-Smads form heterooligomeric complexes with common-partner (co)-Smad (Smad4). The complexes translocate into the nucleus where they regulate the expression of target genes, such as the genes for Serpine1 (plasminogen activator inhibitor-1), inhibitory (I)-Smads (Smad6 and Smad7) and Id1 (inhibitor of differentiation-1 or inhibitor of DNA binding-1) (Figure 1). Because of their relatively low DNA-binding affinity, Smad complexes interact with a wide variety of DNA-binding proteins and cooperatively regulate a synexpression group of target genes (Figure 2a).2 So far, several transcription factors, such as AP-1,23 ETS,24, 25 basic helix-loop-helix proteins,26, 27 C/EBPβ,28 FoxH129, 30 and FoxO31 have been identified and validated as important cofactors of TGF-β/BMP signaling pathways. In addition, Smad complexes recruit coactivators, such as p300 and CREB-binding protein,32, 33 or corepressors, such as ATF-3.34 For example, TGF-β represses transcription of the Id1 gene in epithelial cells through formation of a complex with ATF-3, while TGF-β induces Id1 in cells which do not express ATF-3, such as glioma-initiating cell-like cells.35 Since ATF-3 is induced by tumor necrosis factor-α, signaling crosstalk between TGF-β and tumor necrosis factor-α pathways determines the transcriptional regulation of Id1. Thus, crosstalk with other signaling pathways and interaction with other DNA-binding cofactors define the specific binding patterns of Smads; in addition, interaction with coactivators/corepressors modulates their transcriptional activity (Figure 1).

Figure 1
figure 1

Signaling of TGF-β family members through Smad complexes. Smad proteins are central mediators of the signal transduction of TGF-β family members. Ligand binding assembles specific type II and type I receptors into heterotetramers. The type II receptor transphosphorylates (P) and activates the type I receptor, which subsequently activates receptor-regulated (R)-Smads. Activated R-Smads form heterooligomeric complexes with common-partner (co)-Smad. In the nucleus, Smad complexes interact with DNA-binding cofactors and cooperatively regulate a group of target genes. Crosstalk with other growth regulatory factors affects the specific binding patterns and transcriptional activity of Smads.

Figure 2
figure 2

Factors that determine the binding patterns of Smads. (a) A group of genes that are simultaneously regulated by a specific Smad–cofactor complex is known as a synexpression group. Distinct combinations of DNA-binding cofactors in different contexts determine the set of genes regulated by Smad complexes. (b) Cell-type- or lineage-specific master transcription factors (purple) open up local chromatin structure to make Smad-binding regions (red) accessible. The master transcription factors also physically interact with Smads and, in some cases, recruit them to their binding sites. DNA-binding cofactors, induced and activated in context-dependent manner, strengthen the interaction between Smad and DNA. Interaction with coactivators/corepressors also affects the regulation of their target genes. A full colour version of this figure is available at the Oncogene journal online.

Smad proteins are targets of protein modifications, such as phosphorylation, ubiquitination and ADP-ribosylation. The cyclin-dependent kinases (CDKs) CDK8 and CDK9, which are downstream effectors of extracellular-signal-regulated kinase (ERK) MAP kinase, phosphorylate the linker region of Smads in the nucleus.36, 37, 38, 39 Glycogen synthase kinase-3β (GSK3β) also phosphorylates the linker region of Smads, which requires priming phosphorylation by ERK MAP kinase.40 These phosphorylations mark the proteins for polyubiquitination and promote proteasome-mediated degradation of Smad complexes. Several WW domain proteins have been reported to recognize the phosphorylated linker regions and interact with R-Smads.41 Smurf1 is a member of the E3 ubiquitin ligase family, which can target BR-Smads for degradation,42 while NEDD4L (also known as NEDD4-2) is an E3 ubiquitin ligase for AR-Smads.43, 44 Consequently, endogenous ERK MAPK and GSK3β signaling pathways are able to antagonize Smad activity through proteasome-mediated degradation. Recently, deubiquitinating enzymes (DUBs) for Smad proteins have been identified.45, 46 Monoubiquitination of the lysine-519 (K519) residue of Smad4 prevents its association with R-Smads and negatively regulates TGF-β/BMP signaling pathway. USP9x (also known as FAM) has been identified as a DUB that reverts this modification.45 R-Smads are monoubiquitinated in their DNA-binding domains, which attenuates their affinity for DNA. This monoubiquitination is opposed by another DUB, USP15.46 Recently, Lonn et al.47 found that Smad proteins are targets of ADP-ribosylation. Poly(ADP-ribose) polymerase-1 (PARP-1) interacts with and ADP-ribosylates Smad3 and Smad4 in the nucleus, and affects the binding affinity of Smad complexes in a context-dependent manner.47, 48 Thus, posttranslational modifications of Smad proteins affect their signal transduction capacities; some of these modifications are regulated by other signaling pathways (Figure 1).

Smad-binding motifs

The R-Smads and Smad4 are composed of two evolutionally conserved domains named Mad Homology 1 and 2 (MH1 and MH2). The MH2 domain plays an important role for the formation of heterooligomeric Smad complexes and transcriptional activation, whereas the MH1 domain is responsible for sequence-specific DNA-binding activity. Using a polymerase chain reaction-based random-oligonucleotide selection process, an 8-bp palindromic DNA sequence, GTCTAGAC, was identified as a Smad3 and Smad4 binding motif.49 In contrast to Smad3 and Smad4, Smad2 does not directly bind to DNA due to steric hindrance by an inserted sequence in the DNA-binding region.50 The crystal structures of the MH1 domain of Smad1 and Smad3 have revealed that R-Smads recognize and directly bind to half of the palindrome, that is, GTCT or AGAC sequences, through an 11-amino-acid residue β-hairpin loop in the MH1 domain.51, 52, 53 The amino-acid sequences of the loop are completely conserved among R-Smads and show a high level of similarity between R-Smads and Smad4. The half-site sequences are usually referred to as the CAGA box or Smad binding element (SBE). Recent ChIP-chip/ChIP-seq studies have confirmed that the SBE is enriched in the Smad2/3-binding regions.18, 24, 26, 54, 55

Although the MH1 domain of Smad1 has high affinity for SBE,52, 53 BR-Smads seem to prefer a GC-rich sequence, such as GCCGnCGC, which was originally identified in Drosophila.56 In mammals, GC-rich sequences, such as GCCG and (T)GGCGCC, have been identified in the promoter regions of several BMP target genes. Using a de novo motif-finding method, we identified a Smad1/5-binding motif, which is consistent with the previously reported GC-rich sequences and thus named as GC-rich SBE (GC-SBE).57 Importantly, both GC-SBE and SBE are enriched in the Smad1/5-binding sites identified in both endothelial cells (ECs) and pulmonary arterial smooth muscle cells (PASMCs).57 Since binding motifs for R-Smads have been identified in vitro and in vivo, candidate Smad-binding sites can be predicted in the promoter regions of the target genes. However, these motifs are common throughout the genome, and the majority of them are not occupied by R-Smads when examined using ChIP-chip/ChIP-seq. Thus, additional mechanisms operate to determine the binding patterns of Smads.

Factors that determine the binding patterns of Smads

Recent studies have suggested that Smad complexes colocalize with master transcription factors that specify and maintain cell identities.18, 19, 20 Chen et al.20 pointed out that Smad1 colocalizes in the multiple transcription factor-binding loci with embryonic stem (ES) cell-specific transcription factors, such as Oct4 and Sox2 in mouse ES cells (mESCs). Mullen et al.18 reported that binding regions of Smad3 also overlap with those of Oct4 in both human and mouse ES cells. Intriguingly, at least some of these co-occupied regions are still enriched after tandem ChIP-re-ChIP experiments, indicating that Oct4 and Smad3 bind to similar regions in mESCs simultaneously.18 Moreover, Smad3 colocalizes with MyoD (encoded by Myod1) or PU.1, master transcription factors controlling muscle or hematopoietic differentiation, respectively, in specific cell types which express these genes; forced expression of MyoD in mESCs is sufficient to redirect Smad3 to muscle specific binding sites, where they colocalize.18 In addition, Trompouki et al.19 reported that induction of the myeloid lineage regulator C/EBPα shifted Smad1 to sites newly occupied by C/EBPα in the human erythroleukemia cell line K562. Overexpression of the erythroid regulator GATA1 restricts Smad1 binding to erythroid genes, while binding to genes expressed in other lineages is diminished.19 These findings suggest that Smad complexes are passively recruited to cell-type-specific binding sites through the interaction with master transcription factors.

On the other hand, we recently found that HNF4α, one of the master regulators of hepatocyte differentiation and liver function, contributes to the hepatocyte-specific binding pattern of Smad2/3.58 Interestingly, 32.5% of the Smad2/3-binding regions overlapped with those of HNF4α. This is against the simple model in which cell-type-specific master regulators recruit R-Smads to their binding sites and determine their function. In addition, through the analysis of the distances between the Oct4 peak and the peaks of Sox2 and Smad3 in mESCs, Mullen et al.18 found that Oct4 sites are more closely associated with Sox2 sites than Smad3 sites, suggesting that Oct4 and Smad3 do not interact in a direct manner. They revealed that nucleosomes were relatively depleted at the sites co-occupied by cell-type-specific master transcription factors and Smad3, and hypothesized that master transcription factors increase the accessibility of SBEs and contribute to Smad3 binding. Intriguingly, MyoD binding has been reported to be associated with local histone acetylation.59 PU.1 and C/EBPα binding has been reported to induce nucleosome remodeling, followed by monomethylation of H3K4.60 John et al.61 reported that cell-type-specific glucocorticoid receptor binding patterns are comprehensively predetermined by cell-specific differences in baseline chromatin accessibility patterns, with secondary contributions from local sequence features. Similarly, comparison of Smad1/5-binding patterns of ECs and PASMCs suggested that the endothelial-specific binding pattern of Smad1/5 is predetermined by baseline chromatin accessibility patterns.57 Thus, these facts support the notion that Smad complexes determine their target sites together with other DNA-binding cofactors in two different ways: (1) cell-type- or lineage-specific transcription factors, or pioneer factors,62 open up local chromatin structure to make SBE and GC-SBE accessible and (2) DNA-binding cofactors, induced and activated in context-dependent manner, strengthen the interaction between Smad and DNA (Figure 2b).

Intriguingly, it has been observed that different levels of activation of Smad signaling pathways cause different binding patterns of Smad complexes, possibly correlating to the amount of activated Smad complexes in the nucleus.63 It has been well described that different concentrations of activin regulate the expression of distinct subsets of target genes.64 Lee et al.54 confirmed that phospho-Smad2 is dose-dependently able to bind to different subsets of target genes and regulate their transcription in mESCs. Comparing the ChIP-seq data of different BMP isoforms in ECs, we found that each binding site has different binding affinity for Smad complexes and that the strength of Smad1/5 signaling affects the number and distribution of Smad-binding sites over the genome.57 Thus, these findings suggest that a distinct dose-dependency occurs in the regulation of different subsets of target genes, which may cause phenotypic change.

Smad binding and histone modification markers

As discussed above, local chromatin structure or accessibility affects the binding patterns of Smads. Recent studies have emphasized the importance of enhancers for the precise regulation of expression of target genes.18, 19, 20, 54, 57 On the other hand, several groups have found that most of the Smad-binding sites are located at promoters of known genes.30, 65, 66 Kim et al.30 reported that 50–60% of Smad2/3 binding occurs in exons and promoters in human ES cells (hESCs), while only 10–15% of Smad binding occurs in exons and promoters in derived endoderm. This finding suggests that the preference of binding pattern of Smads to either promoters or enhancers is modulated by the differentiation stages.

Smad proteins have also been shown to induce local chromatin remodeling and modification at their binding sites. Both Smad1/5 and Smad2/3 have been reported to physically interact with a histone demethylase, KDM6B (also known as JMJD3), to recruit it to the NOG (encoding noggin) and NODAL promoter regions, respectively, and to cause the loss of the repressive mark histone H3 lysine-27 trimethylation (H3K27me3) in mESCs.67, 68 Recently, Kim et al.30 reported that Smad2/3 and KDM6B are simultaneously enriched in the GSC (encoding goosecoid) and EOMES (encoding eomesdermin) promoter of hESCs after activin treatment, followed by the loss of the H3K27me3 repressive mark (Figure 3a). Interestingly, Fei et al.65 identified that KDM6B is one of the BMP4-modulated early neural differentiation regulators, suggesting that loss of repressive histone marks through the Smad–KDM6B pathway explains the transcriptional regulation especially at later time points.

Figure 3
figure 3

Smad proteins and histone modification marks. Smad proteins have been reported to induce local chromatin remodeling and modification at their binding sites. Several models are described in ES cells, where early developmental genes are poised and ready to be activated in response to extracellular signals, such as nodal. (a) R-Smads physically interact with a histone demethylase, KDM6B (also known as JMJD3), and recruit it to their target sites, followed by the loss of the H3K27me3 repressive mark (light green). (b) Xi et al.70 reported that nodal signaling triggered TRIM33–Smad2/3 complex formation. The TRIM33–Smad2/3 complex recognizes and binds to H3K9me3-K18ac dual histone marks (light blue) and displaces the chromatin-compacting factor HP1γ (heterochromatin protein 1γ) in the GSC and MIXL1 promoters, resulting in the remodeling of the local chromatin structure to make Smad-binding region(s) (red) accessible. A full colour version of this figure is available at the Oncogene journal online.

In addition to sequence-specific DNA-binding transcription factors, histone code reader proteins, which are recruited and bound to specific histone modifications, are reported to help to determine the binding sites of Smad proteins. Massagué and colleagues have reported that tripartite motif 33 (TRIM33, also known as TIF1γ or Ectodermin), physically interacts with Smad2 and Smad3, to make a TRIM33–Smad2/3 complex without Smad4.69 The TRIM33 contains an N-terminal RING finger/B-box/coiled coil (RBCC) or TRIM domain, and a plant homeodomain (PHD) zinc finger and a Bromo domain in the C-terminus. They reported that the PHD-Bromo cassette recognized histone H3 lysine-9 trimethylation (H3K9me3) and H3 acetylation especially at lysine residues 18 and 23 (H3K18ac and H3K23ac). During mESC differentiation, nodal signaling triggered TRIM33–Smad2/3 complex formation. The TRIM33–Smad2/3 complex recognizes and binds to H3K9me3-K18ac dual histone marks and displaces the chromatin-compacting factor heterochromatin protein 1γ (HP1γ) in the GSC and MIXL1 promoters, resulting in the remodeling of the local chromatin structure (Figure 3b).70 Agricola et al.71 also found that TRIM33 recognizes and binds to H3K18ac/K23ac. On the other hand, TRIM33 has been reported to bind Smad4 and function as a RING-type ubiquitin ligase for Smad4.72 Consistent with this model, Agricola et al.71 reported that TRIM33 inhibits Smad4 function through ubiquitin-mediated degradation of Smad4, and that its E3 ubiquitin ligase activity is induced after binding to histones. The detailed mechanisms have not been settled, but TRIM33 recognizes a specific histone code and modulates TGF-β/BMP signaling. Since the relationship between Smad proteins and histone modification marks has not been fully elucidated on a genome-wide scale, future analyses will address a possible mechanistic link between Smad proteins and epigenetic marks using ChIP-chip/ChIP-seq approach.

Smad binding and gene regulation

Previous studies have indicated that binding of transcription factors detected by ChIP-chip/ChIP-seq experiments are not necessarily associated with transcriptional regulation of nearby genes (reviewed in Farnham73). It has frequently been observed that changing the level of a DNA-binding transcription factor alters the expression level of only 1–10% of its potential target genes. Most of the recent studies have confirmed that 1–20% of Smad-binding sites are associated with the regulation of expression of nearby genes. This discrepancy is in part due to the fact that mRNA levels do not only reflect transcriptional activities, since mRNA levels are also regulated by other biological processes, for example, degradation. Another explanation for the discrepancy is related to the definition of target genes. Although most studies assign binding sites to the nearest gene within 50 kb, this is not always the case. For example, Trompouki et al. revealed that several transcription factors, including Smad1, cooperatively regulate the expression of the hematopoietic gene LMO2 through binding to the known enhancer region at 72 kb upstream of the transcription start site in K562 cells.19, 74 We also observed that Smad1/5 bound to a region 57 kb upstream of the transcription start site of Smad6 in ECs, as well as the LMO2 −72 kb enhancer.57 This region has been reported to be associated with Smad6 expression in the heart, vasculature and hematopoietic organs,75 suggesting that the binding to this region, as well as the promoter region, plays an important role in these cell types. Recently, methods that characterize the chromatin architecture have been developed. Chromosome conformation capture (3C) assays make it possible to study long-distance regulation of genes by enhancers through formation of chromatin loops (reviewed in Simonis et al.76). Application of these technologies will help to identify the functional relationship between Smad-binding sites and genes implicated in cancer progression.

It is also possible that for many sites, binding of Smads is not sufficient for transcriptional regulation, but additional stimuli are required to drive the expression of the target genes. For example, costimulation with tumor necrosis factor-α, which induces the transcriptional repressor ATF-3, affects the expression regulation of the Id1 gene and cellular response.34, 35 Sometimes, ligand stimulation itself induces these cofactors and makes a feed-forward circuit, like in myotube differentiation. The myogenic transcription factor MyoD directly regulates genes expressed during skeletal muscle differentiation together with other transcription factors such as MEF277 and Zfp238 (also known as RP58).78 These transcription factors are also induced by MyoD, and MEF2 functions with MyoD in a positive feed-forward circuit,77 while Zfp238 participates in a negative feed-forward circuit.78 Comparison of MyoD-binding patterns of mouse C2C12 myoblasts and differentiated myotubes has revealed that most binding events in myoblasts are not directly associated with gene regulation. However, MyoD binding increases during myogenic differentiation at many of the regulatory regions associated with genes expressed in skeletal muscle. Intriguingly, the myotube-increased binding sites are enriched for MEF2-like motifs, while the myotube-decreased peaks are enriched for Zfp238-like motifs,59 consistent with the fact that MEF2 positively and Zfp238 negatively cooperate with MyoD. It is possible that TGF-β stimulation induces certain transcription factors, which take part in feed-forward regulatory loops and cooperatively regulate gene expression especially at late time points.

Identification of a TGF-β gene signature

The notion of ‘gene signature’ comes from the early work on cancer classification and prognosis prediction using genome-wide gene expression profiles obtained from microarray analyses of cancer patients.79 Identification of a group of genes that reflect the activity of a common function, pathway or other property in a specific context, are sometimes more revealing compared with the analysis of single genes. Gene expression signatures obtained in experimental conditions has proved to subcategorize patients and predict their prognosis. Concerning TGF-β, Coulouarn et al.80 reported that TGF-β-responsive genes at late time points, or a late TGF-β signature, which were identified in mouse primary hepatocytes, successfully discriminate distinct subgroups of hepatocellular carcinoma and possess a predictive value for hepatocellular carcinoma patients.

Combination of ChIP-chip/ChIP-seq and genome-wide transcriptome analyses provides an accurate prediction of target genes of Smad proteins. TGF-β family members regulate a variety of target genes both directly and indirectly, and modulate many biological processes. The chromatin-binding landscape of Smad proteins, obtained by ChIP-chip/ChIP-seq, will help to identify specific genes that are directly regulated by Smad proteins. It will also help to dissect a specific cellular program regulated by TGF-β family members, for example, the growth inhibitory and apoptosis programs of TGF-β. So far, many groups have identified groups of direct TGF-β target genes by using this strategy. Importantly, the TGF-β/Smad4 target gene signature identified in an ovarian cancer cell line predicts patient survival, based on in silico mining of publically available patient data bases.21 Since TGF-β functions as a tumor suppressor in low-grade carcinoma cells, while it promotes metastasis in advanced carcinoma cells, a direct comparison of the Smad-binding sites of these two stages of tumorigenesis, obtained from experimental models or from cancer patients, may reveal specific gene signatures of TGF-β correlating to its tumor suppressive and tumor-promoting roles, respectively. This may provide us more novel predictive indicators and biomarkers for TGF-β targeting treatments.

Conclusions and perspectives

The signaling pathways of TGF-β family members are key players in tumorigenesis and cancer progression. TGF-β can function both as a tumor-suppressing and a tumor-promoting factor during cancer progression. BMP signaling has been reported to play critical roles in oncogene-induced senescence, which is part of the tumorigenesis barrier and blocks cellular proliferation by inducing irreversible growth arrest.66 Interestingly, BMP signaling induces differentiation of certain cancer-initiating cells, such as glioma-initiating cells,81 while TGF-β/activin signaling maintains their stem cell-like properties.9, 10 Since Smad proteins are central mediators of the signal transduction, studies on global and genome-wide binding sites of Smad proteins may reveal important insights into their complex biological functions.

Identification of an appropriate antibody is the first and most important step for ChIP-chip and ChIP-seq analyses, because the quality of ChIP data depends crucially on the quality of the antibody used.16 Since MH1 and MH2 domains are conserved among R-Smads, several specific antibodies for Smad proteins recognize their linker region. However, linker regions are targets of posttranslational modification and protein interactions, as discussed above. It is possible that such changes may attenuate the affinities of antibodies under specific conditions. Although ChIP-grade antibodies for Smad proteins have been established (Supplementary Table 2), careful interpretation of the results will be required.

In summary, genome-wide analysis of the binding sites of Smad proteins have led to important discoveries of their cell-type-specific and context-dependent functions. Application of genome-wide techniques to experimental models and human samples derived from cancer patients, will help to clarify their complex mechanisms during cancer progression, and may also provide potential prognostic biomarkers for future cancer therapy.