Introduction

Drug-induced liver injury is one of the most common reasons that accounts for the attrition of candidate drugs during the later stages of drug development. Consequently, early detection of drug-induced hepatotoxicity is essential before compounds are tested in animals and enter clinical trials to save time and resources (O’Brien et al. 2006). McDonald and Robertson (2009) calculated that out of 51 drugs taken off the market, 29 and 33% were withdrawn due to hepatotoxicity and cardiotoxicity, respectively. Primary human hepatocytes (HH) are considered as the gold standard model for xenobiotic metabolism and cytotoxicity studies (Guillouzo et al. 2007). However, the scarce availability of fresh human liver samples, complicated isolation procedures, limited life span, inter-individual variability, and cost constitute serious limitations for the use of such in vitro systems in screening (Madan et al. 2003). To overcome these limitations, immortalized liver-derived cell lines were proposed as an ideal alternative to study drug metabolism because of their unlimited availability and phenotypic stability. A first alternative is the widely used human hepatocellular carcinoma cell line HepG2. These cells are highly differentiated and display many of the genotypic features of normal liver cells (Sassa et al. 1987). Consequently, HepG2 cells can be used to screen the cytotoxicity potential of new chemical entities at the lead generation phase (Gerets et al. 2009). Nevertheless, their main limitation is linked to their low metabolic capacities compared with primary hepatocytes (Xu et al. 2004) which make them appropriate for testing the toxicity of the parent molecule but less suited for metabolite toxicity testing. Westerink and Schoonen (2007) showed that HepG2 cells have low levels of cytochromes (CYPs) but normal levels of phase II enzymes with the exception of UDP-glucuronosyl transferases.

Promising new cellular models such as HepaRG cells have also been developed to tackle the problem of low metabolic profiles observed in HepG2 cells. Indeed, HepaRG cells, a human hepatocellular carcinoma cell line (Gripon et al. 2002) composed of a mixture of both hepatocytes-like and biliary-like cells, retain a drug metabolism capacity comparable to that of primary HH without inter-donor variability and functional instability with time in culture that has been observed in primary cells (Lambert et al. 2009). HepaRG cells were shown to maintain hepatic functions and expression of liver-specific genes at levels comparable to HH (Anthérieu et al. 2010).

The aim of our investigations was to better characterize three cellular models, namely HepG2, HepaRG cells and primary HH with a focus on gene expression, CYP activity and cytotoxicity evaluation. To the best of our knowledge, this combined approach has been very rarely attempted in the same study. In a first instance, the different cellular models were exposed to beta-naphthoflavone (BNF), phenobarbital (PB), and rifampicin (RIF) to study their gene expression modulation using an absorption–distribution–metabolism–excretion (ADME) array and their cytochrome P450 enzyme activities (CYP1A2, 2B6, and 3A4). Subsequently, they were exposed to 21 reference compounds with known hepatotoxic potentials in humans. The objectives of the present study were to (1) compare mRNA levels of genes involved in metabolism of control cells and of cells exposed to inducers, (2) determine CYP450 activities after exposure to the three inducers, and (3) evaluate the potential of the different cellular models to detect hepatotoxic compounds.

Materials and methods

Materials

All compounds and reagents (i.e., BNF, RIF, aflatoxine B1, amiodarone, astemizole, cerivastatin, danazol, flutamide, ketoconazole, labetalol, aspirin, chlorpromazine, clofibrate, cyclosporine A, furazolidone, imipramine, tacrine, tamoxifen, betaine HCl, flufenamic acid, isoproterenol, praziquantel, and primidone) were of analytical grade and were purchased from Sigma-Aldrich (Saint Louis, USA) except PB which was obtained from Certa (Braine-L’Alleud, Belgium). RNeasy Mini kits were purchased from Qiagen (Valencia, California, USA) and the human ADME arrays were obtained from Codelink (now Applied Microarrays, Tempe, USA).

Cellular models

The human hepatocellular carcinoma (HepG2) cell line, purchased from the European Collection of Cell Cultures (ECCAC, Salisbury, UK) was maintained as an adherent cell line in Dulbecco’s modified Eagle’s medium supplemented with 10% fetal bovine serum, 50 U/ml penicillin, 50 μg/ml streptomycin, 2 mmol/l l-glutamine, and 1× nonessential amino acids at 37°C in a 5% CO2:95% air-humidified atmosphere. Cells were passaged as needed using 0.5% trypsin-EDTA. All the cell culture solutions were purchased from BioWhittaker Inc (Walkersville, MD, USA).

The HepaRG cell line is derived from a liver tumor of a female patient suffering from hepatocarcinoma (Gripon et al. 2002). HepaRG cells were purchased from Biopredic International (Rennes, France) as confluent monolayer (ca. 1 × 106 cells per well) in 6-well plates with the associated medium. Six-well plates were used for gene expression and CYP450 evaluation. HepaRG cells plated in 96-well plates were used for the cytotoxicity experiments. For both formats, the medium used was composed of Williams E with Glutamax-I added with 100 IU/ml of penicillin, 100 μg/ml of streptomycin, 4 μg/ml of bovine insulin and 5 × 10−5 M of hydrocortisone hemisuccinate. Before shipment, the cells were allowed to proliferate and when confluence was reached, the cells were cultured for 2 weeks in the same medium supplemented with 2% dimethylsulfoxide (DMSO) in order to obtain maximum differentiation. After shipment, the medium was changed to high DMSO containing medium. Media were renewed every 2–3 days. The last medium renewal before starting the induction period was done with “Enriched” medium. On the day of compound exposure, “induction” medium was used. Details of both media are proprietary (Biopredic).

Primary cultures of HH, freshly isolated from three different donors (for characteristics, see Table 1), were purchased from Biopredic International (Rennes, France) as confluent monolayers (ca. 1.5 × 106 cells per well) in 6-well plates pre-coated with a single film of collagen. This format was used to generate mRNA and CYP activity data. Fresh HH were seeded in Williams E medium supplemented with 10% fetal calf serum, 100 U/μl penicillin, 100 μg/ml streptomycin, 1 μg/ml insulin, 2 mM l-glutamine, and 1 μg/ml bovine serum albumin. Upon arrival, the Biopredic proprietary shipping medium was replaced with Williams E medium containing Glutamax-I, penicillin (100 IU/mL), streptomycin (100 μg/mL), bovine insulin (4 μg/mL), and hydrocortisone hemissuccinate (50 μM). The hepatocytes were incubated in a 5% CO2:95% air humidified atmosphere at 37°C for ca. 2 h before starting the treatment period in the presence of the reference inducers.

Table 1 Human hepatocytes donor demographics and characterization

Cryopreserved primary HH from three different donors were purchased from CellzDirect/Invitrogen (Cheshire, UK; for characteristics, see Table 1). This format was used to perform the cytotoxicity experiments. Cryopreserved HH were thawed based upon CellzDirect’s standard method. In brief, hepatocytes were thawed at 37°C, poured into pre-warmed (37°C) CHRM™ thawing medium at a ratio of one vial/50 ml. The cells were centrifuged at 100 g for 10 min, resuspended in 2–3 ml cold (4°C) CHPM™ plating medium and cell viability determined. The cells were seeded in a collagen-coated E-plate at a density of 20,000 cells/well and allowed to attach in a 5% CO2:95% air-humidified atmosphere at 37°C for ca. 4–6 h after which the medium was changed to Williams E medium containing Glutamax-I, penicillin (100 IU/mL), streptomycin (100 μg/mL), bovine insulin (4 μg/mL), and hydrocortisone hemissuccinate (50 μM). Subsequently, 10 μl of compound was added to the wells to start the incubation.

Toxicogenomics and CYP activities determined after exposure of the different cellular models to inducers (6-well format)

Treatment with inducers

The cells of the 3 cellular models were exposed to BNF (25 μM), PB (500 μM), and RIF (25 μM) for 24 h for gene expression evaluation and for 72 h for CYP activity measurements, with medium renewal every 24 h. BNF was used as reference inducer for CYP1A2, PB for CYP2B6, and RIF for CYP3A4. Stock solutions were prepared in DMSO and further 100× diluted in the adequate culture medium (0.1% (v/v) final DMSO concentration). Control cells were exposed to 0.1% (v/v) DMSO. The cells were then incubated in a 5% CO2:95% air humidified atmosphere at 37°C.

Gene expression evaluation

Total RNA was isolated from the cells using the RNeasy Mini kit according to the manufacturer instructions (Qiagen Benelux, Venlo, The Netherlands). RNA quantity was assessed with the nanodrop spectrophotometer (Isogen, IJsselstein, The Netherlands) and RNA quality was checked with the Agilent™ bioanalyzer 2,100 (Agilent Technologies, Massy, France). The human ADME microarrays were purchased from Applied Microarrays. Each array was composed of 1,250 different probes. The genes present on the microarray are all part of the human whole genome microarray which has been extensively used in the literature (Guo et al. 2006; Hiyama et al. 2009; Kamei et al. 2009). To perform cDNA amplification and hybridization the Codelink™ iExpress iAmplify cRNA Prep & Hyb Kit of Applied Microarrays (Tempe, AZ, USA) was used according to the manufacturer’s instructions. The arrays were scanned using the GenePix Array Scanner from Axon (Axon Instruments, Foster City, CA, USA) using laser excitation of the fluor at 635-nm wavelengths for the Cy5 labels. The gene expression data were analyzed using a two-sample t test with unequal variances. Significant differentially expressed probes have been defined as having an adjusted p value of <0.05 (Benjamini and Hochberg 1995) and a fold change of >2 for upregulated genes or <−2 for downregulated genes. Principal component analysis (PCA) was used to compare the different cellular models. Principal component analysis provides a means to reduce high-dimensional gene expression data into few principal components. Gene expression profiles were considered to be similar when data were close in PCA space.

CYP450 enzymatic activities determination after induction

The cells were exposed to the different inducers for 3 days, with every day medium renewal. At the end of the treatment, the cells were washed, harvested in 50 mM Tris-HCl buffer and the microsomal fractions prepared by differential centrifugation (Pearce et al. 1996). CYP marker activities were measured according to methods adapted from the literature (Faucette et al. 2000; Kanazawa et al. 2004). Briefly, microsomes prepared from monolayers of the different cell types were used for the analysis of 7-ethoxyresorufin O-dealkylation, bupropion hydroxylation, and midazolam 1′-hydroxylation as marker activities of human CYP1A2, CYP2B6, and CYP3A4, respectively. Incubations were performed in 50 mM phosphate buffer at 37°C using a NADPH regenerating system (BD BioSciences, San Jose, CA, USA) and CYP450 substrate 7-ethoxyresorufin (10 μM), bupropion (200 μM), or midazolam (3 μM) using the following conditions: 100, 500, and 50 μg/ml microsomal protein and incubation time of 60, 20 and 5 min for CYP1A2, CYP2B6, and CYP3A4, respectively. Reactions were stopped by the addition of ice-cold acetone and clear supernatants analyzed for the presence of metabolites. The amount of resorufin formed was measured by fluorimetry (λ ex/em, 530/590 nm). The amounts of hydroxybupropion and 1′-hydroxymidazolam were determined by LC-MS/MS assays (Shah et al. 2000). Supernatants were mixed to propranolol hydrochloride as internal standard and injected into a HPLC system (Agilent technologies, Palo Alto, USA) equipped with a XDB-C18 column (50 × 2.1 mm, 5 μm). Quantification of hydroxybupropion and 1′-hydroxymidazolam were performed with a Quattro Micro mass spectrophotometer (Waters-Micromass, Manchester, UK). Elution was performed using a mixture of solvent A consisting of 0.1% TFA in water adjusted to pH 2.4 with NH4OH and solvent B consisting of ACN. For the hydroxybupropion determination, the proportion of solvent B in the mobile phase was increased from 20% to 90% in 7.5 min, remained at 90% for 2 min and decreased from 90% to 20% thereafter (total run of 16 min). For the determination of 1′-hydroxymidazolam, the proportion of solvent B in the mobile phase was increased from 25% to 90% in 7 min, remained at 90% for 3 min and decreased from 90% to 25% thereafter (total run of 15 min). The peak areas of the m/z 256 → 139 (hydroxybupropion) and m/z 342→324 (1′-hydroxymidazolam) product ion were measured against the peak areas of the m/z 260→116 of the internal standard. Microsomal proteins were determined using the BCA Protein Assay adapted to microtiter plate (Smith et al. 1985). Activities were compared with those in corresponding vehicle-treated cells with data expressed as fold induction over controls.

Effects of human hepatotoxicants and nonhepatotoxicants on the different cellular models (96-well plate format)

Treatment with hepatotoxic and nonhepatotoxic compounds

The three cellular models were treated with 21 reference compounds known to cause hepatotoxicity or not (eight severely hepatotoxic compounds, i.e., aflatoxine B1, amiodarone, astemizole, cerivastatin, danazol, flutamide, ketoconazole, and labetalol; eight moderately hepatotoxic compounds, i.e., aspirin, chlorpromazine, clofibrate, cyclosporine A, furazolidone, imipramine, tacrine, and tamoxifen; and five nonhepatotoxic compounds, i.e., betaine HCl, flufenamic acid, isoproterenol, praziquantel, and primidone). The compounds were classified according to O’Brien et al. (2006) and Wen and Zhou (2009).

Time-dependent cell response profiling using real-time cell analyzer

The xCELLigence System from Roche Diagnostics (Vilvoorde, Belgium) measures electrical impedance across microelectrodes integrated into the bottom of tissue culture E-Plates. The impedance measurement provides quantitative information on the biological status of the cells, including cell number, viability, and morphology. The 96-well E-plates were coated with collagen (Collagen R Solution 0.2% (Serva, Heidelberg, Germany) for HepG2 and HepaRG cells and Rat Tail Collagen, type 1 (BD Biosciences, Erembodegem, Belgium) for cryopreserved HH) according to the manufacturer’s recommendations. Preliminary experiments determining the best coating and the cell density were performed as previously described in Atienzar et al. (2011). After background reading the appropriate number of cells was added to the plate (10,000 cells/well for HepG2 cells, 20,000 cells/well for cryopreserved primary hepatocytes, and HepaRG plates were seeded at Biopredic until confluence). The cells were allowed to attach at room temperature for 30 min, after which they were placed on the reader in the incubator for continuous recording of impedance overnight for HepG2 and HepaRG cells and for 4–6 h for primary hepatocytes. The cells were then exposed to the compounds for 72 h. Stock solutions of 100 mM were used to prepare three other stock solutions at 0.1, 1, and 10 mM in 100% DMSO. These solutions were further diluted in water and added to the wells. Final concentrations in the plates were 0.1, 1, 10, and 100 μM, and the final DMSO percent was 0.1%. Each treatment condition was measured in triplicate except for the control cells exposed to 0.1% DMSO (n = 6) and for the positive controls (n = 2, 0.00125, 0.0025, and 0.005% Triton X-100). For three compounds (amiodarone, ketoconazole, and furazolidone) solubility problems occurred at 100 mM and a stock solution of 50 mM was prepared, so that the final concentrations in the plate were 0.05, 0.5, 5, and 50 μM. The cells were monitored in real time, at 37°C in a humidified 5% CO2:95% air atmosphere, using the multiplate xCELLigence platform. Intervals for data collection were every 10 min after compound addition for 2 h followed by every 30 min for 72 h. To quantify the cell status based on the measured cell-electrode impedance, a parameter termed cell index (CI) is derived, according to the following equation (Abassi et al. 2009):

$$ {\text{Cl}} = \mathop{{\max }}\limits_{{i = 1,...,N}} \left( {\frac{{{R_{\text{cell}}}\left( {{f_i}} \right)}}{{{R_{\text{b}}}\left( {{f_i}} \right)}} - 1} \right) $$

where R b(f) and R cell(f) are the frequency-dependent electrode resistances (a component of impedance) without cell or with cells present, respectively. N is the number of the frequency points at which the impedance is measured. Thus, CI is a quantitative measure of the status of the cells in an electrode-containing well. Under the same physiological conditions, more cells attaching onto the electrodes leads to larger R cell(f) value, leading to a larger value for CI. Decrease in CI correlates generally well to cell death despite potential confounding factors. Furthermore, for the same number of cells present in the well, a change in the cell status such as morphology will lead to a change in the CI. A “normalized CI” at a given time point is calculated by dividing the CI at the time point by the CI at a reference time point. Thus, the normalized CI is 1 at the reference time point. The normalization was done by using the last time point before compound addition. This allows comparing more precisely the effect of the different concentrations tested versus the control. The CI values presented here were calculated from triplicate values (technical replicates) and presented as average ± standard deviation. Each cellular model has been repeated three times (biological replicates). LC50 values were determined after 48 and 72 h using an internally developed excel macro. In order to evaluate the performance of the different cellular models to detect hepatotoxic and nonhepatotoxic compounds, the sensitivity and specificity percentages were calculated. The sensitivity is defined as the ability of a test system to predict the positive outcome under evaluation (i.e., hepatotoxicity). The specificity represents the ability of a test system to predict the negative outcome under evaluation (i.e., nonhepatotoxicity).

Results

Toxicogenomics and CYP450 activities after exposure of the different cellular models to CYP450 inducers

Gene expression analysis

The PCA representing the gene expression intensities observed in the three cellular models exposed to 0.1% DMSO and to the three inducers (BNF, PB, and RIF) is shown in Fig. 1a, b, respectively. The PCAs have been performed on the normalized intensities of the samples to see how the samples group together. The data were normalized against all housekeeping genes present on the microarray. Each model has been done in triplicate and each point in the PCA plot represents the gene expression results of an individual array. One of the replicates of donor 2 (HH2) was a clear outlier (data not shown) due to a technical issue (low hybridization to the array) and therefore omitted from further analysis. The PCAs show that the technical replicates per cellular model are relatively well clustered (Fig. 1a). HepG2 and HepaRG cells were equally separated from the HH donors HH2 and HH3 in the control condition (Fig. 1a). HepaRG cells are closer to HH1 compared with HepG2 cells. After induction, HepaRG cells are more closely related in space to the human donors than HepG2 cells (Fig. 1b). Only limited changes in gene expression pattern, after exposure to any of the three inducers, could be observed for HepG2 cells in comparison to the HepaRG and three human donor cells (Fig. 1b).

Fig. 1
figure 1

Principal component analysis of gene expression data from HepG2, HepaRG, and human hepatocytes exposed to solvent and three inducers. PCA was performed on the normalized intensities of the control samples (cells exposed to 0.1% DMSO) (a) and of cells exposed to BNF, PB, and RIF (b). Each model has been done in triplicate and each point in the PCA plot represents the gene expression result of a separate array. One of the triplicates of donor 2 (HH2) was an outlier due to technical issue and for that reason not taken into account to make any further calculations

Gene expression analysis within the control condition is shown in Table 2 (upper part). Herein, each cellular model within the control condition will be compared with each other and the number of differentially expressed probes calculated. Differentially expressed probes have been arbitrary defined as having a fold change of >2 for upregulated genes or <−2 for downregulated genes. In addition, significant results (p < 0.05) are marked with an asterisk. The largest difference in the number of differentially expressed probes within the control condition was observed between HH and HepG2 cells, with an average of 599 differentially expressed probes (out of 1,250 probes). Between HepG2 and HepaRG cells an average of 490 probes were differentially expressed. The difference between HH and HepaRG cells and among the different donors of the HH was comparable, with an average of 315 and 326 differentially expressed probes, respectively. After induction, for each cellular model the number of differentially expressed probes was calculated in the control versus the induced condition (Table 2, lower part). An average of 73 probes were differentially expressed (cut-off set at 2) in HepG2 cells compared with an average of 107 probes in HH donors (100, 131, and 91 differentially expressed probes in donors 1, 2, and 3, respectively).

Table 2 Comparison of gene expression changes in control and treated cells from different cellular models

Table 3 presents a comparison of the gene expression fold changes within the control condition for a total of 39 metabolizing genes (25 phase I and 14 phase II metabolizing genes) and six phase III transporter genes (three efflux and three uptake drug transporters), that were present on the human ADME array. The values in the table represent the fold changes between two cellular models that were compared within the vehicle-control condition. All possible combinations are represented.

Table 3 Summary of the gene expression fold changes among the different cellular models

HepaRG vs. HepG2

Twenty-one of 45 genes were significantly higher expressed (cut-off = 2, p < 0.05) in HepaRG cells compared with HepG2 cells. Only glutathione transferase M3 gene was significantly less expressed (2.95-fold) in HepaRG compared with HepG2 control cells. For the remaining 23 genes, similar expression levels were obtained between the two models. The CYP3A4 gene, one of the most important CYP for drug metabolism, is significantly higher expressed in HepaRG cells (19.1 times, p < 0.05) compared with HepG2. Other genes such as CYP2C18 and the uptake transporter, solute carrier family 22 drug transporter, were also significantly higher expressed in HepaRG cells compared with HepG2 cells (56 and 100 times, respectively).

Primary HH vs. HepG2

Thirty-three of 45 genes were higher expressed (cut-off = 2) in the HH compared with HepG2 at least in one of the three donors. However, out of these 33 genes, 20 genes were systematically higher expressed in all three donors. For instance, CYP2C18 and CYP3A4 were on average 499 and 221 times higher expressed in primary cultures of hepatocytes compared with HepG2 cells, respectively. Only two genes (glutathione transferase M3 and UDP-glucose ceramide glucosyltransferase-like one gene) were higher expressed in HepG2 cells compared with HH2.

Primary HH vs. HepaRG

Six genes, including the CYP3A4 gene, were systematically higher expressed (cut-off of 2) in all three preparations of HH compared with HepaRG cells. For instance, CYP2D6 was 35–50 times higher expressed at the mRNA level in fresh HH as compared with HepaRG cells. This result was expected as the HepaRG cells were derived from an individual that is a poor CYP2D6 metabolizer (Guillouzo et al. 2007). Only CYP4B1 gene was clearly higher expressed in HepaRG cells compared with the three donors (with a factor of minimum ca. 10). For all the other genes, similar expression values were obtained in both models.

Among the HH donors

Small differences in gene expression among the hepatocytes from the three donors were observed. The highest difference among the three donors was observed for genes CYP26A1 and flavin containing monooxygenase 1. The largest difference between the donors was observed between donor one and three, with 12 differentially expressed genes.

Gene expression changes after exposure to BNF, PB, and RIF are presented in Table 4. To improve data visualization, a rectangle is drawn when the gene is regulated in the same direction in all different cellular models (five of five) and a dashed rectangle is drawn when four out of five cellular models are regulated in the same direction irrespective of the values of the factors.

Table 4 Summary of the gene expression fold changes in the different cellular models after exposure

BNF

Five genes (CYP2D6, flavin containing monooxygenase 5, sulfotransferase family 1E1, UDP-glucose ceramide glucosyltransferase-like 1, and solute carrier family 10) were down-regulated and 2 genes (CYP1A1 and heme oxygenase 1) were up-regulated in all models following BNF treatment. Six genes (CYP1A2 (up), CYP2E1 (down), CYP3A7 (down), CYP4B1 (down), epoxide hydrolase 1 (s1) (up), and ATP-binding cassette2 (ABCC2) (up)) were regulated in the same direction in all models except for one human hepatocyte donor. CYP7A1, flavin containing monooxygenase 1 and solute carrier family 22 (SLC22A1) were up-regulated in HepG2 cells and downregulated in the other models. When comparing the three HH donors, a similar trend in gene expression was found between donors 1 and 2. Donor 3 showed a slightly different expression pattern.

PB

Four genes (CYP3A7, epoxide hydrolase 1 (s1), heme oxygenase 1, and ATP binding cassette (ABCB1)) were upregulated by PB in all cellular models and two genes (sulfotransferase family 1E1 and 1B1) were downregulated. The genes CYP2A13 and CYP3A4 were downregulated in HepG2 cells and upregulated in the other cellular models and vice versa for CYP7A1. All three donors responded in a similar way for most genes.

RIF

Five genes (CYP3A7, epoxide hydrolase 1 (s1 and s2), heme oxygenase 1, and ATP-binding cassette (ABCB1) were upregulated by RIF in all cellular models. CYP3A4 was upregulated in all models, except for HepG2 cells. All three HH donors responded in a similar way for most genes.

CYP activities measurement

Table 5b represents the fold induction of CYP1A2, 2B6 and 3A4 activities in the three cellular models exposed to the different inducers. Experiments were performed in triplicate and the average fold induction is presented. The individual basal and induced activities are presented in Table 5a.

Table 5 Summary of CYP activity in the HH (3 donors), HepaRG, and HepG2 cells after exposure to vehicule, BNF, PB, and RIF

HepG2

Treatment with BNF caused a 9.1-fold increase in CYP1A2 activity, whereas PB and RIF did not have any effect on the CYP2B6 and 3A4 activity, respectively.

HepaRG

BNF treatment increased the CYP1A2 activity 35.7-fold compared with the vehicle-treated HepaRG cells. PB treatment also increased CYP2B6 activity 8.7-fold. CYP3A4 activity was increased 42.4 and 36.6 times after RIF and PB treatment, respectively.

HH

Treatment with BNF caused on average a 10-fold increase in CYP1A2 activity. No data for CYP2B6 activity was reported for donor three due to a technical problem. Treatment with PB caused a 12.8- and 8.3-fold increase in CYP2B6 activity for donor 1 and 2, respectively. Donor 3 showed a 10.6-fold increase in CYP3A4 activity after treatment with RIF, whereas donors 1 and 2 showed only a 3.7- and 2.1-fold increase, respectively. PB also induced CYP3A4 activity in all three donors in the range two to six times.

Comparing all models

HepaRG cells were the most inducible model, for instance, CYP1A2 was induced 35.7× (BNF), CYP2B6 8.7× (PB), and CYP3A4 42.4× (RIF). The HH donors responded moderately to the different inducers. CYP1A2 induction showed little variability in the response whereas CYP2B6 and 3A4 showed more variability in the response but always in the same direction. Almost no induction could be observed in HepG2 cells, except for CYP1A2 after BNF treatment.

Cytotoxicity experiment

Table 6 summarizes the LC50 values of the different cellular models exposed to 16 hepatotoxic and 5 nonhepatotoxic compounds using the xCELLigence platform. Figure 2 shows the curves obtained with the xCELLigence system in all cellular models exposed to 10 or 100 μM aflatoxine B1. To better visualize the data, no more than three cellular models are presented on the same graph. LC50 values were calculated from these types of graphs by means of an excel macro which was designed internally. A compound is classified as highly cytotoxic when LC50 value is below 10 μM, moderately cytotoxic when LC50 value is between 10 and 50 μM, weakly cytotoxic when LC50 value is between 50 and 100 μM and not cytotoxic when the LC50 value is above 100 μM. These cut-off values were chosen arbitrary. All negative compounds were correctly identified in all models. From the eight severely hepatotoxic compounds, cerivastatin was found highly cytotoxic in all cellular models. Danazol had an LC50 value higher than 100 μM in all models, whereas its Cmax (the maximum concentration of a drug in the blood after dosing) was only 0.16 μM in in vivo studies. Aflatoxin B1 was found highly cytotoxic in HepaRG cells and HH and only moderately cytotoxic in HepG2 cells. Amiodarone and astemizole were classified as highly cytotoxic in HH, moderately cytotoxic in HepG2 and weakly cytotoxic in HepaRG cells. Flutamide, ketoconazole and labetalol induced cytotoxicity in HH only. For the eight moderately hepatotoxic compounds, chlorpromazine and tamoxifen were moderately cytotoxic in HepG2 and HepaRG cells and highly cytotoxic in the 3 HH donors. Aspirin and clofibrate were classified as non cytotoxic in all the models tested. Cyclosporin A, furazolidone and imipramine were found moderately cytotoxic and tacrine weakly cytotoxic in HH with LC50 values ranging between 24 and 81 μM. Imipramine was determined weakly cytotoxic in HepG2 cells (average LC50 of 90 μM). Predictivity of the different models (i.e., sensitivity and specificity) is presented in Table 6. A sensitivity of 30–50% was found for HH (depending on the donor). Lower sensitivities were observed for HepaRG cells (12.5%) and HepG2 cells (6.3%) (Table 6). A specificity of 100% was obtained in all cellular models.

Table 6 LC50 and predictivity data at 48 and 72 h exposure to 21 (non)hepatotoxic compounds for the different cellular models
Fig. 2
figure 2

Cytotoxicity curves of HepG2, HepaRG and human hepatocytes exposed to solvent and two concentrations of aflatoxine B1. To better visualize the data, no more than three cellular models are presented on the same graph. Cryopreserved human hepatocytes, HepaRG, and HepG2 cells were incubated with 10 μM (a, b) and 100 μM (c, d) aflatoxine B1 for approximately 72 h. Cell index curves were generated in real time using the xCELLigence instrument. Each data point was normalized against the time just before compound addition and was calculated from triplicate values. Curves represent averages. For more details of the experiment, please refer to “Materials and method

Discussion

Drug-induced hepatotoxicity is one of the major reasons cited for drug withdrawal (O’Brien et al. 2006; McDonald and Robertson 2009). Therefore, it is of extreme importance to detect human hepatotoxic candidates as early as possible during the drug development process and before clinical phases. Simple cytotoxicity assays using HepG2 cells are relatively insensitive to detect human hepatotoxic drugs (Xu et al. 2004). Nevertheless, this is in contrast to the work of O’Brien et al. (2006) who calculated a sensitivity of 90% for the detection of hepatotoxic drugs in HepG2 cells using a cell imaging approach (LC50 < 30 μM based on a set of 243 compounds). This is very surprising, knowing that HepG2 cells have low metabolic capacities according to the literature and the present data. This discrepancy in sensitivity could be due to (1) the different endpoints measured compared with ours, (2) the different cut-off values used to classify a compound as toxic (30 vs. 10 μM), and (3) the different sources of HepG2 cells which could contain different basal activities of phases I and II drug-metabolizing enzymes (Hewitt and Hewitt 2004). In the present study, the HepG2 sensitivity was only 6.3% with a set of 16 human hepatotoxic drugs. We consider arbitrarily that a drug is classified as hepatotoxic when LC50 < 10 μM. In many cases, depending on the stringency required, LC50 < 10–50 μM are generally used to classify pharmaceutical compounds (Evans et al. 2001; Dambach et al. 2005; O’Brien et al. 2006).

The present publication compares HepG2, HepaRG cells, and primary human hepatocytes with regard to their metabolism and potential to detect hepatotoxicity. To the best of our knowledge, this combined approach has been very rarely attempted in the same study. Clearly, these cellular models have already been characterized separately at the metabolism levels (e.g., Guillouzo et al. 2007; Westerink and Schoonen 2007; Lambert et al. 2009), and there are also some other publications related to the predictivity of HepG2 (O’Brien et al. 2006) and primary HH (Xu et al. 2008) to detect hepatotoxicity. Finally, the results generated on HepaRG cells for the detection of hepatotoxic compounds in the present study are novel with the exception of aflatoxin B1 and ketoconazole (Pernelle et al. 2011).

In the present study, the three cellular models were characterized with respect to their gene expression profiles and CYP activities (CYP1A2, 2B6, and 3A4) in control cells and after exposure to BNF, PB, and RIF. The PCA demonstrated high reproducibility of the replicates and a separation of the three cellular models in both vehicle controls and in cells exposed to the inducers. The largest difference in the number of differentially expressed probes (2-fold change) was found between HH and HepG2 cells after exposure to solvent and to the different inducers. Almost every phase I, II and III genes, present on the micro-array, was less expressed in control HepG2 cells compared with control HepaRG cells and HH. These low CYP mRNA levels in HepG2 cells were also reported in other studies (Westerink and Schoonen 2007). Jennen et al. (2010) demonstrated that at the basal gene expression level, HepaRG cells are closer to primary HH compared with HepG2 cells. In the present study, according to PCA, this was only the case for one of the three donors. However, after exposure to inducers, HepaRG cells were more closely related in space to the HH donors than HepG2 cells. Only limited changes in gene expression pattern, after exposure to any of the three inducers could be observed in HepG2 cells compared with the other models. At the mRNA level, BNF turned out to be the best inducer in HepG2, HepaRG, and HH2 cells with the highest number of differentially expressed probes. Among the three HH donors, differences were observed in their responsiveness to the inducers. This variability can most likely be ascribed to inter-individual differences in basal expression, but also in sensitivity to the inducers (Olinga et al. 2008). Gene expression of phase I enzymes showed that BNF clearly induced the CYP1A2 gene in all three models and down-regulated the CYP3A4 gene in HepaRG cells and HH. The poor inducibility of HepG2 cells, except for the CYP1A1 enzyme, has been reported before (Donato et al. 2004). This could be related to the fact that the aryl hydrocarbon receptor (AhR) is expressed in most immortalized cell lines and consequently the CYP1A gene which is regulated by AhR is inducible in those cells (Donato et al. 2008). At the gene expression level, BNF induced CYP1A2 gene ca. four times in all three cellular models (except in one donor). The other inducers, PB and RIF showed specific induction of respectively CYP2B6 and CYP3A4 genes in HH and HepaRG cells but not in HepG2 cells. These findings are in agreement with the literature (Xu et al. 2005; Faucette et al. 2004; Jackson 2004). In HH, Faucette et al. (2004) showed that CYP2B6 is highly inducible by known CYP3A4 inducers. Rhodes et al. (2011) also mentioned the same level of induction with PB and RIF in HH as in our experiments. Our data also show an induction of CYP1A2 in HepaRG following BNF treatment as in Lübberstedt et al. (2010). However, in contrast to the later experiment, a marked induction of CYP3A4 after RIF exposure was observed in HepaRG cells. The correlation between gene expression and activity for CYP1A2 was optimal for two human donors but was weak for HepG2, HepaRG, and the third human donor. On the other hand, the correlation between the gene expression and CYP3A4 activity was satisfactory in all models, except for HepG2. It is well known that the mRNA level does not necessarily correlate well with the enzyme activities due to posttranscriptional regulations such as acetylation, disulfide bond formation, glycosylation, and phosphorylation (Glanemann et al. 2003). In summary, the gene expression data show that HepG2 cells are poorly inducible whereas HepaRG cells are inducible and are closer to HH after exposure to BNF, PB and RIF. These later increased specifically CYP activity in HH and HepaRG cells but rarely in HepG2 cells, except for BNF. For instance, in HepaRG cells CYP activity was increased 35.7 (BNF), 8.7 (PB), and 42.4 (RIF) times compared with 9.7 (BNF), 10.6 (PB), and 5.5 (RIF) times in HH for CYP1A2, 2B6, and 3A4, respectively. Consequently, HepaRG cells seem to be a very interesting model to use in metabolism studies as a substitute and/or in complement to primary hepatocytes. Recently, Turpeinen et al. (2009), Anthérieu et al. (2010) and Lübberstedt et al. (2010) also concluded that HepaRG cells are a very promising cell line for various applications in xenobiotic metabolism. The main advantages of HepaRG cells are linked to the higher response with the different inducers and compared with primary HH they showed much reduced variability.

Subsequently, the three models were exposed to 16 known hepatotoxic and five nonhepatotoxic compounds and cytotoxicity was determined using the xCELLigence platform (real-time cell analyzer (RTCA)). RTCA was used to measure cytotoxicity. The main advantages of the xCELLigence platform are: no cellular labeling, noninvasive measurements, kinetic evaluation in a real-time mode, reasonable throughput, with potential applications in toxicology and pharmacology (Atienzar et al. 2011). Nevertheless, to convince more researchers to use RTCA in early toxicity screening, it will be necessary to show that there is a good correlation between cellular impedance measurements and classical toxicity endpoints with a large number of compounds (Atienzar et al. 2011). Different studies (but limited in numbers) have revealed a high correlation between impedance-based determination of viable adherent cells and MTT evaluation in different cell types (Atienzar et al. 2011). In addition, our data show a good correlation between CI (RTCA) and cell number (cell imaging) when HepG2 and HepaRG cells were exposed to the same set of 21 compounds with coefficient of correlation of 76% and 88%, respectively (data not shown). In another study the coefficient of correlation between ATP and CI (RTCA) was 88.5% in HepG2 cells exposed to 50 compounds (data not presented). Based on these preliminary data, CI measured by RTCA seems to correlate well with classical toxicity endpoints. Since the use of freshly isolated primary hepatocytes in drug screening is rather difficult (due to its limited availability and high cost) cryopreserved hepatocytes were used instead. CYP1A and 3A studies performed by Roymans et al. (2004) showed similar results between freshly isolated and cryopreserved HH. In the present study, all cellular models correctly classified all nonhepatotoxic compounds (n = 5). Despite the inducible properties of HepaRG cells, the detection of the hepatotoxic compounds in these cells was not remarkably better than in HepG2 cells with a sensitivity of 12.5% which is much lower compared with HH (42% on average). A given compound is classified arbitrarily as cytotoxic when LC50 value < 10 μM. The present study is not in agreement with the high sensitivity (90%) reported in HepG2 cells by O’Brien et al. (2006) although LC50 values < 30 μM were used as a cutoff. However, HepG2 cells were capable of detecting cytotoxicity effects of aflatoxine B1, despite their low metabolic profile. In comparison to HepaRG cells, higher concentrations and longer incubations were needed to reach the LC50 values. Our study reveals that a high metabolic capacity in cell lines does not necessarily guarantee to detect more hepatotoxic drugs. There are several possibilities that could explain why HepaRG cells were not as predictive as expected for the detection of human hepatotoxins. Firstly, to differentiate the HepaRG cells, our supplier Biopredic recommended using 2% DMSO. Such a concentration may be optimal for induction studies but not for cytotoxicity experiments. In HH, a concentration dependent increase in CYP3A4 activity was observed at concentrations between 0.1% and 1% DMSO, followed by a plateau effect in the range 1–2% DMSO (LeCluyse et al. 2000). Our data suggest that in HepaRG cells, 2% DMSO is adequate for CYP induction. Nevertheless, Gilot et al. (2002) have described DMSO as an inhibitor of apoptosis in hepatocytes. This protective effect of DMSO was related to its free radical scavenging property and through a dual blockage of cleaved caspase and ASK1/JNK activities. Consequently, using lower percentages of DMSO could improve the detection of some hepatotoxins. Secondly, higher levels of basal CYP activities may be required in HepaRG cells to optimally metabolize drugs. The third explanation could be linked to the high phases II and III activities that could efficiently detoxify the metabolites preventing thus their toxic action (Kanebratt and Andersson 2008).

Nuclear receptors (i.e., CAR, PXR, and AhR) expression play key role in metabolism. The expression level of CAR and PXR is low in HepG2 cells (Naspinski et al. 2008; Kanno and Inouye 2010) and high in HepaRG cells and HH (Aninat et al. 2006). Nevertheless, the presence or absence of nuclear receptors does not totally explain why some cellular models are less predictive than others for the detection of hepatotoxicity as other factors such as phase II enzymes and transporters could also play key roles. Due to low sensitivity of the different models to detect hepatotoxic compounds, one could wonder whether measuring cytotoxicity endpoints is of any relevance to predict hepatotoxicity. Indeed, because of its crudeness, cell death may not be the most appropriate endpoint to detect all mechanisms of hepatotoxicity in typical cellular models (primary hepatocytes, HepG2, etc.). This is particularly true when a given mechanism of hepatotoxicity does not necessarily lead to cell death although impairing internal cellular processes. For instance, the relationship between toxicity to hepatocytes in vitro and drug induced liver injury in vivo (in pre-clinical species or in man) remains poorly defined (Greer et al. 2010). Better prediction of hepatotoxicity could be obtained by evaluating more specific endpoints (than cell death) such as mitochondrial impairment, biliary transport, CYP450 inhibition and metabolite-mediated toxicity (Greer et al. 2010). Nevertheless, Andrews et al. (2003) developed a high throughput assay, measuring cell death on CYP450-transfected, immortalized human hepatocyte cell lines (THLE) with defined CYP450 activity (CYP3A4, 2C9, 2C19, and 2D6). The assay correctly predicted 585/587 nonhepatotoxic drugs, 15/21 severely hepatotoxic drugs, and 51/71 variably hepatotoxic drugs (Dambach et al. 2005). In the validation set, ca. 72% of drugs with known liver toxicity had IC50’s less than 50 μM in one or more cell lines (Andrews et al. 2003).

In summary, HepG2 cells responded very weakly to the different inducers compared with the other cell models at the gene expression and CYP activity levels. However, although HepaRG cells seem to be a suitable model for the induction studies, cells are not as predictive as primary HH and were quite comparable to the HepG2 cells for the prediction of human hepatotoxic drugs. Consequently, more studies are needed to determine which of the models would be more suited to detect hepatotoxic compounds with the hope to get sensitivity and specificity of at least 80–90%.