A comprehensive transcriptional portrait of human cancer cell lines


Tumor-derived cell lines have served as vital models to advance our understanding of oncogene function and therapeutic responses. Although substantial effort has been made to define the genomic constitution of cancer cell line panels, the transcriptome remains understudied. Here we describe RNA sequencing and single-nucleotide polymorphism (SNP) array analysis of 675 human cancer cell lines. We report comprehensive analyses of transcriptome features including gene expression, mutations, gene fusions and expression of non-human sequences. Of the 2,200 gene fusions catalogued, 1,435 consist of genes not previously found in fusions, providing many leads for further investigation. We combine multiple genome and transcriptome features in a pathway-based approach to enhance prediction of response to targeted therapeutics. Our results provide a valuable resource for studies that use cancer cell lines.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Data set overview.
Figure 2: Detection of gene fusion events in human cell lines.
Figure 3: Pathway-based mutation aggregation shows tissue-specific pathway deregulation.
Figure 4: Pathway aggregation of cell line aberrations.

Accession codes

Primary accessions


Referenced accessions

Gene Expression Omnibus


  1. 1

    Sharma, S.V., Haber, D.A. & Settleman, J. Cell line-based platforms to evaluate the therapeutic efficacy of candidate anticancer agents. Nat. Rev. Cancer 10, 241–253 (2010).

  2. 2

    Weinstein, J.N. et al. An information-intensive approach to the molecular pharmacology of cancer. Science 275, 343–349 (1997).

  3. 3

    Scherf, U. et al. A gene expression database for the molecular pharmacology of cancer. Nat. Genet. 24, 236–244 (2000).

  4. 4

    Abaan, O.D. et al. The exomes of the NCI-60 panel: a genomic resource for cancer biology and systems pharmacology. Cancer Res. 73, 4372–4382 (2013).

  5. 5

    Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).

  6. 6

    Bignell, G.R. et al. Signatures of mutation and selection in the cancer genome. Nature 463, 893–898 (2010).

  7. 7

    Campbell, P.J. et al. Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat. Genet. 40, 722–729 (2008).

  8. 8

    Garnett, M.J. et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483, 570–575 (2012).

  9. 9

    Liu, J. et al. Genome and transcriptome sequencing of lung cancers reveal diverse mutational and splicing events. Genome Res. 22, 2315–2327 (2012).

  10. 10

    Neve, R.M. et al. A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. Cancer Cell 10, 515–527 (2006).

  11. 11

    DeRisi, J. et al. Use of a cDNA microarray to analyse gene expression patterns in human cancer. Nat. Genet. 14, 457–460 (1996).

  12. 12

    Ross, D.T. et al. Systematic variation in gene expression patterns in human cancer cell lines. Nat. Genet. 24, 227–235 (2000).

  13. 13

    American Type Culture Collection Standards Development Organization Workgroup ASN-0002. Cell line misidentification: the beginning of the end. Nat. Rev. Cancer 10, 441–448 (2010).

  14. 14

    Haibe-Kains, B. et al. Inconsistency in large pharmacogenomic studies. Nature 504, 389–393 (2013).

  15. 15

    Mermel, C.H. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41 (2011).

  16. 16

    Beroukhim, R. et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905 (2010).

  17. 17

    Lukk, M. et al. A global map of human gene expression. Nat. Biotechnol. 28, 322–324 (2010).

  18. 18

    Taube, J.H. et al. Core epithelial-to-mesenchymal transition interactome gene-expression signature is associated with claudin-low and metaplastic breast cancer subtypes. Proc. Natl. Acad. Sci. USA 107, 15449–15454 (2010).

  19. 19

    Pádua Alves, C. et al. Brief Report: The lincRNA Hotair is required for epithelial-to-mesenchymal transition and stemness maintenance of cancer cell lines. Stem Cells 31, 2827–2832 (2013).

  20. 20

    Folkes, A.J. et al. The identification of 2-(1H-Indazol-4-yl)-6-(4-methanesulfonyl-piperazin-1-ylmethyl)-4-morpholin-4-yl-thieno[3,2-d]pyrimidine (GDC-0941) as a potent, selective, orally bioavailable inhibitor of class I PI3 kinase for the treatment of cancer. J. Med. Chem. 51, 5522–5532 (2008).

  21. 21

    Hoeflich, K.P. et al. Intermittent administration of MEK inhibitor GDC-0973 plus PI3K inhibitor GDC-0941 triggers robust apoptosis and tumor growth inhibition. Cancer Res. 72, 210–219 (2012).

  22. 22

    Lai, A.Z., Abella, J.V. & Park, M. Crosstalk in Met receptor oncogenesis. Trends Cell Biol. 19, 542–551 (2009).

  23. 23

    Acunzo, M. et al. Cross-talk between MET and EGFR in non-small cell lung cancer involves miR-27a and Sprouty2. Proc. Natl. Acad. Sci. USA 110, 8573–8578 (2013).

  24. 24

    Lin, Z. et al. Detection of murine leukemia virus in the Epstein-Barr virus-positive human B-cell line JY, using a computational RNA-seq-based exogenous agent detection pipeline, PARSES. J. Virol. 86, 2970–2977 (2012).

  25. 25

    Jiang, Z. et al. The effects of hepatitis B virus integration into the genomes of hepatocellular carcinoma patients. Genome Res. 22, 593–601 (2012).

  26. 26

    Seshagiri, S. et al. Recurrent R-spondin fusions in colon cancer. Nature 488, 660–664 (2012).

  27. 27

    Singh, D. et al. Transforming fusions of FGFR and TACC genes in human glioblastoma. Science 337, 1231–1235 (2012).

  28. 28

    Druker, B.J. et al. Efficacy and safety of a specific inhibitor of the BCR-ABL tyrosine kinase in chronic myeloid leukemia. N. Engl. J. Med. 344, 1031–1037 (2001).

  29. 29

    McDermott, U. et al. Genomic alterations of anaplastic lymphoma kinase may sensitize tumors to anaplastic lymphoma kinase inhibitors. Cancer Res. 68, 3389–3395 (2008).

  30. 30

    Robinson, D.R. et al. Functionally recurrent rearrangements of the MAST kinase and Notch gene families in breast cancer. Nat. Med. 17, 1646–1651 (2011).

  31. 31

    Edgren, H. et al. Identification of fusion genes in breast cancer by paired-end RNA-sequencing. Genome Biol. 12, R6 (2011).

  32. 32

    Mitelman, F., Johansson, B. & Mertens, M. (eds.) Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer http://cgap.nci.nih.gov/Chromosomes/Mitelman (2013).

  33. 33

    McPherson, A. et al. deFuse: an algorithm for gene fusion discovery in tumor RNA-Seq data. PLoS Comput. Biol. 7, e1001138 (2011).

  34. 34

    Berger, M.F. et al. Integrative analysis of the melanoma transcriptome. Genome Res. 20, 413–427 (2010).

  35. 35

    Shah, N. et al. Exploration of the gene fusion landscape of glioblastoma using transcriptome sequencing and copy number data. BMC Genomics 14, 818 (2013).

  36. 36

    Wu, Y.-M. et al. Identification of targetable FGFR gene fusions in diverse cancers. Cancer Discov. 3, 636–647 (2013).

  37. 37

    Turner, N. & Grose, R. Fibroblast growth factor signalling: from development to cancer. Nat. Rev. Cancer 10, 116–129 (2010).

  38. 38

    Lawrence, M.S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).

  39. 39

    Garraway, L.A. & Lander, E.S. Lessons from the cancer genome. Cell 153, 17–37 (2013).

  40. 40

    Barrett, S.D. et al. The discovery of the benzhydroxamate MEK inhibitors CI-1040 and PD 0325901. Bioorg. Med. Chem. Lett. 18, 6501–6504 (2008).

  41. 41

    Sutherlin, D.P. et al. Discovery of a potent, selective, and orally available class I phosphatidylinositol 3-kinase (PI3K)/mammalian target of rapamycin (mTOR) kinase inhibitor (GDC-0980) for the treatment of cancer. J. Med. Chem. 54, 7579–7587 (2011).

  42. 42

    Mohammadi, M. et al. Crystal structure of an angiogenesis inhibitor bound to the FGF receptor tyrosine kinase domain. EMBO J. 17, 5896–5904 (1998).

  43. 43

    Wu, T.D. & Nacu, S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26, 873–881 (2010).

  44. 44

    Gentleman, R.C. et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80 (2004).

  45. 45

    Greenman, C.D. et al. PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data. Biostatistics 11, 164–175 (2010).

  46. 46

    Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010).

  47. 47

    Wu, T.D. & Watanabe, C.K. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21, 1859–1875 (2005).

  48. 48

    Fu, W. et al. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493, 216–220 (2013).

  49. 49

    Drmanac, R. et al. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science 327, 78–81 (2010).

  50. 50

    González-Pérez, A. & Lopez-Bigas, N. Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, condel. Am. J. Hum. Genet. 88, 440–449 (2011).

Download references


We thank members of the Genentech cell line bank (gCell) and the compound screening group (gCSI) for contributing cell lines and results to this paper. We thank A. Bruce for graphical assistance.

Author information

C.K., F.J.d.S., J.S., S.S. and Z.Z. conceived the project. C.K., J.S., S.S. and Z.Z. wrote the manuscript. C.K., S.D., E.W.S., P.M.H., Z.J., H.L., J.D., O.M., F.G., J.L., G.P., J.R., K.M., G.J.Z., M.J.B., T.D.W., R.C.G., G.M. and R.B. performed bioinformatics data analysis or provided computational infrastructure. Y.C., S.K.S., M.Y., R.L.Y., D.S., Z.M. and R.M.N. prepared cell lines and performed biochemical experiments including drug treatments and sequencing.

Correspondence to Jeffrey Settleman or Somasekar Seshagiri or Zemin Zhang.

Ethics declarations

Competing interests

The majority of authors are employees of Genentech Inc. and/or hold stock in Roche.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–18, Supplementary Tables 3, 5, 9 and 12 and Supplementary Note (PDF 23603 kb)

Supplementary Table 1

Overview of cell lines included in this study (XLS 203 kb)

Supplementary Table 2

Sequencing statistics for RNA sequencing of cancer cell lines (XLS 126 kb)

Supplementary Table 4

Results for GISTIC analysis run on 610 cell lines (XLS 71 kb)

Supplementary Table 6

Viral integration sites detected by human-viral chimeric RNA (XLS 53 kb)

Supplementary Table 7

Viral integration sites detected by human-viral chimeric RNA - murine viruses (XLS 77 kb)

Supplementary Table 8

Gene-gene fusions identified in cancer cell lines (XLS 1397 kb)

Supplementary Table 10

Fusions found in TCGA for which at least one gene was also found in a fusion in cell lines (XLS 2035 kb)

Supplementary Table 11

Crizotinib response in cancer cell lines (XLS 149 kb)

Supplementary Table 13

IC50 values for five drugs determined in 351 cell lines (XLS 80 kb)

Supplementary Data 1

Gene expression read counts for all coding genes (ZIP 104720 kb)

Supplementary Data 2

Gene expression read counts for all non-coding genes (ZIP 49333 kb)

Supplementary Data 3

All single nucleotide mutations found in cell lines in this study. (ZIP 20500 kb)

Supplementary Data 4

Per-gene ploidy-corrected copy number values (ZIP 7901 kb)

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Klijn, C., Durinck, S., Stawiski, E. et al. A comprehensive transcriptional portrait of human cancer cell lines. Nat Biotechnol 33, 306–312 (2015). https://xs.scihub.ltd/https://doi.org/10.1038/nbt.3080

Download citation

Further reading