Pre-print Functionally Coherent Transcription Factor Target Networks Illuminate Control of Epithelial Remodelling

Cell identity is governed by gene expression, regulated by Transcription Factor (TF) binding at cis-regulatory modules. We developed the NetNC software to decode the relationship between TF binding and the regulation of cognate target genes in cell decision-making; demonstrated on nine datasets for the Snail and Twist TFs, and also modENCODE 'HOT' regions. Results illuminated conserved molecular networks controlling development and disease, with implications for precision medicine. Predicted 'neutral' TF binding accounted for the majority (50% to ≥80%) of candidate target genes from statistically significant peaks and HOT regions had high functional coherence. Expression of orthologous functional TF targets discriminated breast cancer molecular subtypes and predicted novel tumour biology. We identified new gene functions and network modules including crosstalk with notch signalling and regulation of chromatin organisation, evidencing networks that reshape Waddington's landscape during epithelial remodelling. Predicted invasion roles were validated using a tractable cell model, supporting our computational approach.

Pre-print Visualization and analysis of high-throughput in vitro dose-response datasets with Thunor

Quantifying the effects of drugs and other environmental factors on cell proliferation in vitro continues to be one of the most prevalent assays in biomedical research. Assessment of the dose-dependent nature of drug effects is typically performed with a variety of commercial software applications or using freely available, but more technically demanding, statistical programming environments such as Python or R. However, with the advent of large, publicly-available drug response databases and continued advancements in high-throughput experimentation, there is a growing need for user-friendly software platforms that can efficiently and reliably facilitate analysis within and across large datasets. Here we introduce Thunor, an open-source software platform for the management, analysis, and visualization of large-scale dose-dependent cell proliferation datasets. Thunor provides a simple, user-friendly interface to upload cell count data and a graphical plate map tool to annotate plate wells with cell lines and drugs. Best-fit dose-response curves are generated based on either cell viability or proliferation rate drug effect metrics. Derived dose-response parameters, such as IC50, Emax, and activity area, are automatically calculated by the software back-end. An arrayed plot interface supports multiple plot types, including time course, dose-response curve, box/bar/scatter plots of derived parameters, and quality control analyses, among others. We demonstrate the features of Thunor on large-scale, publicly-available viability data and an in-house, high-throughput proliferation rate dataset. Software, documentation, and an online demo are all available at

Journal Article Overcoming intratumoural heterogeneity for reproducible molecular risk stratification: a case study in advanced kidney cancer

Metastatic clear cell renal cell cancer (mccRCC) portends a poor prognosis and urgently requires better clinical tools for prognostication as well as for prediction of response to treatment. Considerable investment in molecular risk stratification has sought to overcome the performance ceiling encountered by methods restricted to traditional clinical parameters. However, replication of results has proven challenging, and intratumoural heterogeneity (ITH) may confound attempts at tissue-based stratification.

We investigated the influence of confounding ITH on the performance of a novel molecular prognostic model, enabled by pathologist-guided multiregion sampling (n = 183) of geographically separated mccRCC cohorts from the SuMR trial (development, n = 22) and the SCOTRRCC study (validation, n = 22). Tumour protein levels quantified by reverse phase protein array (RPPA) were investigated alongside clinical variables. Regularised wrapper selection identified features for Cox multivariate analysis with overall survival as the primary endpoint.

The optimal subset of variables in the final stratification model consisted of N-cadherin, EPCAM, Age, mTOR (NEAT). Risk groups from NEAT had a markedly different prognosis in the validation cohort (log-rank p = 7.62 × 10−7; hazard ratio (HR) 37.9, 95% confidence interval 4.1–353.8) and 2-year survival rates (accuracy = 82%, Matthews correlation coefficient = 0.62). Comparisons with established clinico-pathological scores suggest favourable performance for NEAT (Net reclassification improvement 7.1% vs International Metastatic Database Consortium score, 25.4% vs Memorial Sloan Kettering Cancer Center score). Limitations include the relatively small cohorts and associated wide confidence intervals on predictive performance. Our multiregion sampling approach enabled investigation of NEAT validation when limiting the number of samples analysed per tumour, which significantly degraded performance. Indeed, sample selection could change risk group assignment for 64% of patients, and prognostication with one sample per patient performed only slightly better than random expectation (median logHR = 0.109). Low grade tissue was associated with 3.5-fold greater variation in predicted risk than high grade (p = 0.044).

This case study in mccRCC quantitatively demonstrates the critical importance of tumour sampling for the success of molecular biomarker studies research where ITH is a factor. The NEAT model shows promise for mccRCC prognostication and warrants follow-up in larger cohorts. Our work evidences actionable parameters to guide sample collection (tumour coverage, size, grade) to inform the development of reproducible molecular risk stratification methods.

Journal Article GPU-powered model analysis with PySB/cupSODA

A major barrier to the practical utilization of large, complex models of biochemical systems is the lack of open-source computational tools to evaluate model behaviors over high-dimensional parameter spaces. This is due to the high computational expense of performing thousands to millions of model simulations required for statistical analysis. To address this need, we have implemented a user-friendly interface between cupSODA, a GPU-powered kinetic simulator, and PySB, a Python-based modeling and simulation framework. For three example models of varying size, we show that for large numbers of simulations PySB/cupSODA achieves order-of-magnitude speedups relative to a CPU-based ordinary differential equation integrator.

Availability and implementation
The PySB/cupSODA interface has been integrated into the PySB modeling framework (version 1.4.0), which can be installed from the Python Package Index (PyPI) using a Python package manager such as pip. cupSODA source code and precompiled binaries (Linux, Mac OS/X, Windows) are available at (requires an Nvidia GPU; Additional information about PySB is available at

Supplementary information Supplementary data are available at Bioinformatics online.

Journal Article Integrated, High-Throughput, Multiomics Platform Enables Data-Driven Construction of Cellular Responses and Reveals Global Drug Mechanisms of Action

An understanding of how cells respond to perturbation is essential for biological applications; however, most approaches for profiling cellular response are limited in scope to pre-established targets. Global analysis of molecular mechanism will advance our understanding of the complex networks constituting cellular perturbation and lead to advancements in areas, such as infectious disease pathogenesis, developmental biology, pathophysiology, pharmacology, and toxicology. We have developed a high-throughput multiomics platform for comprehensive, de novo characterization of cellular mechanisms of action. Platform validation using cisplatin as a test compound demonstrates quantification of over 10 000 unique, significant molecular changes in less than 30 days. These data provide excellent coverage of known cisplatin-induced molecular changes and previously unrecognized insights into cisplatin resistance. This proof-of-principle study demonstrates the value of this platform as a resource to understand complex cellular responses in a high-throughput manner.

Journal Article Sunitinib Treatment Exacerbates Intratumoral Heterogeneity in Metastatic Renal Cancer

Purpose: The aim of this study was to investigate the effect of VEGF-targeted therapy (sunitinib) on molecular intratumoral heterogeneity (ITH) in metastatic clear cell renal cancer (mccRCC).

Experimental Design: Multiple tumor samples (n = 187 samples) were taken from the primary renal tumors of patients with mccRCC who were sunitinib treated (n = 23, SuMR clinical trial) or untreated (n = 23, SCOTRRCC study). ITH of pathologic grade, DNA (aCGH), mRNA (Illumina Beadarray) and candidate proteins (reverse phase protein array) were evaluated using unsupervised and supervised analyses (driver mutations, hypoxia, and stromal-related genes). ITH was analyzed using intratumoral protein variance distributions and distribution of individual patient aCGH and gene-expression clustering.

Results: Tumor grade heterogeneity was greater in treated compared with untreated tumors (P = 0.002). In unsupervised analysis, sunitinib therapy was not associated with increased ITH in DNA or mRNA. However, there was an increase in ITH for the driver mutation gene signature (DNA and mRNA) as well as increasing variability of protein expression with treatment (P < 0.05). Despite this variability, significant chromosomal and transcript changes to key targets of sunitinib, such as VHL, PBRM1, and CAIX, occurred in the treated samples.

Conclusions: These findings suggest that sunitinib treatment has significant effects on the expression and ITH of key tumor and treatment specific genes/proteins in mccRCC. The results, based on primary tumor analysis, do not support the hypothesis that resistant clones are selected and predominate following targeted therapy. Clin Cancer Res; 21(18); 4212–23. ©2015 AACR.

Journal Article Carbonic Anhydrase 9 Expression Increases with Vascular Endothelial Growth Factor–Targeted Therapy and Is Predictive of Outcome in Metastatic Clear Cell Renal Cancer

There is a lack of biomarkers to predict outcome with targeted therapy in metastatic clear cell renal cancer (mccRCC). This may be because dynamic molecular changes occur with therapy.

To explore if dynamic, targeted-therapy-driven molecular changes correlate with mccRCC outcome.

Design, setting, and participants
Multiple frozen samples from primary tumours were taken from sunitinib-naïve (n = 22) and sunitinib-treated mccRCC patients (n = 23) for protein analysis. A cohort (n = 86) of paired, untreated and sunitinib/pazopanib-treated mccRCC samples was used for validation. Array comparative genomic hybridisation (CGH) analysis and RNA interference (RNAi) was used to support the findings.

Three cycles of sunitinib 50 mg (4 wk on, 2 wk off).

Outcome measurements and statistical analysis
Reverse phase protein arrays (training set) and immunofluorescence automated quantitative analysis (validation set) assessed protein expression.

Results and limitations
Differential expression between sunitinib-naïve and treated samples was seen in 30 of 55 proteins (p < 0.05 for each). The proteins B-cell CLL/lymphoma 2 (BCL2), mutL homolog 1 (MLH1), carbonic anhydrase 9 (CA9), and mechanistic target of rapamycin (mTOR) (serine/threonine kinase) had both increased intratumoural variance and significant differential expression with therapy. The validation cohort confirmed increased CA9 expression with therapy. Multivariate analysis showed high CA9 expression after treatment was associated with longer survival (hazard ratio: 0.48; 95% confidence interval, 0.26–0.87; p = 0.02). Array CGH profiles revealed sunitinib was associated with significant CA9 region loss. RNAi CA9 silencing in two cell lines inhibited the antiproliferative effects of sunitinib. Shortcomings of the study include selection of a specific protein for analysis, and the specific time points at which the treated tissue was analysed.

CA9 levels increase with targeted therapy in mccRCC. Lower CA9 levels are associated with a poor prognosis and possible resistance, as indicated by the validation cohort.

Patient summary
Drug treatment of advanced kidney cancer alters molecular markers of treatment resistance. Measuring carbonic anhydrase 9 levels may be helpful in determining which patients benefit from therapy.

Journal Article TMA Navigator: network inference, patient stratification and survival analysis with tissue microarray data

Tissue microarrays (TMAs) allow multiplexed analysis of tissue samples and are frequently used to estimate biomarker protein expression in tumour biopsies. TMA Navigator ( is an open access web application for analysis of TMA data and related information, accommodating categorical, semi-continuous and continuous expression scores. Non-biological variation, or batch effects, can hinder data analysis and may be mitigated using the ComBat algorithm, which is incorporated with enhancements for automated application to TMA data. Unsupervised grouping of samples (patients) is provided according to Gaussian mixture modelling of marker scores, with cardinality selected by Bayesian information criterion regularization. Kaplan–Meier survival analysis is available, including comparison of groups identified by mixture modelling using the Mantel-Cox log-rank test. TMA Navigator also supports network inference approaches useful for TMA datasets, which often constitute comparatively few markers. Tissue and cell-type specific networks derived from TMA expression data offer insights into the molecular logic underlying pathophenotypes, towards more effective and personalized medicine. Output is interactive, and results may be exported for use with external programs. Private anonymous access is available, and user accounts may be generated for easier data management.