The study is based on the already well-established fact that breast cancer is not one biologically homogeneous disease, but it is composed of several molecular sub-types each of which is characterized by distinct gene expression profiles.
“It is these differences that explain, at least in part, why patients who have tumors that appear to be similar may experience completely different clinical outcomes such as prognosis and response to anticancer therapies,” said Dr Benjamin Haibe-Kains from Dana-Farber Cancer Institute in Boston, USA. “Thus, there is an urgent need for developing a robust tool to provide clinicians with guidance for classifying breast cancer molecular sub-types, which could then aid in making therapeutic decisions.”
Several research groups have already developed a range of different genetic ‘fingerprints’ they use to assign breast cancers into different sub-types, but questions have been raised about the reliability of these groupings.
To shed new light on this issue, Dr Haibe-Kains and colleagues performed the largest comparative study to date of breast cancer sub-types, analyzing 32 publicly available gene expression datasets including more than 4600 breast cancer patients and six different classification models.
“We studied these models in terms of concordance and prognostic value and, for the first time, we estimated their robustness: that is, their capacity to assign the same tumors to the same molecular sub-types whatever the gene expression data used to fit this model.”
Two main classes of classification models have been published during the last decade: the Single Sample Predictor (SSP) and the Sub-type Classification Model (SCM). Over the years the list of genes used by these models have been refined, leading to the publication of six distinct classification models.
“Generally speaking, we found that SCMs yielded stronger concordance than SSPs,” Dr Haibe-Kains said. “We also observed that SCMs, including a simple model that uses only three genes –ESR1, ERBB2 and AURKA– were significantly more robust than SSPs.”
By demonstrating the robustness of the SCM models, the new study is a significant step towards bringing these classification models into the clinic, Dr Haibe-Kains said. “The robustness of SCMs makes them promising candidates for an implementation into the clinic especially in the simplest form –that is, a model using only three genes.”
The work was a collaboration between the Computational Biology and Functional Genomics Laboratory, Dana-Farber Cancer Institute, Boston, USA; the Functional Genomics and Translational Research Unit, Institut Jules Bordet, Bruxelles, Belgium; and the Machine Learning Group, Universite Libre de Bruxelles, Bruxelles, Belgium.