Predicting Breast Cancer Through Machine Learning Techniques
Breast cancer remains one of the most common types of cancers in women. Over 4700 women were diagnosed with and 710 died of breast cancer in Wisconsin in 2016. While current national guidelines address breast cancer screening and prevention, more work needs to be done to determine the right combination of risk factors that will allow clinicians to better target interventions.
With funding from a UW2020 grant and Garding Against Cancer, UW Carbone Cancer Center radiologist Elizabeth Burnside, MD, is leading a collaborative effort with cancer epidemiologist Amy Trentham-Dietz, PhD, oncologist James Shull, PhD, and computer scientists C. David Page, PhD and Irene Ong, PhD, to identify novel genetic markers to advance the precise targeting of breast cancer prevention and early detection of breast cancer. Using machine learning methods, this interdisciplinary research team hopes to better predict risk by leveraging information from multiple levels of “big data.”
“Solving important problems in precision medicine, like tailoring screening to individuals based on predictive data will require transdisciplinary teams with expertise in imaging, informatics, engineering, population health, and other diverse disciplines working collaboratively,” Burnside says.
Burnside and her colleagues recognize the importance of team science, a collaboration that brings together experts from different areas of breast cancer research who generally work independently.
“Current breast cancer screening recommendations use a women’s age to determine when a woman should receive her screening mammogram, but there are other factors to consider like genetics or other demographic factors that may play an important role,” says Jennifer Cox, MS, project manager and data scientist in Burnside’s group. “It’s possible we could provide better advice, but we just don’t have the knowledge yet.”
Using data from the Wisconsin Women’s Health Study (WWHS), the researchers collected survey data and genetic data from about 7,000 women across the state from 1988 to 2007. With this recent funding, Burnside and her colleagues were able to process the genetic data through a collaboration with the Cancer Risk Estimates Related to Susceptibility (CARRIERS) consortium at the Mayo Clinic. The team also obtained mammography text reports from the UW Health electronic health records for about 500 of the WWHS participants. The team then needed to convert the reports from an “unstructured” text format and put it in a “structured” spreadsheet format. The researchers used a machine learning algorithm to comb through the text reports, pull out relevant key words that that are important breast cancer risk factors found on mammograms — like breast density — and put the information into a spreadsheet that could be easily analyzed.
The risk prediction models for this project are in the early stages of development. However, they have found some promising preliminary results using another dataset that shows the advantages to their research approach combining genetic information, electronic health records, and mammogram data.
For example, in a recent published conference proceeding, Burnside and her colleagues used machine learning methods to predict breast cancer risk in a patient cohort derived from the Marshfield Clinic Personalized Medicine Research Project. Their results show that combining information about genetic variants associated with breast cancer risk with mammogram data improved the performance of their predictive models for women between the ages of 29 and 59 years old but not for women over 60. Current guidelines for breast cancer screening vary by age, and this study suggests that genetic information could be used to improve upon breast cancer risk assessments for younger women. Additionally, in a separate published conference proceeding using the Marshfield Clinic data, the researchers found that the combination of demographic risk factors, genetic data, and mammogram data improves breast cancer risk prediction.
“We are doing the best we can to improve breast cancer screening and prevention using common risk factors,” Jennifer Cox says, “but we may do better if we can study the problem in more complex ways, like combining known risk factors with genetics, to personalize our approach for each individual woman.”