Omics approaches and data platform

The third theme encompasses subprojects focused on developing the iCAN data platform and implementing innovative approaches for analyzing molecular profiling data. Leveraging the power of artificial intelligence, our researchers aim to improve patient treatment selection and prediction of treatment outcomes.

Furthermore, this theme seeks to uncover novel drug combinations and investigate the impact of the tumor microenvironment on cancer progression.

The central objective of the theme is to enhance the utilization of molecular profiling data through advanced computational methods. By employing AI and innovative analytical techniques, researchers strive to uncover hidden patterns and relationships within vast amounts of molecular data. This knowledge can then be applied to refine treatment decisions, discover new therapeutic targets, and optimize patient care.

Coordinating PI: Esa Pitkänen

iCAN-Mu-Male (part of DPM)

We aim to create artificial intelligence (AI) methods to assist clinicians in cancer diagnostics and treatment choice. To do this, we will utilize integrated molecular profiling and health registry data available in iCAN for identifying tumor characteristics informative of the patient’s clinical trajectory. We will adapt a machine learning model called MuAt that we have previously introduced to process integrated iCAN patient data. This model will enable us to identify tumor types even when the primary tumor is not known, for example in liquid biopsy applications for cancer early detection and in metastatic cancers with an unknown primary tumor. The model also distinguishes clinically and biologically relevant tumor subtypes, enabling more accurate diagnosis, prognosis and treatment choice. Finally, we will integrate MuAt with the iCAN molecular tumor board (iMTB) application. iMTB integration will allow a clinician to quickly retrieve and view tumor types, subtypes as well as previously encountered tumor cases most closely matching a new tumor case at hand, potentially improving treatment decisions.

Coordinating PI: Andrew Erickson

Our bodies are made of organs (lungs, heart, liver), which are made of tissues, consisting of single cells. Cancers arise from cells, which carry their genetic code in DNA. In many cancers, DNA is different between individual cells, allowing for the study of tumor genetics. Tumor genetics has been extensively studied using “bulk” genetic sequencing. This approach takes a piece of tissue, with many millions of cells, and blows them all apart. The DNA is grouped together giving an average readout from these cells. Because these methods destroy the structure of the tissues, they fail to capture the differences between individual cancer cells. Single cell techniques have developed to physically separate and sequence cells. But, they have a problem in studying tumor genetics as it is unknown where in the tissues each separated cell comes from. New unique spatial biotechnologies have developed to profile cells within tissues. We have developed a method to spatially track tumor cells in tissues, generating 50,000 spatial readouts from a single patient’s tumor (Erickson et al, Nature, 2022). This application is to establish a team to further develop spatial biotechnologies on iCAN patient samples.

Coordinating PI: Johan Lundin

Characterization of biological samples for diagnostic purposes is undergoing a transition where an increasing number of steps in the process are being supported by machine learning and artificial intelligence. For example, within pathology, cancer research and microbiology an expert’s decisions will soon be supported with an array of readouts performed by AI-algorithms. The paradigm shift from human expert-based interpretations to computerized readout has vast implications for both clinical medicine and biomedical research and poses a grand challenge for the research community and health care in general.

Many tasks currently performed visually as part of the diagnostic process can be automated by training deep learning-based classifiers. One of the major advantages of the novel AI-based algorithms is the ability to train classifiers for diagnoses that exhibit a high level of complexity. This means that during the next few years, it will not only become possible to replicate what highly trained experts do through visual assessment, but also supersede human performance with regard to diagnostic precision, accuracy and consistency.

Coordinating PI: Lassi Paavolainen

Understanding cancer and its surrounding tissue is essential for deciding the optimal cancer treatment for a patient. Microscopy imaging of cancer tissue provides vast amounts of organizational information from cancer that can be visualized by clinicians and even by patients. However, current image analysis methods are only capable of scratching the surface of this complex and large-scale data present in microscopy images. Further analytical development is required to support cancer treatment. In this project we aim to solve this analytical problem by developing novel AI-driven image analysis methods to uncover new treatment and survival specific information from microscopy images. Our main goal is to improve knowledge of cancer tissue structure and to discover new biomarkers using these AI methods. The resulting biomarkers can lead to improved targeted cancer therapies and provide information of cancer aggressiveness to treating clinician. When applying these methods to cancer biopsies, the outcome can be used to optimize patient-specific treatment by clinicians without / before surgery. The results help clinicians to decide whether surgery is needed and which treatment option is optimal for patient survival and recovery.

Coordinating PI: Tero Aittokallio

Drug resistance is the major reason why cancers progress in patients with disseminated disease. Interactions of tumor cells with their surrounding tissue and cells, their so-called tumor microenvironment (TME), drives this resistance and identification of multi-drug combinatorial therapies that target such tumor-microenvironment interactions provide a great potential for durable outcomes. We will develop an experimental-computational platform to identify tumor-selective drug combinations applicable to a variety of solid tumors; first tested in ovarian cancers (2023-2024), and later extended to lung and other tumor types (2025-26). Combinations most promising as clinical treatments will be discussed with iCAN clinicians, and assessed via the molecular tumor board (iMTB) that intends to centralise the sharing of clinical information. Since drug combinations are required to treat most of the advanced cancers, this project will implement a critical platform to find effective and safe treatments for individual cancer patients that each carry a unique TME. The platform will also help to identify markers indicative of response, so-called biomarkers, providing means to select patients to next-generation clinical trials based on their genomic identity and drug responses tested in the lab.

Coordinating PI: Mikko Myllymäki

Cancer remains among the leading causes of mortality globally. While traditional chemotherapeutic agents are still widely used to treat various cancer subtypes, their effectiveness is hindered by toxicities affecting organs such as the liver and kidneys. Novel therapies that activate the immune system have shown promising activity in some cancers subtypes; however, their mechanism of action rely on patient’s own immune system to attack tumor cells. Clonal hematopoiesis is a premalignant condition referring to expansion of a hematopoietic stem cell clone in the bone marrow that can be detected in peripheral blood. Clonal hematopoiesis is associated with risk of inflammatory diseases, including cardiovascular diseases, due to increased immune cell activation, and predisposes to other health conditions such as chronic liver and kidney diseases, possibly rendering some cancer patients more susceptible to drug-induced toxicities. Our project aims to comprehensively evaluate the spectrum of clonal hematopoiesis in iCAN participants and how it is associated with disease outcomes, including responses to immunotherapies and treatment-related adverse events. Collectively, these efforts will help design individualized therapies for cancer patients in the future.

Coordinating PI: Kimmo Porkka

Close to real-time sharing of comprehensive cancer data at different sites (national/international) is increasingly important for generating reliable and transparent evidence for regulatory and research purposes.  With the recent European focus on privacy preservation (e.g. GDPR, local adaptations), exchange of primary patient-level data is increasingly challenging. Thus, novel methods and structures for timely sharing and analyzing medical information and expertise are needed.

The iCAN-SHARE subproject will provide tools and methods for privacy-preserving, secure, and timely exchange of iCAN research data and models, and enable global collaboration (academic, industry). We will also establish extensive network studies for next-generation predictive modeling with the ultimate aim of matching the right cancer drugs with the right cancer patients.

Coordinating PI: Hanna Ollila

Recent research has demonstrated that cancer has a time dependent pathogenicity. Cancer risk can be higher during winter vs. summer, and cancer cells metastasize primarily during the night. Consequently, inherent homeostatic factors including genetic factors, and environmental time dependent and seasonal risk factors likely shape the metastatic potential and timing of cancer onset and metastasis. In addition, pharmaceutical intervention may be most efficient if timed correctly and provide added survival when timed optimally. This project aims to explore the temporal variation in cancer risk, metastasis and mortality. In addition, we will examine the effect of timed drug administration and its therapeutic potential in cancer treatment.

Coordinating PI: Vilja Pietiäinen

We have created the iCAN integrative Molecular Tumor Board (iMTB) reporting system to improve cancer diagnosis and treatment by using detailed molecular data from patient samples. Traditional diagnostic methods often cannot provide personalized treatment options. Our goal with the iMTB is to analyze this complex data, turn it into easy-to-understand information, and help doctors in the diagnosis and potentially choose the best treatments based on each patient’s unique molecular profile. The iCAN-iMTB tool currently helps identify genetic alterations in cancer for individual patients and provides reports on these to researchers and doctors. Our next steps are to include more types of molecular data, make the clinical reporting process better by working together with diagnostic laboratory and clinics and, and by using iMTB for pan-cancer patient study, further explore the clinical relevance of the molecular profiling findings.

The iMTB is a key tool aiding with cancer research and discoveries, and most importantly, aiming to improve patient outcomes and quality of life using precision medicine data. To keep patients informed and involved, we will work with patient representatives (POTKU) and other patient advocacy groups. Our aim is to ensure that patients can benefit from the latest advancements in cancer care.

Coordinating PI: Sampsa Hautaniemi

Generating massive amounts of genomics data has become common due to decreasing sequencing costs. Furthermore, bioinformatics community has developed reliable data analysis pipelines allowing processing tens of thousands of genomes efficiently. Thus, we are experiencing a paradigm shift where the main bottleneck has shifted from generating and processing data from cancer patients to translating these vast quantities of data into knowledge and medical benefits.

One of the most powerful approaches to translate data into knowledge is visualization. Modern visualization methodologies are based on advanced mathematics, state-of-the-art software engineering and with the help of powerful computing clusters, visualization methods have evolved to the level that enables interactive exploring large data masses. We have pioneered a cutting-edge interactive visualization tool called GenomeSpy for thorough interpretation of genomics data and shown its utility in exploration of >700 whole-genome sequencing samples from cancer patients. In this project, the tool will be installed to visualize iCAN data, which allows conveniently exploration of all iCAN genomics data. We start with ovarian cancer subproject and then expand to other subprojects. The interactive and rapid visualization will greatly fasten making discoveries from the iCAN genomics data.

Coordinating PI: Ping Chen

Cancer is a highly complex disease that varies significantly from person to person, making it difficult to develop effective treatments. Our project investigates a process called “alternative splicing,” which allows cells to produce different proteins from the same gene. When alternative splicing goes wrong, it can drive cancer development and growth and influence how well treatments work. Using comprehensive molecular and clinical data from the iCAN Flagship Project, we will develop advanced computational tools to study how alternative splicing impacts cancer development. Our goal is to identify dysregulated splicing patterns in human cancers and understand the mechanisms driving these changes. By linking these disrupted splicing events to patient outcomes, we aim to discover new biomarkers and targets for better diagnostics and personalized therapies. In this pilot study, we will focus on ovarian cancer to validate our approach. This research will significantly deepen our understanding of how alternative splicing contributes to cancer and extend its application to other cancer types within the iCAN Flagship Project, paving the way for more effective and personalized cancer treatments.