Cancer is a generic term for a group of diseases that can affect any part of the body. According to the World Health Organization, cancer causes 7 million deaths each year, or 12.5% of deaths worldwide. More than 11 million people are diagnosed with cancer every year, and it is estimated that there will be 16 million new cases every year by 2020.
Cancer develops when cells in one part of the body begin to grow out of control, often leading to invasion of other tissues, either directly or by traveling to other parts of the body where they begin to grow and replace normal tissue through a process called metastasis.
Cancer cells develop as a result of damage to DNA. Most of the time when DNA becomes damaged the body is able to repair it, but in cancer cells, damaged DNA is not fixed. Damaged DNA can be hereditary or can be caused by carcinogens, by exposure to radioactive materials, or by certain viruses that insert their DNA into the human genome.
Current therapies and treatment regimens are based upon classification strategies which are limited in terms of their capacity to identify specific tumor groups exhibiting different clinical and biological profiles. Given the recent advances in high-throughput genetics it is now possible to utilize gene expression patterns as a basis for classifications. Tumors can be analyzed at the genomic, RNA, and protein expression level using tissue microarrays (TMA) to confirm clinico-pathologic correlations which have been established with whole tissue sections. Drs. Foran and Bhanot have undertaken an ambitious project to design, develop and evaluate a multi-modality decision support approach for assessing and managing cancer that employs an automated, evidence-based approach for systematically evaluating clinical, genomic, and imaging data. We will take an interdisciplinary approach which leverages advances in molecular biology and early detection through state-of-the-art imaging and pattern recognition techniques. We will translate research results from our studies into a prototype clinical decision support system.

In the PathMiner, TMA-Miner, and Multimodal Decision Support (MDS) projects, the bottleneck is no longer the computational tools for performing the analysis, but rather the limitations that we currently have in terms of processing speeds. If rich informational content extracted from the thousands of cells, tissues, and microarray datasets that constitute a single specimen could be processed in parallel and a reliable meta-classifier could be devised to optimally combine the salient genomic, proteomic and image-based signatures from several modalities simultaneously it would be possible to develop high-throughput screening and analysis systems which would hasten the rate of progress in investigative cancer research, drug discovery and diagnostics. Our team has the capacity and requisite knowledge to build such a system, but this undertaking will require much greater computational resources than we currently have at our disposal.
As part of our image acquisition protocol cancer specimens including standard histologic sections and immunostained tissue microarrays are being scanned on a high-resolution automated whole slide scanner using a 40x volume scan. Datasets are stored on a RAID system in multi-tiled TIFF format. For each microscopic specimen the resulting image file is approximately 1 GB. As of this writing we have already digitized a mixed set of nearly 100, 000 tissue discs originating from immunostained microarrays from several NIH designated Cancer Centers across the country. The other specimens used in our experiments will be accessed from repositories at The Cancer Institute of New Jersey and the University of Pennsylvania School of Medicine. All samples are de-identified so that they can not be traced back to patients, however, each specimen will be accompanied by the surgical diagnosis of record, the histological description, and the tumor grade.