This service facilitates the analysis of correlation between the activity of many chemical substances tested in (NCI60) anti-cancer trials and the expression of genes deduced from cDNA microarray results obtained for 60 cancer cell lines. Because both types of experiments were conducted on the same group of cells the results can be directly compared with each other. All experimental values were taken from the USA National Cancer Institute. The data collection and initial clustering analyses where published in "A gene expression database for the molecular pharmacology of cancer" & "Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks".
For this service the data was processed to calculate relations between drugs and genes observer in the NCI60 cell lines. The relation can be discover by assessing the correlation between biological activity and expression over an array of 60 values measured in the cells. On one hand it may seem unlikely to find a gene significantly affecting the action of a drug from a list of almost 10 000 investigated human genes just by collecting 60 measurements. On the other hand if each measurement could provide just 2 possible values the total number of possible results would be 60^2 = 1024^6 a number with 19 figures, much larger than the number of nucleotides in gene sequences stored in current databases. Thus, 60 experiments are sufficient to create a substance-specific of gene-specific activity profile. The results can point to interesting genes or groups of genes interacting in some way with the compound of interest.
The correlation values have been calculated for all gene-drug pairs stored in the database but also for all gene-gene and drug-drug pairs. The correlations are expressed as correlation coefficient and using the error function (correlation-error-function) that calculates the probability of obtaining a higher than observed correlation coefficient by chance. The observed gene-drug correlation results have been also used to score the relation between a drug and a gene function as defined by the gene ontology (GO) and also between a drug and a pathway (as defined by KEGG or BioCarta). The score (matching-score) was defined as the sum of negative logarithms of the values of the correlation-error-function to all related genes belonging to the group (GO-term or pathway) divided by the total number of genes belonging to this group. Genes are defined to be related to a drug if the value of the correlation-error-function is below 0.0001. The matching-score equals 0 if there are no related genes from the group.
There are 3 different biological action of a drug assessed in this study:
All concentrations are expressed as negative logarithm on the service pages. Correlations to all 3 types of drug action are calculated.
Following pages are available
DrugsThe drug page shows information on 45343 compound available in the database. Following columns are shown:
The drugs can be sorted by highest activity observed on any of the 60 cell lines using any of the 3 activity types (GI50,TGI,LC50) [links are available in the ME-BR column headings and on the Cells page]. Drugs can be also sorted by relation to a different drug or to a gene or by the matching-score to a pathway [links (named drugs) are available from the corresponding objects]. If drugs are sorted by relation to other objects additional columns showing the similarity appear. For Pathways and GO-terms the additional score column shows the matching-score described above. For Drugs and Genes the correlation coefficients for the 3 activities are reported and the lowest observed correlation-error-function (the correlation-error-function is used as sorting criterion). For drugs also the tanimoto 2d-index similarity is reported if above 0.4.
LigandsThe drug page shows information on 1159274 compound downloaded from Ligand.Info. Following columns are shown:
The ligands can be sorted by similarity to another ligand or drug.
GenesThe gene page shows information on 9703 human clones used in cDNA expression experiments. These clones were mapped on human genes by comparing the nucleotide sequences of the 5' and 3' ends of the clones with the database of human mRNA sequences using blast. 1293 clones could not be mapped this way and are left with their original annotation only. Following columns are shown:
The genes can be sorted by expression change observed in any of the 60 cell lines [links are shown in the ME-BR column headings]. Genes can be also sorted by relation to another gene or to a drug [links (named genes) are available from the corresponding objects]. If genes are sorted by relation to other objects additional columns showing the highest observed correlation coefficients and the lowest observed correlation-error-function are reported (the correlation-error-function is used as sorting criterion). There is only one correlation coefficient between two genes but there are 3 for a gene-drug pair, for each activity (GI50,TGI,LC50) one.
PathwaysAssignment of genes to pathways was downloaded from the Cancer Genome Anatomy Project (CGAP) page. The page includes 314 BioCarta pathway assignments and 170 KEGG pathway assignments. BioCarta and KEGG pathways can be viewed separately following the corresponding links in the menu panel. Following columns are shown:
Pathways assigned to a single gene or related via genes to a drug can be selected through links from the corresponding objects. if pathways related to drugs are shown an additional column with the matching-score is displayed and used for sorting. pathways assigned to a gene are sorted by the number of genes in them.
GO-termsAssignment of GO-terms to human genes was also downloaded from the Cancer Genome Anatomy Project (CGAP) page. The page includes 2327 component GO-terms, 15762 process GO-terms and 9241 function GO-terms. These 3 categories can be viewed separately following the corresponding links in the menu panel. Following columns are shown:
GO-terms assigned to a single gene or related via genes to a drug can be selected through links from the corresponding objects. if GO-terms related to drugs are shown an additional column with the matching-score is displayed and used for sorting. GO-terms assigned to a gene are sorted by the number of genes with this assignment.
CellsThis page shows the list of 60 cancer cell lines used in gene expression and drug activity studies. Following column are shows:
Some cell lines don't have any activity measurements recorded. These can clearly not be used in expression-activity correlation analyses. Nevertheless they are provided here for completeness.
SearchThe search input field can be used to restrict the list of displayed object to those matching the provided string. Regular expression matching on the text in the Names column is used.