ColoGuidePro signature reporting file

Generated by ReProMSig version 1.0

Date created: 2022-03-23

  Last update: 2023-07-18



Lead contact: Jianmin Wu

Email address: wujm@bjmu.edu.cn

Organization: Peking University Cancer Hospital & Institute

Reporting file URL: https://omics.bjcancer.org/prognosis/open/ColoGuidePro_4ec64bfa_report.html


I. Signature summary

Signature name: ColoGuidePro

Signature type: Prognostic molecular signature

Signature generation: The signature score is derived by summing the multiplication of expression value of each signature gene by the published corresponding regression coefficient.

Patient risk stratification: 2 Groups

Signature description: A 7-gene expression signature for stage III colorectal cancer prognosis, reproduced using the published signature genes and coefficients (PMID: 22991413), and evaluated by ReProMSig.


Training & Validation datasets


Class Dataset name Primary site Disease type Molecular dataset Endpoint Pubmed Clinical trial Registry
Training dataset ProL Colorectal Adenomas and Adenocarcinomas (95) GSE30378 (95) DFS (48 events, 95 patients) 22991413
Validation dataset ProT Colorectal Adenomas and Adenocarcinomas (77) GSE24550 (77) DFS (19 events, 77 patients) 22991413
Validation dataset ProV Colorectal Adenomas and Adenocarcinomas (218) GSE14333 , GSE17538 (218 in total) DFS (51 events, 218 patients) 19996206 , 19914252


II. Items of methods

Item 4. Source of data

Item 4a. Study design

Training dataset

  • Study design or source of data

A retrospective cohort study including 95 samples taken from patients treated surgically at different hospitals in the Oslo region, before adjuvant chemotherapy becoming standard treatment for stage III patients.

Validation dataset(s)

  • Study design or source of data

ProT: consecutively collected (95% inclusion rate), including 77 patients treated by curative resection at a Norwegian hospital (Aker University Hospital, Oslo, Norway).

ProV: sourced from 2 independent series of altogether 218 patients with stage II and III CRC.
- Consecutive patients from (n = 185) retrieved from the tissue banks of the Royal Melbourne Hospital and the H. Lee Moffitt Cancer Center in the United States.
- Patients from GSE17538 (n = 33) were treated at the Vanderbilt Medical Centre (Nashville, TN, USA).

Item 4b. Key study dates

Training dataset

Patients were recruited from 1987 to 1989.

Validation dataset(s)

ProT: Patients were recruited from 2005 to 2008.

ProV: Not mentioned in this dataset.


Item 5. Participants

Item 5a. Key study setting

Training dataset

  • Study setting, number and location of centers

Patients were collected from hospitals in the Oslo region.

Validation dataset(s)

  • Study setting, number and location of centers

ProT: Patients were consecutively collected at Oslo University Hospital, Aker, Norway.

ProV: Patients were collected at H Lee Moffitt Cancer Center, Royal Melbourne Hospital, and Vanderbilt University Medical Center.

Item 5b. Eligibility criteria for participants

Training dataset

  • Exclusion criteria

Clinical anlaysis

Patients with missing outcome, stage I and IV tumors were excluded.

Statistical anlaysis

No patients were excluded by ReProMSig.

  • Inclusion criteria

Clinical anlaysis

These patients were selected to include approximately equal numbers of stage II and III tumors, as well as equal numbers of survival events between the stages, again to achieve independent information within each stage. Selection was also based on long-term follow-up among survivors (>10 years).

Statistical anlaysis

All patients were included in ReProMSig.

Validation dataset(s)

  • Exclusion criteria

Clinical anlaysis

ProT: Patients with missing outcome, stage I and IV tumors were excluded.

ProV: Patients with missing outcome, stage I and IV tumors were excluded. Individuals who had received preoperative chemotherapy and/or radiotherapy or for whom tumor-derived total RNA was inadequate for microarray analysis (RNA integrity number [RIN] < 6) were excluded from GSE14333 series.

Statistical anlaysis

No patients were excluded by ReProMSig.

  • Inclusion criteria

Clinical anlaysis

ProT: Patients with stage II and III tumors were selected.

ProV: There was extensive overlap between samples from H. Lee Moffitt Cancer Center in the two series (GSE14333 and GSE17538), only nonoverlapping samples from stage II and III patients were included.

Statistical anlaysis

All patients were included in ReProMSig.

  • Is it similar to the eligibility criteria used in training dataset
  • Similar as aforementioned.

    Item 5c. Received treatments

    Training dataset

    • Treatment details

    None of the patients had received adjuvant chemotherapy, which was introduced as standard treatment for patients with stage III CRC aged <75 years in Norway in 1997.

    Validation dataset(s)

    • Treatment details

    ProT: None of the patients had received preoperative radiotherapy and adjuvant chemotherapy was given in accordance with Norwegian guidelines to patients with stage III colon cancer aged <75 years, or to patients with stage II disease in whom <8 lymph nodes were examined, or to patients with preoperative or intraoperative tumour perforation. Furthermore, adjuvant chemotherapy is generally only given to physically fit patients, but not to patients with rectal cancer. All underwent curative resection and no bowel perforation was reported.

    ProV:
    - None of patients from GSE14333 had received preoperative chemotherapy and/or radiotherapy. 21 patients had received postoperative concurrent chemoradiotherapy (50.4 Gy in 28 fractions with concurrent 5-fluorouracil), one patient had only received adjuvant radiotherapy, 64 patients had only received adjuvant chemotherapy (either single agent 5-fluouracil/capecitabine or 5-fluouracil and oxaliplatin).
    - 33 patients from GSE17538 had no available information about received treatments.


    Item 6. Outcome

    Item 6a. Clinical endpoints being analyzed
    • Outcome

    Relapse or death from colorectal cancer was regarded as events, and patients with no events were censored. Specially, 10 years follow-up was done for the ProL dataset, and 5 years follow-up was done for the ProT dataset.

    • How and when assessed

    Not mentioned in this study.

    Item 6b. Actions for blind assessment of the outcome

    Not mentioned in this study.


    Item 7. Predictors

    Item 7a. Predictors used in developing the multivariable prediction model

    Genome-wide expression at the exon-level for colorectal cancer tissue biopsies was analyzed using the Affymetrix Human Exon 1.0 ST platform.

    Item 7b. Actions for blind assessment of predictors

    Not mentioned in this study.


    Item 8. Sample Size

    • How the study size arrived at

    Not mentioned in this study.


    Item 9. Missing Data

    Training dataset

    • Missing value handling

    Assuming occurrence of missing values in gene expression profiles were random events, KNN method (impute.knn function in R package impute version 1.60.0) was applied to impute missing values by ReProMSig.

    Validation dataset(s)

    • Missing value handling

    Assuming occurrence of missing values in gene expression profiles were random events, KNN method (impute.knn function in R package impute version 1.60.0) was applied to impute missing values by ReProMSig.


    Item 10. Statistical Analysis Methods

    Item 10a. Predictors handling method
    • Molecular profiles

    Expression values were converted to log2 transformed. Using training dataset gene expression values as the reference, ComBat function in R package "sva" (version 3.34.0) was applied to reduce the likelihood of batch effects from nonbiological technical biases for each validation dataset profiles.

    • Implausible observations

    Extreme values (e.g., outliers) in expression profiles were regarded as NA, and imputed by KNN method, using impute.knn function in R package "impute " (version 1.60.0).

    Item 10b. Model building procedures
    • Predictor selection before modeling

    Candidate predictors include only genes with expression variances higher than 0.2, and P-values from univariate Cox proportional hazards analyses below 0.5 (n = 3,098 genes). P-values (Wald test of predictive potential) were calculated in R 2.11.1 using the Bioconductor package Weighted Gene P-values (Wald test of predictive potential) were calculated in R 2.11.1 using the Bioconductor package Weighted Gene Co-expression Network Analysis (WGCNA).

    • Multivariable prediction model building

    Lasso-penalized multivariate Cox proportional hazards modeling was conducted on a filtered candidate genes. seven different gene expression signatures were found to accommodate optimal survival prediction more than 50 times each (size range, 1–12 genes). For all these signatures, except the 1-gene signature, there were significant associations between patient survival and increasing numbers of genes expressed at levels associated with poor survival. For each of the gene expression signatures, patients were dichotomized to good and poor prognosis groups according to all the possible stepwise increases in amounts of genes being expressed at levels associated with poor prognosis. For the 28 possible poor prognosis groups, 22 (79%) had significant associations with poor patient survival. To assess which stratification rule had the best predictive potential on independent samples, the same patient stratification according to the different gene signatures were repeated in the validation dataset. The best performing stratification rule across both datasets (by rank of P-values from univariate Cox proportional hazards analyses), assigned patients to a poor prognosis group when expressing three or more genes in the 7-gene signature at levels associated with poor prognosis.

    • Collinearity assessment

    The correlations between each pair of selected molecular predictors were estimated using Pearson correlation analysis, to evaluate the presence of collinearity. However, we did not exclude any predictors even if potential collinearity found.

    Item 10c. Prediction method for the validation dataset(s)

    • Signature model

    Individual signature score was calculated by a weighted sum of the predictors in the generated LASSO penalized Cox regression model, in which weights are the corresponding regression coefficients.

    • Nomogram

    Multivariate Cox regression model integrating the molecular signature prediction (signature group) and clinicopathological factors (Stage) (user provided), was developed for nomograms. The nomogram was then created using patients in training dataset, which could be used to estimate the DFS probabilities at 12, 36, 60-months for single patient.

    • Online research tool

    An online research tool for single patient prediction of risk score (i.e., the exponential of signature score) using the molecular signature is available from https://omics.bjcancer.org/prognosis/.

    Item 10d. Model performance assessment
    • Independence test

    Univariate and multivariate Cox regression analyses were performed to test whether individual prognosis factor is an independent factor in predicting patients DFS. Both the molecular signature prediction (risk groups) and clinicopathological variables (MSI_status, Stage) were included in the independence test.

    • Interaction test

    We did not examine interaction terms but relied on the main effects of the selected.

    • Kaplan-Meier analysis

    Discrimination is visually inspected from the spread of Kaplan-Meier curves for each predicted risk group (high-risk, and low-risk), in the training and validation datasets respectively. Differences in the probability of DFS between risk groups (high risk vs low risk) were tested by the two-sided log-rank test. In addition, Kaplan-Meier plots also presented the total number of patients, the number of events (outcome) for each risk group.

    • Time dependent receiver operating characteristic (ROC) analysis

    12, 36, 60-months time-dependent ROC analysis was performed to examine the prognostic accuracy of the developed model in the training dataset. ROC curves for molecular signature, clinicopathological variable (Stage) and the combined model were plotted. An area under the ROC curve (AUC) of 0.5 indicates no discrimination, whereas an AUC of 1.0 indicates perfect discrimination.

    • Prediction error (PE) curve analysis

    PE curve analysis was applied in the training dataset to examine the prognosis prediction error rate, and ten-fold cross-validation cumulative prediction error was computed using Kaplan-Meier estimation as reference. Models with smaller area under the curve indicates a relatively lower error rate. PE curves for molecular signature, clinicopathological variable (Stage) and the combined model were plotted.

    • Calibration plots

    The calibration plots at 12, 36, 60-months were used to assess the consistency between the actual and predicted DFS probabilities from the molecular signature and the combined model in training dataset.

    • DFS

    DFS with 95% confidence intervals at 12, 36, 60-months in the training and validation datasets respectively, were calculated by Kaplan-Meier method for patients in each risk group.

    • Association analysis

    To assess whether the developed prediction model is correlated with clinicopathological variables, two-sided Fisher's exact test (for categorical variables) and two-sided t test (for continuous variables) were applied to measure association or difference between predicted risk groups.

    Item 10e. Model updating arising from the validation


    Item 11. Risk Groups

    Patients were stratified into two risk groups on the basis of signature score distribution in each dataset: low-risk (signature score < 80th percentile) and high-risk (signature score >= 80th percentile).


    Item 12. Development vs. Validation

    Differences in development and validation

  • Setting
  • The details are described at Item 5a.

  • Eligibility criteria
  • The details are described at Item 5b.

  • Outcome
  • The details are described at Item 6.

  • Predictors
  • The mRNA expression profiles of predictors in training dataset and ProT dataset were generated using Affymetrix Human Exon 1.0 ST platform, while those in ProV dataset were determined using Affymetrix Human Genome U133Plus 2.0 arrays.



    III. Items of results

    Item 13. Participants

    Item 13a. Study participants

    Training dataset

    In the training dataset, 95 out of 95 (100%) had follow-up of DFS with a median follow-up (by reverse Kaplan-Meier method) of 120 months (range: 3.96-120).

    Validation dataset(s)

    ProT: 77 out of 77 (100%) had follow-up of DFS with a median follow-up (by reverse Kaplan-Meier method) of 46.08 months (range: 2.04-59.64).

    ProV: 218 out of 218 (100%) had follow-up of DFS with a median follow-up (by reverse Kaplan-Meier method) of 47.86 months (range: 0.43-118.58).

    Item 13b/c. Comparison of participant characteristics between the training and validation datasets

    Clinical and pathological characteristics


    Key clinical and pathological characteristics of participants were included. P denotes the P value of the characteristics distribution comparison between the training dataset and each validation dataset. Categorical variables were compared using fisher'exact test, and continuous variables using t-test.


    Expression profile distribution


    The density plots present the distributions of the log2-transformed expression level for the training dataset ProL, and the validation datasets ProT, ProV.


    Item 14. Model Development

    Unadjusted association between each candidate predictor and outcome in the training dataset


    Unadjusted association between each candidate predictor and DFS.


    TRIPOD relevance: This table reports the information needed suggested by Item 14a "Number of participants and outcome events involved in model development" and Item 14b "Unadjusted association between each candidate predictor and outcome".


    Item 15. Model Specification

    Item 15a. Details of the prediction model

    Signature genes and the corresponding coefficients


    7 variables comprising the developed signature and their corresponding coefficients were calculated by the Cox regression model.
    The signature score of a patient can be calculated using the following formula:
    Signature score = -0.1*CXCL9 + 0.04*DMBT1 + 0.05*NT5E - 0.08*OLFM4 + 0.1*SEMA3A - 0.02*UGT2B17 + 0.1*WNT11.
    A higher signature score indicates a relative poorer prognosis.


    Pearson correlation plot for pairwise expression comparison among signature genes


    The correlation plot shows the Pearson correlation coefficients of expression profiles between each pair of signature predictors in the training dataset. From the plot, the correlation pattern of signature predictors can be visually checked if their expression levels are independent. The correlation coefficient ranges from -1 to 1. A smaller absolute value implies that a lower linear dependency, indicating this pair of signature predictors maybe is independent.


    Item 15b. Application of the prediction model

    Nomogram for prediction of 12, 36, 60-months DFS probabilities


    Multivariate Cox regression model integrating the molecular signature prediction (signature group) and clinicopathological factors (Stage) (user provided), was developed for nomograms. The nomogram was then created using patients in training dataset, which could be used to estimate the DFS probabilities at 12, 36, 60-months for single patient. In the nomogram, each factor was assigned a weighted score. To use this nomogram, first draw a line straight upward to the Point's axis to determine how many points toward the probability of each variable. Then sum the points achieved for each of the predictors. Last locate the final sum on the Total Points axis and draw a line straight down to find the patient's probability of DFS.


    Item 16. Model Performance

    Cox regression analysis of signature and clinicopathological variables (training dataset)


    Univariate hazard ratio and multivariate hazard ratio of each variable with a 95% confidence interval provided in parentheses, indicate multiplicative effects on the hazard. P denotes the statistical significances in the hazard ratio between a test group relative to a reference group, which were tested by the wald test. For multivariate Cox analysis, the relationship between the variable of interest and the outcome was evaluated after adjusting for potential confounding variables that may be related to the outcome.


    Receiver operating characteristic (ROC) analysis (training dataset)


    Time-dependent receiver operating characteristic (ROC) curves show the sensitivity and specificity of different variables in prognosis prediction. It could help to quantify and compare the discrimination ability of the signature and clinicopathological prognosis factors using the 12, 36, 60-months time-dependent ROC analysis in the training dataset. The area under the curve (AUC) ranges from 0.5 (no discrimination) to a theoretical maximum of 1, which were texted in the legend for each prognosis model. A prognosis model with larger AUC indicates a better performance.


    Calibration of the molecular signature (training dataset)


    Calibration plots at 12, 36, 60-months were generated to explore the performance of the signature. Signature-predicted probabilities and observed outcome were plotted on the X-axis and Y-axis, respectively. A calibration plot along the 45-degree line indicates perfect consistency between the actual and signature-predicted prognosis. The vertical bars represent 95% CIs.


    Calibration of the combined model integrating the signature and key clinicopathological variables (training dataset)


    Calibration plots at 12, 36, 60 -months were generated to explore the performance of the nomogram. Nomogram-predicted probabilities and observed outcome were plotted on the X-axis and Y-axis, respectively. A calibration plot along the 45-degree line indicates perfect consistency between the actual and nomogram-predicted prognosis. The vertical bars represent 95% CIs.


    Prediction error (PE) curve analysis (training dataset)


    Prediction error (PE) curves were used to evaluate the performance of prediction models in the training dataset, and ten‐fold cross‐validation cumulative prediction error were computed using Kaplan-Meier estimation as reference. Integrated Brier score (IBScore) was defined as the area under a prediction error curve, which were texted in the legend for each prognosis model. A prognosis model with smaller IBScore indicates a better performance.


    Association analysis of signature-predicted risk groups with clinicopathological variables


    Association analysis was employed to investigate whether the signature is correlated with clinicopathological variables and other molecular features in both training and validation datasets. Samples with valid values (such as non-missing, non-unknown) were compared in the analysis. P denotes the P values comparing the associations between the clinicopathological/molecular variables and the risk groups in each dataset. Fisher'exact test was used for association analysis.


    Kaplan-Meier analysis of signature-predicted risk groups


    Training dataset


    Validation dataset 1


    Validation dataset 2



    Kaplan-Meier survival curves for DFS in training and validaiton datasets, stratified by the prediction model (high-risk and low risk). The performance of the prognostic signature was evaluated by the two‐sided log‐rank test in both training dataset and validation datasets. P < 0.05 was considered as statistically significant, and 95% confidence intervals are presented in brackets. HR, hazard ratio.


    12/36/60 months DFS for different signature-predicted risk groups


    12/36/60 months DFS of patients in different risk groups for each dataset were predicted by the prediction model.