[ Input Data File ]  [ Data and Model Specifications ]  [ Output ] 

JPSurv

Joinpoint Model for Survival (Relative and Cause-Specific)

The JPSurv software has been developed to analyze trends in survival with respect to year of diagnosis[1]. Survival data includes two temporal dimensions that are important to consider: the calendar year of diagnosis and the time since diagnosis. The JPSurv model is an extension of Cox proportional hazard model and of Hakulinen and Tenkanen[2] in the case of relative survival, and fits a proportional hazard joinpoint model to survival data on the log hazard scale. Joinpoint models consist of linear segments connected through joinpoints. The probability (hazard) of cancer death is specified as the product of a baseline hazard (on time since diagnosis) and a multiplicative factor describing the effect of year of diagnosis and possibly other covariates. The effect of year of diagnosis is modeled as joined linear segments on the log scale. The number and location of joinpoints are estimated from data and represent the times at which trends changed. This model implies that the probability of cancer death as a function of time since diagnosis is proportional for individuals diagnosed in different calendar years. The software uses discrete-time survival data, i.e. survival data grouped by years since diagnosis in the life table format. The software accommodates both relative survival and cause-specific survival.

 
The next sections will describe the input data file, the JPSurv control options for data and model specification, and the JPSurv output produced.
 

Input Data File: From SEER*Stat  [Top]

    JPSurv reads grouped relative survival or cause-specific survival data generated from SEER*Stat.

    The data requires at a minimum the following variables: calendar year of diagnosis, survival time interval (years from diagnosis), number at risk at beginning of interval, number of deaths, and number of cases lost to follow-up, and interval expected survival in the case of relative survival. The data may also include other covariates of interest such as cancer site, sex, stage, etc.

    Tips for creating the survival data in SEER*Stat.

    • Year at diagnosis needs to be included as a covariate.
    • Make sure the Display/Standard Life option in the Parameters Tab is checked. JPSurv does not read the Summary Table format.
    • In order to reflect trends in calendar years, 12 months per interval needs to be specified.
    • Export results as text files, a .txt and a dictionary .dic file.

Input Data File: From Delimited Text File  [Top]

    JPSurv can also read data from delimited text files using common delimiters (comma, semicolon, or tab). The user will need to have prior knowledge of the data stored in each column in order to correctly use the package. The data requires at a minimum the following variables: calendar year of diagnosis, survival time interval, number at risk at beginning of interval, number of deaths, and number of cases lost to follow-up, and interval expected survival in the case of relative survival. The data may also include other covariates of interest such as cancer site, sex, stage, etc. Note that survival time interval should be at equal intervals and not have gaps. For example, an input data file with the intervals (1,3,4,5) will not return model output as the interval lengths are not consistent.

    NOTE: When using a delimited text file as input for the JPSurv application the user must convert the interval expected survival column to Proportions from Percentages. The acceptable range of inputs is [0,1].

Data and Model Specifications  [Top]

  • Year of Diagnosis Range: Specify the number of years that is going to be used to fit the JPSurv model. For example, the data may include diagnosis years from 1975 through 2011, but the user is interested in trends in the last 10 years, e.g. 2002-2011.

  • Max Intervals from Diagnosis: The user can specify the maximum intervals from diagnosis to select a subset of the input data to be used in the joinpoint regression model. For example, the data may have 1 through 15 years intervals after diagnosis. If the user is only interested in characterizing up to 5-year survival trends, 5 can be selected and the models will be run using survival up to 5 years.

  • Cohort Selection: The user selects the desired cohorts from a menu of variables, e.g. cancer site, sex, stage, etc. If mutiple cohorts are selected, computation time may be long and and an e-mail is required for a notification to be sent when results are available.

  • Maximum Number of Joinpoints: The user can specify the maximum number of joinpoints to be tested. JPsurv with fit models with 0 up to the maximum number joinpoints. See below for model selection criteria. For 3 or more joinpoints the calculations may be slow and an e-mail address is required for a notification to be sent when results are available.

  • Advanced Options:
    • Delete Last Interval: Last interval can be deleted in case there is data instability in the last follow-up interval.
    • Minimum Number of Years between Joinpoints (Excluding Joinpoints): If x is selected joinpoints will be at least x years apart. Default value is 2.
    • Minimum Number of Years before First Joinpoint (Excluding Joinpoint): If x is selected the first joinpoint can be located at the (x+1)th or later calendar year. Default value is 3.
    • Minimum Number of Years after Last Joinpoint (Excluding Joinpoint): If x is selected the last joinpoint can be located at the (x+1)th or more calendar years prior to last calendar year. Default value is 5.
    • Number of Calendar Years of Projected Survival: Specifies the calculation of projected survival up to x years from the last calendar year. Default value is 5.

Output  [Top]

    Export: The user can export the cohort, model specification and results to a workspace file. The default exported workspace filename is the same as the input data filename with .jpsurv extension.

    Model Selection: JPSurv uses the minimum Bayesian Information Criterion (BIC) to select the best fitted model. The Akaike Information Criterion (AIC) is also provided and tends to pick models with a higher number of joinpoints. Graph and other output features are also available for other fitted models, beyond the final model.


Graph/Trend Measures  [Top]

There are 3 types of graphs and 2 survival trend measures as specified below. Graphs display predicted (modeled) and observed (data) survival or interval probabilities of death for each Joinpoint model and cohorts. The default is the final selected model however the user can select other fitted models. The user can check "Show Trend Measures" and hit Recalculate to display the trend summary measures. All plots in JPSurv are done using the package ggplot2 [3] developed for the R environment.

  1. Survival vs. Year of Diagnosis Graph: The user can select 1 or more values of interval years, e.g. 5-year or 10-year survival and produce the trend graph over all available years of diagnosis. The default is 5-year survival.
    • Trend measure: Average Absolute Change in Survival by Diagnosis Year: The numbers represent the average absolute difference in survival (either relative survival or cause-specific survival) for people diagnosed in one calendar year compared to the prior year. For example, 1.0 average absolute change in 5-year survival from 2000 to 2009, means that survival has been increasing on average 1.0 survival points each year, approximately 10.0 survival points in 10 years. This trend measure depends on calendar year, and the average over calendar years is reported. It also depends on the time since diagnosis as selected by the user.

  2. Death vs. Year of Diagnosis Graph: The user can select 1 or more values of interval years, e.g. 5-year or 10-year probability of death interval and produce the trend graph over all available years of diagnosis. The default is 5-year probability of death interval, which represents given alive at the end of the fourth years the probability of dying of cancer between 4th and 5th year from diagnosis.
    • Trend measure: Percent Change in the Interval Probability of Dying of Cancer by Diagnosis Year: The numbers represent the percent change in the interval probability of dying of cancer for people diagnosed in one calendar year compared to the prior year. For example, -1.0% percent change in the interval probability of dying of cancer from 2000 to 2009, means that the probability of dying of cancer is decreasing by 1.0% each year. Because the model is a proportional hazard (probability of cancer death model) this trend measure is independent on time since diagnosis, so it is the same for probabilities of dying in any interval (e.g. 0 to 1, 1 to 2,... years since diagnosis).

  3. Survival vs. Time Since Diagnosis Graph: The user can select 1 or more calendar years, e.g. 1990 and 2000, and show modeled vs. observed survival by years since diagnosis.

Show Trend Measures on Graph: For some of the graphs the user can display the trends measures by selecting the checkbox "Show Trend Measures." The annotation feature is only available when there 3 or less intervals and for models with 3 or less joinpoints.

Model Estimates: Displays the number and location of joinpoints, the parameter estimates, and standard errors.

Download Full Dataset: Provides the data, survival, probability of cancer death estimates and standard errors for the full data.

Download Graph Dataset: Provides the graph data and estimates for users to reproduce the graphs using a different software.

  1. Yu BB, Huang L, Tiwari RC, Feuer EJ, Johnson KA. Modelling population-based cancer survival trends by using join point models for grouped survival data. Journal of the Royal Statistical Society Series a-Statistics in Society. 2009;172:405-25.
  2. Hakulinen T, Tenkanen L. Regression Analysis of Relative Survival Rates. Applied Statistics. 1987;36(3):309-17.
  3. H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2009.