Interim guidelines – choosing which mortality data source to use (2023 update)
This cancer commentary is the same as that which was released in 2022, with the exception that:
- the recommended data source for cancers new to Cancer data in Australia report (CdiA) have been added to Appendix A.
- the recommended data source for some cancers has changed for several cancers (Appendix A).
The recommended data source may change between releases of the CdiA as new data may change Australian Cancer Database and National Mortality Database comparability. Please note that the 2023 release of CdiA includes a page dedicated to describing work being undertaken related to the cancer mortality project.
Cancer data commentary 8b
Previous releases of the Cancer data in Australia report (CdiA) utilised cancer mortality statistics sourced from the National Mortality Database (NMD). The 2022 release of CdiA provides users with two different sources of cancer mortality statistics (the Australian Cancer Database (ACD) and the NMD). Please read cancer data commentary number 8 for more information about the different cancer mortality sources and why the respective statistics may differ.
General advice to help people select which data source that would best meet their needs was provided within the initial release of the 2022 CdiA. This commentary provides users with more direct assistance selecting the most appropriate data source and outlines which data source AIHW would generally recommend for each specific cancer and reporting period.
AIHW’s recommendations of which mortality data source to use are provided to help users who may want more direct advice on which cancer mortality data source to choose for each cancer. However, the recommendations in this commentary should not be taken to be prescriptive or definitive. Different analysis may lead to different recommendations of which mortality data source to use. Ultimately, users of the data will need to decide which mortality data source to use taking into consideration their specific investigations or reporting needs.
At the time of the 2022 release of the CdiA, AIHW’s cancer mortality investigations were in a preliminary stage. Given that the preliminary findings indicated that the NMD may not be as appropriate for reporting mortality for certain cancers, the 2022 release of the CdiA mortality data also included ACD mortality data so users could consider the suitability of the NMD for their specific reporting needs and offer alternative data where these needs are not met.
It is anticipated that cancer mortality investigations will have progressed much further by the time the 2023 CdiA is released. These guidelines are provided to help users while cancer mortality investigations remain ongoing; it is possible that these guidelines will change or no longer be required after cancer mortality investigations have been completed.
Where the user is focusing on cancer mortality solely for the periods where ACD mortality data is available (currently 2007 to 2017), AIHW recommends using mortality data from the ACD.
AIHW considers that in general, mortality data from the ACD will generally be based on access to a greater depth of information which is likely to lead to more precise cancer mortality reporting. Accordingly, where ACD actual mortality data is available (2007 to 2017) and these time periods meet the user’s analysis or reporting needs, the ACD is recommended.
An exception to this is all cancers combined where the NMD is more complete and includes deaths from basal and squamous non-melanoma skin cancer (which are excluded from the ACD).
Where the user is focused on reporting that includes more recent years, mortality data from the NMD is recommended for cancers where the NMD is sufficiently close to the ACD and the ACD is recommended in most other instances. Appendix A provides a full list of cancers reported within CdiA and which data source is recommended in each case.
While AIHW considers the ACD to generally provide more precise cancer mortality reporting, the ACD currently has a relatively limited time series and the NMD has more recent mortality data. The recency of NMD data means the 2018, 2019 and 2020 years are actual data within the NMD where as only projected data is possible from the ACD for these years. In general, actual data are recommended over projections if both are available.
The ACD and NMD mortality data for 2021 and 2022 are both projected. The more distant a projection is from the last year of actual data, the less reliable a projection will generally be (where both use the same method, and that method uses cancer mortality trend information as its basis to derive mortality rates and counts).
With its recency and extended time series, the NMD cancer statistics will offer stronger reporting for 2018 to 2022 where the NMD actual mortality statistics appears sufficiently representative for the selected cancer. The key consideration is ‘does the ACD mortality time series provide sufficient confidence that the NMD is providing appropriate mortality statistics for the selected cancer?’.
The Cancer Data and Monitoring Unit (CDMU) of AIHW has considered which cancers within the NMD it believes are sufficiently close to be recommended for continued use for cancer mortality reporting, as well as those where the ACD may be recommended or preferred (Appendix A provides the specific recommendations for each cancer).
Methods used to consider whether NMD mortality data is sufficiently close to the ACD mortality data are described later in this commentary.
Where the user is focussing on longer term cancer mortality reporting, the NMD is recommended if it is sufficiently representative of mortality counts and rates for the selected cancer. The recommendations in Appendix A help identify these cancers.
Unlike recommendation 2 where the user has a choice of which mortality source to use, at present there is no pre-2007 mortality data using the ACD. Where the NMD is recommended for continued use for a particular cancer, the NMD data may be more comfortably used for pre-2007 cancer mortality reporting.
For cancers where the ACD is preferred or recommended for a specific cancer, users will need to consider whether the NMD longer-term reporting data can be reliably used.
As mentioned in Recommendation 1, the NMD is recommended for reporting of the ‘all cancers combined’ reporting group. The recommendation is primarily based on the completeness of the data but there are tangential benefits of note when the NMD’s broader use is considered.
While the Cancer data in Australia report uses NMD to report on cancer deaths, the NMD contains coded causes for all deaths. Where a study is interested in cancer mortality rates (that is, for all cancers combined) compared to other causes of death as recorded in the NMD, the NMD will be more appropriate not only as the recommended source but also because the comparable mortality data is obtained from the same source and will therefore have a greater level of coherence than if ACD was used.
As discussed earlier, these interim guidelines are provided to assist data users. However, these guidelines cannot take the specific needs of individual users into account. For instance, consider an investigation about liver cancer mortality over the last 40 years in Australia. For liver cancer mortality reporting, the ACD is recommended (ACD and NMD mortality counts and trends are shown in Figure 1).
The CdiA reports that liver cancer incidence rates continue to rise over time and 5-year survival remains relatively low. The NMD provides the only source of longer-term national historical mortality reporting for this cancer. Given these trends, the increasing mortality rates reported from the NMD are to be expected. The general trend of ACD liver cancer mortality also supports the general trend of NMD liver cancer mortality.
Even though the ACD is recommended, the NMD could be considered the best data source for the reporting of liver cancer mortality over time. However, in using this data, it would be important to note that the liver cancer mortality rates presented may be overstated to some degree (approximately 25% higher than the ACD between 2007 and 2017).
Figure 1: Liver cancer deaths by mortality data source, persons, 1982 to 2022
- Actual data from the NMD is provided from 1982 to 2020 and projected data is from 2021 to 2022.
- Actual data from the ACD is provided from 2007 to 2017 and projected data is from 2018 to 2022.
Source: AIHW Australian Cancer Database 2018, AIHW National Mortality Database.
Reliability over consistency
Where an organisation undertakes many cancer mortality investigations, it may not be possible to report cancer mortality using a consistent data source. As a simple example, an organisation reports on a cancer for the 2015 year and uses the ACD as recommended. That same organisation needs to report on the same cancer using a longer time series in a different series of investigations and uses the NMD as it is sufficiently close to the ACD (which cannot be used because the ACD does not have a long enough time series). The cancer mortality information released by the organisation will therefore be inconsistent for the 2015 year. However, both sets of information released aim to provide the most informative and reliable cancer mortality information currently available.
Recommendations apply to all ages and persons
The analysis of which data source to use for the various cancers was based largely on how closely the mortality statistics align between the ACD and the NMD for each of the selected cancers. For simplicity, the recommendation of which data source to use does not change with age or sex. For cancers that occur in only one sex, the analysis for persons produces the same results as it would for the sex in which the cancer occurs.
It should be noted that it is possible that a cancer may be categorised as ACD preferred or ACD recommended when reporting total mortality for a cancer, but for some age groups (most likely younger), the NMD data and ACD data closely align. Kidney cancer mortality provides a useful example where the comparability of the NMD and ACD reduces with age (this data may be viewed in the Cancer mortality by age data visualisation).
Reporting mortality for multiple cancers
This commentary offers recommendations on which data source to choose for selected cancers. Complexities increase where data users wish to report on numerous cancers and the recommended sources differ. Colorectal cancer mortality by data source is used below to illustrate and discuss the complexities.
|Cancer site||National Mortality Database deaths||Australian Cancer Database deaths||Recommended data source|
|Colon cancer||1,175||3,450||3,450 (ACD)|
|Rectal cancer||3,146||1,704||1,704 (ACD)|
|Colorectal cancer||5,326||5,154||5,326 (NMD)|
- Data are projections.
- Colon and rectal cancer are coded under the ICD10 as C18 and C19–C20, respectively. Colorectal cancer (C18–C20) deaths from the National Mortality Database includes C26.0.
Sources: AIHW Australian Cancer Database 2018, AIHW National Mortality Database.
In general, it would be better to use the mortality data source that is recommended for each specific cancer type when reporting on multiple cancers within a report. Here the data source selection will either be ACD, or NMD data that appears sufficiently consistent with the ACD. As illustrated above, an issue arises in that the colon cancer deaths and rectal cancer deaths sourced from the ACD do not equal the total colorectal cancer deaths sourced from the NMD. This can be addressed through including notes such as ‘Colorectal cancer projected deaths is not equal to colon cancer projected deaths plus rectal cancer deaths as it is obtained from a different data source’.
Alternatively, the data user could choose to use the ACD only. However, the additional years of mortality data available in the NMD are likely to enable a more informed projection to be produced for the number of deaths due to colorectal cancer.
There are many different scenarios for reporting mortality for multiple cancers. This commentary only touches on the issue and offers some general guidance on which data source, at present, appears to best represent mortality in various cases.
Method Part A – ACD recommended cancers
When deciding which data source to use when reporting for a selected cancer, it is important for the user to consider whether the actual ACD and NMD counts and rates are sufficiently close. Where they are close, the ACD supports the NMD’s continued use. Methods A and B are used in determining the CDMU’s assessment of whether the two data sources are ‘sufficiently close'.
For method A, lines of best fit are created separately for the ACD and NMD trends between 2007 and 2017. Where the confidence intervals of these lines do not intersect for the majority of data points (that is, 6 or more of the 11 data points between 2007 and 2017), the two series are considered to differ. Where the majority of points in the time series are significantly different, the ACD is recommended. Appendix B provides greater detail of this method.
Method Part B – ACD preferred cancer
For the remaining cancers, heuristic models were generated to identify which of the NMD selected cancers were considered to be, on average, too far from the ACD to be preferred for use. These cancers are classified as ‘ACD preferred’. This additional and subjective process identified the cancers that were considered to not be sufficiently close to ACD actual results. These cancers were categorised as ‘ACD preferred’ while the remaining assessed cancers were classified as ‘NMD continued use.
NMD continued use
The NMD has been the source of cancer mortality within the CdiA since its initial release. ‘NMD continued use’ is the category used for the cancers where the NMD is sufficiently similar to the ACD and the NMD is recommended to continue to be used to report mortality for that cancer.
The CdiA includes statistics on several general and relatively unspecific reporting groups. An example of these is ‘Cancer of overlapping and unspecified sites of the biliary tract’. This cancer reporting group consists of ICD-10 codes C24.8 (malignant neoplasm of overlapping sites of biliary tract) and C24.9 (malignant neoplasm of biliary tract unspecified). For this and other similar types of cancer reporting groups, it was not assessed whether the NMD is sufficiently close to the ACD to recommend its continued used; these are only categorised as ‘Not assessed’.
The rationale for not assessing these is that there is very little expectation for the two sources to align. Unlike other cancers such as kidney or liver cancer where the purpose of the datasets is to measure the number of deaths for the respective cancers, these quantify the number of cancers which were more not able to be precisely coded to a specific cancer.
In general, these groups are more likely to be complementary data. For example, the overlapping and unspecified sites of the biliary tract mortality counts within the ACD effectively provides the number of deaths within the biliary tract that could not be coded to a more specific site when using the ACD. When this cancer reporting group is considered in conjunction with other cancers of the biliary tract such as the gallbladder, extrahepatic bile ducts and ampullary cancers, it provides a more complete picture of the total biliary tract cancer mortality. As a complementary cancer reporting item, it is most appropriately used with other reporting information from the same mortality data source.
Similarly, cancer of unknown primary site does not measure a specific cancer. It is expected that, with the additional information available to cancer registries for coding cause of death, the ACD is likely to have fewer deaths where the primary site is unknown. Similar to the non-specific cancer reporting sites, cancer of unknown primary site is likely to be of most use when used with data from the same source.
Histology based cancer reporting groups
Ovarian cancer and serous carcinomas of the fallopian tube, other female genital organs excluding serous carcinomas of the fallopian tube, soft tissue sarcoma, all sarcomas combined and neuroendocrine tumours are all cancers which are derived through histology data. The ACD contains histology data while the NMD does not. Accordingly, for these cancers the ACD is the sole source for mortality reporting.
The ACD does not include cause of death by histology. Therefore, to obtain mortality data for these cancers, it is assumed that if the site of the cancer identified as the cause of death is the same as the site where the relevant cancer was diagnosed, it was the cause of death (for example, if a neuroendocrine tumour was diagnosed in topography C20, and C20 was the underlying cause of death, then the neuroendocrine cancer was the cause of death). However, it is possible that another type of cancer was also diagnosed in C20 and it was the cause of death. Accordingly, it is possible that the above-mentioned cancers are overstated to some degree. The 2022 release of the CdiA is the first occasion where these cancers have mortality figures released. It is expected that the method to derive cause of death for these cancers will be investigated further in the future with the aim to minimise any possible overstatement of deaths for these cancers.
Non-melanoma skin cancer
Non-melanoma skin cancer mortality statistics from the ACD exclude basal and squamous cell carcinomas of the skin (as these are not notifiable diseases). The basal and squamous cell carcinomas of the skin are the most common type of cancer in Australia and accordingly the ACD based cancer is named ‘non-melanoma skin cancer (rare types)’. As the sole data source for non-melanoma skin cancer (rare types), the ACD is recommended for reporting mortality for non-melanoma skin cancer excluding basal and squamous cell carcinomas.
The non-melanoma skin cancer mortality from the NMD includes basal and squamous cell carcinomas of the skin. Within the CdiA, it is known as ‘non-melanoma skin cancer (all types)’. The NMD is recommended for reporting of non-melanoma skin cancer mortality.
Differences between the ACD and NMD for non-melanoma of the skin are conceptually due to the NMD including deaths from basal and squamous cell carcinomas of the skin. Further work will be done in the cancer mortality investigations to confirm whether it is likely that the NMD mortality less the ACD mortality for this cancer reliably estimates deaths from basal and squamous cell carcinoma of the skin.
|Australian Cancer Database recommended||Australian Cancer Database preferred||National Mortality Database continued use|
|Acute myeloid leukaemia||Appendiceal cancer||Acute lymphoblastic leukaemia|
|Ampullary cancer||Bone cancer||All blood cancers combined|
|Anal cancer||Immunoproliferative cancers||All cancers combined|
|Chronic myeloid leukaemia (CML)||Kidney cancer||Bladder cancer|
|Chronic myelomonocytic leukaemia (including juvenile)||Major salivary glands (cancer of the)||Brain and other central nervous system (cancer of the)|
|Colon cancer||Myelodysplastic syndromes||Brain cancer|
|Connective, subcutaneous and other soft tissues (cancer of)||Myeloproliferative neoplasms||Breast cancer|
|Endometrial cancer||Other central nervous system cancers||Cervical cancer|
|Extrahepatic bile duct cancer||Other plasma cell cancers||Chronic lymphocytic leukaemia|
|Eye cancer||Sinuses cancer||Colorectal cancer|
|Gallbladder cancer and extrahepatic bile duct cancer||Submandibular gland cancer||Gynaecological cancers|
|Gallbladder cancer||Tongue cancer||Hodgkin lymphoma|
|Head and neck cancer (excluding lip)||Urethral cancer||Kaposi sarcoma|
|Head and neck cancer (including lip)||Laryngeal cancer|
|Hypopharyngeal cancer||Lung cancer|
|Lip cancer||Melanoma of the skin|
|Mouth cancer||Middle ear cancer|
|Nasal cavity cancer||Multiple myeloma|
|Oesophageal cancer||Myeloproliferative neoplasms (excluding CML)|
|Oral cancer||Nasopharyngeal cancer|
|Other female genital organs (cancer of)||Non-Hodgkin lymphoma|
|Parotid gland cancer||Oropharyngeal cancer|
|Rectal cancer (excluding rectosigmoid junction)||Other blood cancers|
|Rectal cancer (including rectosigmoid junction)||Other endocrine glands (cancer of)|
|Rectosigmoid junction cancer||Other male genital organs (cancer of)|
|Renal pelvis cancer||Other thoracic and respiratory organs (cancer of)|
|Retroperitoneal and peritoneal cancer||Ovarian cancer|
|Small intestine cancer||Pancreatic cancer|
|Stomach cancer||Penile cancer|
|Ureteral cancer||Peripheral nerves and autonomic nervous system (cancer of the)|
|Sublingual gland cancer|
- Table excludes ‘Not assessed’ cancers: unknown primary site (cancer of), other and ill-defined digestive organs (cancer of), other and ill-defined sites (cancer of), other and ill-defined sites in the lip, oral cavity and pharynx (cancer of), other and unspecified leukaemia, other and unspecified lymphoid leukaemia, other and unspecified myeloid leukaemia, overlapping and unspecified sites in biliary tract (cancer of), overlapping and unspecified sites in major salivary glands (cancer of) and overlapping and unspecified sites in urinary tract (cancer of).
- Table excludes Australian Cancer Database only cancers: all sarcomas combined, neuroendocrine tumours, non-melanoma skin cancer (rare types), other female genital organs excluding serous carcinomas of the fallopian tube (cancer of), ovarian cancer and serous carcinomas of the fallopian tube and soft tissue sarcoma and National Mortality Database only cancer non-melanoma skin cancer (all types).
- Between the 2022 and 2023 releases of CdiA, cancers that changed recommended data source were kidney cancer (ACD recommended to ACD preferred), parotid gland cancer (ACD preferred to ACD recommended), immunoproliferative cancers (NMD continued use to ACD preferred) and vaginal cancer (ACD preferred to NMD continued use).
Significance testing for the difference between mortality rates of the National Mortality Database and Australian Cancer Database (2007 to 2017)
As part of CdiA 2022, two mortality data sources were considered for reporting. The continued use of mortality data from the National Mortality Database (NMD) was compared to the data available from the 2018 Australian Cancer Database (ACD). Both data sources can be used to extract the number of cancer deaths but the information available for each source, from which a cause of death is determined, is different between the ACD and NMD. For this reason, death reporting is likely to be different, but the extent of variation may vary across cancer groups. The purpose of this investigation was to quantify the level of difference in the observed number of cancer deaths from the ACD and the NMD for various cancer groups across time.
Data from the ACD were available for reporting from 2007 to 2017 and, hence, the same observation window was used for data from the NMD. Annual crude cancer death rates for persons and all ages combined were extracted and least-squares linear regression was used to find the straight line of best-fit through the data series for each source.
The basis of this investigation is to test if the difference in the fitted rates is significantly different from zero for the two data sources.
For year t, between 2007 to 2017, inclusive, let:
R1 (t) = fitted rate for the model based on ACD source of death
R2 (t) = fitted rate for the model based on NMD source of death
s1 (t) = standard error for fitted rate R1 (t)
s2 (t) = standard error for fitted rate R2 (t)
A 95% confidence interval for the quantity R1 (t) – R2 (t) was created under the assumption that these two fitted rates are uncorrelated. Note that this is a slightly conservative confidence interval as the mortality rates are very likely to be positively correlated as they are measuring the same quantity. i.e. both data sources are capturing cancer deaths over the same period. That is, we consider the interval
where t9, 0.025 is the upper 2.5 percentile of the t-distribution with 9 degrees of freedom. There are 11 values of t, namely one for each year from 2007 to 2017. R1 (t) and R2 (t) are significantly different for a given value of t if and only if the confidence interval given by equation above does not contain zero. If we define N to be the number of times that R1 (t) and R2 (t) are significantly different then N takes a value between 0 and 11. The higher the value of N, the more evidence there is that R1 (t) and R2 (t) are significantly different overall. When N was 6 or more (more than half) it was concluded that R1 (t) and R2 (t) were significantly different and we assigned the term 'ACD recommended'. When N was less than 6 the series were assessed using other methods as highlighted in Part B of the methods section.