SEER-Medicare: About the Data Files

There are a large number of people and records per person in the SEER-Medicare data. Given the vast amount of data, the term "SEER-Medicare data" refers not to one file, but a series of files. They are:

  • The SEER data file
  • The Medicare files, which summarize Medicare enrollment, specific healthcare services that occurred in different settings (e.g., hospitals, physician offices, outpatient clinics), and healthcare assessments (e.g., while enrolled in nursing homes or home health care).
  • Housing Assistance Data
  • Ancillary files, which summarize characteristics of the included healthcare providers (individuals and institutions) and the geographical areas (e.g., zip codes and census tracts) in which the included Medicare beneficiaries live and/or receive care.

There are two cohorts of people included in the SEER-Medicare data – persons with a cancer diagnosis and a random sample of Medicare beneficiaries who do not have cancer. The "non-cancer" group is drawn from a random 5 percent sample of Medicare beneficiaries residing in the SEER areas. Persons in the 5 percent sample who also appear in the SEER data are removed from the non-cancer group, leaving a sample of persons with no known history of cancer. Medicare claims are available for the non-cancer group in the same format as for the persons with a cancer diagnosis. Information from the non-cancer group can be used for comparative purposes, such as the cost of care or the use of specific tests or procedures among a random sample of Medicare beneficiaries who do not have cancer. Data for the non-cancer group can also be used with the data for the persons with a cancer diagnosis to conduct population-based analyses of testing, treatment, and costs within the SEER areas.

For persons in the cancer and non-cancer groups, investigators can link the files using a unique identifier that has been assigned to each individual.

Last Updated: 27 Sep, 2023