Extract Transform Load (ETL) Process

The Know Your Tumor data is generated by Tempus and through an automated process extracted, transformed and loaded into the SPARK platform powered by Seven Bridges Genomics.

Know Your Tumor data updates are uploaded to the SPARK platform on a routine basis. This includes both data for new patients and new data for existing patients. During each update, all prior data is replaced with the latest dataset to ensure the most accurate and complete data is made available.

Clinical & Molecular Data

Clinical and Molecular data are available through individual comma-separated values (csv) formatted files. All csv files are ingested into the SPARK platform through an automated process which maintains the common patient identifier across all data sources. Each of these files can be linked via a common patient identifier (ex. ‘patient_id’ for clinical and molecular csv files, bam/vcf/fastq all have patient ID as part of the filename). This allows the end-user to search/group and download seamlessly across all Know Your Tumor data sets.

Genomic Data

Genomic data including bam, vcf and fastq files per case can be downloaded from their original Tempus storage location on AWS.