Molecular Data Guide

The purpose of the Know Your Tumor Molecular Data Guide is to provide SPARK end-users with information and guidance regarding molecular and genomic data tables and their descriptions. All of the files below are available in .csv file format.

📘

Note:

This data guide assumes a baseline level knowledge of pancreatic cancer, genomic and clinical data and does not cover basic descriptions and explanations.

Data Table NameDescription
onco_result_cnv_chr_segmentThis table contains Copy Number Variation (CNV) segment data derived from genomic analysis of both solid and liquid assays. Each row represents a unique CNV segment, defined as a contiguous region of the genome with a consistent copy number state, identified within a specific chromosome and analysis.
onco_result_cnv_geneThis table stores data for Copy Number Variants (CNVs) annotated at the gene level
onco_result_hla_genotyping_passingThis table stores HLA genotyping results. The results for each analyzed HLA gene are alphanumerically ordered based on the allele names from hla_allele_a_two_field and hla_allele_b_two_field, such that _a is first and _b is second (e.g., DRB104:92 and DRB111:04, respectively). This standardized naming convention (_a, _b) is applied across the model.
onco_result_hla_loh_passingThis table contains harmonized HLA Loss of Heterozygosity (LOH) results for each analyzed HLA gene. The hla_allele_a_two_field and hla_allele_b_two_field columns are alphanumerically ordered based on the allele names, such that _a is first and _b is second (e.g., DRB104:92 and DRB111:04, respectively). This standardized naming convention is applied across the model.
onco_result_ihc_passingThis table stores immunohistochemistry (IHC) data generated at Tempus that have passed quality control filtering
onco_result_immune_infiltrationThis table contains immune infiltration results for tumor samples from solid assays, including the percentage of immune cells within an isolate, as well as the percentages of specific immune cell types such as B lymphocytes, CD4+ T lymphocytes, CD8+ T lymphocytes, natural killer (NK) cells, and macrophages
onco_result_microorganism_passingThis table stores passing calls that meet quality criteria, focusing on the most specific taxonomic level available, meaning that at least the species_name is populated, preferably accompanied by species_subtype_name.
onco_result_msi_annotatedThis table combines all other sources used for the main Microsatellite Instability (MSI) concept (including reported and annotated tables) and is intended to create an easy-to-follow layered data flow containing all available data. The values are joined to the reported results based on analysis_id and MSI value
onco_result_neoepitope_passingThis table contains high-quality neoepitope predictions that have passed quality filters. Each row represents a unique analysis of a chr_pos_ref_alt, including predicted binding affinities for HLA-A, HLA-B, and HLA-C alleles for a resulting target peptide
onco_result_overview_geneThis table contains gene-level results for all targeted genes of a given Tempus Next Generation Sequencing assay as defined in the onco_assay_gene_variant_type reference table. Each row indicates whether a gene was identified as significant for a given assay analysis and, if so, which variant type led to that significant determination.
onco_result_rna_geneThis table stores gene-level RNA expression calls that use DNA-normalized gene identifiers through gene ID mapping
onco_result_rna_transcriptThis table contains RNA transcript expression results
onco_result_snv_indel_passingThis table stores short variants (including Single Nucleotide Variants (SNVs) and indels) which passed filter in the bioinformatics pipeline. Short variants are genomic alterations that are denoted in the 'chr, pos, ref, alt' format. Variants > 1000 bp are trimmed (both ref and alt trimmed to 1 kb). "Passing" variants are those variants that pass both the low quality filters (defined by Tempus variant calling pipelines) and low evidence filters (non-low quality variants with < 3 reads support (tumor_reads if somatic, normal_reads if germline)). Normal and tumor results exist on the same row for the given variant when match_type = "match"
onco_result_snv_indel_passing_filteredThis table contains short Single Nucleotide Variant (SNV) indel calls that have passed all research flags, reported with standardized gene names, annotations, and reported status (analysis, sample, patient). Short variants are genomic alterations that are denoted in the 'chr, pos, ref, alt' format. Variants > 1000 bp are trimmed (both ref and alt trimmed to 1 kb). Passing variants are those variants that pass both the low quality filters (defined by Tempus variant calling pipelines) and low evidence filters (non-low quality variants with < 3 reads support (tumor_reads if somatic, normal_reads if germline)). Normal and tumor results exist on the same row for the given variant when match_type = "match"
onco_result_sv_breakpoint_annotatedAll Structural Variant (SV) calls at a breakpoint level, with reported flags applied. A breakpoint is defined as a location in the genome identified by an SV caller as one end of an SV event.
onco_result_tmb_annotatedThis table contains values for Tumor Mutation Burden (TMB), calculated as the number of variants per megabase
onco_temporal_breast_receptorsThis table stores abbreviated breast cancer-related hormone receptor (HR) result rollups
onco_temporal_receptor_hr_her2This table stores test results of hormone receptors (ER, PGR) and HER2 by collapsing results based on collection date or result date (if the collection date is unknown). The table contains all ‘valid’ cellular results (positive, negative, equivocal, low, and conflict) for ER, PR, and HER2 from the oncoreported_third_party_overview_marker (RTPOM) table. Furthermore, this table enables clear data lineage by providing columns that trace upstream to RTPOM (*_onco_reported_third_party_overview_marker_id) and downstream to enriched table columns (is_receptor_status_enriched*).
onco_tumor_liquid_fraction_ensembleThis tables contains circulating tumor DNA (ctDNA) fraction estimates for liquid assays such has xF.v2, subsetted down to the ensemble method of the Tempus ctDNA estimation pipeline. This method combines the results of the other methods to create a more robust estimate of the ctDNA fraction.