This vignette aims to help developers migrate from the now defunct cgdsr
CRAN package. Note that the cgdsr
package code is shown for comparison but it
is not guaranteed to work. If you have questions regarding the contents,
please create an issue at the GitHub repository:
https://github.com/waldronlab/cBioPortalData/issues
library(cBioPortalData)
Here we show the default inputs to the cBioPortal function for clarity.
cbio <- cBioPortal(
hostname = "www.cbioportal.org",
protocol = "https",
api. = "/api/v2/api-docs"
)
getStudies(cbio)
FALSE # A tibble: 399 × 13
FALSE name description publicStudy pmid citation groups status importDate
FALSE <chr> <chr> <lgl> <chr> <chr> <chr> <int> <chr>
FALSE 1 Adenoid Cyst… Whole exom… TRUE 2609… Martelo… ACYC;… 0 2023-12-0…
FALSE 2 Adenoid Cyst… Whole-exom… TRUE 2368… Ho et a… ACYC;… 0 2023-12-0…
FALSE 3 Adenoid Cyst… Targeted S… TRUE 2441… Ross et… ACYC;… 0 2023-12-0…
FALSE 4 Adenoid Cyst… Whole-geno… TRUE 2686… Rettig … ACYC;… 0 2023-12-0…
FALSE 5 Adenoid Cyst… WGS of 21 … TRUE 2663… Mitani … ACYC;… 0 2023-12-0…
FALSE 6 Adenoid Cyst… Whole-geno… TRUE 2682… Drier e… ACYC 0 2023-12-0…
FALSE 7 Adenoid Cyst… Whole exom… TRUE 2377… Stephen… ACYC;… 0 2023-12-0…
FALSE 8 Adenoid Cyst… Multi-Inst… TRUE 3148… Allen e… ACYC;… 0 2023-12-0…
FALSE 9 Basal Cell C… Whole-exom… TRUE 2695… Bonilla… PUBLIC 0 2023-12-0…
FALSE 10 Acute Lympho… Comprehens… TRUE 2573… Anderss… PUBLIC 0 2023-12-0…
FALSE # ℹ 389 more rows
FALSE # ℹ 5 more variables: allSampleCount <int>, readPermission <lgl>,
FALSE # studyId <chr>, cancerTypeId <chr>, referenceGenome <chr>
Note that the studyId
column is important for further queries.
head(getStudies(cbio)[["studyId"]])
## [1] "acbc_mskcc_2015" "acyc_mskcc_2013" "acyc_fmi_2014" "acyc_jhu_2016"
## [5] "acyc_mda_2015" "acyc_mgh_2016"
library(cgdsr)
cgds <- CGDS("http://www.cbioportal.org/")
getCancerStudies.CGDS(cgds)
patientId
.sampleListId
identifies groups of patientId
based on profile typesampleLists
function uses studyId
input to return sampleListId
For the sample list identifiers, you can use sampleLists
and inspect the
sampleListId
column.
samps <- sampleLists(cbio, "gbm_tcga_pub")
samps[, c("category", "name", "sampleListId")]
## # A tibble: 15 × 3
## category name sampleListId
## <chr> <chr> <chr>
## 1 all_cases_in_study All samples gbm_tcga_pu…
## 2 other Expression Cluster Classic… gbm_tcga_pu…
## 3 all_cases_with_cna_data Samples with CNA data gbm_tcga_pu…
## 4 all_cases_with_mutation_and_cna_data Samples with mutation and … gbm_tcga_pu…
## 5 all_cases_with_mrna_array_data Samples with mRNA data (Ag… gbm_tcga_pu…
## 6 other Expression Cluster Mesench… gbm_tcga_pu…
## 7 all_cases_with_methylation_data Samples with methylation d… gbm_tcga_pu…
## 8 all_cases_with_methylation_data Samples with methylation d… gbm_tcga_pu…
## 9 all_cases_with_microrna_data Samples with microRNA data… gbm_tcga_pu…
## 10 other Expression Cluster Neural gbm_tcga_pu…
## 11 other Expression Cluster Proneur… gbm_tcga_pu…
## 12 other Sequenced, No Hypermutators gbm_tcga_pu…
## 13 other Sequenced, Not Treated gbm_tcga_pu…
## 14 other Sequenced, Treated gbm_tcga_pu…
## 15 all_cases_with_mutation_data Samples with mutation data gbm_tcga_pu…
It is possible to get case_ids
directly when using the samplesInSampleLists
function. The function handles multiple sampleList
identifiers.
samplesInSampleLists(
api = cbio,
sampleListIds = c(
"gbm_tcga_pub_expr_classical", "gbm_tcga_pub_expr_mesenchymal"
)
)
## CharacterList of length 2
## [["gbm_tcga_pub_expr_classical"]] TCGA-02-0001-01 ... TCGA-12-0615-01
## [["gbm_tcga_pub_expr_mesenchymal"]] TCGA-02-0004-01 ... TCGA-12-0620-01
To get more information about patients, we can query with getSampleInfo
function.
getSampleInfo(api = cbio, studyId = "gbm_tcga_pub", projection = "SUMMARY")
## # A tibble: 206 × 6
## uniqueSampleKey uniquePatientKey sampleType sampleId patientId studyId
## <chr> <chr> <chr> <chr> <chr> <chr>
## 1 VENHQS0wMi0wMDAxLTAxO… VENHQS0wMi0wMDA… Primary S… TCGA-02… TCGA-02-… gbm_tc…
## 2 VENHQS0wMi0wMDAzLTAxO… VENHQS0wMi0wMDA… Primary S… TCGA-02… TCGA-02-… gbm_tc…
## 3 VENHQS0wMi0wMDA0LTAxO… VENHQS0wMi0wMDA… Primary S… TCGA-02… TCGA-02-… gbm_tc…
## 4 VENHQS0wMi0wMDA2LTAxO… VENHQS0wMi0wMDA… Primary S… TCGA-02… TCGA-02-… gbm_tc…
## 5 VENHQS0wMi0wMDA3LTAxO… VENHQS0wMi0wMDA… Primary S… TCGA-02… TCGA-02-… gbm_tc…
## 6 VENHQS0wMi0wMDA5LTAxO… VENHQS0wMi0wMDA… Primary S… TCGA-02… TCGA-02-… gbm_tc…
## 7 VENHQS0wMi0wMDEwLTAxO… VENHQS0wMi0wMDE… Primary S… TCGA-02… TCGA-02-… gbm_tc…
## 8 VENHQS0wMi0wMDExLTAxO… VENHQS0wMi0wMDE… Primary S… TCGA-02… TCGA-02-… gbm_tc…
## 9 VENHQS0wMi0wMDE0LTAxO… VENHQS0wMi0wMDE… Primary S… TCGA-02… TCGA-02-… gbm_tc…
## 10 VENHQS0wMi0wMDE1LTAxO… VENHQS0wMi0wMDE… Primary S… TCGA-02… TCGA-02-… gbm_tc…
## # ℹ 196 more rows
case_id
.cancerStudy
identifiercase_list_description
describes the assaysgetCaseLists
and getClinicalData
We obtain the first case_list_id
in the cgds
object from above and the
corresponding clinical data for that case list (gbm_tcga_pub_all
as the case
list in this example).
clist1 <-
getCaseLists.CGDS(cgds, cancerStudy = "gbm_tcga_pub")[1, "case_list_id"]
getClinicalData.CGDS(cgds, clist1)
Note that a sampleListId
is not required when using the
fetchAllClinicalDataInStudyUsingPOST
internal endpoint. Data for all
patients can be obtained using the clinicalData
function.
clinicalData(cbio, "gbm_tcga_pub")
## # A tibble: 206 × 24
## patientId DFS_MONTHS DFS_STATUS KARNOFSKY_PERFORMANC…¹ OS_MONTHS OS_STATUS
## <chr> <chr> <chr> <chr> <chr> <chr>
## 1 TCGA-02-0001 4.5041095… 1:Recurred 80.0 11.60547… 1:DECEAS…
## 2 TCGA-02-0003 1.3150684… 1:Recurred 100.0 4.734246… 1:DECEAS…
## 3 TCGA-02-0004 10.323287… 1:Recurred 80.0 11.34246… 1:DECEAS…
## 4 TCGA-02-0006 9.9287671… 1:Recurred 80.0 18.34520… 1:DECEAS…
## 5 TCGA-02-0007 17.030136… 1:Recurred 80.0 23.17808… 1:DECEAS…
## 6 TCGA-02-0009 8.6794520… 1:Recurred 80.0 10.58630… 1:DECEAS…
## 7 TCGA-02-0010 11.539726… 1:Recurred 80.0 35.40821… 1:DECEAS…
## 8 TCGA-02-0011 4.7342465… 1:Recurred 80.0 20.71232… 1:DECEAS…
## 9 TCGA-02-0014 <NA> <NA> 100.0 82.55342… 1:DECEAS…
## 10 TCGA-02-0015 14.991780… 1:Recurred 80.0 20.61369… 1:DECEAS…
## # ℹ 196 more rows
## # ℹ abbreviated name: ¹KARNOFSKY_PERFORMANCE_SCORE
## # ℹ 18 more variables: PRETREATMENT_HISTORY <chr>, PRIOR_GLIOMA <chr>,
## # SAMPLE_COUNT <chr>, SEX <chr>, sampleId <chr>, ACGH_DATA <chr>,
## # CANCER_TYPE <chr>, CANCER_TYPE_DETAILED <chr>, COMPLETE_DATA <chr>,
## # FRACTION_GENOME_ALTERED <chr>, MRNA_DATA <chr>, MUTATION_COUNT <chr>,
## # ONCOTREE_CODE <chr>, SAMPLE_TYPE <chr>, SEQUENCED <chr>, …
You can use a different endpoint to obtain data for a single sample.
First, obtain a single sampleId
with the samplesInSampleLists
function.
clist1 <- "gbm_tcga_pub_all"
samplist <- samplesInSampleLists(cbio, clist1)
onesample <- samplist[["gbm_tcga_pub_all"]][1]
onesample
## [1] "TCGA-02-0001-01"
Then we use the API endpoint to retrieve the data. Note that you would run
httr::content
on the output to extract the data.
cbio$getAllClinicalDataOfSampleInStudyUsingGET(
sampleId = onesample, studyId = "gbm_tcga_pub"
)
## Response [https://www.cbioportal.org/api/studies/gbm_tcga_pub/samples/TCGA-02-0001-01/clinical-data]
## Date: 2024-02-04 21:20
## Status: 200
## Content-Type: application/json
## Size: 3.31 kB
getClinicalData
uses case_list_id
as input without specifying the
study_id
as case list identifiers are unique to each study.We query clinical data for the gbm_tcga_pub_expr_classical
case list
identifier which is part of the gbm_tcga_pub
study.
getClinicalData.CGDS(x = cgds,
caseList = "gbm_tcga_pub_expr_classical"
)
cgdsr
allows you to obtain clinical data for a case list subset
(54 cases with gbm_tcga_pub_expr_classical
) and cBioPortalData
provides
clinical data for all 206 samples in gbm_tcga_pub
using the clinicalData
function.
cgdsr
returns a data.frame
with sampleId
(TCGA.02.0009.01) but not
patientId
(TCGA.02.0009)cBioPortalData
returns sampleId
(TCGA-02-0009-01) and patientId
(TCGA-02-0009).cgdsr
provides case_id
s with .
and cBioPortalData
returns patientId
s
with -
.You may be interested in other clinical data endpoints. For a list, use
the searchOps
function.
searchOps(cbio, "clinical")
## [1] "getAllClinicalAttributesUsingGET"
## [2] "fetchClinicalAttributesUsingPOST"
## [3] "fetchClinicalDataUsingPOST"
## [4] "getAllClinicalAttributesInStudyUsingGET"
## [5] "getClinicalAttributeInStudyUsingGET"
## [6] "getAllClinicalDataInStudyUsingGET"
## [7] "fetchAllClinicalDataInStudyUsingPOST"
## [8] "getAllClinicalDataOfPatientInStudyUsingGET"
## [9] "getAllClinicalDataOfSampleInStudyUsingGET"
molecularProfiles(api = cbio, studyId = "gbm_tcga_pub")
## # A tibble: 10 × 8
## molecularAlterationType datatype name description showProfileInAnalysi…¹
## <chr> <chr> <chr> <chr> <lgl>
## 1 COPY_NUMBER_ALTERATION DISCRETE Putati… Putative c… TRUE
## 2 COPY_NUMBER_ALTERATION DISCRETE Putati… Putative c… TRUE
## 3 MUTATION_EXTENDED MAF Mutati… Mutation d… TRUE
## 4 METHYLATION CONTINUOUS Methyl… Methylatio… FALSE
## 5 MRNA_EXPRESSION CONTINUOUS mRNA e… mRNA expre… FALSE
## 6 MRNA_EXPRESSION Z-SCORE mRNA e… 18,698 gen… TRUE
## 7 MRNA_EXPRESSION Z-SCORE mRNA e… Log-transf… TRUE
## 8 MRNA_EXPRESSION CONTINUOUS microR… expression… FALSE
## 9 MRNA_EXPRESSION Z-SCORE microR… microRNA e… FALSE
## 10 MRNA_EXPRESSION Z-SCORE mRNA/m… mRNA and m… TRUE
## # ℹ abbreviated name: ¹showProfileInAnalysisTab
## # ℹ 3 more variables: patientLevel <lgl>, molecularProfileId <chr>,
## # studyId <chr>
Note that we want to pull the molecularProfileId
column to use in other
queries.
getGeneticProfiles.CGDS(cgds, cancerStudy = "gbm_tcga_pub")
Currently, some conversion is needed to directly use the molecularData
function, if you only have Hugo symbols. First, convert to Entrez gene IDs
and then obtain all the samples in the sample list of interest.
hugoGeneSymbol
to entrezGeneId
genetab <- queryGeneTable(cbio,
by = "hugoGeneSymbol",
genes = c("NF1", "TP53", "ABL1")
)
genetab
## # A tibble: 3 × 3
## entrezGeneId hugoGeneSymbol type
## <int> <chr> <chr>
## 1 4763 NF1 protein-coding
## 2 25 ABL1 protein-coding
## 3 7157 TP53 protein-coding
entrez <- genetab[["entrezGeneId"]]
allsamps <- samplesInSampleLists(cbio, "gbm_tcga_pub_all")
In the next section, we will show how to use the genes and sample identifiers to obtain the molecular profile data.
The getProfileData
function allows for straightforward retrieval of the
molecular profile data with only a case list and genetic profile identifiers.
getProfileData.CGDS(x = cgds,
genes = c("NF1", "TP53", "ABL1"),
geneticProfiles = "gbm_tcga_pub_mrna",
caseList = "gbm_tcga_pub_all"
)
cBioPortalData
cBioPortalData
provides a number of options for retrieving molecular profile
data depending on the use case. Note that molecularData
is mostly used
internally and that the cBioPortalData
function is the user-friendly method
for downloading such data.
molecularData
We use the translated entrez
identifiers from above.
molecularData(cbio, "gbm_tcga_pub_mrna",
entrezGeneIds = entrez, sampleIds = unlist(allsamps))
## $gbm_tcga_pub_mrna
## # A tibble: 618 × 8
## uniqueSampleKey uniquePatientKey entrezGeneId molecularProfileId sampleId
## <chr> <chr> <int> <chr> <chr>
## 1 VENHQS0wMi0wMDAxLT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_mrna TCGA-02…
## 2 VENHQS0wMi0wMDAxLT… VENHQS0wMi0wMDA… 4763 gbm_tcga_pub_mrna TCGA-02…
## 3 VENHQS0wMi0wMDAxLT… VENHQS0wMi0wMDA… 7157 gbm_tcga_pub_mrna TCGA-02…
## 4 VENHQS0wMi0wMDAzLT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_mrna TCGA-02…
## 5 VENHQS0wMi0wMDAzLT… VENHQS0wMi0wMDA… 4763 gbm_tcga_pub_mrna TCGA-02…
## 6 VENHQS0wMi0wMDAzLT… VENHQS0wMi0wMDA… 7157 gbm_tcga_pub_mrna TCGA-02…
## 7 VENHQS0wMi0wMDA0LT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_mrna TCGA-02…
## 8 VENHQS0wMi0wMDA0LT… VENHQS0wMi0wMDA… 4763 gbm_tcga_pub_mrna TCGA-02…
## 9 VENHQS0wMi0wMDA0LT… VENHQS0wMi0wMDA… 7157 gbm_tcga_pub_mrna TCGA-02…
## 10 VENHQS0wMi0wMDA2LT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_mrna TCGA-02…
## # ℹ 608 more rows
## # ℹ 3 more variables: patientId <chr>, studyId <chr>, value <dbl>
getDataByGenes
The getDataByGenes
function automatically figures out all the sample
identifiers in the study and it allows Hugo and Entrez identifiers, as well
as genePanelId
inputs.
getDataByGenes(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"),
by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_mrna"
)
## $gbm_tcga_pub_mrna
## # A tibble: 618 × 10
## uniqueSampleKey uniquePatientKey entrezGeneId molecularProfileId sampleId
## <chr> <chr> <int> <chr> <chr>
## 1 VENHQS0wMi0wMDAxLT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_mrna TCGA-02…
## 2 VENHQS0wMi0wMDAxLT… VENHQS0wMi0wMDA… 4763 gbm_tcga_pub_mrna TCGA-02…
## 3 VENHQS0wMi0wMDAxLT… VENHQS0wMi0wMDA… 7157 gbm_tcga_pub_mrna TCGA-02…
## 4 VENHQS0wMi0wMDAzLT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_mrna TCGA-02…
## 5 VENHQS0wMi0wMDAzLT… VENHQS0wMi0wMDA… 4763 gbm_tcga_pub_mrna TCGA-02…
## 6 VENHQS0wMi0wMDAzLT… VENHQS0wMi0wMDA… 7157 gbm_tcga_pub_mrna TCGA-02…
## 7 VENHQS0wMi0wMDA0LT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_mrna TCGA-02…
## 8 VENHQS0wMi0wMDA0LT… VENHQS0wMi0wMDA… 4763 gbm_tcga_pub_mrna TCGA-02…
## 9 VENHQS0wMi0wMDA0LT… VENHQS0wMi0wMDA… 7157 gbm_tcga_pub_mrna TCGA-02…
## 10 VENHQS0wMi0wMDA2LT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_mrna TCGA-02…
## # ℹ 608 more rows
## # ℹ 5 more variables: patientId <chr>, studyId <chr>, value <dbl>,
## # hugoGeneSymbol <chr>, type <chr>
cBioPortalData
: the main end-user functionIt is important to note that end users who wish to obtain the data as
easily as possible should use the main cBioPortalData
function:
gbm_pub <- cBioPortalData(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"), by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_mrna"
)
assay(gbm_pub[["gbm_tcga_pub_mrna"]])[, 1:4]
## TCGA-02-0001-01 TCGA-02-0003-01 TCGA-02-0004-01 TCGA-02-0006-01
## ABL1 -0.1744878 -0.177096729 -0.08782114 -0.1733767
## NF1 -0.2966920 -0.001066810 -0.23626512 -0.1691507
## TP53 0.6213171 0.006435625 -0.30507285 0.3967758
Similar to molecularData
, mutation data can be obtained with the
mutationData
function or the getDataByGenes
function.
mutationData(
api = cbio,
molecularProfileIds = "gbm_tcga_pub_mutations",
entrezGeneIds = entrez,
sampleIds = unlist(allsamps)
)
## $gbm_tcga_pub_mutations
## # A tibble: 57 × 23
## uniqueSampleKey uniquePatientKey molecularProfileId sampleId patientId
## <chr> <chr> <chr> <chr> <chr>
## 1 VENHQS0wMi0wMDAxLTAxO… VENHQS0wMi0wMDA… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 2 VENHQS0wMi0wMDAxLTAxO… VENHQS0wMi0wMDA… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 3 VENHQS0wMi0wMDAzLTAxO… VENHQS0wMi0wMDA… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 4 VENHQS0wMi0wMDAzLTAxO… VENHQS0wMi0wMDA… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 5 VENHQS0wMi0wMDEwLTAxO… VENHQS0wMi0wMDE… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 6 VENHQS0wMi0wMDEwLTAxO… VENHQS0wMi0wMDE… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 7 VENHQS0wMi0wMDEwLTAxO… VENHQS0wMi0wMDE… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 8 VENHQS0wMi0wMDExLTAxO… VENHQS0wMi0wMDE… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 9 VENHQS0wMi0wMDE0LTAxO… VENHQS0wMi0wMDE… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 10 VENHQS0wMi0wMDI0LTAxO… VENHQS0wMi0wMDI… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## # ℹ 47 more rows
## # ℹ 18 more variables: entrezGeneId <int>, studyId <chr>, center <chr>,
## # mutationStatus <chr>, validationStatus <chr>, startPosition <int>,
## # endPosition <int>, referenceAllele <chr>, proteinChange <chr>,
## # mutationType <chr>, ncbiBuild <chr>, variantType <chr>, keyword <chr>,
## # chr <chr>, variantAllele <chr>, refseqMrnaId <chr>, proteinPosStart <int>,
## # proteinPosEnd <int>
getDataByGenes(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"),
by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_mutations"
)
## $gbm_tcga_pub_mutations
## # A tibble: 57 × 25
## uniqueSampleKey uniquePatientKey molecularProfileId sampleId patientId
## <chr> <chr> <chr> <chr> <chr>
## 1 VENHQS0wMi0wMDAxLTAxO… VENHQS0wMi0wMDA… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 2 VENHQS0wMi0wMDAxLTAxO… VENHQS0wMi0wMDA… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 3 VENHQS0wMi0wMDAzLTAxO… VENHQS0wMi0wMDA… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 4 VENHQS0wMi0wMDAzLTAxO… VENHQS0wMi0wMDA… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 5 VENHQS0wMi0wMDEwLTAxO… VENHQS0wMi0wMDE… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 6 VENHQS0wMi0wMDEwLTAxO… VENHQS0wMi0wMDE… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 7 VENHQS0wMi0wMDEwLTAxO… VENHQS0wMi0wMDE… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 8 VENHQS0wMi0wMDExLTAxO… VENHQS0wMi0wMDE… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 9 VENHQS0wMi0wMDE0LTAxO… VENHQS0wMi0wMDE… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## 10 VENHQS0wMi0wMDI0LTAxO… VENHQS0wMi0wMDI… gbm_tcga_pub_muta… TCGA-02… TCGA-02-…
## # ℹ 47 more rows
## # ℹ 20 more variables: entrezGeneId <int>, studyId <chr>, center <chr>,
## # mutationStatus <chr>, validationStatus <chr>, startPosition <int>,
## # endPosition <int>, referenceAllele <chr>, proteinChange <chr>,
## # mutationType <chr>, ncbiBuild <chr>, variantType <chr>, keyword <chr>,
## # chr <chr>, variantAllele <chr>, refseqMrnaId <chr>, proteinPosStart <int>,
## # proteinPosEnd <int>, hugoGeneSymbol <chr>, type <chr>
getMutationData.CGDS(
x = cgds,
caseList = "getMutationData",
geneticProfile = "gbm_tcga_pub_mutations",
genes = c("NF1", "TP53", "ABL1")
)
Copy Number Alteration data can be obtained with the getDataByGenes
function
or by the main cBioPortal
function.
getDataByGenes(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"),
by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_cna_rae"
)
## $gbm_tcga_pub_cna_rae
## # A tibble: 609 × 10
## uniqueSampleKey uniquePatientKey entrezGeneId molecularProfileId sampleId
## <chr> <chr> <int> <chr> <chr>
## 1 VENHQS0wMi0wMDAxLT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_cna_… TCGA-02…
## 2 VENHQS0wMi0wMDAxLT… VENHQS0wMi0wMDA… 4763 gbm_tcga_pub_cna_… TCGA-02…
## 3 VENHQS0wMi0wMDAxLT… VENHQS0wMi0wMDA… 7157 gbm_tcga_pub_cna_… TCGA-02…
## 4 VENHQS0wMi0wMDAzLT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_cna_… TCGA-02…
## 5 VENHQS0wMi0wMDAzLT… VENHQS0wMi0wMDA… 4763 gbm_tcga_pub_cna_… TCGA-02…
## 6 VENHQS0wMi0wMDAzLT… VENHQS0wMi0wMDA… 7157 gbm_tcga_pub_cna_… TCGA-02…
## 7 VENHQS0wMi0wMDA2LT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_cna_… TCGA-02…
## 8 VENHQS0wMi0wMDA2LT… VENHQS0wMi0wMDA… 4763 gbm_tcga_pub_cna_… TCGA-02…
## 9 VENHQS0wMi0wMDA2LT… VENHQS0wMi0wMDA… 7157 gbm_tcga_pub_cna_… TCGA-02…
## 10 VENHQS0wMi0wMDA3LT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_cna_… TCGA-02…
## # ℹ 599 more rows
## # ℹ 5 more variables: patientId <chr>, studyId <chr>, value <int>,
## # hugoGeneSymbol <chr>, type <chr>
cBioPortalData(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"),
by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_cna_rae"
)
## harmonizing input:
## removing 3 colData rownames not in sampleMap 'primary'
## A MultiAssayExperiment object of 1 listed
## experiment with a user-defined name and respective class.
## Containing an ExperimentList class object of length 1:
## [1] gbm_tcga_pub_cna_rae: SummarizedExperiment with 3 rows and 203 columns
## Functionality:
## experiments() - obtain the ExperimentList instance
## colData() - the primary/phenotype DataFrame
## sampleMap() - the sample coordination DataFrame
## `$`, `[`, `[[` - extract colData columns, subset, or experiment
## *Format() - convert into a long or wide DataFrame
## assays() - convert ExperimentList to a SimpleList of matrices
## exportClass() - save data to flat files
getProfileData.CGDS(
x = cgds,
genes = c("NF1", "TP53", "ABL1"),
geneticProfiles = "gbm_tcga_pub_cna_rae",
caseList = "gbm_tcga_pub_cna"
)
Similar to Copy Number Alteration, Methylation can be obtained by
getDataByGenes
function or by ‘cBioPortalData’ function.
getDataByGenes(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"),
by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_methylation_hm27"
)
## $gbm_tcga_pub_methylation_hm27
## # A tibble: 174 × 10
## uniqueSampleKey uniquePatientKey entrezGeneId molecularProfileId sampleId
## <chr> <chr> <int> <chr> <chr>
## 1 VENHQS0wMi0wMDAxLT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_meth… TCGA-02…
## 2 VENHQS0wMi0wMDAxLT… VENHQS0wMi0wMDA… 4763 gbm_tcga_pub_meth… TCGA-02…
## 3 VENHQS0wMi0wMDAxLT… VENHQS0wMi0wMDA… 7157 gbm_tcga_pub_meth… TCGA-02…
## 4 VENHQS0wMi0wMDAzLT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_meth… TCGA-02…
## 5 VENHQS0wMi0wMDAzLT… VENHQS0wMi0wMDA… 4763 gbm_tcga_pub_meth… TCGA-02…
## 6 VENHQS0wMi0wMDAzLT… VENHQS0wMi0wMDA… 7157 gbm_tcga_pub_meth… TCGA-02…
## 7 VENHQS0wMi0wMDA2LT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_meth… TCGA-02…
## 8 VENHQS0wMi0wMDA2LT… VENHQS0wMi0wMDA… 4763 gbm_tcga_pub_meth… TCGA-02…
## 9 VENHQS0wMi0wMDA2LT… VENHQS0wMi0wMDA… 7157 gbm_tcga_pub_meth… TCGA-02…
## 10 VENHQS0wMi0wMDA3LT… VENHQS0wMi0wMDA… 25 gbm_tcga_pub_meth… TCGA-02…
## # ℹ 164 more rows
## # ℹ 5 more variables: patientId <chr>, studyId <chr>, value <dbl>,
## # hugoGeneSymbol <chr>, type <chr>
cBioPortalData(
api = cbio,
studyId = "gbm_tcga_pub",
genes = c("NF1", "TP53", "ABL1"),
by = "hugoGeneSymbol",
molecularProfileIds = "gbm_tcga_pub_methylation_hm27"
)
## harmonizing input:
## removing 148 colData rownames not in sampleMap 'primary'
## A MultiAssayExperiment object of 1 listed
## experiment with a user-defined name and respective class.
## Containing an ExperimentList class object of length 1:
## [1] gbm_tcga_pub_methylation_hm27: SummarizedExperiment with 3 rows and 58 columns
## Functionality:
## experiments() - obtain the ExperimentList instance
## colData() - the primary/phenotype DataFrame
## sampleMap() - the sample coordination DataFrame
## `$`, `[`, `[[` - extract colData columns, subset, or experiment
## *Format() - convert into a long or wide DataFrame
## assays() - convert ExperimentList to a SimpleList of matrices
## exportClass() - save data to flat files
getProfileData.CGDS(
x = cgds,
genes = c("NF1", "TP53", "ABL1"),
geneticProfiles = "gbm_tcga_pub_methylation_hm27",
caseList = "gbm_tcga_pub_methylation_hm27"
)
sessionInfo()
## R version 4.3.2 Patched (2023-11-13 r85521)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.3 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.18-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] survminer_0.4.9 ggpubr_0.6.0
## [3] ggplot2_3.4.4 survival_3.5-7
## [5] cBioPortalData_2.14.2 MultiAssayExperiment_1.28.0
## [7] SummarizedExperiment_1.32.0 Biobase_2.62.0
## [9] GenomicRanges_1.54.1 GenomeInfoDb_1.38.5
## [11] IRanges_2.36.0 S4Vectors_0.40.2
## [13] BiocGenerics_0.48.1 MatrixGenerics_1.14.0
## [15] matrixStats_1.2.0 AnVIL_1.14.1
## [17] dplyr_1.1.4 BiocStyle_2.30.0
##
## loaded via a namespace (and not attached):
## [1] jsonlite_1.8.8 magrittr_2.0.3
## [3] magick_2.8.2 GenomicFeatures_1.54.3
## [5] farver_2.1.1 rmarkdown_2.25
## [7] BiocIO_1.12.0 zlibbioc_1.48.0
## [9] vctrs_0.6.5 memoise_2.0.1
## [11] Rsamtools_2.18.0 RCurl_1.98-1.14
## [13] rstatix_0.7.2 BiocBaseUtils_1.4.0
## [15] htmltools_0.5.7 S4Arrays_1.2.0
## [17] progress_1.2.3 lambda.r_1.2.4
## [19] curl_5.2.0 broom_1.0.5
## [21] SparseArray_1.2.3 sass_0.4.8
## [23] bslib_0.6.1 htmlwidgets_1.6.4
## [25] zoo_1.8-12 futile.options_1.0.1
## [27] cachem_1.0.8 commonmark_1.9.1
## [29] GenomicAlignments_1.38.2 mime_0.12
## [31] lifecycle_1.0.4 pkgconfig_2.0.3
## [33] Matrix_1.6-5 R6_2.5.1
## [35] fastmap_1.1.1 GenomeInfoDbData_1.2.11
## [37] shiny_1.8.0 digest_0.6.34
## [39] colorspace_2.1-0 RaggedExperiment_1.26.0
## [41] AnnotationDbi_1.64.1 RSQLite_2.3.5
## [43] labeling_0.4.3 filelock_1.0.3
## [45] RTCGAToolbox_2.32.1 km.ci_0.5-6
## [47] fansi_1.0.6 RJSONIO_1.3-1.9
## [49] httr_1.4.7 abind_1.4-5
## [51] compiler_4.3.2 bit64_4.0.5
## [53] withr_3.0.0 backports_1.4.1
## [55] BiocParallel_1.36.0 carData_3.0-5
## [57] DBI_1.2.1 highr_0.10
## [59] ggsignif_0.6.4 biomaRt_2.58.2
## [61] rappdirs_0.3.3 DelayedArray_0.28.0
## [63] rjson_0.2.21 tools_4.3.2
## [65] httpuv_1.6.14 glue_1.7.0
## [67] restfulr_0.0.15 promises_1.2.1
## [69] gridtext_0.1.5 grid_4.3.2
## [71] generics_0.1.3 gtable_0.3.4
## [73] KMsurv_0.1-5 tzdb_0.4.0
## [75] tidyr_1.3.1 data.table_1.15.0
## [77] hms_1.1.3 car_3.1-2
## [79] xml2_1.3.6 utf8_1.2.4
## [81] XVector_0.42.0 markdown_1.12
## [83] pillar_1.9.0 stringr_1.5.1
## [85] later_1.3.2 splines_4.3.2
## [87] ggtext_0.1.2 BiocFileCache_2.10.1
## [89] lattice_0.22-5 rtracklayer_1.62.0
## [91] bit_4.0.5 tidyselect_1.2.0
## [93] Biostrings_2.70.2 miniUI_0.1.1.1
## [95] knitr_1.45 gridExtra_2.3
## [97] bookdown_0.37 futile.logger_1.4.3
## [99] xfun_0.41 DT_0.31
## [101] stringi_1.8.3 yaml_2.3.8
## [103] evaluate_0.23 codetools_0.2-19
## [105] tibble_3.2.1 BiocManager_1.30.22
## [107] cli_3.6.2 xtable_1.8-4
## [109] munsell_0.5.0 jquerylib_0.1.4
## [111] survMisc_0.5.6 Rcpp_1.0.12
## [113] GenomicDataCommons_1.26.0 dbplyr_2.4.0
## [115] png_0.1-8 XML_3.99-0.16.1
## [117] rapiclient_0.1.3 parallel_4.3.2
## [119] TCGAutils_1.22.2 ellipsis_0.3.2
## [121] readr_2.1.5 blob_1.2.4
## [123] prettyunits_1.2.0 bitops_1.0-7
## [125] scales_1.3.0 purrr_1.0.2
## [127] crayon_1.5.2 rlang_1.1.3
## [129] KEGGREST_1.42.0 rvest_1.0.3
## [131] formatR_1.14