Contributing to the Portal - One woman's journey to share data
So, I've decided to do it. Yes, I've thought about it for a long time, but now I'm ready. I'm going to take the leap and share some data for teaching. Ha ha. What a build up. You didn't think I was going sky diving or something did you? No, it's nothing so scary as that. I've got several datasets that I think are good. So, why has this task sat at the bottom of my to-do list?
Too much else to do? Sure we're all busy, but this is like a karmic payback for all the time that I've saved using the portal instead of endlessly searching the web for teaching data. No data to share? Well, I just said that's not my problem. Ah-ha! It's inertia! I've never done it before so it feels daunting and I need a big push to get going. Well friends, I'm here to take on this adventure and share it with you. Then we will all be familiar with the process and can eliminate that excuse too.
Getting ready
My first stop is to the portal website to check out the instructions for data submission. A reasonable place to start, of course. So, navigate on over to https://www.causeweb.org/tshs
and select the tab at the top noting "For Contributors" (red star).
They note three types of submissions. Mine will be "Type 1" since I'm only submitting the teaching dataset right now. They also accept teaching resources related to datasets already on the portal (or the one you are uploading). That will be a project for another time.
Required reading links two PDF files (blue arrow), which I've opened in new tabs. The Creative Commons license is for submitting teaching materials so that's off my radar for now. OK, back to the instructions sheet.
It's got the process laid out in steps. Basically, I have to fill out the three templates (see zip file, red arrow) and submit for peer review. OK, the zip file has been downloaded and opened. The templates are straight forward. Ugh, here's a road block. Who needs to give permission for a published dataset (purple star)?
The Dataset -- Who owns it?
Let's talk specifics. I'm looking to upload The Cancer Genome Atlas - Clinical Data Resource (TCGA-CDR). It's a
compilation of the clinical data collected on over 11,000 specimens processed by TCGA for multi-omic molecular profiling. This clinical dataset opens options for many types of classroom (and research) investigation about specific cancers, pan-cancer studies, derivation of outcome endpoints, survival analysis comparisons, or simply about data cleaning and the >1000 choices we had to make along the way.
TCGA is a public data resource. The TCGA-CDR is a supplemental data file on a published manuscript that has been given open access. In fact it's linked here directly from TCGA (https://gdc.cancer.gov/about-data/publications/pancanatlas).
I'm a co-author on the paper but I'm not sure that is sufficient. It looks like I've got homework. I'll reach out to the corresponding author, TCGA, and Cell and report back what I learn. In the meanwhile, please have a look at the paper and data (both linked above). Maybe you can think of a great teaching resource to submit with these data.
Stay tuned...
Got advice? Please share in the comments below. Have you submitted a teaching resource to the TSHS Portal and want to share your experience on the blog? Reach out.