Skip to content
Snippets Groups Projects
Commit 74fbe23f authored by Tim O'Donnell's avatar Tim O'Donnell
Browse files

update datasets

parent ab1d2d98
No related branches found
No related tags found
Loading
......@@ -25,7 +25,7 @@ date
cd $SCRATCH_DIR/$DOWNLOAD_NAME
############################################
# BINDING AFFINITIES
# BINDING AFFINITIES: class I
############################################
#
# Kim et al 2014 [PMID 25017736]
......@@ -36,7 +36,7 @@ wget -q https://github.com/openvax/mhcflurry/releases/download/pre-1.1/bdata.201
mkdir raw
############################################
# MS: Multiallelic
# MS: Multiallelic class I
############################################
# Bassani-Sternberg, ..., Gfeller PLOS Comp. Bio. 2017 [PMID 28832583]
# The first dataset is from this work. The second dataset is originally from:
......@@ -84,8 +84,15 @@ wget -q https://www.mcponline.org/lookup/suppl/doi:10.1074/mcp.M116.060350/-/DC1
# Hassan, ..., van Veelen Mol Cell Proteomics 2015 [PMID 23481700]
PMID=23481700
mkdir -p raw/$PMID
wget -q https://www.mcponline.org/highwire/filestream/34681/field_highwire_adjunct_files/1/mcp.M112.024810-2.xls -P raw/$PMID
wget -q https://www.mcponline.org/highwire/filestream/34681/field_highwire_adjunct_files/1/mcp.M112.024810-2.xls -P raw/$PMID
############################################
# MS: Monoallelic class II
############################################
# Abelin, ..., Rooney Immunity 2019 [PMID 31495665]
PMID=31495665
mkdir -p raw/$PMID
wget -q https://ars.els-cdn.com/content/image/1-s2.0-S1074761319303632-mmc2.xlsx -P raw/$PMID
cp $SCRIPT_ABSOLUTE_PATH .
......
......@@ -8,17 +8,11 @@ To generate this download run:
./GENERATE.sh
```
## Kim 2014
This download contains the BD2009, BD2013, and BLIND datasets from
[Dataset size and composition impact the reliability of performance benchmarks for peptide-MHC binding predictions](http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-15-241).
BD2013 (augmented with more recent data from IEDB) are used to train the production
MHCflurry models. BD2009 and BLIND are useful for performing validation on held-out data.
## Abelin et al. Immunity 2017
This download contains the peptides identified in
[Mass Spectrometry Profiling of HLA-Associated Peptidomes in Mono-allelic Cells Enables More Accurate Epitope Prediction](https://www.ncbi.nlm.nih.gov/pubmed/28228285).
The other published data sets correspond to the publications indicated in GENERATE.sh.
......@@ -109,7 +109,7 @@ releases:
default: false
- name: data_published
url: http://github.com/openvax/mhcflurry/releases/download/pan-dev1/data_published.tar.bz2
url: https://github.com/openvax/mhcflurry/releases/download/pre-1.4.0/data_published.20190920.tar.bz2
default: false
- name: data_curated
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment