Newer
Older
# Kim 2014 Data
This download contains the BD2009, BD2013, and BLIND datasets from [Dataset size and composition impact the reliability of performance benchmarks for peptide-MHC binding predictions](http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-15-241). BD2013 (augmented with more recent data from IEDB) are used to train the production MHCflurry models. BD2009 and BLIND are useful for performing validation on held-out data.
These files are available on dropbox here:
* https://dl.dropboxusercontent.com/u/3967524/bdata.2009.mhci.public.1.txt
* https://dl.dropboxusercontent.com/u/3967524/bdata.20130222.mhci.public.1.txt
* https://dl.dropboxusercontent.com/u/3967524/bdata.2013.mhci.public.blind.1.txt
To generate this download run:
```
./GENERATE.sh
```