Skip to content
Snippets Groups Projects
README.md 790 B
Newer Older
# Kim 2014 Data

This download contains the BD2009, BD2013, and BLIND datasets from [Dataset size and composition impact the reliability of performance benchmarks for peptide-MHC binding predictions](http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-15-241). BD2013 (augmented with more recent data from IEDB) are used to train the production MHCflurry models. BD2009 and BLIND are useful for performing validation on held-out data.

These files are available on dropbox here:

 * https://dl.dropboxusercontent.com/u/3967524/bdata.2009.mhci.public.1.txt
 * https://dl.dropboxusercontent.com/u/3967524/bdata.20130222.mhci.public.1.txt
 * https://dl.dropboxusercontent.com/u/3967524/bdata.2013.mhci.public.blind.1.txt

To generate this download run:

```
./GENERATE.sh
```