Skip to content
Snippets Groups Projects
Name Last commit Last update
..
GENERATE.sh
README.md

Published datasets

These datasets are derived from publications and do not change.

To generate this download run:

./GENERATE.sh

This download contains the BD2009, BD2013, and BLIND datasets from Dataset size and composition impact the reliability of performance benchmarks for peptide-MHC binding predictions.

BD2013 (augmented with more recent data from IEDB) are used to train the production MHCflurry models. BD2009 and BLIND are useful for performing validation on held-out data.

The other published data sets correspond to the publications indicated in GENERATE.sh.