Published datasets
These datasets are derived from publications and do not change.
To generate this download run:
./GENERATE.sh
This download contains the BD2009, BD2013, and BLIND datasets from Dataset size and composition impact the reliability of performance benchmarks for peptide-MHC binding predictions.
BD2013 (augmented with more recent data from IEDB) are used to train the production MHCflurry models. BD2009 and BLIND are useful for performing validation on held-out data.
The other published data sets correspond to the publications indicated in GENERATE.sh.