Skip to content
Snippets Groups Projects

Published datasets

These datasets are derived from publications and do not change.

To generate this download run:

./GENERATE.sh

This download contains the BD2009, BD2013, and BLIND datasets from Dataset size and composition impact the reliability of performance benchmarks for peptide-MHC binding predictions.

BD2013 (augmented with more recent data from IEDB) are used to train the production MHCflurry models. BD2009 and BLIND are useful for performing validation on held-out data.

The other published data sets correspond to the publications indicated in GENERATE.sh.