From 40d12464d6cabfa67303e458afef387abdfcc9ca Mon Sep 17 00:00:00 2001
From: Tim O'Donnell <timodonnell@gmail.com>
Date: Thu, 21 Dec 2017 12:35:54 -0500
Subject: [PATCH] test

---
 docs/readme.template.txt | 315 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 315 insertions(+)
 create mode 100644 docs/readme.template.txt

diff --git a/docs/readme.template.txt b/docs/readme.template.txt
new file mode 100644
index 00000000..674dd28e
--- /dev/null
+++ b/docs/readme.template.txt
@@ -0,0 +1,315 @@
+MHCflurry is a Python package for peptide/MHC I binding affinity
+prediction. It provides competitive accuracy with a fast, documented,
+open source implementation.
+
+You can download pre-trained MHCflurry models fit to affinity
+measurements deposited in IEDB. See the
+"downloads_generation/models_class1" directory in the repository for
+the workflow used to train these predictors. Users with their own data
+can also fit their own MHCflurry models.
+
+Currently only allele-specific prediction is implemented, in which
+separate models are trained for each allele. The released models
+therefore support a fixed set of common class I alleles for which
+sufficient published training data is available.
+
+MHCflurry supports Python versions 2.7 and 3.4+. It uses the Keras
+neural network library via either the Tensorflow or Theano backends.
+GPUs may optionally be used for a generally modest speed improvement.
+
+If you find MHCflurry useful in your research please cite:
+
+   O'Donnell, T. et al., 2017. MHCflurry: open-source class I MHC
+   binding affinity prediction. bioRxiv. Available at:
+   http://www.biorxiv.org/content/early/2017/08/09/174243.
+
+
+Installation (pip)
+******************
+
+Install the package:
+
+   pip install mhcflurry
+
+Then download our datasets and trained models:
+
+   mhcflurry-downloads fetch
+
+From a checkout you can run the unit tests with:
+
+   pip install nose
+   nosetests .
+
+
+Using conda
+***********
+
+You can alternatively get up and running with a conda environment as
+follows. Some users have reported that this can avoid problems
+installing tensorflow.
+
+   conda create -q -n mhcflurry-env python=3.6 'tensorflow>=1.1.2'
+   source activate mhcflurry-env
+
+Then continue as above:
+
+   pip install mhcflurry
+   mhcflurry-downloads fetch
+
+
+Command-line usage
+==================
+
+
+Downloading models
+******************
+
+Most users will use pre-trained MHCflurry models that we release.
+These models are distributed separately from the source code and may
+be downloaded with the following command:
+
+We also release other "downloads," such as curated training data and
+some experimental models. To see what you have downloaded, run:
+
+
+mhcflurry-predict
+*****************
+
+The "mhcflurry-predict" command generates predictions from the
+command-line. It defaults to using the pre-trained models you
+downloaded above but this can be customized with the "--models"
+argument. See "mhcflurry-predict -h" for details.
+
+   $ mhcflurry-predict --alleles HLA-A0201 HLA-A0301 --peptides SIINFEKL SIINFEKD SIINFEKQ
+   allele,peptide,mhcflurry_prediction,mhcflurry_prediction_low,mhcflurry_prediction_high
+   HLA-A0201,SIINFEKL,5326.541919062165,3757.86675352994,7461.37693353508
+   HLA-A0201,SIINFEKD,18763.70298522213,13140.82000240037,23269.82139560844
+   HLA-A0201,SIINFEKQ,18620.10057358322,13096.425874678192,23223.148184869413
+   HLA-A0301,SIINFEKL,24481.726678691946,21035.52779725433,27245.371837497867
+   HLA-A0301,SIINFEKD,24687.529360239587,21582.590014592537,27749.39869616437
+   HLA-A0301,SIINFEKQ,25923.062203902562,23522.5793450799,28079.456657427705
+
+The predictions returned are affinities (KD) in nM. The
+"prediction_low" and "prediction_high" fields give the 5-95 percentile
+predictions across the models in the ensemble. The predictions above
+were generated with MHCflurry 0.9.2.
+
+Your exact predictions may vary slightly from these (up to about 1 nM)
+depending on the Keras backend in use and other numerical details.
+Different versions of MHCflurry can of course give results
+considerably different from these.
+
+You can also specify the input and output as CSV files. Run
+"mhcflurry-predict -h" for details.
+
+
+Fitting your own models
+***********************
+
+
+Library usage
+=============
+
+The MHCflurry Python API exposes additional options and features
+beyond those supported by the commandline tools. This tutorial gives a
+basic overview of the most important functionality. See the API
+Documentation for further details.
+
+The "Class1AffinityPredictor" class is the primary user-facing
+interface.
+
+
+   >>> import mhcflurry
+   >>> print("MHCflurry version: %s" % (mhcflurry.__version__))
+   MHCflurry version: 1.0.0
+   >>> 
+   >>> # Load downloaded predictor
+   >>> predictor = mhcflurry.Class1AffinityPredictor.load()
+   >>> print(predictor.supported_alleles)
+   ['BoLA-6*13:01', 'Eqca-1*01:01', 'H-2-Db', 'H-2-Dd', 'H-2-Kb', 'H-2-Kd', 'H-2-Kk', 'H-2-Ld', 'HLA-A*01:01', 'HLA-A*02:01', 'HLA-A*02:02', 'HLA-A*02:03', 'HLA-A*02:05', 'HLA-A*02:06', 'HLA-A*02:07', 'HLA-A*02:11', 'HLA-A*02:12', 'HLA-A*02:16', 'HLA-A*02:17', 'HLA-A*02:19', 'HLA-A*02:50', 'HLA-A*03:01', 'HLA-A*11:01', 'HLA-A*23:01', 'HLA-A*24:01', 'HLA-A*24:02', 'HLA-A*24:03', 'HLA-A*25:01', 'HLA-A*26:01', 'HLA-A*26:02', 'HLA-A*26:03', 'HLA-A*29:02', 'HLA-A*30:01', 'HLA-A*30:02', 'HLA-A*31:01', 'HLA-A*32:01', 'HLA-A*32:07', 'HLA-A*33:01', 'HLA-A*66:01', 'HLA-A*68:01', 'HLA-A*68:02', 'HLA-A*68:23', 'HLA-A*69:01', 'HLA-A*80:01', 'HLA-B*07:01', 'HLA-B*07:02', 'HLA-B*08:01', 'HLA-B*08:02', 'HLA-B*08:03', 'HLA-B*14:02', 'HLA-B*15:01', 'HLA-B*15:02', 'HLA-B*15:03', 'HLA-B*15:09', 'HLA-B*15:17', 'HLA-B*15:42', 'HLA-B*18:01', 'HLA-B*27:01', 'HLA-B*27:03', 'HLA-B*27:04', 'HLA-B*27:05', 'HLA-B*27:06', 'HLA-B*27:20', 'HLA-B*35:01', 'HLA-B*35:03', 'HLA-B*35:08', 'HLA-B*37:01', 'HLA-B*38:01', 'HLA-B*39:01', 'HLA-B*40:01', 'HLA-B*40:02', 'HLA-B*42:01', 'HLA-B*44:01', 'HLA-B*44:02', 'HLA-B*44:03', 'HLA-B*45:01', 'HLA-B*45:06', 'HLA-B*46:01', 'HLA-B*48:01', 'HLA-B*51:01', 'HLA-B*53:01', 'HLA-B*54:01', 'HLA-B*57:01', 'HLA-B*58:01', 'HLA-B*73:01', 'HLA-B*83:01', 'HLA-C*03:03', 'HLA-C*03:04', 'HLA-C*04:01', 'HLA-C*05:01', 'HLA-C*06:02', 'HLA-C*07:01', 'HLA-C*07:02', 'HLA-C*08:02', 'HLA-C*12:03', 'HLA-C*14:02', 'HLA-C*15:02', 'Mamu-A*01:01', 'Mamu-A*02:01', 'Mamu-A*02:0102', 'Mamu-A*07:01', 'Mamu-A*07:0103', 'Mamu-A*11:01', 'Mamu-A*22:01', 'Mamu-A*26:01', 'Mamu-B*01:01', 'Mamu-B*03:01', 'Mamu-B*08:01', 'Mamu-B*10:01', 'Mamu-B*17:01', 'Mamu-B*17:04', 'Mamu-B*39:01', 'Mamu-B*52:01', 'Mamu-B*66:01', 'Mamu-B*83:01', 'Mamu-B*87:01', 'Patr-A*01:01', 'Patr-A*03:01', 'Patr-A*04:01', 'Patr-A*07:01', 'Patr-A*09:01', 'Patr-B*01:01', 'Patr-B*13:01', 'Patr-B*24:01']
+
+   # coding: utf-8
+
+   # In[22]:
+
+   import pandas
+   import numpy
+   import seaborn
+   import logging
+   from matplotlib import pyplot
+
+   import mhcflurry
+
+
+
+   # # Download data and models
+
+   # In[2]:
+
+   get_ipython().system('mhcflurry-downloads fetch')
+
+
+   # # Making predictions with `Class1AffinityPredictor`
+
+   # In[3]:
+
+   help(mhcflurry.Class1AffinityPredictor)
+
+
+   # In[4]:
+
+   downloaded_predictor = mhcflurry.Class1AffinityPredictor.load()
+
+
+   # In[5]:
+
+   downloaded_predictor.predict(allele="HLA-A0201", peptides=["SIINFEKL", "SIINFEQL"])
+
+
+   # In[6]:
+
+   downloaded_predictor.predict_to_dataframe(allele="HLA-A0201", peptides=["SIINFEKL", "SIINFEQL"])
+
+
+   # In[7]:
+
+   downloaded_predictor.predict_to_dataframe(alleles=["HLA-A0201", "HLA-B*57:01"], peptides=["SIINFEKL", "SIINFEQL"])
+
+
+   # In[8]:
+
+   downloaded_predictor.predict_to_dataframe(
+       allele="HLA-A0201",
+       peptides=["SIINFEKL", "SIINFEQL"],
+       include_individual_model_predictions=True)
+
+
+   # In[9]:
+
+   downloaded_predictor.predict_to_dataframe(
+       allele="HLA-A0201",
+       peptides=["SIINFEKL", "SIINFEQL", "TAAAALANGGGGGGGG"],
+       throw=False)  # Without throw=False, you'll get a ValueError for invalid peptides or alleles
+
+
+   # # Instantiating a `Class1AffinityPredictor`  from a saved model on disk
+
+   # In[10]:
+
+   models_dir = mhcflurry.downloads.get_path("models_class1", "models")
+   models_dir
+
+
+   # In[11]:
+
+   # This will be the same predictor we instantiated above. We're just being explicit about what models to load.
+   downloaded_predictor = mhcflurry.Class1AffinityPredictor.load(models_dir)
+   downloaded_predictor.predict(["SIINFEKL", "SIQNPEKP", "SYNFPEPI"], allele="HLA-A0301")
+
+
+   # # Fit a model: first load some data
+
+   # In[12]:
+
+   # This is the data the downloaded models were trained on
+   data_path = mhcflurry.downloads.get_path("data_curated", "curated_training_data.csv.bz2")
+   data_path
+
+
+   # In[13]:
+
+   data_df = pandas.read_csv(data_path)
+   data_df
+
+
+   # # Fit a model: Low level `Class1NeuralNetwork` interface
+
+   # In[14]:
+
+   # We'll use mostly the default hyperparameters here. Could also specify them as kwargs.
+   new_model = mhcflurry.Class1NeuralNetwork(layer_sizes=[16])
+   new_model.hyperparameters
+
+
+   # In[16]:
+
+   train_data = data_df.loc[
+       (data_df.allele == "HLA-B*57:01") &
+       (data_df.peptide.str.len() >= 8) &
+       (data_df.peptide.str.len() <= 15)
+   ]
+   get_ipython().magic('time new_model.fit(train_data.peptide.values, train_data.measurement_value.values)')
+
+
+   # In[17]:
+
+   new_model.predict(["SYNPEPII"])
+
+
+   # # Fit a model: high level `Class1AffinityPredictor` interface
+
+   # In[18]:
+
+   affinity_predictor = mhcflurry.Class1AffinityPredictor()
+
+   # This can be called any number of times, for example on different alleles, to build up the ensembles.
+   affinity_predictor.fit_allele_specific_predictors(
+       n_models=1,
+       architecture_hyperparameters={"layer_sizes": [16], "max_epochs": 10},
+       peptides=train_data.peptide.values,
+       affinities=train_data.measurement_value.values,
+       allele="HLA-B*57:01",
+   )
+
+
+   # In[19]:
+
+   affinity_predictor.predict(["SYNPEPII"], allele="HLA-B*57:01")
+
+
+   # # Save and restore the fit model
+
+   # In[20]:
+
+   get_ipython().system('mkdir /tmp/saved-affinity-predictor')
+   affinity_predictor.save("/tmp/saved-affinity-predictor")
+   get_ipython().system('ls /tmp/saved-affinity-predictor')
+
+
+   # In[21]:
+
+   affinity_predictor2 = mhcflurry.Class1AffinityPredictor.load("/tmp/saved-affinity-predictor")
+   affinity_predictor2.predict(["SYNPEPII"], allele="HLA-B*57:01")
+
+
+Supported alleles and peptide lengths
+=====================================
+
+Models released with the current version of MHCflurry (1.0.0) support
+peptides of length 8-15 and the following 124 alleles:
+
+   BoLA-6*13:01, Eqca-1*01:01, H-2-Db, H-2-Dd, H-2-Kb, H-2-Kd, H-2-Kk,
+   H-2-Ld, HLA-A*01:01, HLA-A*02:01, HLA-A*02:02, HLA-A*02:03,
+   HLA-A*02:05, HLA-A*02:06, HLA-A*02:07, HLA-A*02:11, HLA-A*02:12,
+   HLA-A*02:16, HLA-A*02:17, HLA-A*02:19, HLA-A*02:50, HLA-A*03:01,
+   HLA-A*11:01, HLA-A*23:01, HLA-A*24:01, HLA-A*24:02, HLA-A*24:03,
+   HLA-A*25:01, HLA-A*26:01, HLA-A*26:02, HLA-A*26:03, HLA-A*29:02,
+   HLA-A*30:01, HLA-A*30:02, HLA-A*31:01, HLA-A*32:01, HLA-A*32:07,
+   HLA-A*33:01, HLA-A*66:01, HLA-A*68:01, HLA-A*68:02, HLA-A*68:23,
+   HLA-A*69:01, HLA-A*80:01, HLA-B*07:01, HLA-B*07:02, HLA-B*08:01,
+   HLA-B*08:02, HLA-B*08:03, HLA-B*14:02, HLA-B*15:01, HLA-B*15:02,
+   HLA-B*15:03, HLA-B*15:09, HLA-B*15:17, HLA-B*15:42, HLA-B*18:01,
+   HLA-B*27:01, HLA-B*27:03, HLA-B*27:04, HLA-B*27:05, HLA-B*27:06,
+   HLA-B*27:20, HLA-B*35:01, HLA-B*35:03, HLA-B*35:08, HLA-B*37:01,
+   HLA-B*38:01, HLA-B*39:01, HLA-B*40:01, HLA-B*40:02, HLA-B*42:01,
+   HLA-B*44:01, HLA-B*44:02, HLA-B*44:03, HLA-B*45:01, HLA-B*45:06,
+   HLA-B*46:01, HLA-B*48:01, HLA-B*51:01, HLA-B*53:01, HLA-B*54:01,
+   HLA-B*57:01, HLA-B*58:01, HLA-B*73:01, HLA-B*83:01, HLA-C*03:03,
+   HLA-C*03:04, HLA-C*04:01, HLA-C*05:01, HLA-C*06:02, HLA-C*07:01,
+   HLA-C*07:02, HLA-C*08:02, HLA-C*12:03, HLA-C*14:02, HLA-C*15:02,
+   Mamu-A*01:01, Mamu-A*02:01, Mamu-A*02:0102, Mamu-A*07:01,
+   Mamu-A*07:0103, Mamu-A*11:01, Mamu-A*22:01, Mamu-A*26:01,
+   Mamu-B*01:01, Mamu-B*03:01, Mamu-B*08:01, Mamu-B*10:01, Mamu-B*17:01,
+   Mamu-B*17:04, Mamu-B*39:01, Mamu-B*52:01, Mamu-B*66:01, Mamu-B*83:01,
+   Mamu-B*87:01, Patr-A*01:01, Patr-A*03:01, Patr-A*04:01, Patr-A*07:01,
+   Patr-A*09:01, Patr-B*01:01, Patr-B*13:01, Patr-B*24:01
-- 
GitLab