Skip to content
Snippets Groups Projects
Commit a84ef111 authored by Tim O'Donnell's avatar Tim O'Donnell
Browse files

working on docs

parent a6875fad
No related branches found
No related tags found
No related merge requests found
...@@ -49,21 +49,19 @@ help: ...@@ -49,21 +49,19 @@ help:
@echo " coverage to run coverage check of the documentation (if enabled)" @echo " coverage to run coverage check of the documentation (if enabled)"
@echo " dummy to check syntax errors of document sources" @echo " dummy to check syntax errors of document sources"
# Added by Tim: # Added by Tim
.PHONY: generate .PHONY: generate
generate: generate:
sphinx-apidoc -M -f -o _build/ ../mhcflurry sphinx-apidoc -M -f -o _build/ ../mhcflurry
mhcflurry-downloads fetch models_class1_pan mhcflurry-downloads fetch models_class1_pan
python generate_class1_pan.py --out-dir model-info python generate_class1_pan.py --out-dir model-info
# Added by Tim: # Added by Tim
.PHONY: readme .PHONY: generate_model_info
readme: text generate_model_info:
rm -f package_readme/readme.generated.txt sphinx-apidoc -M -f -o _build/ ../mhcflurry
cat package_readme/readme_header.rst \ mhcflurry-downloads fetch models_class1_pan
_build/text/package_readme/readme.template.txt \ python generate_class1_pan.py --out-dir model-info
> package_readme/readme.generated.txt
chmod 444 package_readme/readme.generated.txt # read only
.PHONY: clean .PHONY: clean
clean: clean:
...@@ -72,6 +70,10 @@ clean: ...@@ -72,6 +70,10 @@ clean:
mv $(BUILDDIR)/html /tmp/html-bk mv $(BUILDDIR)/html /tmp/html-bk
rm -rf $(BUILDDIR)/* rm -rf $(BUILDDIR)/*
mv /tmp/html-bk $(BUILDDIR)/html mv /tmp/html-bk $(BUILDDIR)/html
# Added by Tim
.PHONY: clean_model_info
clean_model_info:
rm -rf model-info rm -rf model-info
.PHONY: html .PHONY: html
......
# MHCflurry documentation # MHCflurry documentation
Due to our use of `sphinxcontrib-autorun2` we unfortunately require Python 2.7
to build to the docs. Python 3 is not supported.
To generate Sphinx documentation, from this directory run: To generate Sphinx documentation, from this directory run:
``` ```
......
...@@ -8,24 +8,14 @@ See also the :ref:`tutorial <commandline_tutorial>`. ...@@ -8,24 +8,14 @@ See also the :ref:`tutorial <commandline_tutorial>`.
.. autoprogram:: mhcflurry.predict_command:parser .. autoprogram:: mhcflurry.predict_command:parser
:prog: mhcflurry-predict :prog: mhcflurry-predict
.. _mhcflurry-predict-scan:
.. autoprogram:: mhcflurry.predict_scan_command:parser
:prog: mhcflurry-predict-scan
.. _mhcflurry-downloads: .. _mhcflurry-downloads:
.. autoprogram:: mhcflurry.downloads_command:parser .. autoprogram:: mhcflurry.downloads_command:parser
:prog: mhcflurry-downloads :prog: mhcflurry-downloads
.. _mhcflurry-class1-train-allele-specific-models:
.. autoprogram:: mhcflurry.train_allele_specific_models_command:parser
:prog: mhcflurry-class1-train-allele-specific-models
.. _mhcflurry-calibrate-percentile-ranks:
.. autoprogram:: mhcflurry.calibrate_percentile_ranks_command:parser
:prog: mhcflurry-calibrate-percentile-ranks
.. _mhcflurry-class1-select-allele-specific-models:
.. autoprogram:: mhcflurry.select_allele_specific_models_command:parser
:prog: mhcflurry-class1-select-allele-specific-models
...@@ -14,12 +14,12 @@ are distributed separately from the pip package and may be downloaded with the ...@@ -14,12 +14,12 @@ are distributed separately from the pip package and may be downloaded with the
.. code-block:: shell .. code-block:: shell
$ mhcflurry-downloads fetch models_class1 $ mhcflurry-downloads fetch models_class1_presentation
Files downloaded with :ref:`mhcflurry-downloads` are stored in a platform-specific Files downloaded with :ref:`mhcflurry-downloads` are stored in a platform-specific
directory. To get the path to downloaded data, you can use: directory. To get the path to downloaded data, you can use:
.. command-output:: mhcflurry-downloads path models_class1 .. command-output:: mhcflurry-downloads path models_class1_presentation
:nostderr: :nostderr:
We also release a few other "downloads," such as curated training data and some We also release a few other "downloads," such as curated training data and some
...@@ -28,6 +28,10 @@ experimental models. To see what's available and what you have downloaded, run: ...@@ -28,6 +28,10 @@ experimental models. To see what's available and what you have downloaded, run:
.. command-output:: mhcflurry-downloads info .. command-output:: mhcflurry-downloads info
:nostderr: :nostderr:
Most users will only need ``models_class1_presentation``, however, as the
presentation predictor includes a peptide / MHC I binding affinity (BA) predictor
as well as an antigen processing (AP) predictor.
.. note:: .. note::
The code we use for *generating* the downloads is in the The code we use for *generating* the downloads is in the
...@@ -37,8 +41,9 @@ experimental models. To see what's available and what you have downloaded, run: ...@@ -37,8 +41,9 @@ experimental models. To see what's available and what you have downloaded, run:
Generating predictions Generating predictions
---------------------- ----------------------
The :ref:`mhcflurry-predict` command generates predictions from the command-line. The :ref:`mhcflurry-predict` command generates predictions for individual peptides
By default it will use the pre-trained models you downloaded above; other (as opposed to scanning protein sequences for epitopes).
By default it will use the pre-trained models you downloaded above. Other
models can be used by specifying the ``--models`` argument. models can be used by specifying the ``--models`` argument.
Running: Running:
...@@ -68,6 +73,38 @@ on the Keras backend and other details. ...@@ -68,6 +73,38 @@ on the Keras backend and other details.
In most cases you'll want to specify the input as a CSV file instead of passing In most cases you'll want to specify the input as a CSV file instead of passing
peptides and alleles as commandline arguments. See :ref:`mhcflurry-predict` docs. peptides and alleles as commandline arguments. See :ref:`mhcflurry-predict` docs.
Scanning protein sequences for predicted MHC I ligands
-------------------------------------------------
Starting in version 1.6.0, MHCflurry supports scanning proteins for MHC I binding
peptides using the ``mhcflurry-predict-scan`` command.
We'll generate predictions across ``example.fasta``, a FASTA file with two short
sequences:
.. literalinclude:: /example.fasta
Here's the ``mhctools`` invocation.
.. command-output::
mhctools
--mhc-predictor mhcflurry
--input-fasta-file example.fasta
--mhc-alleles A02:01,A03:01
--mhc-peptide-lengths 8,9,10,11
--extract-subsequences
--output-csv /tmp/subsequence_predictions.csv
:ellipsis: 2,-2
:nostderr:
This will write a file giving predictions for all subsequences of the specified lengths:
.. command-output::
head -n 3 /tmp/subsequence_predictions.csv
See the :ref:`mhcflurry-predict-scan` docs for more options.
Fitting your own models Fitting your own models
----------------------- -----------------------
...@@ -115,42 +152,6 @@ It looks like this: ...@@ -115,42 +152,6 @@ It looks like this:
:nostderr: :nostderr:
Scanning protein sequences for predicted epitopes
-------------------------------------------------
The `mhctools <https://github.com/hammerlab/mhctools>`__ package
provides support for scanning protein sequences to find predicted
epitopes. It supports MHCflurry as well as other binding predictors.
Here is an example.
First, install ``mhctools`` if it is not already installed:
.. code-block:: shell
$ pip install mhctools
We'll generate predictions across ``example.fasta``, a FASTA file with two short
sequences:
.. literalinclude:: /example.fasta
Here's the ``mhctools`` invocation. See ``mhctools -h`` for more information.
.. command-output::
mhctools
--mhc-predictor mhcflurry
--input-fasta-file example.fasta
--mhc-alleles A02:01,A03:01
--mhc-peptide-lengths 8,9,10,11
--extract-subsequences
--output-csv /tmp/subsequence_predictions.csv
:ellipsis: 2,-2
:nostderr:
This will write a file giving predictions for all subsequences of the specified lengths:
.. command-output::
head -n 3 /tmp/subsequence_predictions.csv
Environment variables Environment variables
......
...@@ -54,7 +54,7 @@ extensions = [ ...@@ -54,7 +54,7 @@ extensions = [
'sphinx.ext.viewcode', 'sphinx.ext.viewcode',
'sphinx.ext.githubpages', 'sphinx.ext.githubpages',
'numpydoc', 'numpydoc',
'sphinxcontrib.autorun2', 'sphinx_autorun',
'sphinxcontrib.programoutput', 'sphinxcontrib.programoutput',
'sphinxcontrib.autoprogram', 'sphinxcontrib.autoprogram',
'sphinx.ext.githubpages', 'sphinx.ext.githubpages',
...@@ -76,7 +76,7 @@ master_doc = 'index' ...@@ -76,7 +76,7 @@ master_doc = 'index'
# General information about the project. # General information about the project.
project = 'MHCflurry' project = 'MHCflurry'
copyright = '2019, Timothy O\'Donnell' copyright = 'Timothy O\'Donnell'
author = 'Timothy O\'Donnell' author = 'Timothy O\'Donnell'
# The version info for the project you're documenting, acts as replacement for # The version info for the project you're documenting, acts as replacement for
......
...@@ -12,36 +12,24 @@ from os import mkdir ...@@ -12,36 +12,24 @@ from os import mkdir
import pandas import pandas
import logomaker import logomaker
import tqdm
from matplotlib import pyplot from matplotlib import pyplot
from mhcflurry.downloads import get_path from mhcflurry.downloads import get_path
from mhcflurry.amino_acid import COMMON_AMINO_ACIDS from mhcflurry.amino_acid import COMMON_AMINO_ACIDS
from mhcflurry.class1_affinity_predictor import Class1AffinityPredictor
AMINO_ACIDS = sorted(COMMON_AMINO_ACIDS) AMINO_ACIDS = sorted(COMMON_AMINO_ACIDS)
parser = argparse.ArgumentParser(usage=__doc__) parser = argparse.ArgumentParser(usage=__doc__)
parser.add_argument( parser.add_argument(
"--class1-models-dir-with-ms", "--class1-models-dir",
"--class1-models",
metavar="DIR", metavar="DIR",
default=get_path( default=get_path(
"models_class1_pan", "models.combined", test_exists=False), "models_class1_pan", "models.combined", test_exists=False),
help="Class1 models. Default: %(default)s", help="Class1 models. Default: %(default)s",
) )
parser.add_argument(
"--class1-models-dir-no-ms",
metavar="DIR",
default=get_path(
"models_class1_pan", "models.no_mass_spec", test_exists=False),
help="Class1 models. Default: %(default)s",
)
parser.add_argument(
"--class1-models-dir-refined",
metavar="DIR",
default=get_path(
"models_class1_pan_refined", "models.affinity", test_exists=False),
help="Class1 refined models. Default: %(default)s",
)
parser.add_argument( parser.add_argument(
"--logo-cutoff", "--logo-cutoff",
default=0.01, default=0.01,
...@@ -84,6 +72,9 @@ parser.add_argument( ...@@ -84,6 +72,9 @@ parser.add_argument(
def model_info(models_dir): def model_info(models_dir):
allele_to_sequence = Class1AffinityPredictor.load(
models_dir).allele_to_sequence
length_distributions_df = pandas.read_csv( length_distributions_df = pandas.read_csv(
join(models_dir, "length_distributions.csv.bz2")) join(models_dir, "length_distributions.csv.bz2"))
frequency_matrices_df = pandas.read_csv( frequency_matrices_df = pandas.read_csv(
...@@ -104,10 +95,21 @@ def model_info(models_dir): ...@@ -104,10 +95,21 @@ def model_info(models_dir):
normalized_frequency_matrices.loc[:, AMINO_ACIDS] = ( normalized_frequency_matrices.loc[:, AMINO_ACIDS] = (
normalized_frequency_matrices[AMINO_ACIDS] / distribution) normalized_frequency_matrices[AMINO_ACIDS] / distribution)
sequence_to_alleles = defaultdict(list)
for allele in normalized_frequency_matrices.allele.unique():
sequence = allele_to_sequence[allele]
sequence_to_alleles[sequence].append(allele)
allele_equivalance_classes = sorted([
sorted(equivalence_group)
for equivalence_group in sequence_to_alleles.values()
], key=lambda equivalence_group: equivalence_group[0])
return { return {
'length_distributions': length_distributions_df, 'length_distributions': length_distributions_df,
'normalized_frequency_matrices': normalized_frequency_matrices, 'normalized_frequency_matrices': normalized_frequency_matrices,
'observations_per_allele': observations_per_allele, 'observations_per_allele': observations_per_allele,
'allele_equivalance_classes': allele_equivalance_classes,
} }
...@@ -191,7 +193,7 @@ def go(argv): ...@@ -191,7 +193,7 @@ def go(argv):
mkdir(args.out_dir) mkdir(args.out_dir)
predictors = [ predictors = [
("combined", args.class1_models_dir_with_ms), ("combined", args.class1_models_dir),
] ]
info_per_predictor = OrderedDict() info_per_predictor = OrderedDict()
alleles = set() alleles = set()
...@@ -224,7 +226,6 @@ def go(argv): ...@@ -224,7 +226,6 @@ def go(argv):
w(".. contents:: :local:", "") w(".. contents:: :local:", "")
def image(name): def image(name):
if name is None: if name is None:
return "" return ""
...@@ -234,7 +235,7 @@ def go(argv): ...@@ -234,7 +235,7 @@ def go(argv):
if args.max_alleles: if args.max_alleles:
alleles = alleles[:args.max_alleles] alleles = alleles[:args.max_alleles]
for allele in alleles: for allele in tqdm.tqdm(alleles):
w(allele, "-" * 80, "") w(allele, "-" * 80, "")
for (label, info) in info_per_predictor.items(): for (label, info) in info_per_predictor.items():
length_distribution = info["length_distributions"] length_distribution = info["length_distributions"]
......
...@@ -5,7 +5,8 @@ MHCflurry is an open source package for peptide/MHC I binding affinity predictio ...@@ -5,7 +5,8 @@ MHCflurry is an open source package for peptide/MHC I binding affinity predictio
provides competitive accuracy with a fast and documented implementation. provides competitive accuracy with a fast and documented implementation.
You can download pre-trained MHCflurry models fit to affinity measurements You can download pre-trained MHCflurry models fit to affinity measurements
deposited in IEDB or train a MHCflurry predictor on your own data. deposited in IEDB (and a few other sources)
or train a MHCflurry predictor on your own data.
Currently only allele-specific prediction is implemented, in which separate models Currently only allele-specific prediction is implemented, in which separate models
are trained for each allele. The released models therefore support a fixed set of common are trained for each allele. The released models therefore support a fixed set of common
......
sphinx sphinx
sphinxcontrib-autorun2 sphinxcontrib-autorun
sphinxcontrib-programoutput sphinxcontrib-programoutput
sphinxcontrib-autoprogram sphinxcontrib-autoprogram
sphinx-rtd-theme sphinx-rtd-theme
...@@ -9,3 +9,4 @@ mhctools ...@@ -9,3 +9,4 @@ mhctools
pydot pydot
tabulate tabulate
logomaker logomaker
tqdm
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment