Running¶
The TTT Proteotyping Pipeline can be run either by manually running each of the
steps, or it can be controlled automatically using Snakemake. The
automation ensures that each RAW input file is taken through all the required
steps to produce the final output. Together with the supplied
snakemake_crontab_script.sh
it can be used as a completely hands-off
automated way of analyzing proteomics samples.
Note
The Snakemake workflow can only be run on Linux computers, as it depends on some Linux command line features.
Work directory¶
The Snakemake workflow requires a work directory containing the following folder structure:
0.raw
1.mzXML
2.xml
3.fasta
4.blast8
5.results
The reference data required to run the entire workflow is usually put in a
single directory (or symlinked there), but they can (in theory) be located
anywhere in the file system. The position of all the required files must be
specified in the TTT_pipeline_snakemake_config.yaml
file. This file must be
specified on the command line when invoking the workflow.
Run the Snakemake workflow¶
To run the Snakemake workflow, ensure that a suitable Python/Conda environment
is activated in which all the proteotyping programs and scripts are available
in PATH
. The minimal command line required to start the workflow is this:
snakemake --snakefile SNAKEFILE --configfile CONFIGFILE
As the work directory is specified in the configfile, the command can in theory
be run anywhere in the file system. It is recommended, however, that the
Snakemake workflow is invoked via the use of the included
snakemkae_crontab_script.sh
which sets some environment parameters to
ensure reliable operation. It uses the linux command flock
to ensure that
only one instance of the workflow is ever run at the same time.
Automatic invokation via crontab¶
The workflow can be invoked automatically at set times using the Linux built-in
crontab
. To edit your personal user’s crontab, type crontab -e
at the
command prompt. Add something like the following line to make the Snakemake
workflow check for new files to analyze three times daily (00:00, 12:00,
18:00):
0 0,12,18 * * * /bin/bash /PATH/TO/snakemake_crontab_script.sh
Make sure to modify the configfile (TTT_pipeline_snakemake_config.yaml
) and
the crontab script file (snakemake_crontab_script.sh
) to match your
environment.