Configuration¶
This page explains each value of Metaphor’s config settings, that is, the values defined in the config YAML file.
TOP LEVEL
These settings are valid for all steps in the workflow.
samples: samples.csv
QC
fastp:
activate: True
length_required: 50
cut_mean_quality: 30
extra: "--detect_adapter_for_pe"
merge_reads:
activate: True
host_removal:
activate: False
reference: ""
fastqc:
activate: True
multiqc:
activate: True
ASSEMBLY
coassembly: False Whether to perform coassembly (also known as pooled assembly). If this is true, all samples are merged together and assembled into a single file of contigs.
megahit:
preset: "meta-large"
min_contig_len: 1000
remove_intermediate_contigs: True
rename_contigs:
activate: True Whether to rename contigs so contigs and mapping files (.bam) can be imported into Anvi’o. We suggest you keep this on.
awk_command: awk '/^>/{{gsub(" |\\\\.|=", "_", $0); print $0; next}}{{print}}' {input} > {output} This is to prevent errors with the Snakemake –lint command. Don’t change it unless you know what you’re doing.
metaquast:
activate: False
coassembly_reference: "" Reference FASTA file for Metaquast to use as reference. Only required if coassembly is True.
ANNOTATION
prodigal:
activate: True
mode: "meta"
quiet: True
genes: False
scores: False
prokka:
activate: False
args: "--quiet --force"
diamond:
db: "COG2020/cog-20.dmnd" Will try to create from db_source if it doesn’t exist.
db_source: "COG2020/cog-20.fa.gz"
output_type: 6
output_format: "qseqid sseqid stitle evalue bitscore staxids sscinames"
cog_functional_parser:
activate: True
db: "COG2020"
lineage_parser:
activate: True
taxonmap: "COG2020/cog-20.taxonmap.tsv"
rankedlineage: "taxonomy/rankedlineage.dmp"
names: "taxonomy/names.dmp" Path of names file of NCBI Taxonomy
nodes: "taxonomy/nodes.dmp" Path of nodes file of NCBI Taxonomy
download_url: "https://ftp.ncbi.nih.gov/pub/taxonomy/new_taxdump/new_taxdump.tar.gz" URL to download NCBI Taxonomy database
plot_cog_functional:
activate: True
filter_categories: True
categories_cutoff: 0.01 Remove categories with mean abundance across samples smaller than this value
plot_taxonomies:
activate: True
tax_cutoff: 20 Only show the N most abundant taxa for any rank. Leave as 0 for no filtering. Low abundance taxa will be grouped as ‘Low abundance’.
colormap: "tab20c" Which matplotlib colormap to use
BINNING
cobinning: True Whether to perform cobinning. When this is true, only one binning group will be used. If False, samples will be binned according to their ‘group’ column.
vamb:
activate: True
minfasta: 100000
batchsize: 256
metabat2:
activate: True
seed: 0
preffix: "bin" Preffix of each bin, e.g. bin.1.fa, bin.2.fa, etc.
concoct:
activate: True
das_tool:
activate: True
score_threshold: 0.5
bins_report: True
POSTPROCESSING
postprocessing:
activate: True
runtime_unit: "m"
runtime_cutoff: 5
memory_unit: "max_vms"
memory_cutoff: 1
memory_gb: True