Methods
Effective bundles various tools to predict bacterial secreted proteins based on their sequence:
- EffectiveT3: prediction of Type III secretion signals
- EffectiveCCBD: detection of conserved binding domains of Type III chaperones
- EffectiveELD: secretion system independent prediction of secreted proteins based on eukaryotic-like domains
Type IV secreted proteins can be recognized by their C-terminal signal sequence. The program T4SEpre contains multiple models representing C-terminal sequential and position-specific amino acid compositions, possible motifs and structural features. Due to the very high computational costs of the T4SEpre model based on protein secondary structure (Sse), we have only used the sequence-based models T4SEpre_psAac and T4SEpre_bpbAac.
Multiple evidence suggested organelles as targets of bacterial secreted proteins. Effective therefore includes the program Predotar, a tool that allows to rapidly screen N-terminal targeting sequences and to predict their subcellular localization in eukaryotic host cells.
Beyond the analysis of arbitrary protein sequence collections, the new release of Effective also provides a “genome-mode”, in which protein sequences from nearly complete genomes and or metagenomic bins can be screened for the presence of three important secretion systems (type III, IV, VI). The genome mode utilizes several programs and databases:
- EffectiveELD: secretion system independent prediction of secreted proteins based on eukaryotic-like domains with Z-score refinement
- CheckM: estimates the completeness of a genome based on single-copy marker genes
- COGnitor: predicts orthologous group assignments of protein sequences
- EggNOG 4.0: database of Clusters of Orthologous Groups (COGs) and Non-supervised Orthologous Groups (NOGs)
- EffectiveS346: predicts the presence and putative functionality of type III, IV, VI secretion systems
Find below documentation and supplementary data for methods that were specifically developed for Effective.