Documentation

TopModel Documentation Page

Submitting a Job

The job will be available for 7 days after the calculations are finished. After that, it will be deleted. User entered data will not be shared with any third parties.

Email

Entering an email address is optional. If you provide an email address, a link to the results webpage will be send to that email. Your email address will not be shared with any third parties.

Sequence Input

Sequence input can be provided by copying sequence in the text field or by uploading a fasta file. TopModel predicts only protein structures, therefore every symbol not corresponding to a capital letter amino acid symbol is turned into an alanine ('A'). Fasta headers are removed. The length of the entered sequence must be between 30 and 1000 residues. The server will only accept single sequences, i.e. the first sequence in fasta file. If you want to model multiple proteins, please submit the sequences separately.

Giving a Job Name

An optional job name can be set. The downloadable results files will then have the job name as an identifier. The job name must not be longer than 30 characters and may only consist of letters, numbers, and the underscore ('_') symbol.

Job Types

You can choose from four job types:

Normal run TopModel will find templates, build alignments, score models and refine them.

Protect Templates Normal run, but specified templates will not be removed during threading.
Exclude Templates Normal run, but specified templates will be removed during threading.
Specify Templates TopModel will skip threading and refinement. This is fast but only accurate if the specified templates are highly similar to the target sequence (> 50% identity).

Entering PDB ID's

If you choose to use any other job type than "Normal run", you have to enter PDB ID's of templates. The server accepts only PDB ID information of a certain format. A PDB ID may consist of up to 6 characters but has to start with a number. Capital as well as non-capital letters are accepted.
For the "Protect templates" and "Exclude templates" job types, the chain information is not needed. Therefore, any entered PDB ID longer than 4 characters will be cut down to 4 letters automatically (e.g., 3N8V_A -> 3N8V).
For the "Specify templates" job type, the chain information must be encoded in the fifth or sixth character (e.g., 3N8V_A or 3N8VA). If you use 6 characters to encode the chain information, the fifth character has to be an underscore ('_') symbol). The individual PDB ID's must be comma-separated.

Example Run

If you want to try out the Server without providing a sequence, an example run is prepared. Clicking submit after choosing the example sequence will present the link to an example of a previously predicted structure. The protein used in this is the Hemoglobin subunit beta (https://www.uniprot.org/uniprot/P68871).

Interpreting the Results

Sequence Alignment

The sequence alignment is calculated by TopAligner (see the TopModel paper for an in-depth description). It is a structure-based consensus alignment between the template structures and the final, refined model from TopModel. All the identified templates by TopThreader that were found to be compatible (TM-Score > 0.5) with the final model from TopModel are used for the alignment.
The last line is the sequence conservation. 9 equals to more than 95 % of conservation, 0 equals to less than 5 % conservation.

TopScore

TopScore is a prediction of the lDDT error (defined as 1-lDDT score) of the protein. TopScore is both a global score (the estimated error of the whole protein) and a local score (the estimated error of each residue in the model). Low scores (blue/cyan color) correspond to residues and models with a low estimated error, while high scores (yellow/orange/red color) correspond to residues and models with a high estimated error.

TopScore Single

TopScoreSingle is a prediction of the lDDT error just as TopScore. The only difference between TopScore and TopScoreSingle is that TopScore considers clustering information. This makes TopScore sensitive to the model quality of the entire model ensemble, whereas TopScoreSingle is independent of the model ensemble.

Identity

The sequence identity is calculated as the percentage of the residues aligned between the template and the target sequence that are identical to the target sequence residues.

Coverage

The sequence coverage is the percentage of the target sequence that has a residue-match in the alignment to the template structure.

Predicted TM-Score

The predicted TM-Score is the score used by TopModel to rank templates. This score is calculated by multiple neural networks that use information from primary threaders, structural similarity between templates, and model quality predictions from TopScore and TopScoreSingle to predict the structural similarity between the template and the unknown native structure.

Threader

This column shows which primary threaders identified the given template. For a detailed list of the primary threaders see the TopModel paper.

Structure

The Structure of the template is presented with the NGL Viewer. The structure is colored from N-Terminus (blue) to C-Terminus (red).

Support

If you have any questions or suggestions, please write an email to cpcweb[at]uni-duesseldorf.de.

Literature

Mulnaes, D., Porta, N., Clemens, R., Apanasenko, I., Reiners, J., Gremer, L., Neudecker, P., Smits, S.H.J., Gohlke, H.
TopModel: Template-based protein structure prediction at low sequence identity using top-down consensus and deep neural networks.
J. Chem. Theory Comput. 2020, 16, 1953-1967.

Mulnaes, D., Koenig, F., Gohlke, H.
TopSuite webserver: A meta-suite for deep learning-based protein structure and quality prediction.
J. Chem. Inf. Model. 2021, DOI: 10.1021/acs.jcim.0c01202

This Server uses the NGL Viewer for visualization.
Rose, A. S., Bradley, A. R., Valasatava, Y., Duarte, J. M., Prlić, A., & Rose, P. W. (2016, July).
Web-based molecular graphics for large complexes.
In Proceedings of the 21st international conference on Web3D technology (pp. 185-186). ACM.

Rose, A. S., & Hildebrand, P. W. (2015). NGL Viewer: a web application for molecular visualization.
Nucleic acids research, 43(W1), W576-W579.

Normal run	TopModel will find templates, build alignments, score models and refine them.
Protect Templates	Normal run, but specified templates will not be removed during threading.
Exclude Templates	Normal run, but specified templates will be removed during threading.
Specify Templates	TopModel will skip threading and refinement. This is fast but only accurate if the specified templates are highly similar to the target sequence (> 50% identity).