Introduction to molecular visualization

Molecular Visualization

Understanding molecular biology and biochemistry is very much facilitated by visualization of biomolecular structures. Dynamics, chemical reactions, and macroscopic properties can be understood and interpreted from the atomic level. Since myoglobin's structure was crystallographically solved in 1957, different approaches have been developed in order to represent structures. Those from the field of computer graphics are the most successful ones.

Molecular models try to simplify the complexity of the molecules, in order to make those susceptible for learning. By focusing on different aspects, one can understand in a systematic fashion how the molecular machines work. Finally, all information can be integrated in a collective view. This is of course of paramount importance in a learning ambit.


PDB stands for "Protein Data Bank". It is a universal database of biological macromolecular structures. In principle, every structure that has been made public can be found here. Structures are identified by a unique code of 4 digits. The standard way to access to the PDB is through its web interface. Structures are stored in files, which have a specific structure. First, a header containing all the information regarding the system under study, experimental conditions, journal citation, etc. Second a body with the actual coordinates of the atoms. Here is an example of 4DFR (dihydrofolate reductase complexed with methotrexate).

HEADER    OXIDO-REDUCTASE                         25-JUN-82   4DFR      4DFR   3
COMPND   2 METHOTREXATE                                                 4DFR   5
SOURCE    (ESCHERICHIA $COLI B), STRAIN /MB1428$,                       4DFRF  1
SOURCE   2 A METHOTREXATE-RESISTANT MUTANT                              4DFR   7
AUTHOR    D.J.FILMAN,D.A.MATTHEWS,J.T.BOLIN,J.KRAUT                     4DFR   8
ATOM      1  N   MET A   1      24.293  59.579   4.215  1.00 39.20      4DFR 268
ATOM      2  CA  MET A   1      25.127  58.554   4.958  1.00 37.00      4DFR 269
ATOM      3  C   MET A   1      24.186  58.457   6.297  1.00 35.50      4DFR 270
ATOM      4  O   MET A   1      23.827  59.402   6.804  1.00 34.50      4DFR 271
ATOM      5  CB  MET A   1      26.311  59.103   5.385  1.00 40.80      4DFR 272
ATOM      6  CG  MET A   1      27.346  58.263   5.966  1.00 46.90      4DFR 273
ATOM      7  SD  MET A   1      28.097  59.208   7.157  1.00 52.50      4DFR 274
ATOM      8  CE  MET A   1      28.875  60.685   6.576  1.00 53.80      4DFR 275
ATOM      9  N   ILE A   2      23.920  57.222   6.665  1.00 32.60      4DFR 276
ATOM     10  CA  ILE A   2      23.091  56.867   7.886  1.00 30.00      4DFR 277
ATOM     11  C   ILE A   2      24.088  56.447   8.960  1.00 26.30      4DFR 278
The first columns of each line labeled "ATOM" contain the serial number of the atom, the name of the atom, the name of the residue, the chain identifier, the residue number in the sequence and the coordinates (x,y,z).

The easiest way to access the PDB is via the PDB code of a structure, often given in journal articles. If the PDB code is unknown, the database can also be searched by keywords or by author name, if we know it.

For example, let's try to find a dihydrofolate reductase (DHFR) complexed with methotrexate again. If we use the keywords "dihydrofolate reductase" we retrieve 126 structures. This is so, because there might be structures from different species, structures with different ligands and even structures with different mutations or solved under different experimental conditions. Now we have to refine the results. For that, on the left we choose "Refine this Search".  The next "sub-query" that we will add is "Ligand name": "methotrexate". This yields 25 results. We could still refine this search to find DHFR from a specific species. Also, the resolution of the structure is a good criterion for selection: the lower the resolution the better the structure (note that this only applies to structures solved by X-ray crystallography and not to structures solved by NMR).


back to top

PyMOL is an open source molecular visualization system written by Warren DeLano. It is widely used among the scientific community due to its many virtues: relative ease of use, extensibility, great visualization capabilities...

The user manual can be found here. Although not everything is covered there, it is the right starting point. There are two important concepts that are key to start using the program, and we develop briefly here:

Layout of the user interface

Two windows can be distinguished: the main window and the one with the external controls (upper one in the figure). In the main window we have 3 elements: the area for the display of the molecules, a prompt for text commands (PyMOL>_ ) and a graphical menu on the right. The external controls window has standard menu bar (File, Edit, Help....), a command input area (equivalent to the one in the main window) and an text-output area.

In the graphical menu area we find a list with the objects and selections that are currently loaded (in our case only one object called "3dfr". Also the generic "all", which is used in case we want to apply a specific modification to all objects). Clicking on the name of the object has the effect of hiding it (or showing if it was previously hidden). To the right of object name there are five associated buttons: A S H L C. When clicked, different drop-down menus will appear:

            A -> Actions to be done on the object (center object, remove hydrogens, delete object, select atoms, copy, etc..)
            S -> Show: switch between the possible molecular representations (lines, sticks, dots, mesh, spheres, ribbons, cartoons and surfaces)
            H -> Hide: complementary to show
            L -> Labels
            C -> Color

Mouse controls

The mouse controls are thoroughly described at

From here...

Back to the front page