Back to index


Protein - DNA interactions: TATAbox - binding protein (TBP)


For the initiation of transcription of the genes of all eukaryotes as well as all archaea, basal transcription factors are needed that recognize basal promoter elements. The first protein that binds to promoter DNA is the “TATA box binding protein” (TBP). It binds, as the name implies, to a DNA element called “TATA box”. Additional transcription factors assemble into a large complex and recruite the RNA polymerase to the promoter, which then can start transcription of the gene.
The structures of several TBPs, binary complexes of TBP with DNA, and ternary complexes of TBP, a factor called “transcription factor B” and DNA have been determined. The structures are highly conserved from archaea to humans, indicating that this vital function has developed very early in evolution. The following concepts should be addressed with the example of the TBP/TATA box complex:
  • the conformations of nucleic acids, e.g. B-DNA, Z-DNA, A-RNA;
  • site-specific recognition of DNA sequences by proteins through specific interactions;
  • the induction of conformational changes of nucleic acids upon protein binding;
  • basal transcription initiation in eukaryotes, archaea and bacteria;
  • regulation of transcription initiation by different classes of DNA-binding proteins.
Proposed book chapters: NC 26.1, 26.2, 28 or VV 29, 31, 34.3.


Reset PyMOL

PDB: 1tgh


Load the TATA fragment





Show cartoon
show cartoon, tata
hide lines, tata
Ring mode 1
set cartoon_ring_mode, 1Ring mode 2
set cartoon_ring_mode, 2Ring mode 3
set cartoon_ring_mode, 3

Label residues
label name C1* and byres tata, "%s - %s"%(resn, resi)

Color by residue

util.cbam resn A
util.cbac resn T
util.cbag resn G*
util.cbay resn C*




Distances for a A-T pair
show sticks, chain A and resi 7show sticks, chain B and resi 6distance chain B and resi 6 and name H61, chain A and resi 7 and name O4
distance chain B and resi 6 and name N1, chain A and resi 7 and name H3

Distances for a C-G pair
show sticks, chain B and resi 1show sticks, chain A and resi 12distance chain B and resi 1 and name O2, chain A and resi 12 and name H21
distance chain B and resi 1 and name N3, chain A and resi 12 and name H1
distance chain B and resi 1 and name H41, chain A and resi 12 and name O6




The TATA-box is a specific sequence of bases in the DNA that serves as initiation indicator for gene transcription into mRNA. The sequence is recognized by the TATAbox-binding protein (TBP) which is the first component of th transcription complex. The TBP anchors to this region, changing dramatically the structure of the DNA. In this module we will study the interaction between TBP and DNA and one of the subsequent transcription factors.



To start, let's explore a B-DNA structure with the sequence: C-G-T-A-T-A-T-A-T-A-C-G that has been prepared for you. Download it from here and load it into PyMOL (alternatively, use the link on the left)


Show a cartoon representation. Check the different ring modes (set "cartoon_ring_mode" 1, 2 or 3). Choose your favorite representation.








Label by residue and observe the pairings C-G and T-A.




Color carbon atoms differently, according to the nucleotide:
use the "util.cbaX " command. cba stands for: "Color By Atom" and the last X is to be substituted by the color name. For example: g->green; b->blue; etc.

Magenta -> Adenine
Cyan -> Thymine
Green -> Guanine
Yellow -> Cytosine


Measure distances to observe the number of hydrogen bonds that are possible between each pair. For that, first use a sticks representation. Then Wizard->Measurement. Then left click on the proper atoms (hydrogen + hydrogen bond acceptor).







Observe that CG pairs allow for 3 hydrogen bonds ( according to distance and orientation criteria) while TA do so only for two.







Rotate the structure to identify the major and minor grooves of the DNA.
Load 1tgh
fetch 1tgh
Align DNA structures
align tata, 1tghdelete dist*
Select DNA and TBP
select DNA, 1tgh and chain b+cselect TBP, 1tgh and chain a
Color DNA residues
util.cbam resn A
util.cbac resn T
util.cbag resn G*
util.cbay resn C*
Create scene 1
hide allshow cartoon, tatascene F1, store
Create scene 2
hide allshow cartoon, DNAscene F2, store

Create scene 3
show cartoon, tbpscene F3, store


The PDB 1TGH contains the same DNA sequence, complexed with TBP. Load the structure, align it with the bare TATA sequence and color residues as previously.


Create separate selections for DNA (chains B and C) and for the protein (chain A). Label it also by base (residue) name.








Now hide everything but the free TATA - DNA sequence. Save the scene into F1 (Scence -> store -> F1) . Without rotating or translating the structure, hide everything but the DNA from the complex and save the scene into F2. Alternate both scenes (press alternatively F1 and F2 keys) to observe the drastic  conformational change more clearly.



Now create a third scene, with the TBP bound. Show TBP as cartoon. Store the scene in F3 and alternate between F1 and F3 for a more realistic view (the conformational change is not spontaneous but forced by the TBP).

Notice that creating scenes is a powerful way to prepare movies within PyMOL. This has been use for the case of cytrate synthase.
Hide DNA
hide allshow cartoon, TBPcolor white, TBP

Let's explore how TBP binds to the DNA causing the conformational change. One of the first things we can observe is that TBP binds to the minor groove. Thus, the interaction is mainly non-polar and just a few hydrogen bonds are formed. Therefore, it is the deformability of the sequence (T-A pairs are held only by two hydrogen bonds, and G-C with three) that makes it recognizable and not specific chemical patterns. Now hide the DNA and color the TBP uniformly white. Observe the symmetry of the TBP structure. Overall, it is an alpha-beta protein and has a saddle shape.
Show PHE in sticks
show sticks, resn pheshow cartoon, DNA



Show ARG and LYS as blue sticks
show sticks, resn arg+lysutil.cbab resn arg+lysshow spheres, DNA and name P

Show ASN as green sticks
show sticks, resn asnutil.cbag resn asn

The way the protein binds to the DNA is what causes the strong bending of the nucleic acid. The most important interaction is caused by four PHE sidechains that insert themselves between the bases. Show PHE residues in stick representation and observe how dramatic is the effect of interfering the bases' stacking. Show again the DNA in cartoon representation.

Also, ARG and LYS residues (positively charged) are situated to contact the phosphate groups of the DNA backbone. Color these residues by atom, setting the carbon color to blue. Show the phosphate atoms as spheres.

Finally, identify the two ASN that form hydrogen bonds at the center. Color them by atom, setting the carbon atom color to green.
Create representation
create TBP, 1tgh and chain ashow surface, TBPset transparency, 0.5hide spheres

The last objective of this module is to learn how to make publication-quality pictures with PyMOL. We are going to depict the previously described three interactions (PHE, ARG, and LYS and ASN), in a clear way. Let's not forget that the goal of molecular visualization is to facilitate understanding. From a complex system, we are interested in depicting particular characteristics, one or a few at a time. Not forgetting this first goal, one can try to make it compatible with a beautiful picture.

DNA will be represented as a cartoon and for TBP we will use a transparent surface representation. To obtain a continuos surface, the easiest way is to, first, create a separate object for TBP. Positively charged residues should be colored blue and negatively charged in red. The rest in white. Also, through the transparent surface, the cartoon of the protein should be visible. The interesting residues that we are studying (PHE, ASN, ARG and LYS) will be represented as sticks. DNA bases will be labeled by residue name.
Ray-trace image
bg whiteray 800, 600

Save image
png tbp-dna
Once we have the desired view, it's time to ray-trace the image. Ray tracing is a technique for producing realistic 3D images. For that, it takes into account reflexions, refractions and absortions of light rays. Within PyMOL the use is extremely simple, just the command "ray" does the work. Of course several parameters can be tweaked to obtain a the desired picture. For example, the final resolution, shadows and fog are the most important ones. With these, a feeling of deepness can be achieved. Of course, depending on the picture, we might be interested in different adjustments.

Normally, for a publication, we would use a white background. Try several combinations of fog and shadows, and ray-trace your image to a 800x600 resolution. Now, without touching the generated view (otherwise we would go back to a non-raytraced image), save it as a png, just type "png filename".

Ref.: Juo et al. J. Mol. Biol. 1996, 261, 239-254

Back to index Back to top