TAL Effector Site Finder Tutorial
TAL Effector Site Finder identifies candidate binding sites for a TAL effector in an input DNA sequence. TAL Effector Site Finder is intended to identify binding sites for naturally occurring TAL effectors in known target genes or for searching short DNA sequences for custom TAL effector binding sites.
TAL effector binding sites are scored using the scoring function developed by Moscou and Bogdanove in supplementary script S1 of the paper Moscou, M.J. and Bogdanove, A.J. (2009) A simple cipher governs DNA recognition by TAL effectors. Science. 326(5959):1501.
The scoring function is based on RVD-nucleotide association frequencies for known TAL-effector target pairs. Each RVD-nucleotide pair in the TAL effector/target alignment is assigned a probability score based on these association frequencies. Scores for all RVD-nucleotide pairs are summed to score the entire alignment.
TAL Effector Site Finder returns a list of the best (lowest) scoring candidate binding sites in the input DNA sequence for the TAL effector.
- Upload the DNA sequence and the RVD sequence you want to search.
To follow along with this tutorial, use the link at the top of the page to load the sample data set:
Note that this will load a sample DNA sequence and a TAL effector RVD sequence.
To load your own DNA sequence, cut and paste FASTA formatted sequences into the text box or upload a file containing the sequences. All sequences should include a FASTA identifier line (">long_sequence" in the image above) and sequences are limited to the characters ACGTN. For more details on correct formatting, see the Help page.
To enter a different RVD sequence, enter a sequence between 12 and 35 RVDs, separated by spaces. Use '*' to indicate a missing amino acid (such as N* or H*).
- Select additional options.
By default, TAL Effector Site Finder returns the 10 best-scoring binding sites in the DNA sequence. You can change this number to return a different number of sites.
You can also choose to scan the reverse complement of the entered DNA sequence. By default only the sequence as entered is searched. To follow along with the tutorial, leave this box unchecked (default settings).
If you scan the reverse complement of the sequence, TAL Effector Site Finder will return the specified number of sites (10 by default) for both the forward and the reverse complement sequence (20 sites total using the default settings).
- Submit your query.
Optionally, you may enter your email address to receive an email with a link to your results when your job is complete.
You must also choose the length of time your results will be available for download from our server.
When you are finished, hit the Submit button.
- Retrieve your results.
After hitting Submit, you will be taken to a page detailing the progress of your job.
Do not navigate away from this page; or bookmark the page so you can return and download your results later!
When the job finishes, your results will appear in a table. You will also be provided with a link to download a tab-delimited text file containing your results. If you entered an email address, you will receive an email notification with a link to this page so you can retrieve your results.
- Interpret your results.
By default, TAL Effector Site Finder will always return the 10 best- (lowest-) scoring sites in the entered DNA sequence. (If you changed the number of sites in Step 2, a different number will be returned.) Not all of these Top 10 sites are likely to be bound by the TAL effector! You may find that none of these "best" sites are likely to bind. Or, if there are more than 10 good sites in the DNA sequence, the tool may not return all of the sites that the TAL effector can bind.
To determine whether or not a site is likely to be bound by the TAL effector, look at the Score column and the Best possible score reported above the table and at the top of the results file. Score gives the actual score for the TAL effector on the candidate binding site. Best possible score gives the score for the TAL effector on its "perfect" binding site (the site with all RVDs aligned with their most frequently associated nucleotide). If the Score is closer to the Best possible score, the TAL effector is more likely to bind to that site.
TAL effectors and their known targets typically have Scores less than 3*Best_possible_score for the TAL effector.
In the example above, the first site has a Score identical to the Best_possible_score and is likely to be bound by the TAL effector. The other 9 sites have very high scores and probably will not be bound by the TAL effector.
If you checked the option to search the reverse complement, your results table will have a column for Target Sequence and Plus Strand Sequence. The two columns are identical for sites in the plus (as entered) DNA sequence. For sites on the minus strand, the Plus Strand Sequence is the reverse complement of the target.
Note that all binding sites are directly preceded by a T at the 5' end. The T is not shown in the output.