Primer Design Computational Protocol

ENTM201L - General Entomology Laboratory | UC Riverside

Primer Design Computational Protocol

Module 11: Designing and Validating COI Barcoding Primers

Overview

This computational protocol guides you through the systematic design and validation of DNA barcoding primers for mosquito COI gene amplification. You will use bioinformatics tools to retrieve sequences, perform alignments, design primers, and validate specificity.


Learning Objectives


Required Software and Resources

Web-Based Tools

Optional Software (Advanced)


Protocol

Part 1: Sequence Retrieval and Curation (30 minutes)

Step 1: Search GenBank for COI Sequences

  1. Navigate to NCBI Nucleotide Database:
  2. Search for mosquito COI sequences:
    • Search term: ("Culicidae"[Organism] OR "mosquito"[All Fields]) AND COI[Gene] AND 500:2000[SLEN]
    • This finds COI sequences from mosquitoes between 500-2000 bp
  3. Filter results:
    • Select species of interest (Aedes, Anopheles, Culex)
    • Aim for 10-20 sequences representing diverse genera
    • Prioritize reference sequences (RefSeq database)
  4. Download sequences:
    • Select all desired sequences (checkboxes)
    • Click "Send to" → "File" → Format: "FASTA"
    • Save as mosquito_coi_sequences.fasta

Step 2: Curate Sequence Dataset

  1. Open FASTA file in text editor
  2. Check sequence quality:
    • Remove sequences <600 bp (too short for primer design)
    • Remove sequences with >5% ambiguous bases (N characters)
    • Ensure taxonomic diversity (multiple genera represented)
  3. Rename headers for clarity:
    • Format: >Genus_species_AccessionNumber
    • Example: >Aedes_aegypti_KC920138

Recommended Target Species

For ENTM201L Primer Design Exercise:

  • Aedes aegypti - KC920138
  • Aedes albopictus - MN736318
  • Anopheles gambiae - L20934
  • Culex quinquefasciatus - AB453206
  • Culex pipiens - EF612193

Part 2: Multiple Sequence Alignment (20 minutes)

Step 3: Align Sequences with MAFFT

  1. Go to MAFFT web server:
  2. Upload your FASTA file:
    • Click "Browse" and select mosquito_coi_sequences.fasta
  3. Select alignment strategy:
    • Method: Auto (default)
    • Output format: FASTA
  4. Submit alignment:
    • Click "Submit"
    • Wait 2-5 minutes for results
  5. Download aligned sequences:
    • Save as mosquito_coi_aligned.fasta

Step 4: Identify Conserved Regions

  1. Visualize alignment:
    • Open in text editor or alignment viewer
    • Look for columns with identical bases across all species
  2. Mark conserved regions:
    • Conserved = >90% identity across all sequences
    • These are candidate primer binding sites
    • Look for conserved regions flanking variable central region
  3. Identify variable region:
    • This is your barcode region (to be amplified)
    • Should be 500-800 bp for optimal Sanger sequencing

Part 3: Primer Design with Primer3 (30 minutes)

Step 5: Prepare Consensus Sequence

  1. Create consensus from alignment:
    • Use most common base at each position
    • Or select reference sequence (e.g., Aedes aegypti)
  2. Extract target region:
    • Include 200 bp upstream and downstream of barcode region
    • Total length: ~1000-1200 bp

Step 6: Design Primers in Primer3Plus

  1. Navigate to Primer3Plus:
  2. Paste your template sequence:
    • Paste consensus COI sequence (FASTA format)
  3. Set primer design parameters:
    Parameter Value Rationale
    Primer size 18-25 bp (optimal: 20) Balance specificity and synthesis cost
    Primer Tm 57-63°C (optimal: 60) Standard PCR annealing temperature
    Product size 600-800 bp Optimal for Sanger sequencing
    GC content 40-60% Stable annealing
    Max self-complementarity 4 bp Avoid hairpin formation
    Max 3' complementarity 2 bp Prevent primer-dimer formation
  4. Submit primer design:
    • Click "Pick Primers"
    • Review top 5 primer pairs

Step 7: Evaluate Primer Candidates

For each primer pair, record:

Select best primer pair based on:

  1. Similar Tm for forward and reverse (within 2°C)
  2. GC content 45-55%
  3. No hairpins or self-dimers
  4. Product size 650-750 bp
  5. GC clamp at 3' end (1-2 G or C bases)

Part 4: Adding Degenerate Bases (Advanced, 20 minutes)

Step 8: Identify Variable Positions in Alignment

  1. Return to your aligned sequences
  2. Examine primer binding sites:
    • Look at positions where your designed primers bind
    • Check if all species have identical bases
  3. Identify variable positions:
    • Mark positions where 2+ species differ
    • Note which bases occur at each variable position

Step 9: Add IUPAC Degenerate Codes

Code Bases Meaning
R A or G puRine
Y C or T pYrimidine
M A or C aMino
K G or T Keto
S G or C Strong
W A or T Weak

Example:

Position 8 in alignment:
Aedes aegypti:      T
Aedes albopictus:   T
Anopheles gambiae:  A
Culex quinquefasciatus: A

Replace with: W (A or T)

Part 5: Primer Validation (30 minutes)

Step 10: Calculate Accurate Melting Temperatures

  1. Go to IDT OligoAnalyzer:
  2. Enter forward primer sequence
  3. Set parameters:
    • Oligo concentration: 0.25 µM (typical PCR condition)
    • Na+ concentration: 50 mM
    • Mg++ concentration: 1.5 mM
  4. Review results:
    • Tm (melting temperature)
    • Hairpin structures (ΔG > -3 kcal/mol is acceptable)
    • Self-dimer formation (ΔG > -5 kcal/mol is acceptable)
  5. Repeat for reverse primer

Step 11: Check Specificity with Primer-BLAST

  1. Navigate to Primer-BLAST:
  2. Enter primer sequences:
    • Forward primer
    • Reverse primer
  3. Set parameters:
    • Database: nr (non-redundant nucleotide)
    • Organism: Culicidae (taxid: 7157)
    • PCR product size: 600-800 bp
  4. Submit search
  5. Analyze results:
    • How many mosquito species amplify?
    • Are there off-target amplifications?
    • Do primers bind other genes?

Step 12: Perform In Silico PCR

  1. Use UCSC In-Silico PCR (if genome available):
  2. Select genome:
    • Aedes aegypti genome (if available)
  3. Enter primers:
    • Forward primer
    • Reverse primer
  4. Run PCR simulation
  5. Verify product:
    • Single product at expected size?
    • Correct gene (COI)?
    • No off-target products?

Part 6: Comparison with AU-COI Primers (20 minutes)

Step 13: Compare Your Primers to Published Primers

AU-COI Primers (Hoque et al., 2022):

AU-COI-F: 5'-TATTTTCWACAAATCATAARGATATTGGWAC-3'
AU-COI-R: 5'-TAWACTTCWGGRTGWCCRAARAATCA-3'
Product size: 712 bp

Folmer Primers (1994):

LCO1490: 5'-GGTCAACAAATCATAAAGATATTGG-3'
HCO2198: 5'-TAAACTTCAGGGTGACCAAAAAATCA-3'
Product size: 710 bp

Compare:

  1. Primer binding locations (alignment positions)
  2. Primer lengths
  3. GC content
  4. Melting temperatures
  5. Degeneracy (number of degenerate bases)
  6. Product sizes

Step 14: Evaluate Design Success

Questions to answer:


Data Recording and Analysis

Primer Design Summary Table

Parameter Your Forward Primer Your Reverse Primer AU-COI-F AU-COI-R
Sequence (5'→3') ___ ___ TATTTTCWACAAATCATAARGATATTGGWAC TAWACTTCWGGRTGWCCRAARAATCA
Length (bp) ___ ___ 31 26
Tm (°C) ___ ___ ~55 ~52
GC content (%) ___ ___ 32.3 38.5
Degeneracy ___ ___ 8 32
Product size (bp) ___ 712

Advanced Analysis (Optional)

BioPython Scripting for Automated Primer Design

Example Python script for primer analysis:

from Bio import SeqIO
from Bio.Seq import Seq
from Bio.SeqUtils import MeltingTemp as mt

# Load sequence
record = SeqIO.read("aedes_coi.fasta", "fasta")
template = record.seq

# Define primers
forward = Seq("TATTTTCWACAAATCATAARGATATTGGWAC")
reverse = Seq("TAWACTTCWGGRTGWCCRAARAATCA")

# Calculate Tm
tm_forward = mt.Tm_NN(forward, Na=50, Mg=1.5)
tm_reverse = mt.Tm_NN(reverse, Na=50, Mg=1.5)

print(f"Forward Tm: {tm_forward:.1f}°C")
print(f"Reverse Tm: {tm_reverse:.1f}°C")

# Find binding sites
forward_pos = template.find(str(forward))
print(f"Forward binds at position: {forward_pos}")

Troubleshooting

Problem Possible Cause Solution
No primers found Parameters too stringent Relax Tm range or product size
Primers bind non-specifically Not conserved enough Choose more conserved regions
Large Tm difference GC content mismatch Adjust primer length to balance Tm
Hairpin formation Self-complementarity Redesign primer to avoid palindromes
Too many degenerate bases High sequence variability Choose more conserved flanking region

Key Takeaways


Connection to Lab Activities

In this computational module, you:

  1. Designed primers for COI amplification from scratch
  2. Understood why AU-COI primers work better than Folmer primers for mosquitoes
  3. Learned to validate primer specificity bioinformatically
  4. Gained skills applicable to any molecular biology project

These primers will be used in: