flowchart LR
Target[Target Selection] --> Structure[Structure Analysis]
Structure --> Backbone[Backbone Generation<br/>RFdiffusion]
Backbone --> Sequence[Sequence Design<br/>LigandMPNN]
Sequence --> Validation[Structure Prediction<br/>AlphaFold2/ESMFold]
Validation --> Refinement[Iterate & Refine]
style Backbone fill:#f9f,stroke:#333,stroke-width:2px
style Sequence fill:#bbf,stroke:#333,stroke-width:2px
style Validation fill:#bfb,stroke:#333,stroke-width:2px
4. Reflection & Capstone Introduction
Before diving into the capstone project, take time to reflect on what you’ve learned and prepare your approach. This self-guided activity helps you consolidate your knowledge and plan your binder design project.
Live Workshop Session
📊 View slide deck
Part 1: Reflecting on Your Learning
Take 10-15 minutes to think through these questions. Consider writing your answers in a notebook or document—this reflection will help solidify your understanding and identify areas to revisit.
Structure Prediction
- What surprised you most about how AlphaFold2 or ESMFold work?
- When would you choose ESMFold over AlphaFold2? What are the trade-offs?
- How do you interpret pLDDT scores? What score range makes you confident in a prediction?
- What does the PAE matrix tell you that pLDDT alone doesn’t?
Protein Design
- How does RFdiffusion “know” what shape to generate? What’s the role of the noise schedule?
- Why do we need sequence design (LigandMPNN) after backbone generation? Why can’t RFdiffusion output sequences directly?
- What’s the difference between unconditional and conditional generation in RFdiffusion?
- How do hotspot residues guide the design process?
Computational Concepts
- Why are GPUs faster than CPUs for these ML tools?
- When might CPU actually be faster than GPU?
- What’s the benefit of vectorization in numerical computing?
Connecting the Tools
- Sketch out the typical workflow for designing a protein binder. What tools do you use at each step?
- Where in the pipeline would you use structure prediction? Where would you use it for validation?
- What would you do if your designed binder had low pLDDT when you predicted its structure?
Part 2: Introduction to the Capstone Project
The capstone project brings together everything you’ve learned. You’ll design a de novo protein binder for a target of your choice.
What is Binder Design?
Binder design is formulated as: “Given a target protein (and optionally a specific epitope), design a smaller protein capable of binding the target.”
This is one of the most important problems in computational protein design, with applications in:
- Therapeutics: Blocking disease-related protein interactions
- Diagnostics: Creating detection reagents
- Research tools: Probing protein function
- Synthetic biology: Building new biological circuits
The Design Pipeline
Available Targets
You’ll choose from one of these curated targets, each representing a different challenge:
| Category | Targets | Challenge Level |
|---|---|---|
| Immune Checkpoint | PD-L1, IFNAR2, IL-7Rα | Well-studied interfaces, therapeutic relevance |
| Allergen | Bet v 1 | Smaller target, antibody-like design |
| Enzymes | TrkA, TEM-1, GM2 Activator, Beta-Glucosidase | Diverse binding sites, varied complexity |
Consider what interests you most: - Therapeutic relevance? → PD-L1, IFNAR2, IL-7Rα - Smaller, focused project? → Bet v 1 - Enzyme inhibition? → TEM-1, Beta-Glucosidase - Receptor binding? → TrkA
Part 3: Planning Your Approach
Before starting the capstone, develop a plan. Answer these questions to guide your work.
Target Selection
- Which target interests you most? Why?
- What PDB structure will you use? Have you looked at it in PyMOL?
- What residues define the binding interface? (Check the target’s detail page)
- Are there any potential challenges with your target (flexible regions, glycosylation sites, etc.)?
Design Strategy
- What size binder will you aim for? (Typical range: 50-100 residues)
- Will you use hotspot conditioning in RFdiffusion? Which residues?
- How many designs will you generate at each step?
- What metrics will you use to filter designs? (pLDDT, PAE, interface contacts, etc.)
Documentation Plan
- How will you keep track of your experiments? (Lab notebook, Jupyter notebook, README, etc.)
- What format will you use to present your work? (Report, slides, video, GitHub repo)
- What commands and settings will you record? (Think about reproducibility)
Part 4: Getting Started Checklist
Before moving to the capstone, make sure you can answer “yes” to these:
Head to the Capstone Project page to begin! Each target has a dedicated page with specific guidance, PDB information, and strategy tips.
Additional Resources
If you want to deepen your understanding before starting the capstone:
Papers
- RFdiffusion paper - Watson et al., Nature 2023
- AlphaFold2 paper - Jumper et al., Nature 2021
- BindCraft preprint - Pacesa et al., bioRxiv 2024
Communities
- RosettaCommons Forums
- OpenFold Discussions
- Protein design Twitter/X community (#proteindesign)