4. Reflection & Capstone Introduction

Before diving into the capstone project, take time to reflect on what you’ve learned and prepare your approach. This self-guided activity helps you consolidate your knowledge and plan your binder design project.

Live Workshop Session

🎥 Live workshop recording — Protein design roundtable discussion
📊 View slide deck

Part 1: Reflecting on Your Learning

Take 10-15 minutes to think through these questions. Consider writing your answers in a notebook or document—this reflection will help solidify your understanding and identify areas to revisit.

Structure Prediction

NoteReflection Questions
  1. What surprised you most about how AlphaFold2 or ESMFold work?
  2. When would you choose ESMFold over AlphaFold2? What are the trade-offs?
  3. How do you interpret pLDDT scores? What score range makes you confident in a prediction?
  4. What does the PAE matrix tell you that pLDDT alone doesn’t?

Protein Design

NoteReflection Questions
  1. How does RFdiffusion “know” what shape to generate? What’s the role of the noise schedule?
  2. Why do we need sequence design (LigandMPNN) after backbone generation? Why can’t RFdiffusion output sequences directly?
  3. What’s the difference between unconditional and conditional generation in RFdiffusion?
  4. How do hotspot residues guide the design process?

Computational Concepts

NoteReflection Questions
  1. Why are GPUs faster than CPUs for these ML tools?
  2. When might CPU actually be faster than GPU?
  3. What’s the benefit of vectorization in numerical computing?

Connecting the Tools

NoteReflection Questions
  1. Sketch out the typical workflow for designing a protein binder. What tools do you use at each step?
  2. Where in the pipeline would you use structure prediction? Where would you use it for validation?
  3. What would you do if your designed binder had low pLDDT when you predicted its structure?

Part 2: Introduction to the Capstone Project

The capstone project brings together everything you’ve learned. You’ll design a de novo protein binder for a target of your choice.

What is Binder Design?

Binder design is formulated as: “Given a target protein (and optionally a specific epitope), design a smaller protein capable of binding the target.”

This is one of the most important problems in computational protein design, with applications in:

  • Therapeutics: Blocking disease-related protein interactions
  • Diagnostics: Creating detection reagents
  • Research tools: Probing protein function
  • Synthetic biology: Building new biological circuits

The Design Pipeline

flowchart LR
    Target[Target Selection] --> Structure[Structure Analysis]
    Structure --> Backbone[Backbone Generation<br/>RFdiffusion]
    Backbone --> Sequence[Sequence Design<br/>LigandMPNN]
    Sequence --> Validation[Structure Prediction<br/>AlphaFold2/ESMFold]
    Validation --> Refinement[Iterate & Refine]

    style Backbone fill:#f9f,stroke:#333,stroke-width:2px
    style Sequence fill:#bbf,stroke:#333,stroke-width:2px
    style Validation fill:#bfb,stroke:#333,stroke-width:2px

Available Targets

You’ll choose from one of these curated targets, each representing a different challenge:

Category Targets Challenge Level
Immune Checkpoint PD-L1, IFNAR2, IL-7Rα Well-studied interfaces, therapeutic relevance
Allergen Bet v 1 Smaller target, antibody-like design
Enzymes TrkA, TEM-1, GM2 Activator, Beta-Glucosidase Diverse binding sites, varied complexity
TipChoosing Your Target

Consider what interests you most: - Therapeutic relevance? → PD-L1, IFNAR2, IL-7Rα - Smaller, focused project? → Bet v 1 - Enzyme inhibition? → TEM-1, Beta-Glucosidase - Receptor binding? → TrkA


Part 3: Planning Your Approach

Before starting the capstone, develop a plan. Answer these questions to guide your work.

Target Selection

ImportantPlanning Questions
  1. Which target interests you most? Why?
  2. What PDB structure will you use? Have you looked at it in PyMOL?
  3. What residues define the binding interface? (Check the target’s detail page)
  4. Are there any potential challenges with your target (flexible regions, glycosylation sites, etc.)?

Design Strategy

ImportantPlanning Questions
  1. What size binder will you aim for? (Typical range: 50-100 residues)
  2. Will you use hotspot conditioning in RFdiffusion? Which residues?
  3. How many designs will you generate at each step?
  4. What metrics will you use to filter designs? (pLDDT, PAE, interface contacts, etc.)

Documentation Plan

ImportantPlanning Questions
  1. How will you keep track of your experiments? (Lab notebook, Jupyter notebook, README, etc.)
  2. What format will you use to present your work? (Report, slides, video, GitHub repo)
  3. What commands and settings will you record? (Think about reproducibility)

Part 4: Getting Started Checklist

Before moving to the capstone, make sure you can answer “yes” to these:

TipReady to Start?

Head to the Capstone Project page to begin! Each target has a dedicated page with specific guidance, PDB information, and strategy tips.


Additional Resources

If you want to deepen your understanding before starting the capstone:

Papers

Communities