Monday: Tool Installation

Overview

Monday is dedicated to installing the essential ML-based protein design and structure prediction tools on your HPC cluster. Each module guides you through installing a specific tool, from structure predictors like LocalColabFold and ESMFold to design tools like LigandMPNN and BindCraft.

By the end of Monday, you’ll be able to:
  • Request GPU resources and navigate a module-based HPC environment (CUDA, conda, containers).
  • Install and smoke-test each of the core structure prediction tools (LocalColabFold, ESMFold, Chai-1, Boltz-2) on your cluster.
  • Install and smoke-test the core design and docking tools (LigandMPNN, RFdiffusion2, DiffDock-PP, PLACER, BindCraft).
  • Explain, at a high level, what each tool takes as input, produces as output, and when you’d reach for it.
  • Troubleshoot the most common install failures (CUDA mismatch, missing weights, environment conflicts) well enough to keep moving.

Pre-work

Before starting the main modules, complete these pre-work assignments to ensure your local environment is ready.

# Module Description Status
P1 Environment & GitHub Setup conda, git, and GitHub Required
P2 PyMOL & VS Code Install visualization and coding tools Required
P3 Python Refresher Refresh Python skills for bioinformatics Recommended

Getting Started

Before installing individual tools, complete the HPC Setup module to ensure your environment is properly configured:

# Module Description Status
1 Common HPC Setup CUDA, Conda, containers, and environment setup Start Here

Tool Installation Modules

# Tool Description Status
2 LocalColabFold Fast AlphaFold2 structure prediction Required
3 LigandMPNN Context-aware protein sequence design Required
4 RFdiffusion2 Atom-level active site scaffolding Required
5 ESMFold Single-sequence structure prediction Required
6 OpenFold Open-source AlphaFold2 reproduction Optional
7 Chai-1 Multi-modal biomolecular structure prediction Required
8 Boltz-2 Structure + binding affinity prediction Required
9 DiffDock-PP Protein-protein docking Required
10 PLACER Protein-ligand conformational ensemble prediction Required
11 BindCraft End-to-end binder design pipeline Required
12 ESM3 Multimodal protein generation Optional
13 RFdiffusion All Atom All-atom protein design (predecessor to RFd2) Optional

Tool Categories

Structure Prediction

Tool Input Output Speed Best For
LocalColabFold Sequence + MSA Structure Medium High-accuracy single proteins
ESMFold Sequence only Structure Fast Quick predictions, no MSA needed
OpenFold Sequence + MSA Structure Medium Research, custom training
Chai-1 Multi-modal Complex structures Medium Proteins + ligands + nucleic acids
Boltz-2 Multi-modal Structure + affinity Medium Drug discovery

Protein Design

Tool Input Output Best For
LigandMPNN Backbone Sequence Sequence design with ligand context
RFdiffusion2 Constraints Backbone Active site scaffolding
BindCraft Target structure Binder designs End-to-end binder design

Docking

Tool Input Output Best For
DiffDock-PP Two proteins Docked complex Protein-protein docking
PLACER Protein + ligand Ensemble poses Protein-ligand docking

Tips for Success

  1. Start with HPC Setup - Complete Module 1 before installing any tools

  2. Use separate conda environments - Each tool should have its own environment to avoid dependency conflicts

  3. Check GPU availability - Most tools require GPU access; make sure you can request GPU nodes on your cluster

  4. Note your paths - Keep track of where you install each tool; you’ll need these paths later

  5. Test each installation - Don’t move on until you’ve verified each tool works

  6. Check shared resources - Your HPC may already have databases (AlphaFold, ColabFold) installed

Resource Overview

Approximate requirements across all tools:

Resource Total Needed
Disk Space 50-100 GB (tools only), 2+ TB (with databases)
GPU RAM 16-80 GB depending on task
CPU RAM 32-64 GB

Getting Help

If you encounter issues:

  1. Check the tool’s official documentation (linked in each module)
  2. Search for existing GitHub issues on the tool’s repository
  3. Report an issue on the bootcamp site (GitHub account required)

← Back to Home
Tuesday →