Overview
This tutorial covers production-scale high-throughput virtual screening (HTS) in RevBind’s RevDock/RevScreen engine. Unlike small-batch static, flexible, or ensemble docking runs, this workflow is designed for screening multi-million compound libraries efficiently using GPU-enabled parallel batch execution.When to Use This Workflow
This workflow is for production high-throughput screening — not the same as running static, flexible, or ensemble docking on a small compound set. Use it when:- You have a large compound library (hundreds of thousands to millions of compounds)
- You want to maximize GPU throughput by running multiple pipelines in parallel
- You are executing a primary screen before downstream hit filtering and validation
Key Parameters
| Parameter | Recommended Value | Notes |
|---|---|---|
| Batch size | ~150,000 compounds | Optimized for GPU-enabled throughput |
| Exhaustiveness | 8 | Do not increase — compute scales disproportionately |
| Parallelization | One pipeline per batch | Run all batches simultaneously |
Step-by-Step Walkthrough
Navigate to RevDock and RevScreen
From the Revilico OS dashboard, open Revbind, then navigate to RevDock → RevScreen. This is the production HTS interface — distinct from the static, flexible, and ensemble docking modes.
Select Your Target Protein Structure
Load the protein structure you want to screen against. Confirm the correct structure is selected before proceeding — for example, the apo (no inhibitor) form of your target receptor.
Define the Binding Site and Docking Box
Review the target site and define the screening box:
- Determine where the box should be placed based on prior structural analysis
- Identify the key amino acids in the target binding site
- Calculate the grid center for the docking region
- Adjust box position until it is centered correctly on the pocket
Set Exhaustiveness to 8
Set Exhaustiveness = 8 for high-throughput mode. Do not increase this value — compute cost scales disproportionately at higher settings. The goal is efficient throughput with usable screening results, not maximum conformational sampling.
Prepare Your Compound Batches
For large libraries, you will receive compounds pre-split into batches of approximately 150,000 compounds each. For a 2 million compound library, this generates roughly 13–14 batches.If your library is not already batched, you can:
- Ask the Revilico team to split it for you at no cost
- Use Claude Code or RevAgent to batch it yourself
Batch_1.sdf, Batch_2.sdf, etc.).Different compound libraries (e.g., Enamine REAL, custom sets) can be screened separately using the same workflow. Each library gets its own set of batch pipelines.
Configure and Launch Batch 1
Select Batch 1 as your compound library. Confirm:
- The correct compound library batch is loaded
- The correct protein structure is selected
- The box and grid settings are confirmed from Step 3
Parallelize All Remaining Batches
This is the key step that makes production HTS efficient. Rather than waiting for each batch to complete sequentially, run all batches in parallel:
- Clear Batch 1 from the configuration
- Load Batch 2, confirm settings, click Run Pipeline
- Repeat immediately for Batch 3, Batch 4, and so on
Download Results and Run Downstream Analytics
Once all pipelines complete, your screening data becomes available for download. Export the results and run filtration and analytics using your preferred AI agent or analysis workflow.A follow-up tutorial will cover the full post-screen analytics step in detail.
Parallelization Strategy
The core performance principle of this workflow is parallelization over batching:- A single 2M compound job run sequentially is slow and resource-inefficient
- Splitting into ~150K batches and running all pipelines simultaneously compresses wall-clock time significantly
- GPU utilization stays high across the cluster rather than being bottlenecked by a single large job
Next Steps
After your HTS run completes, the standard hit progression workflow is:- RevAnalytics — Filter by docking score, select top-ranked compounds, analyze interaction fingerprints
- Flexible Docking — Re-dock your top hits with flexible residues and CNN rescoring for higher-confidence rankings
- Ensemble Docking — Run your highest-priority compounds against MD-sampled protein conformations for maximum accuracy
- RevFEP — Compute rigorous binding free energies for your final shortlist

