RevScreen - Ensemble Docking

Overview

Ensemble docking is the highest-accuracy docking mode in Rev-Bind’s Virtual Screening Engine. Rather than docking against a single static protein structure, it samples the protein’s dynamic conformational landscape via molecular dynamics (MD) simulation, then docks your compound library against multiple protein snapshots simultaneously. The result is a rich matrix of binding data that captures how compounds interact across the full dynamic behavior of the target.

Ensemble Docking Guide — Watch Video

Why Ensemble Docking?

	Static / Flexible Docking	Ensemble Docking
Protein representation	Single structure	Multiple MD snapshots
Conformational sampling	Limited to side-chain flex	Full protein dynamics
Accuracy	Good	High
Compute time	Fast	Longer
Best use case	Library-scale initial screening	Lead compound prioritization

The trade-off is clear: ensemble docking takes more time, but it provides a substantially more accurate and dynamic picture of protein-ligand interactions. Accuracy typically translates far better than static docking, particularly for flexible or allosteric targets.

Workflow Overview

Ensemble docking on Revilico OS involves three sequential stages:

Run a Protein-Water MD Simulation — Sample the protein’s conformational space over time
Pre-process the MD Trajectory — Extract protein snapshots at defined time intervals
Run Ensemble Docking — Dock your compound library against each snapshot and aggregate results

Stage 1: Run a Protein-Water MD Simulation

Before you can run ensemble docking, you need an MD trajectory to draw protein conformations from. This is done using Revilico’s RevMD-Aqua engine.

Navigate to Protein-Water MD

From the Revilico OS dashboard, go to Dynamic Molecular Interactions → Protein Water MD.

Name Your Simulation

Give your simulation a descriptive name (e.g., AXL-apo-50ns) to make it easy to identify when you return to ensemble docking.

Upload Your PDB File

Upload your protein structure PDB file, or drag-drop it from the file pane on the right-hand side of the screen.

Configure Force Fields and Solvent Conditions

Select your force field parameters. If you are new to molecular dynamics, the recommended defaults are appropriate for most standard protein systems.For specialized systems, you can additionally configure:

Salt and ion concentration — to match physiological or specific assay conditions
pH — to reflect your experimental environment

Use the Revilico Interpreter to get plain-language explanations of any parameter on screen. This is particularly useful if you are newer to molecular dynamics setup.

Set Simulation Length

Define your simulation time in nanoseconds. General guidance:

Minimum: 50 ns for most protein targets
Recommended: Scale upward based on protein size and the expected timescale of conformational transitions
Longer simulations capture slower motions but require more compute time

Run the Simulation

Click Run Simulation. The platform will execute the protein-water MD job. You will receive a notification when the trajectory is ready. You can find a dedicated walkthrough video for RevMD-Aqua in the RevMD tutorials.

Stage 2: Pre-process the MD Trajectory

Once your MD simulation is complete, return to the Virtual Screening Engine to prepare the protein snapshots for docking.

Navigate to Ensemble Docking → Pre-process MD

From the Virtual Screening Engine, select Ensemble Docking, then click Pre-process MD.

Select Your MD Simulation Pipeline

From the dropdown, select the MD simulation you completed in Stage 1. The platform will automatically preload the min (start time) and max (end time) values from your trajectory.

Define Your Sampling Window

Adjust the time sliders to set which portion of the trajectory you want to sample from:

Full trajectory: Leave the sliders at the preloaded min and max values to sample conformations across the entire simulation
Latter half only: Advance the start slider forward to skip the early equilibration phase and focus on post-equilibrium conformations, which are generally more biologically representative

Sampling the latter half of the trajectory — after the protein has equilibrated — typically produces higher-quality ensemble members. Early frames may still reflect the starting crystal structure rather than native dynamics.

Set the Snapshot Interval

Define the time interval (in ns) between snapshots. The number of snapshots extracted is:

Number of snapshots = (max time − min time) ÷ interval

Example: A 0–50 ns trajectory with an interval of 10 ns will produce 5 snapshots — one at the start and one at each 10 ns increment.More snapshots increase the diversity and representativeness of the ensemble, but proportionally increase docking compute time. A starting range of 5–10 snapshots is practical for most campaigns.

Start Pre-processing

Click Start Preprocessing. The platform will extract protein PDB snapshots at each interval and prepare them for the docking stage. These pre-processed pipelines will appear in the ensemble docking interface.

Stage 3: Run Ensemble Docking

With your pre-processed snapshots ready, you can now run the ensemble docking campaign against all protein conformations simultaneously.

Open Ensemble Docking

In the Virtual Screening Engine, select Run Ensemble Docking.

Select Your Simulation and Pre-processed Pipeline

From the dropdown menus:

Select the MD simulation pipeline you ran in Stage 1
Select the pre-processed pipeline you created in Stage 2 (the one containing your defined snapshot interval and sampling window)

You will be left with a set of PDB files — one for each protein conformation snapshot.

Set the Grid Center for Each Snapshot

This is the critical step unique to ensemble docking. Because the protein moves and flexes across snapshots, the binding pocket position shifts with each conformation — so the docking box must be defined independently for each snapshot.For each snapshot, you have two options:

Calculate Grid Center Auto — Click this button and the platform computes the docking box center automatically based on the current protein structure. Work through each snapshot one by one.
Calculate Grid Center via Residue — Specify your known binding site residues by name to anchor the box. This is faster if you have a well-characterized pocket and ensures the box tracks the same region across all conformations.

After calculating the grid center for each snapshot, confirm the selection and verify the box is correctly centered on your pocket of interest before proceeding to the next. The protein structure will look different in each snapshot as it reflects a different point in the MD trajectory.

Repeat this process for every snapshot in your ensemble.

Add Your Compound Library

Navigate to Data Engineering within the ensemble docking interface, and drag-drop the CSV file containing your compound library.

Run the Pipeline

Click Run Pipeline. The docking algorithm will send every compound in your library through each protein snapshot — Trajectory 1, 2, 3, 4, 5, and so on — generating a complete matrix of docking data across all conformational states.

Understanding the Ensemble Docking Output

The output is a multi-dimensional dataset where each compound has a docking score against every protein snapshot. This data matrix enables several levels of analysis: Consistent binders: Compounds that score well across most or all snapshots are likely robust, conformationally-insensitive binders. These are your highest-confidence leads for progression. Conformation-selective binders: Compounds that score strongly against only specific snapshots may act as conformational selectors or allosteric modulators, stabilizing particular protein states. These can be highly valuable for targeted biology. False positive filtration: A compound that scores well in one snapshot but poorly across the rest is likely a docking artifact of that specific geometry rather than a true binder. Ensemble docking dramatically reduces this class of false positives compared to single-structure docking. Analyze your full results matrix in the RevAnalytics module.

Next Steps

After ensemble docking, your prioritized compound list is ready for:

RevFEP — Alchemical free energy perturbation calculations for rigorously ranking your top candidates by binding affinity
RevMD-Bind — Full protein-ligand MD simulations to characterize binding stability and residence time for shortlisted leads
RevAnalytics — Deep-dive statistical analysis of your ensemble docking score matrix, interaction fingerprinting, and pose clustering
Static & Flexible Docking — Review the earlier stages of the docking workflow if you are revisiting this tutorial

​Overview

​Why Ensemble Docking?

​Workflow Overview

​Stage 1: Run a Protein-Water MD Simulation

​Stage 2: Pre-process the MD Trajectory

​Stage 3: Run Ensemble Docking

​Understanding the Ensemble Docking Output

​Next Steps

Overview

Why Ensemble Docking?

Workflow Overview

Stage 1: Run a Protein-Water MD Simulation

Stage 2: Pre-process the MD Trajectory

Stage 3: Run Ensemble Docking

Understanding the Ensemble Docking Output

Next Steps