Skip to main content

The Problem You Are Trying to Solve

“I want to run a multiplex screen to understand how large compound libraries interact with large protein libraries, but the full experimental matrix is too expensive and time-intensive.”
Compound–protein Interaction Discovery At
Scale
A true compound × protein screen can explode into millions of combinations, and even “lightweight” biochemical assays become impractical. Teams often need a way to:
  • discover likely binders/interactors early
  • triage the search space
  • identify off-target risk and polypharmacology signals
  • focus experimental validation on the smallest, highest-value subset

Solution

This workflow uses Revilico’s binding chemistry engines to simulate compound–protein interaction likelihood at scale, then progressively increases rigor to produce a high-confidence list of protein interactors for any given compound (or compound interactors for any given protein). The primary refinement chain is: Virtual Screening / Docking → Pose & interaction QC → (Optional) MD stability → (Optional) FEP confirmation → Ranked interactor list. You can run this workflow in either direction:
  • Compound-centric: “What proteins does this compound likely bind?”
  • Target-centric: “Which compounds bind this protein?”
  • Matrix mode: “Score a compound library against a protein panel”

What Data Do I Need to Provide?

Required
  • Compound library as SMILES (CSV)
  • Protein library as structures (PDB) or sequences (if you need structure generation first)
Recommended
  • Known binding pockets or reference ligands (if available)
  • Any existing experimental binding/activity data (even small) for calibration
Optional
  • Desired selectivity constraints (e.g., avoid kinases, avoid hERG panel proteins)
  • A protein “tox/off-target panel” list (for safety screening use cases)

Workflow

  1. Assemble and Standardize Your Screening Inputs
Before screening, ensure both libraries are in usable formats. Users typically:
  • Upload compounds as a SMILES CSV
  • Upload proteins as PDB files (one or many)
  • Use platform utilities to convert/merge files as needed
This gives a clean compound library + a clean protein panel ready for screening.
  1. Establish the Screening Strategy
At scale, you want a fast first pass with clear constraints. Users typically decide:
  • What counts as an “interactor” (score threshold, pose confidence, binding mode plausibility)
  • Whether to screen one compound vs many proteins, or many compounds vs a protein panel
  • Whether known binding sites exist, or whether docking should search broader pockets using blind docking approaches
Primary engines:
  • Virtual Screening Engine (static, flexible, and ensemble docking)
  • Boltz-Cofolding
This provides a screening plan that prioritizes throughput and broad recall (catch candidates).
  1. Run High-Throughput Interaction Prediction
This is the engine-driven multiplex substitute for wet-lab screening. Users run:
  • Virtual Screening or Boltz Co-Folding (fast, high-throughput) across the protein panel and compound set
  • If screening a very large compound library, start with Static Docking at scale, which allows for GPU optimized docking speed, and then down-select top candidates of interest.
This produces a large interaction score matrix (compound–protein pairs) with top-ranked candidates.
  1. Quality Control and Reduce False Positives
Docking at scale is intentionally fast, so the next step is to remove obvious artifacts. Users typically:
  • Filter out strained ligand poses or implausible geometries by looking at intramolecular energies or 3D conformational visualizations
  • Require agreement across multiple poses (not a single outlier)
  • Prioritize conserved interactions across related proteins (if relevant)
Primary engines and utilities:
  • Flexible Docking (re-score a smaller batch, allow key residues to move)
  • Ensemble Docking (if pocket flexibility is a concern)
  • Revilico Interpreter / RevilicoGPT for automated interpretation of pose quality trends to extrapolate results of the data to experimental outcomes
This results in a refined list of likely interactors with improved precision.
  1. Confirm Stability in a Physical Environment (Optional)
When you need higher confidence (or when docking is ambiguous), validate whether binding is stable over time. Users run Protein Ligand MD on the most important compound–protein pairs What you’re looking for:
  • Stable binding mode (no immediate dissociation / unrealistic drift) with meaningful breakdowns of energetic contributions
  • Reasonable RMSD/RMSF behavior near the binding site
  • Consistent key contacts over the trajectory
This will give a stability-validated subset of compound–protein interactions.
  1. Quantify Binding Strength for Final Confirmation (Optional)
For the smallest shortlist where you want thermodynamic confidence: Users run:
  • MMPBSA/MMGBSA binding energy scoring analysis across longer time scales
  • ABFE to estimate absolute binding favorability for a compound–protein pair
  • RBFE to compare close analogs (e.g., when ranking within a compound series)
This will produce a “gold” shortlist of interactions supported by physics-based free energy estimates, giving you more confidence when down selecting certain molecules for analysis in the lab.

Results

  • A ranked, high-confidence list of protein interactors for any given compound (or vice versa), including:
    • predicted binding poses
    • docking and re-scoring metrics
    • optional MD stability evidence
    • optional ABFE/RBFE thermodynamic support
This list is designed to replace a brute-force biochemical multiplex screen with a computational triage pipeline, reducing experimental burden to only the most valuable validations.

Integration with Other Engines (Optional)

Depending on your downstream goal, this workflow can connect naturally to:
  • QSAR Modeling (learn patterns from predicted/experimental activity)
  • Pharmacophore Analysis (motif extraction for cross-target similarity)
  • Generative Chemistry (design/selectivity tuning based on interaction profile)
  • ADMET-AI / Toxicity panels (early developability triage)
  • Quantum Chemistry (electronic property or reactivity checks for select pairs)

Why Revilico?

Revilico supports multiplex interaction discovery because it combines:
  • scale-first screening (Virtual Screening + Docking)
  • precision refinement (Flexible/Ensemble Docking, MD)
  • high-confidence confirmation (FEP)
  • a unified interface for results inspection, interpretation, and iteration
  • Ability to analyze these different factors across a wide variety of multi-plexed ligand protein interaction pairs at scale, simply and easily using simulations.
So instead of experimentally testing every combination, you computationally narrow the matrix to the few pairs that actually deserve bench time.