Skip to main content
The Problem You Are Trying to Solve
“I have a moderately active molecule, and I want to optimize it for higher potency, better selectivity, and improved developability.”
Hero Image
At this stage of hit-to-lead, the goal is no longer just to find binders, but to systematically improve a known chemical series. This is challenging because gains in one dimension (potency) often come at the expense of others (selectivity, solubility, safety, or synthetic feasibility). Traditional optimization relies on slow, iterative medicinal chemistry cycles supported by assays and structural reasoning, which can be costly and time-consuming. With modern AI-assisted workflows, we can propose and evaluate many optimization hypotheses in parallel, but only if generation, scoring, and validation are tightly integrated and interpretable. This allows for multi parameter optimization for every molecule designed and eventually synthesized prior to being sent into the lab. Solution
This workflow enables users to iteratively optimize a moderately active molecule using Revilico’s integrated generative, analytical, and structure-based engines, with Molecular Optimization as the core driver. At a high level, the workflow:
  • Generates improved analogs of the starting molecule
  • Scores and prioritizes them using fast, complementary signals
  • Validates top candidates with higher-fidelity methods
  • Repeats the loop as needed until clear lead candidates emerge
The emphasis is on a controlled optimization loop, rather than one-off generation. What Data Do I Need to Provide?
Required
  • Starting molecule(s) as SMILES (moderately active compounds)
  • Target protein structure (experimental or predicted)
  • Clear optimization intent (e.g. “increase potency without increasing lipophilicity”)
Optional (highly recommended)
  • Known liabilities or constraints (avoid motifs, MW limits, polarity ranges)
  • Selectivity context (off-targets or related proteins)
  • Property priorities (potency vs PK vs safety tradeoffs)
  • Historical activity or ADMET data (for QSAR guidance)
Workflow
  1. Establish an Optimization Baseline
Before generating new molecules, establish what “better” means. On the Revilico Operating System, users typically:
  • Review Static Docking, Flexible Docking, or Ensemble Docking results (and/or experimental data) for the starting molecule to understand binding modes and pose stability
  • Use Pharmacophore Analysis and QSAR Modeling to identify key interactions, liabilities, and regions of the molecule that drive activity or risk
  • Decide whether optimization should be conservative (close analogs via Molecular Optimization with tight similarity constraints) or more exploratory (relaxed similarity, scaffold or substituent changes)
This baseline informs how aggressively the Molecular Optimization Engine should push the chemistry and which constraints or objectives should guide the optimization loop. Usually, the docking and activity engines will create a baseline for structure activity relationships that can be leveraged during downstream compound generations.
  1. Generate Optimized Analogs
This is the core step of the workflow. Using Revilico’s Molecular Optimization Engine, users input their moderately active molecule(s) and define optimization goals. The engine then generates new “siblings” of the starting compound, iteratively improving them under a multi-objective scoring framework. This can be done through a ‘global iteration’, allowing the engine to modify any portion of the compound, or a more conservative approach of scaffold decoration, preserving core motifs driving activity, while maintaining other side chains or R-groups that can be iterated against. Typical optimization goals include:
  • Improving predicted binding or docking performance
  • Staying within desirable physicochemical ranges
  • Penalizing known liabilities or unstable motifs
  • Maintaining similarity to the active series (or relaxing it, if needed)
The engine works by repeatedly:
  1. Generating candidate molecules
  2. Scoring them against defined objective functions (engines that help predict parameters)
  3. Updating the generator model during reinforcement learning to favor better chemistry as predicted by the scoring functions.
This balances exploitation (refining what works) with exploration (avoiding local SAR traps). This generates a ranked set of optimized molecular hypotheses and clear scoring summaries explaining why molecules were favored.
  1. Rapid Structural and Statistical Triage
Once a new batch of optimized analogs is generated, users typically apply fast screening layers to prioritize which molecules are worth deeper analysis in other Revilico Engines, depending on what properties need to be optimized (like activity in this case). Docking
  • Used to evaluate binding modes and relative affinity trends across static, flexible, and ensemble conditions
  • Helps eliminate obvious false positives
  • Provides structural intuition for SAR decisions
QSAR Modeling
  • Uses data-driven patterns to predict activity, selectivity, or developability signals
  • Scales well across larger libraries
  • Complements docking by capturing non-structural trends
At this stage, the goal is down-selection, not final validation.
  1. Validate Top Candidates with Dynamics and Energetics (Optional)
For the most promising candidates, users can integrate more expensive but higher-confidence methods. Protein-Ligand Molecular Dynamics
  • Tests binding stability over biologically relevant time scales
  • Reveals water effects, flexibility, and pose robustness
  • Helps eliminate unstable or over-fit docking poses
  • This engine is also equipped with snapshot Free Energy Perturbation (FEP) calculations using MMPSA and MMGBSA to get more accurate read outs of energies.
Free Energy Perturbation (RBFE / ABFE)
  • Provides quantitative ranking within a focused chemical series
  • Particularly useful when choosing which compounds to synthesize next
  • Often used as a final filter before experimental commitment
  • Is capable of highly resolving the breakdown of binding energy contributors for more resolved understanding of ligand protein engagement.
These steps are optional but powerful when optimization decisions become costly, and chemical space becomes more narrow.
  1. Iterate the Optimization Loop
Based on what you learn:
  • Refine scoring objectives
  • Adjust similarity constraints
  • Introduce new penalties or priorities
  • Re-run Molecular Optimization with updated guidance
Most hit-to-lead campaigns cycle through this loop multiple times, progressively narrowing toward a small set of strong lead candidates, overall helping to save a tremendous amount of time and money on synthesized compounds which can eventually fail. Results
  • Iteratively improved compound series
  • Clear rationale for why each optimization step was taken
  • Reduced uncertainty before synthesis or experimental testing
  • A small, prioritized set of lead-like candidates ready for the next stage
When improvements are consistently supported across generation, docking, QSAR, and (optionally) physics-based validation, users can proceed with much higher confidence. Now what? I’ve identified optimized candidates, but what’s next?
  • After identifying these candidates and optimizing their activities and other properties using generative chemistry, you can move forward with Retrosynthesis to design and explore synthetic routes to utilize.
  • If you’d like to re-analyze the compound set for different key properties that were also flagged for optimization, the rest of the operating system suite can be used for this as well.
Why Revilico?
Revilico enables a closed-loop optimization workflow where molecule generation, scoring, and validation are tightly integrated. Rather than relying on a single signal, users can hedge decisions across generative chemistry, structure-based modeling, and data-driven analytics, allowing optimization to move faster without sacrificing scientific control or interpretability.