RFDiffusion2

Understanding RFdiffusion2

  1. What is a theozyme, and how does RFdiffusion2 use it differently than previous models?
    1. Theozyme: a theoretically atomically precise (neutral agent) site made up of side-chain atoms positioned to stabilize a transition state
    2. RFDiffusion2: only requires those atomic positions- there is no need for full residues or indices, so no rotamer or index pre-spec needed
  2. Why is removing sequence index and rotamer specification significant in enzyme design?
    1. Removing sequence index and rotamer specifications (like those required to be specific in RFDiffusion1)
      1. unlocks a larger design space by removing sequence index constraints, skips rotamer enumeration
      2. speeds generation since you don’t need to do rotamer enumeration
      3. avoids brute-force enumeration
      4. lets the model infer optimal placements automatically
    2. TLDR it removes hard coded constraints which expands the combinatorial space exponentially
      1. damn so everything is becoming infinite space- the supply of bioreactors needs to increase dramatically
        1. how much cheaper does it have to get?
      2. limits in the design space
  3. How does RFdiffusion2 use 'stochastic centering' to improve inference performance?
    1. adds random translations during training to the model learns to refine the motif position- not just memorize offset. This prevents overfitting to input geometry
  4. What are the key benchmarking results comparing RFdiffusion and RFdiffusion2?
    1. 41/41 enzyme cases (at what atomic precision, and why not the full theoretical space of enzymes?)
      1. RFDiffusion1 only got 16/41
    2. generated more novel folds and worked better on complex sites
  5. What does the ORI token control, and why is it useful for scaffold design?
    1. the ORI token control = the origin token (is it randomly generated like a seed in an encryption algorithm, like Salt?)
    2. it lets you guide pocket orientation (expose one end of the ligand to solvent)

Connection to Antivenom Pipeline

  1. How could RFdiffusion2 help scaffold a de novo binder for melittin if no natural structure exists?
  2. Where would RFdiffusion2 fit in your pipeline—before or after AlphaFold2/Chai-1, and why?

5/29- way past this, built end to end pipelines, now working on commercial implementations