Understanding RFdiffusion2
- What is a theozyme, and how does RFdiffusion2 use it differently than previous models?
- Theozyme: a theoretically atomically precise (neutral agent) site made up of side-chain atoms positioned to stabilize a transition state
- RFDiffusion2: only requires those atomic positions- there is no need for full residues or indices, so no rotamer or index pre-spec needed
- Why is removing sequence index and rotamer specification significant in enzyme design?
- Removing sequence index and rotamer specifications (like those required to be specific in RFDiffusion1)
- unlocks a larger design space by removing sequence index constraints, skips rotamer enumeration
- speeds generation since you don’t need to do rotamer enumeration
- avoids brute-force enumeration
- lets the model infer optimal placements automatically
- TLDR it removes hard coded constraints which expands the combinatorial space exponentially
- damn so everything is becoming infinite space- the supply of bioreactors needs to increase dramatically
- how much cheaper does it have to get?
- limits in the design space
- How does RFdiffusion2 use 'stochastic centering' to improve inference performance?
- adds random translations during training to the model learns to refine the motif position- not just memorize offset. This prevents overfitting to input geometry
- What are the key benchmarking results comparing RFdiffusion and RFdiffusion2?
- 41/41 enzyme cases (at what atomic precision, and why not the full theoretical space of enzymes?)
- RFDiffusion1 only got 16/41
- generated more novel folds and worked better on complex sites
- What does the ORI token control, and why is it useful for scaffold design?
- the ORI token control = the origin token (is it randomly generated like a seed in an encryption algorithm, like Salt?)
- it lets you guide pocket orientation (expose one end of the ligand to solvent)
Connection to Antivenom Pipeline
- How could RFdiffusion2 help scaffold a de novo binder for melittin if no natural structure exists?
- Where would RFdiffusion2 fit in your pipeline—before or after AlphaFold2/Chai-1, and why?
5/29- way past this, built end to end pipelines, now working on commercial implementations