Continuous Control of Editing Models <br/> via Adaptive-Origin Guidance

Method

The key observation of our work is that the limitation of CFG in controlling editing strength arises from the dominance of the unconditional prediction at low guidance scales. In instruction-based editing settings, the unconditional prediction typically corresponds to an arbitrary manipulation of the input rather than faithful reconstruction. Consequently, when the guidance scale is varied, low guidance values do not induce small semantic changes around the input.

(a) Standard CFG

Null-condition as origin. The origin is given by ε_t(∅), and the guidance direction is ε_t(c_T) - ε_t(∅).

(b) Adaptive Origin (Ours)

Null-identity interpolated origin. The origin is interpolated between the identity prediction ε_t(REC) and the standard null prediction ε_t(∅), as a function of the edit strength.

Varying CFG scale vs AdaOr edit strength. Standard CFG originates from arbitrary edits, while AdaOr smoothly transitions from the input image to the target edit.

To enable smooth control over edit strength, we introduce an identity instruction — an instruction that corresponds to the identity manipulation, reproducing the input content without any semantic modification. Building on this, we introduce a guidance mechanism where the term that dominates the prediction at low scales (i.e., the origin) is adjusted according to the desired edit strength. Specifically, we interpolate between the identity prediction and the standard unconditional prediction.

By assigning greater weight to the identity term at lower edit strengths and transitioning to the standard term at higher strengths, our method enables smooth, continuous control over manipulation intensity without requiring per-edit optimization or specialized datasets.

Continuous Control of Editing Models
via Adaptive-Origin Guidance

Method

Results