Birchlabsさんのイラストまとめ 5ページ目

Birchlabs
@Birchlabs

got #stablediffusion's Unet compiled to CoreML, targeting all compute units (including Neural Engine).
replaced self-attention with the MultiHeadedAttention that Apple optimized for Neural Engine.
not faster yet (need cross-attention too).
https://t.co/9tdKdGIK7X

6 27

2022-11-13

Birchlabs
@Birchlabs

k-diffusion added Brownian tree noise sampling, increasing the stability of convergence.
10, 15, 20, 25, 30, 35, 40, 100 step counts:
left = default noise
right = Brownian noise
default strategy has it jumping all over the place, but Brownian sampling is stable. #stablediffusion

2 48

2022-11-10

Birchlabs
@Birchlabs

skipping the clamping-by-%ile and just denormalizing CFG20's latents to CFG7.5's abs().max() is very similar to reducing cond_scale, but not quite.

think it works out something like:
(uncond + (cond - uncond) * 20)/(20/7.5)
versus
uncond + (cond - uncond) * (20/(20/7.5))

0 0

2022-11-07

Birchlabs
@Birchlabs

99.99999999999%ile dynthresh
I think this shows that "clamping out n%ile outliers" is only important when you have excessive outliers. the rest of the battle is "what range of values do you span". hence denormalizing the latents to CFG7.5's abs().max(), gives us a safer range.

0 2

2022-11-07

Birchlabs
@Birchlabs

a couple more examples before I post the code

0 5

2022-11-07

Birchlabs
@Birchlabs

each step: compute CFG7.5. for each channel: flatten, center on mean, grab abs().max()
compute CFG20. each channel: flatten, center on mean, compute 99.x%ile of abs(), pick larger of %ile or of CFG7.5's max. clamp channel by that. normalize. multiply by CFG7.5's max.
code coming

0 6

2022-11-07

Birchlabs
@Birchlabs

made a new algorithm for dynamic thresholding in #stablediffusion
enables us to set CFG scale high (e.g. 20) without clipping; keep dynamic range / subtlety in shadows, highlights
we refer to a known-good CFG (7.5)'s dynamic range, which helps us pick a ceiling.
detail to follow

7 63

2022-11-07

Birchlabs
@Birchlabs

got official DPMSolver++ sampler #stablediffusion working on Mac.
today, Cheng Lu added a trick to improve performance on <15 steps. probably k-diffusion doesn't have this yet.
dynthresh only works on pixel space; remains an unsolved problem on latents.
https://t.co/22rDBzDWiW

1 40

2022-11-06

Birchlabs
@Birchlabs

with DPM-Solver++(2M) sampler, we get coherent images in 5 sampler steps!
and these aren't Heun steps (where n steps = 2n-1 model calls), this is just 5 model calls! less than 3.5 secs on Mac!
Katherine released this implementation yesterday in k-diffusion — great work as usual!

21 156

2022-11-06

Birchlabs
@Birchlabs

classifier-free guidance:
ask model to denoise gauss noise.
no condition: model predicts a salad.
shrine maiden condition: model predicts graffiti of faces.
CFG is "what makes shrine maiden different from salad", multiplied by your guidance scale.
repeat this every sampler step.

1 4

2022-10-26