Birchlabsさんのプロフィール画像

Birchlabsさんのイラストまとめ


ML Engineer at Anlatan (@novelaiofficial). co-author of HDiT (Hourglass Diffusion Transformers). works on diffusion models and LLMs. 日本語を勉強してる。
birchlabs.co.uk

フォロー数:191 フォロワー数:4507

cubic eased multi-cond guidance
similar visual effect to latent walk, but I think it has better temporal coherence + the motion's easier to follow.
we're mixing denoising predictions, so it's a visual interpolation rather than a conceptual interpolation.

0 9

latent walk (spherical linear interpolation between word embeddings on CLIP's latent hypersphere) vs multi-cond guidance
the effect is similar, but pose changes more during latent walk

3 20

multi-cond guidance is back!
thanks to for the hardware.
diffusers-play can now do complex batching.
each sample can have different conditions, varied numbers of conditions, differing negative prompts, differing CFG scales, weighted multi-cond
https://t.co/r85Jc9EpXR

2 19

waifu-diffusion 1.5 beta is out!
trained on tag-based prompts *and* natural language
photos *and* illustrations
high-res (896x896), varied aspect ratio!
great work by as usual, and any other contributors
https://t.co/5D3ZCmGSoU

2 23

finally got the Nvidia 4090 PC set up!
Python 3.11 + CUDA 11.8 + PyTorch 2.0 alpha + xformers + Ubuntu 22.10
it's so faaaassst
thanks again for sponsoring the build!
https://t.co/xzpWuYHLgl

3 15

on Mac got 12% faster! (12 Unet steps 12.03->10.76s; avg of 10)
upgraded pytorch 789b143...acab0ed (Dec 23–Jan 6)
wonder if it's due to razarmehr's torch.linear() optimizations (made BERT 3x faster)
https://t.co/r5HBnpm5RW
more to come when macOS updates!

0 13

dropping out the final value head from cross-attention, and replacing it with a duplicate of the penultimate value head
different, but… fine?
https://t.co/3yrsOmw23Z

0 7

I'm not actually sure is *using* all of those attention heads.
we can throw away one of the value heads used by cross-attention, and get an almost-correct result.
"One Write-Head is All You Need" suggests only _query_ needs multiple heads.
https://t.co/axPGdLmlJT

3 50

it's trained on images with 640x640 area, with neither length longer than 768.
I did the maths wrong and generated an out-of-distribution image (800x768) but it came out great:

6 22

waifu-diffusion 1.4 epoch 1 is out!
supports non-square aspect ratios, triple prompt length.
includes text encoder and VAE.
thanks to the WD team (, , salt, +contributors & sponsors) and to NovelAI for sharing their training process.
https://t.co/t8vVqxnTkK

21 84