The basic idea here is to train MLPs to predict a CLIP image embedding from a text embedding, and vice versa; then normalize and add in the embeddings.

0 7

Me reading a bunch of sh*t I don't understand: "I'm creating embeddings"

0 2

Averaging CLIP embeddings over many permutations of the form "a photograph of X", "X trending on artstation", "X rendered in Blender/Unity," etc produces fairly dramatic improvements in some cases.

e.g. "Jabba the Hutt smoking a cigar"

Left: single text, Right: avg many texts

12 86

word2vecとpoincare embeddingsの差分を可視化してみた

1 17

Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation. https://t.co/bxA1cW2i9e

2 5

Learning Point Embeddings from Shape Repositories for Few-Shot Segmentation. https://t.co/Cc0lUSsRME

1 9

Cross-Lingual Contextual Word Embeddings Mapping With Multi-Sense Words In Mind. https://t.co/7gWZPkIlFW

5 6

Computing k-Modal Embeddings of Planar Digraphs. https://t.co/Q1c1oox0YR

1 3

An Exploration on Coloring t-SNE Embeddings in Python https://t.co/H6QzANtbRm

10 22