To save space, is it possible to use quantized embeddings? I’d like to use binary quantized embeddings to really cut down the storage size. Having done some tests, I get >90% performance with 32x less storage!
1 Like