Why quantize something that is already very small (270mb)? | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		Havoc on Oct 26, 2023 \| parent \| context \| favorite \| on: Jina AI launches open-source 8k text embedding Why quantize something that is already very small (270mb)?

pietz on Oct 27, 2023 [–]

Just making up stuff here, but smaller models are great for serverless compute like functions, which would also benefit from lighter computation. Don't forget, some people are dealing with hundreds of millions of documents. Accelerating this by 4x may be worth a small performance hit.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact