Que recherchez-vous ?
AI Tools

Quantization for memory savings

Apr 2, 2025
Released
Safe harbor statement

The following is intended for informational purposes only, and may not be incorporated into any contract. No purchasing decisions should be made based on the following materials. Unity is not committing to deliver any functionality, features or code. The development, timing and release of all products, functionality and features are at the sole discretion of Unity, and are subject to change.

How to share roadmap feedback

If you have accepted to use functional cookies and logged in using your Unity ID at the top right of the page, then sharing feedback is as simple as clicking a card below, selecting a topic’s importance, adding your point of view, and submitting. If you prefer not to accept functional cookies or log in, you will be prompted to enter an email address and validate it, so we know how to reach out when the topic evolves. For more information read the Feedback and privacy terms.

What? Quantization converts AI model weights (the numbers that take up most of the model size) from relatively large and high-precision numbers (FP 32) to relatively small and low-precision numbers (FP16 or INT8) to save disk space. A simple non-AI example can illustrate how quantization works with images. An image is often 24 bits per pixel for full RGB colors, but you can downsize that to 8 bits to use fewer colors, or even 1 bit for black and white only – that is quantization.

Why? AI models can be very large (many gigabytes), so the file size can be an issue if you want to embed them in your Unity app. Quantizing can reduce file size with the following approximate benchmarks we tested: by 50% for FP16 and by 75% for INT8. The downside of quantization is quality loss. Calculating quality loss is tricky and subjective, so it’s best to test your results thoroughly to see if the degradation is acceptable. A good rule of thumb is that larger models degrade less with quantization.

*Note: Performance improvement with quantization is separate from the memory-saving improvement and is covered on a separate roadmap card.

Help us deliver the deliver that matter to you. Let us know how important is this for you.
By submitting you agree to the Unity’s feedback terms and privacy policy.
Discover upcoming updates and share your feedback