Quantization

Quantization is used to reduce the precision of the weights and biases in a model in order to decrease computational requirements. It involves converting full-precision 32-bit weights into lower-precision formats. Typically 16-bit or 8-bit quantization is used, but research has shown promise in resource constrained enviroments for ternary and binary networks.