We also recommend using BF16 as the activation precision for the model. We released the models with native quantization support. You can either use the with_python() method if your tool implements the full interface or modify the definition using with_tools(). This implementation runs in a permissive Docker container which could…