Create, Share, and Scale Enterprise AI Workflows with NVIDIA AI Workbench, Now in Beta
Nvidia
JANUARY 30, 2024
The most common quantization used for this LoRA fine-tuning workflow is 4-bit quantization, which provides a decent balance between model performance and fine-tuning feasibility. This amounts to 2041. Use a sandbox environment to try the code for yourself. Notice that the base model doesn’t perform well out of the box.
Let's personalize your content