Overview

This guide provides an overview of the model training process used by OGC's Fine-Tuning Service, highlighting the key aspects and configurations involved in fine-tuning Large Language Models (LLMs).

Understanding Fine-Tuning

Fine-tuning allows you to adapt a pre-trained Large Language Model to your specific use case, enhancing its performance for particular tasks or domains without retraining the entire model from scratch.

Efficient Fine-Tuning with LoRA

OGC uses a technique called Low-Rank Adaptation (LoRA) for fine-tuning, which:

Significantly reduces computational resources and memory requirements.
Maintains or enhances model performance compared to traditional fine-tuning methods.
Trains a minimal number of parameters efficiently, so it only updates a small subset of the model's parameters while keeping the rest of parameters frozen, thus speeding up the fine-tuning process.

Training Configuration

When launching your fine-tuning job through OGC's user interface, you'll have control over several key training parameters:

Batch Size:
- Adjust batch sizes to fit within available GPU memory.
- A larger batch size can lead to a faster training, but with the risk of being stuck in a local minima, so model convergence might be delayed.
Number of Training Epochs:
- Set the number of forward and backward passes where each batch is processed once to calculate the loss and gradients.
Learning Rate:
- Setting up a base learning rate is highly effective because it's memory-efficient and adapts the learning rate dynamically for each parameter.

Under the hood, OGC enables FlashAttention‑2 during training to accelerate attention computation and reduce memory use because the large attention scores and weight matrices are not stored in memory. FlashAttention‑2 is an exact attention algorithm that improves throughput on our GPUs, enabling larger batch sizes and longer sequence lengths without sacrificing accuracy.

LoRA (Low-Rank Adaptation) is a technique that significantly reduces the number of trainable parameters, making fine-tuning faster and more memory-efficient. Key LoRA parameters that you can configure are:

LoRA Rank (r):
- This determines the dimensionality of the LoRA matrices. A smaller rank means fewer trainable parameters and faster training, but potentially lower model performance. A common range for rank is 4 to 16.
LoRA Alpha (alpha):
- This is a scaling factor for the LoRA weights. It's typically set to be the same as the rank or twice the rank. It controls how much the LoRA adaptation influences the original model's weights.
LoRA Dropout (dropout):
- This is a regularization technique applied to the LoRA layers during training to prevent overfitting. It randomly sets a fraction of the LoRA activations to zero. A common value is 0.1.

Monitoring and Managing Training Jobs

OGC's Fine-Tuning provides intuitive tools to monitor your fine-tuning jobs, enabling you to:

Access detailed logs on the training and validationn loss to understand model performance.
Manage checkpoints effectively by always evaluating them on the validation loss, so the best version of your model is always logged.

Advantages of Fine-Tuning on OGC

One click fine tune
Job management
HF integration
Integration with OGC's Model Registry & Inference Endpoints

Fine-Tuning use cases:

Customer Support Chatbot – Fine-tune on historical tickets and help center content for automated support.
Product Q&A Assistant – Train on catalogs and reviews to answer product related queries.
Legal Document Analyzer – Fine-tune on contracts to extract clauses, obligations, and risks.
Codebase Assistant – Train on internal code and documentation to support developer queries.
Clinical Trial Chatbot – Fine-tune on trial protocols and FAQs for real-time investigator assistance.
Manufacturing Defect Classifier – Train vision models on proprietary defect images for quality control.

Overview

Understanding Fine-Tuning​

Efficient Fine-Tuning with LoRA​

Training Configuration​

Monitoring and Managing Training Jobs​

Advantages of Fine-Tuning on OGC​

Fine-Tuning use cases:​

Contents