Kapampangan2English

Fine-Tuned SLM with RAG

Role: AI Engineer

The Overview

This project addresses the limited linguistic support for Kapampangan in mainstream Large Language Models (LLMs). By combining a customized Small Language Model (SLM) with a Retrieval-Augmented Generation (RAG) pipeline, the system provides precise translations grounded in verified dictionary data. The application is served via a FastAPI backend and consumed through an interactive Streamlit web interface.

Phase 1: Fine-Tuning the Pre-Project

Before building the RAG application, the core translation intelligence had to be developed. This involved fine-tuning a base model on specific Kapampangan-English datasets.

Model & DatasetThe foundation is Qwen3-1.7B, a highly efficient Small Language Model. It was trained using the Coco-18 Kapampangan-English dataset.
Unsloth & QuantizationTo make fine-tuning accessible and efficient, the model was quantized to 4-bit precision using Unsloth and trained using QLoRa. This drastically reduced VRAM requirements while preserving translation quality.
BenchmarkingThe performance was rigorously tested by comparing the BLEU and chrF scores of the raw model against the fine-tuned version to ensure quantitative improvements.

Phase 2: RAG Pipeline Integration

The fine-tuned model acts as the primary generation engine. However, to expand its vocabulary beyond the training data and prevent hallucinations, a local Vector Database was introduced.

Data ProcessingDictionary data was scraped using Selenium & BeautifulSoup, then cleaned and normalized with LLM assistance (Claude) before being embedded using all-MiniLM-L6-v2.
Vector SearchThe embeddings are stored in ChromaDB. When a user queries a word, the system retrieves relevant definitions to augment the prompt for the fine-tuned SLM.
Orchestration & ServingLangChain orchestrates the retrieval and generation phases. Inference is handled efficiently by Ollama. The entire backend is exposed via FastAPI, with a clean UI built in Streamlit.

Tech Stack

Python
FastAPI & Uvicorn
Streamlit
LangChain & Ollama
ChromaDB
Unsloth & QLoRa
Qwen3-1.7B
Selenium & BeautifulSoup

Pipeline Highlights

4-bit GGUF Quantization
Custom Fine-tuned Translator
RAG augmented definition retrieval
Microservice Architecture

Case Study: Kapampangan2English Pipeline