How We Stopped Fine-Tuning and Started Querying: Real-Time RAG with DeepSeek at Rast Mobile

3 min readApr 21, 2025

At Rast Mobile, we recently built a fully local RAG system using DeepSeek 32B, integrated with live company databases, and we didn’t need any fine-tuning.

That’s right. No custom training. No embedding pipelines.
Just smart use of metadata, live SQL, and prompt engineering. Here’s how we did it.

You dont need to write tons of SQL. Use LLM to generate SQL for you.

- The Realisation: You Don’t Need Fine-Tuning

We started this project thinking we might need to fine-tune a model to understand our client’s business data. But as we dug in, it became clear:

RAG isn’t about teaching the model new knowledge. It’s about giving it the right context at the right time.

And in our case, that context was structured business data from Postgresql — and we already had it.

- LLM’s Job: SQL Generator + Answer Enhancer

So instead of fine-tuning, we used the LLM in two key ways:

  1. Generate SQL based on user questions and table metadata
  2. Enhance and rephrase the result into a meaningful, user-friendly explanation

That’s it.

- Our RAG Strategy: Simple, Effective, Local

Here’s what the system does step by step:

  1. Receive natural language question from user
    → e.g., “How many orders were delivered last week?”
  2. LLM reviews metadata (table names, column types, relationships, samples)
    → No embeddings. Just descriptive metadata and sample rows in context.
  3. LLM generates SQL query
    → e.g., SELECT COUNT(*) FROM orders WHERE status = 'delivered' AND created_at >= now() - interval '7 days';
  4. The system executes a query on the live PostgreSQL DB
  5. LLM wraps the result into a readable, customized answer
    → “There were 187 delivered orders in the last 7 days.”

- What Worked Well

  • No need for fine-tuning
  • No vector databases or embeddings
  • Just smart prompt templates + live metadata injection
  • Works with any SQL-compatible DB (PostgreSQL, MySQL, MSSQL, etc.)

- All On-Premise

We run everything locally using DeepSeek 32B, so data privacy is fully preserved. No API calls to OpenAI, no cloud storage. Great for enterprises with strict security policies.

- Live Use Case

The system is now running at one of our partner companies. Non-technical team members can ask business questions and get real-time answers without needing dev support or writing SQL.

It’s like giving them a data analyst who never sleeps.

- Tech Stack Summary

  • 🔁 DeepSeek 32B (running locally with vLLM)
  • 🧩 Prompt templates dynamically injected with:
  • Table names + descriptions
  • Column metadata
  • Sample rows
  • 💾 PostgreSQL (but adaptable to any SQL DB)
  • 🧠 RAG logic: Generate SQL → Query DB → Enhance Answer

- What You Can Take Away

You dont need to write tons of SQL. Use LLM to generate SQL for you.

If you’re building a RAG system for structured data:

  • Skip fine-tuning for SQL-based tasks
  • Use rich metadata and prompt engineering
  • Let LLMs assist SQL writing, not replace your logic layer
  • Think of RAG as contextual I/O, not knowledge training

- Want to Learn More?

We’re now offering this solution as a service at Rast Mobile — with full support for local hosting, custom prompt engineering, and SQL integration.

Mail me or visit rastmobile.com if you want to use LLMS as live data copilots in your business.

--

--

Mehmet ALP
Mehmet ALP

Written by Mehmet ALP

Hello! My name is Alp and I am the founder of Rast Mobile, a software development company focused on delivering high-quality, innovative solutions to clients.

No responses yet