Building a Toy Project with LlamaIndex, Next.js, and Ollama

In one of my previous blog posts, I introduced LlamaIndex—a powerful framework for building LLM applications. In this post, I’d like to take it a step further by creating a toy project using a Next.js backend paired with the Ollama provider for running local LLMs.

Let’s get started.


🛠 Requirements

To follow along, make sure you have the following installed:

  • Node.js
  • Ollama (for running models locally)

💡 Tip: Use nvm for Node.js Version Management

I highly recommend using nvm to manage multiple versions of Node.js easily. It’s a lifesaver for projects with varying version requirements.


🚀 Step 1: Install Ollama

Download and install Ollama from their official site:
👉 https://ollama.com/download

And pull these models:

ollama pull llama3.2
ollama pull nomic-embed-text

🧱 Step 2: Scaffold the Project

We’ll start by creating a LlamaIndex-powered app:

npx create-llama@latest

You will be asked for the name of your project, along with other configuration options, something like this:

npm create llama@latest
Need to install the following packages:
  create-llama@latest
Ok to proceed? (y) y
✔ What is your project named? … my-app
✔ What app do you want to build? › Agentic RAG
✔ What language do you want to use? › Next.js
✔ Do you want to use LlamaCloud services? … No / Yes
✔ Please provide your LlamaCloud API key (leave blank to skip): …
✔ Please provide your OpenAI API key (leave blank to skip): …
? How would you like to proceed? › - Use arrow-keys. Return to submit.
    Just generate code (~1 sec)
   Start in VSCode (~1 sec)
❯   Generate code and install dependencies (~2 min)

🔌 Step 3: Add Ollama Provider

To integrate Ollama, we need to install the official LlamaIndex Ollama bindings:

npm install @llamaindex/ollama

⚙️ Step 4: Update Your .env File

Edit your .env file to include the following:

# The provider for the AI models to use.
MODEL_PROVIDER=ollama

# The name of LLM model to use.
MODEL=llama3.2:latest

# Name of the embedding model to use.
EMBEDDING_MODEL=nomic-embed-text:latest

Make sure the model names match those available in your local Ollama installation.


🧩 Step 5: Configure the Provider

Update app/api/chat/engine/provider.ts with the following code:

import { Ollama } from "@llamaindex/ollama";
import { OllamaEmbedding } from "@llamaindex/ollama";

export function setupProvider() {
  Settings.llm = new Ollama({
    model: process.env.MODEL ?? "gemini-2.5-pro-preview-03-25",
    maxTokens: process.env.LLM_MAX_TOKENS
      ? Number(process.env.LLM_MAX_TOKENS)
      : undefined,
  });

  Settings.embedModel = new OllamaEmbedding({
    model: process.env.EMBEDDING_MODEL,
    dimensions: process.env.EMBEDDING_DIM
      ? parseInt(process.env.EMBEDDING_DIM)
      : undefined,
  });
}

This code tells LlamaIndex to use Ollama both for LLM inference and embedding generation.


🧪 Step 6: Generate and Run the App

Now it’s time to build and run your app:

npm run generate
npm run dev

And that’s it! 🎉 You now have a toy project running LlamaIndex on a Next.js backend, powered by local LLMs via Ollama.

Previous Post