In one of my previous blog posts, I introduced LlamaIndex—a powerful framework for building LLM applications. In this post, I’d like to take it a step further by creating a toy project using a Next.js backend paired with the Ollama provider for running local LLMs.
Let’s get started.
To follow along, make sure you have the following installed:
nvm
for Node.js Version ManagementI highly recommend using nvm to manage multiple versions of Node.js easily. It’s a lifesaver for projects with varying version requirements.
Download and install Ollama from their official site:
👉 https://ollama.com/download
And pull these models:
ollama pull llama3.2
ollama pull nomic-embed-text
We’ll start by creating a LlamaIndex-powered app:
npx create-llama@latest
You will be asked for the name of your project, along with other configuration options, something like this:
npm create llama@latest
Need to install the following packages:
create-llama@latest
Ok to proceed? (y) y
✔ What is your project named? … my-app
✔ What app do you want to build? › Agentic RAG
✔ What language do you want to use? › Next.js
✔ Do you want to use LlamaCloud services? … No / Yes
✔ Please provide your LlamaCloud API key (leave blank to skip): …
✔ Please provide your OpenAI API key (leave blank to skip): …
? How would you like to proceed? › - Use arrow-keys. Return to submit.
Just generate code (~1 sec)
Start in VSCode (~1 sec)
❯ Generate code and install dependencies (~2 min)
To integrate Ollama, we need to install the official LlamaIndex Ollama bindings:
npm install @llamaindex/ollama
.env
FileEdit your .env
file to include the following:
# The provider for the AI models to use.
MODEL_PROVIDER=ollama
# The name of LLM model to use.
MODEL=llama3.2:latest
# Name of the embedding model to use.
EMBEDDING_MODEL=nomic-embed-text:latest
Make sure the model names match those available in your local Ollama installation.
Update app/api/chat/engine/provider.ts
with the following code:
import { Ollama } from "@llamaindex/ollama";
import { OllamaEmbedding } from "@llamaindex/ollama";
export function setupProvider() {
Settings.llm = new Ollama({
model: process.env.MODEL ?? "gemini-2.5-pro-preview-03-25",
maxTokens: process.env.LLM_MAX_TOKENS
? Number(process.env.LLM_MAX_TOKENS)
: undefined,
});
Settings.embedModel = new OllamaEmbedding({
model: process.env.EMBEDDING_MODEL,
dimensions: process.env.EMBEDDING_DIM
? parseInt(process.env.EMBEDDING_DIM)
: undefined,
});
}
This code tells LlamaIndex to use Ollama both for LLM inference and embedding generation.
Now it’s time to build and run your app:
npm run generate
npm run dev
And that’s it! 🎉 You now have a toy project running LlamaIndex on a Next.js backend, powered by local LLMs via Ollama.