LlamaParseReader with TypeScript and Express.js

Parsing documents like PDFs can be a real pain. Thankfully, LlamaIndex has you covered. LlamaParseReader leverages LlamaCloud creating a simple way to parse documents and prepare them for use in your RAG applications.

I'll assume if you are reading this article, you have worked with LlamaIndex, and this guide shows the document parsing LlamaCloud offers.

If not, there is a phenomenal free video course on DeepLearning.ai which you can watch here.

Before we start, ensure you have an API Key from LlamaIndex Cloud, which we will need later (and named LLAMA_CLOUD_API_KEY in your .env file). We also use OpenAI as our LLM. You can get a key for that here.

This is an Express.js example, but the core logic could be lifted for any Node.js environment.

Install Dependencies

First, install the necessary packages:

npm install express llamaindex dotenv
npm install --save-dev @types/express typescript

Setup .env File

Create a .env file in the root of your project and add your API keys:

LLAMA_CLOUD_API_KEY=your_api_key_here
OPENAI_API_KEY=your_api_key_here

Write Some Code

Create a new TypeScript file, for example, server.ts. Below is the complete code to use LlamaParseReader with Express.js:

import express, { Request, Response } from "express";
import { config } from "dotenv";
import { VectorStoreIndex, OpenAI, Settings } from "llamaindex";
import { LlamaParseReader } from "llamaindex/readers/LlamaParseReader";

// Load environment variables from .env file
config();

// Set up LLM settings, I tend to use 3.5 because it's cheap and works well with all the use cases I've thrown at it with local docs.
Settings.llm = new OpenAI({ model: "gpt-3.5-turbo" });

const app = express();
const port = process.env.PORT || 3000;

app.use(express.json());

app.post("/query", async (req: Request, res: Response) => {
  try {
    const { query } = req.body;
    if (!query) {
      throw new Error("Input is required");
    }

    // Initialize the LlamaParseReader
    const reader = new LlamaParseReader({ resultType: "markdown" });

    // Load and parse the document
    const documents = await reader.loadData(
      "./src/data/writing-effectively.pdf"
    );

    // Create embeddings and store them in a VectorStoreIndex
    const index = await VectorStoreIndex.fromDocuments(documents);

    // Create a query engine
    const queryEngine = index.asQueryEngine();

    // Query the document using the query engine
    const { response } = await queryEngine.query({ query });

    // Return the response
    res.json({ response });
  } catch (err) {
    console.error(err);
    res.status(400).send("Something went wrong.");
  }
});

app.listen(port, () => {
  console.log(`Server is running on port ${port}`);
});

Compile and Run the Code

Ensure you have the TypeScript compiler installed and run the code:

npx tsc && node dist/server.js

Make sure your TypeScript compiler is configured correctly in tsconfig.json.

LlamaParseReader with TypeScript and Express.js

Install Dependencies

Setup .env File

Write Some Code

Compile and Run the Code

Written by Niall Maher

Fetching comments