The addLoader() method is used to load the data sources from different data sources to a RAG pipeline. You can find the signature below:

Parameters

loaderParam
string
required

The data to embed, can be a URL, local file or raw content, depending on the data type.. You can find the full list of supported data sources here.

Usage

Load data from webpage

Code example
import { RAGApplicationBuilder } from '@llm-tools/embedjs';

import { OpenAiEmbeddings } from '@llm-tools/embedjs-openai';

import { HNSWDb } from '@llm-tools/embedjs-hnswlib';

import { WebLoader } from '@llm-tools/embedjs-loader-web';



const app = await new RAGApplicationBuilder()

.setModel(SIMPLE_MODELS.OPENAI_GPT4_O)

.setEmbeddingModel(new OpenAiEmbeddings())

.setVectorDatabase(new HNSWDb())

.build();



await app.addLoader(new WebLoader({ urlOrContent: 'https://www.forbes.com/profile/elon-musk' }));

//Add loader completed with 4 new entries for 6c8d1a7b-ea34-4927-8823-xba29dcfc5ac

Load data from sitemap

Code example
import { RAGApplicationBuilder } from '@llm-tools/embedjs';

import { OpenAiEmbeddings } from '@llm-tools/embedjs-openai';

import { HNSWDb } from '@llm-tools/embedjs-hnswlib';

import { SitemapLoader } from '@llm-tools/embedjs-loader-sitemap';



const app = await new RAGApplicationBuilder()

.setModel(SIMPLE_MODELS.OPENAI_GPT4_O)

.setEmbeddingModel(new OpenAiEmbeddings())

.setVectorDatabase(new HNSWDb())

.build();



await app.addLoader(new SitemapLoader({ url: '"https://js.langchain.com/sitemap.xml' }));

//Add loader completed with 11024 new entries for 6c8d1a7b-ea34-4927-8823-xba29dcfc5ad

You can find complete list of supported data sources here.