You can load any data from your own custom data source by implementing the BaseLoader interface. For example -

class CustomLoader extends BaseLoader<{ customChunkMetadata: string }> {

    constructor() {

        super('uniqueId');

    }



    async *getChunks() {

        throw new Error('Method not implemented.');

    }

}

Customizing the chunk size and overlap

If you want to customize the chunk size or the chunk overlap on an existing loader, all built-in loaders take an additional parameter chunkSize and chunkOverlap in their constructor. For example -

import { RAGApplicationBuilder } from '@llm-tools/embedjs';

import { OpenAiEmbeddings } from '@llm-tools/embedjs-openai';

import { HNSWDb } from '@llm-tools/embedjs-hnswlib';

import { DocxLoader } from '@llm-tools/embedjs-loader-msoffice';



const app = await new RAGApplicationBuilder()

.setModel(SIMPLE_MODELS.OPENAI_GPT4_O)

.setEmbeddingModel(new OpenAiEmbeddings())

.setVectorDatabase(new HNSWDb())

.build();



app.addLoader(new DocxLoader({ filePathOrUrl: '...', chunkOverlap: 100, chunkSize: 20 }))