PDF Proposal Knowledge Base with S3, OpenAI GPT-4o & Qdrant RAG Agent

by Joe Swink•Updated: Last update a month ago•Source: n8n.io

Loading workflow viewer...

Tags

AI RAG Multimodal AI

Getting Started

This template has a two part setup:

Ingest PDF files from S3, extract text, chunk, embed with OpenAI embeddings, and index into a Qdrant collection with metadata.
Provide a chat entry point that uses an Agent with OpenAI to retrieve from the same Qdrant collection as a tool and answer proposal knowledge questions.

What it does

Lists objects in an S3 bucket, loops through keys, downloads each file, and extracts text from PDFs.
Chunks text and loads it into Qdrant with metadata for retrieval.
Exposes a chat trigger wired to an Agent using an OpenAI chat model.
Adds a retrieve as tool Qdrant node so the Agent can ground answers in the indexed corpus.

Why it is useful

Simple pattern for building a proposal or knowledge base from PDFs stored in S3.
End to end path from ingestion to retrieval augmented answers.
Easy to swap models or collections, and to extend with more tools.

Setup notes

Attach your own AWS credentials to the two S3 nodes and set your bucket name.
Attach your Qdrant credentials to both Qdrant nodes and set your collection.
Attach your OpenAI credentials to the embedding and chat nodes.
The sanitized template uses placeholders for bucket and collection names.

Related Workflows

Manage Appian Tasks with Ollama Qwen LLM and Postgres Memory

Manage Appian Tasks with Ollama Qwen LLM and Postgres Memory

AI ChatbotMultimodal AI

Build Academic Knowledge Graph from Research Papers with PDF Vector, GPT-4 and Neo4j

Build Academic Knowledge Graph from Research Papers with PDF Vector, GPT-4 and Neo4j

AI RAGMultimodal AI

Customer Support Chatbot with RAG using OpenAI and Pinecone

Customer Support Chatbot with RAG using OpenAI and Pinecone

by Ilyass Kanissi

AI RAGMultimodal AI