This n8n workflow converts invoices in PDF format into a structured, ready-to-use JSON , using AI and XML transformation — without writing any code.
Upload form → The user uploads a PDF file.
Text extraction → The PDF content is extracted as plain text.
XML schema definition → A standard invoice structure is defined with fields such as:
AI (Gemini) → The model rewrites the PDF text into a valid XML following the predefined schema.
XML cleanup → Removes extra tags, line breaks, and unnecessary formatting.
JSON conversion → The XML is transformed into a clean, structured JSON object, ready for integrations, APIs, or storage.