Use cases
- Monitor Google Drive folder, parsing PDF, DOCX and image file into a destination folder, ready for further processing (e.g. RAG ingestion, translation, etc.)
- Keep processing log in Google Sheet and send Slack notifications.
How it works
- Trigger: Watch Google Drive folder for new and updated files.
- Create a uniquely named destination folder, copying the input file.
- Parse the file using Mistral Document, extracting content and handling non-OCRable images separately.
- Save the data returned by Mistral Document into the destination Google Drive folder (raw JSON file, Markdown files, and images) for further processing.
How to use
- Google Drive and Google Sheets nodes:
- Create Google credentials with access to Google Drive and Google Sheets. Read more about Google Credentials.
- Update all Google Drive and Google Sheets nodes (14 nodes total) to use the credentials
- Mistral node:
- Slack nodes:
- Create Slack OAuth2 credentials. Read more about Slack OAuth2 credentials
- Update the two Slack nodes:
Send Success Message
and Send Error Message
:
- Set the credentials
- Select the channel where you want to send the notifications (channels can be different for success and errors).
- Create a Google Sheets spreadsheet following the steps in
Google Sheets Configuration
. Ensure the spreadsheet can be accessed as Editor
by the account used by the Google Credentials above.
- Create a directory for input files and a directory for output folders/files. Ensure the directories can be accessed by the account used by the Google Credentials.
- Update the
File Created
, File Updated
and Workflow Configuration
node following the steps in the green Notes.
Requirements
- Google account with Google API access
- Mistral Cloud account access to Mistral API key.
- Slack account with access to Slack client ID and secret ID.
- Basic n8n knowledge: understanding of triggers, expressions, and credential management
Who’s it for
Anyone building a data pipeline ingesting files to be OCRed for further processing.
🔒 Security
All credentials are stored as n8n credentials. The only information stored in this workflow that could be considered sensitive are the Google Drive Directory and Sheet IDs. These directories and the spreadsheet should be secured according to your needs.
Need Help?
Reach out on LinkedIn or Ask in the Forum!