This n8n workflow automates the process of fetching, processing, and storing tech news articles from RSS feeds into a Notion database. It retrieves articles from The Verge and TechCrunch, processes them to avoid duplicates, extracts full article content, generates summaries using an LLM, and stores the data in Notion. The workflow is designed to run on a schedule or manually for testing, with sticky notes providing clear documentation for each step.
Data in notion
Triggers :
When clicking ‘Execute workflow’
).Schedule Trigger
, disabled).Fetch Feeds :
The Verge
) and TechCrunch (TechCrunch
).Hash Creation :
Crypto
, Crypto1
) to identify unique articles efficiently.Loop Over Articles :
Loop Over Items
, Loop Over Items1
) to handle multiple articles from each feed.Duplicate Check :
Get many database pages
, Get many database pages1
) to check if an article’s hash exists. If it does, the article is skipped (If
, If1
).Fetch Full Article :
HTTP Request
, HTTP Request1
).Extract Content :
HTML
, HTML1
) using specific CSS selectors (.duet--article--article-body-component p
for The Verge, .entry-content p
for TechCrunch).Clean Data :
Code in JavaScript
, Code in JavaScript1
) processes the extracted content by removing empty paragraphs, links, and excessive whitespace, then joins paragraphs into a single string.Summarize Article :
OpenAI Chat Model
, OpenAI Chat Model1
) with a LangChain node (Basic LLM Chain
, Basic LLM Chain1
) to generate a concise summary (max 1500 characters) in plain text, focusing on main arguments or updates.Store in Notion :
Create a database page
, Create a database page1
) with fields for title, summary, date, hash, URL, source, digest status, and full article text.Credentials :