Web Loaders
These loaders are used to load web resources. They do not involve the local file system.
info
If you'd like to write your own document loader, see this how-to. If you'd like to contribute an integration, see Contributing integrations.
All web loaders
Name | Description |
---|---|
Playwright | Only available on Node.js. |
Apify Dataset | This guide shows how to use Apify with LangChain to load documents fr... |
AssemblyAI Audio Transcript | This covers how to load audio (and video) transcripts as document obj... |
Azure Blob Storage Container | Only available on Node.js. |
Azure Blob Storage File | Only available on Node.js. |
Browserbase Loader | Description |
College Confidential | This example goes over how to load data from the college confidential... |
Confluence | Only available on Node.js. |
Couchbase | Couchbase is an award-winning distributed NoSQL cloud database that d... |
Figma | This example goes over how to load data from a Figma file. |
FireCrawl | This notebook provides a quick overview for getting started with |
GitBook | This example goes over how to load data from any GitBook, using Cheer... |
GitHub | This example goes over how to load data from a GitHub repository. |
Hacker News | This example goes over how to load data from the hacker news website,... |
IMSDB | This example goes over how to load data from the internet movie scrip... |
Notion API | This guide will take you through the steps required to load documents... |
PDF files | This notebook provides a quick overview for getting started with |
RecursiveUrlLoader | This notebook provides a quick overview for getting started with |
S3 File | Only available on Node.js. |
SearchApi Loader | This guide shows how to use SearchApi with LangChain to load web sear... |
SerpAPI Loader | This guide shows how to use SerpAPI with LangChain to load web search... |
Sitemap Loader | This notebook goes over how to use the SitemapLoader class to load si... |
Sonix Audio | Only available on Node.js. |
Blockchain Data | This example shows how to load blockchain data, including NFT metadat... |
Spider | Spider is the fastest crawler. It converts any website into pure HTML... |
Taskade | Taskade is the ultimate tool for AI-driven writing, project managemen... |
Cheerio | This notebook provides a quick overview for getting started with |
Puppeteer | This notebook provides a quick overview for getting started with |
YouTube transcripts | This covers how to load youtube transcript into LangChain documents. |