Bluesky Dataset by Keyword for Sentiment & NLP
Researchers collect thousands of keyword-matched Bluesky posts as a clean dataset for sentiment analysis, text labeling, and NLP model training.
How it works
- 1Open it on Apify
Hit Run on Apify — it opens the tool in the cloud, no install.
- 2Set the inputs
Adjust
searchQuery,authorHandles,maxItems(sensible defaults are pre-filled). - 3Click Run
The tool runs on Apify’s cloud and collects the data for you.
- 4Export the results
Download as JSON, CSV or Excel, or pipe straight into your app, Google Sheets, or an AI agent.
Inputs
| Field | What it does | Type |
|---|---|---|
searchQuery | Keyword(s) to search Bluesky posts for (e.g. "artificial intelligence"). Leave empty if you instead want to scrape specific authors via Author handles. | string |
authorHandles | Bluesky handles to scrape (e.g. bsky.app, jay.bsky.team). For each handle the actor returns the author's profile plus their recent posts. The leading @ is optional. Leave empty if using a Search query instead. | array |
maxItems | Maximum number of posts to return per search query or per author handle. Pagination follows the API cursor until this limit is reached. | integer |
notionConnector | Optional. Write each post as a page into your Notion when the run finishes. Authorize a Notion connector once in Settings → API & Integrations → MCP connectors, then pick it here. Leave empty to skip (default) — results are always saved to the dataset regardless. | string |
notionParentId | Optional. The Notion data source ID of the database to write into (only used if a Notion connector is set). Leave empty to create the pages privately in your workspace instead. | string |
What you get
A structured dataset — each result includes fields like:
authorHandlesdetailssearchQueryExport every run as JSON, CSV or Excel, or send it to your app, a database, Google Sheets, or an AI agent.
More use cases for Bluesky Scraper
Bluesky Brand Monitoring: Track Mentions & Engagement
Social teams can see who's posting about a brand on Bluesky, with each post's author, like count, and repost count for real-time mention tracking.
Scrape Multiple Bluesky Accounts: Posts + Profiles
Feed a list of Bluesky handles and export every account's recent posts and profile into one dataset, ready for a competitor or creator roundup.