Request a tool
All toolsMCP serverRequest a toolPlatformsCategories
Hacker News Scraper icon

Hacker News Scraper

Search Hacker News stories, Show HN, Ask HN, comments, or the front page by keyword and get clean JSON with points, author and comment links.

Run this in the cloudRun on Apify →

Developer & Research Tools

How it works

  1. 1
    Open it on Apify

    Hit Run on Apify — it opens the tool in the cloud, no install.

  2. 2
    Set the inputs

    Adjust query, tags, sortBy (sensible defaults are pre-filled).

  3. 3
    Click Run

    The tool runs on Apify’s cloud and collects the data for you.

  4. 4
    Export the results

    Download as JSON, CSV or Excel, or pipe straight into your app, Google Sheets, or an AI agent.

Inputs

FieldWhat it doesType
queryKeywords to search Hacker News for (e.g. "openai", "rust async"). Leave empty to get the newest/front-page items for the chosen tag.string
tagsWhich kind of Hacker News items to return: stories, Show HN, Ask HN, comments, or current front page.string
sortByOrder results by Algolia relevance (best match) or by date (newest first).string
minPointsOnly return items with at least this many points (upvotes). Leave empty for no minimum. Applies mainly to stories.integer
maxItemsMaximum number of items to return. The actor paginates the API until it has this many (or runs out).integer
notionConnectorOptional. Write each item as a page into your Notion when the run finishes. Authorize a Notion connector once in Settings → API & Integrations → MCP connectors, then pick it here. Leave empty to skip (default) — results are always saved to the dataset regardless.string
notionParentIdOptional. The Notion data source ID of the database to write into (only used if a Notion connector is set). Leave empty to create the pages privately in your workspace instead.string

What you get

A structured dataset — each result includes fields like:

authorcreatedAthnUrlnumCommentsobjectIdpointstexttitletypeurl

Export every run as JSON, CSV or Excel, or send it to your app, a database, Google Sheets, or an AI agent.

3 ready-to-run use cases

Search Hacker News Stories by Keyword: OpenAI & More

Search Hacker News for any keyword like OpenAI and get matching stories ranked by relevance, each with points, author, and a direct comment-thread link.

Scrape New Show HN Launches: Newest Products First

Founders and marketers tracking competitors get the latest Show HN posts, newest first, with product titles, points, and links to every fresh launch.

Hacker News Front Page Scraper: Live Top Stories

A snapshot of the current Hacker News front page: titles, points, authors, and comment counts for every story ranking right now, returned as structured JSON.

Hacker News Scraper

Scrape Hacker News via the public Algolia HN Search API — no API key, no login, no anti-bot. Search stories, Show HN, Ask HN, comments, or the current front page; sort by relevance or date; filter by minimum points.

What it does

  • Queries https://hn.algolia.com/api/v1/search (relevance) or search_by_date (newest first).
  • Paginates automatically until it has collected maxItems.
  • Returns clean, normalized JSON for each item (HTML stripped from story/comment text).
  • Deduplicates by Hacker News objectID.

Input

FieldTypeDefaultDescription
querystring"openai"Keywords to search for. Empty = newest/front-page items for the chosen tag.
tagsselectstorystory, show_hn, ask_hn, comment, or front_page.
sortByselectrelevancerelevance or date (newest first).
minPointsintegerOnly items with at least this many points.
maxItemsinteger50Max items to return (1–1000).
proxyConfigurationproxyoffOptional; the public API has no anti-bot, so no proxy is needed. Only enable it if you hit IP rate limits at very high volume.

> front_page note: the front_page tag always returns the ~30 items currently on the HN front page (no keyword search). For topic searches use story/show_hn/ask_hn/comment instead.

Output

Each successful item:

{
  "ok": true,
  "objectId": "44159823",
  "type": "story",
  "title": "OpenAI ...",
  "url": "https://example.com/article",
  "author": "someuser",
  "points": 412,
  "numComments": 188,
  "createdAt": "2026-06-01T12:34:56.000Z",
  "text": "",
  "hnUrl": "https://news.ycombinator.com/item?id=44159823"
}

For comments, type is comment, title/url reflect the parent story, and text is the (HTML-stripped) comment body.

Some fields can be null depending on the item: url (Ask HN/text posts and self-posts have no external link), points and numComments (often absent on comments), and text (empty string for link stories). title, author, objectId and hnUrl are present for well-formed items but defensively default to null if the API omits them.

On failure or no results, a single diagnostic row is emitted with ok:false and an errorCode (e.g. NO_RESULTS, RATE_LIMITED, NETWORK) — and nothing is charged.

Pricing

Pay-per-result: one charge per returned item (item event). Diagnostic/empty rows are never charged.