Package Registry Scraper (npm + PyPI)
Get npm and PyPI package metadata - version, license, repo, keywords, and npm download counts. Search by keyword or look up exact names. No API key needed.
How it works
- 1Open it on Apify
Hit Run on Apify — it opens the tool in the cloud, no install.
- 2Set the inputs
Adjust
registry,searchQuery,packageNames(sensible defaults are pre-filled). - 3Click Run
The tool runs on Apify’s cloud and collects the data for you.
- 4Export the results
Download as JSON, CSV or Excel, or pipe straight into your app, Google Sheets, or an AI agent.
Inputs
| Field | What it does | Type |
|---|---|---|
registry | Which package registry to use. npm supports both keyword search and exact-name lookup; PyPI supports exact-name lookup only (it has no clean public search API). | string |
searchQuery | Keywords to search the npm registry for (e.g. "react state management"). npm only — ignored for PyPI. Leave empty if you are looking up exact package names instead. | string |
packageNames | Exact package names to look up directly. Works for BOTH registries. For PyPI this is the only supported mode (e.g. ["requests", "fastapi"]). For npm, scoped names like "@types/node" are supported. | array |
includeDownloads | Fetch last-month download counts for each npm package via the npm downloads API. npm only — PyPI does not expose a public download-count endpoint. Adds one request per package. Note: monthlyDownloads is null when this is off, for PyPI packages, or if the downloads API call fails for a given package (a warning is logged in that case). | boolean |
maxItems | Maximum number of packages to return from an npm search query. Only applies to npm search; ignored for exact-name lookups. | integer |
notionConnector | Optional. Write each package as a page into your Notion when the run finishes. Authorize a Notion connector once in Settings → API & Integrations → MCP connectors, then pick it here. Leave empty to skip (default) — results are always saved to the dataset regardless. | string |
notionParentId | Optional. The Notion data source ID of the database to write into (only used if a Notion connector is set). Leave empty to create the pages privately in your workspace instead. | string |
What you get
A structured dataset — each result includes fields like:
authordescriptionhomepagekeywordslicensemonthlyDownloadsnameregistryrepositoryscoreurlversionExport every run as JSON, CSV or Excel, or send it to your app, a database, Google Sheets, or an AI agent.
2 ready-to-run use cases
npm Search Scraper: Downloads, License & Repo
Search the npm registry by keyword and compare each package's version, license, source repo, and monthly downloads side by side. Great for JS library research.
npm License Audit: Map package.json Deps
Feed your package.json dependencies and return the license and source repository for each npm package - a fast OSS compliance check for engineering teams.
Package Registry Scraper (npm + PyPI)
Pull clean, structured package metadata from the npm and PyPI public registries. No API key, no login, no anti-bot. Search npm by keyword, or look up exact packages by name on either registry — and get a single normalized shape back for both.
What you get per package
registry, name, version, description, author, homepage, repository (decoded to a browseable https URL), license, keywords, monthlyDownloads (npm), and url (the human-facing registry page).
Nullable fields. Registries don't always populate every field, so the following can be null: description, author, homepage, repository, license (and keywords may be an empty array). monthlyDownloads is null when includeDownloads is off, for PyPI packages (no public download endpoint), or if the npm downloads API call fails for a specific package — in which case a warning is logged so you know why.
Input
| Field | Notes |
|---|---|
registry | npm or pypi. Default npm. |
searchQuery | Keyword search — npm only (PyPI has no clean public search API). |
packageNames | Array of exact names — works for both registries (e.g. ["requests","fastapi"], or npm scoped ["@types/node"]). |
includeDownloads | Fetch last-month download counts for npm packages. On by default. npm only. |
maxItems | Cap on npm search results. Default 50. |
You must provide either a searchQuery (npm) or one or more packageNames.
Registries
- npm — keyword search via the registry search API, package detail via
registry.npmjs.org/{pkg}, and monthly downloads viaapi.npmjs.org. - PyPI — exact-name lookup via
pypi.org/pypi/{pkg}/json. PyPI has no clean public search API, so asearchQuerywithregistry=pypireturns a single diagnostic row telling you to usepackageNamesinstead.
Output
One dataset row per package, deduped by registry + name. Packages that can't be found, and empty searches, return a single diagnostic row (ok:false) and are not charged.
Pricing
Pay-per-result: you are charged once per successfully returned package row. Diagnostic rows (ok:false) — bad input, not-found packages, empty searches, network/registry errors — are never charged.
Proxy
These are public, no-auth registries with no anti-bot, so no proxy is needed — leave proxy off (the default). Only enable a proxy if you hit IP-based rate limits on very large runs.
Troubleshooting
- Empty/bad-input run returns a
BAD_INPUTdiagnostic row: provide asearchQuery(npm) or one or morepackageNames. monthlyDownloadsis null for some npm packages: the downloads API occasionally rate-limits or has no data for a package; the run logs a warning and continues withnullfor that field.- PyPI search isn't supported (PyPI has no clean public search API); use exact
packageNameswithregistry=pypi.
Examples
Search npm:
{ "registry": "npm", "searchQuery": "react state management", "maxItems": 25, "includeDownloads": true }
Look up Python packages:
{ "registry": "pypi", "packageNames": ["requests", "fastapi"] }