Newest Archive.org Uploads for Any Search Term
Track recently added archive.org items for any topic, sorted newest first by upload date, each with its title, date and direct link. Great for monitoring.
How it works
- 1Open it on Apify
Hit Run on Apify — it opens the tool in the cloud, no install.
- 2Set the inputs
Adjust
query,mediaType,sort(sensible defaults are pre-filled). - 3Click Run
The tool runs on Apify’s cloud and collects the data for you.
- 4Export the results
Download as JSON, CSV or Excel, or pipe straight into your app, Google Sheets, or an AI agent.
Inputs
| Field | What it does | Type |
|---|---|---|
query | Keywords to search the Internet Archive for (e.g. "nasa apollo", "jazz"). Supports Lucene operators used by archive.org, e.g. "title:(grateful dead) AND year:[1977 TO 1980]". Required. | string |
mediaType | Restrict results to one media type, or leave empty for any. texts = books/documents, audio = music/recordings, movies = video/film, software, image, web (archived sites), data, collection. | string |
sort | Order of results. downloads = most-downloaded first, date = newest item date first, publicdate = most recently added to archive.org first, relevance = the archive's default relevance ranking. | string |
maxItems | Maximum number of unique items to return. The actor paginates 100 per request until this many items are collected or the result set is exhausted. | integer |
notionConnector | Optional. Write each item as a page into your Notion when the run finishes. Authorize a Notion connector once in Settings → API & Integrations → MCP connectors, then pick it here. Leave empty to skip (default) — results are always saved to the dataset regardless. | string |
notionParentId | Optional. The Notion data source ID of the database to write into (only used if a Notion connector is set). Leave empty to create the pages privately in your workspace instead. | string |
What you get
A structured dataset — each result includes fields like:
creatordatedescriptiondownloadsidentifiermediaTypepublicdatesubjectstitleurlyearExport every run as JSON, CSV or Excel, or send it to your app, a database, Google Sheets, or an AI agent.
More use cases for Internet Archive Scraper
Archive.org Book Search by Keyword to JSON
Free public-domain books from archive.org's text collection by keyword, with author, publication year and item link for every title. Ideal for researchers.