GDELT Worldwide News Scraper

Search global news with the GDELT 2.0 API. Get articles with title, URL, domain, country, language, and date. Filter by keyword, timespan, country. No API key.

Run this in the cloudRun on Apify →

News & Finance Data

How it works

1
Open it on Apify
Hit Run on Apify — it opens the tool in the cloud, no install.
2
Set the inputs
Adjust query, maxItems, sort (sensible defaults are pre-filled).
3
Click Run
The tool runs on Apify’s cloud and collects the data for you.
4
Export the results
Download as JSON, CSV or Excel, or pipe straight into your app, Google Sheets, or an AI agent.

Inputs

Field	What it does	Type
`query`	Keywords to search worldwide news for. Wrap phrases in quotes (e.g. "climate change"). Combine with OR, or operators like domain:reuters.com. Very short or very common single words may be rejected by GDELT — add a second word or quote a phrase.	string
`maxItems`	Maximum number of articles to return. GDELT hard-caps a single request at 250 — higher values are automatically capped at 250.	integer
`sort`	How to order results: DateDesc (newest first), DateAsc (oldest first), or HybridRel (by relevance to the query).	string
`timespan`	Only return articles from the last N units of time, e.g. 1d, 3d, 1w, 1m, 3m. Leave empty for GDELT's default window. GDELT only covers roughly the last 3 months.	string
`sourceCountry`	Limit to articles from a given source country, appended to the query as sourcecountry:{code}. Use GDELT country codes (e.g. US, UK, FR, DE, IN). Leave empty for all countries.	string
`sourceLang`	Limit to articles in a given language, appended to the query as sourcelang:{code} (e.g. english, french, spanish, german). Leave empty for all languages.	string
`notionConnector`	Optional. Write each article as a page into your Notion when the run finishes. Authorize a Notion connector once in Settings → API & Integrations → MCP connectors, then pick it here. Leave empty to skip (default) — results are always saved to the dataset regardless.	string
`notionParentId`	Optional. The Notion data source ID of the database to write into (only used if a Notion connector is set). Leave empty to create the pages privately in your workspace instead.	string

What you get

A structured dataset — each result includes fields like:

detailsquerysortsourceCountrysourceLangtimespan

Export every run as JSON, CSV or Excel, or send it to your app, a database, Google Sheets, or an AI agent.

2 ready-to-run use cases

AI News Scraper Worldwide, Newest First | GDELT

Newest artificial intelligence headlines from global news outlets, sorted latest-first. Returns title, URL, source, country and date for AI trend tracking.

World Cup News Scraper, Global Coverage | GDELT

FIFA World Cup articles from every country and language GDELT indexes, ranked by relevance. Each result returns headline, link, source country and date.

GDELT Worldwide News Scraper

Search worldwide news through the GDELT 2.0 DOC API — a public, no-key, no-login JSON endpoint that indexes online news from across the planet in 65+ languages. No proxy or anti-bot handling needed.

What it does

Given a search query, the actor calls the GDELT DOC API in ArtList mode and returns clean, structured articles. You can filter by recency, source country, and source language, and sort by newest, oldest, or relevance.

Input

Field	Type	Default	Notes
`query`	string	— (required)	Keywords. Quote phrases: `"climate change"`. Very short/common single words may be rejected by GDELT.
`maxItems`	integer	`100`	Capped at 250 — GDELT returns at most 250 articles per request.
`sort`	string	`DateDesc`	`DateDesc` (newest), `DateAsc` (oldest), `HybridRel` (relevance).
`timespan`	string	—	Recency window, e.g. `1d`, `3d`, `1w`, `1m`, `3m`. GDELT covers ~the last 3 months.
`sourceCountry`	string	—	Appended as `sourcecountry:{code}` (e.g. `US`, `UK`, `FR`).
`sourceLang`	string	—	Appended as `sourcelang:{code}` (e.g. `english`, `french`).
`proxyConfiguration`	object	—	Optional; not needed (public API).

Output

Each successful row:

{
  "ok": true,
  "title": "…",
  "url": "https://…",
  "domain": "example.com",
  "sourceCountry": "United States",
  "language": "English",
  "publishedAt": "2026-06-11T12:00:00.000Z",
  "socialImage": "https://…"
}

Results are de-duplicated by URL. Each ok:true article is billed one article charge unit. Diagnostic rows (ok:false) and empty/blocked runs are never charged.

Nullable fields: GDELT does not always populate every field. Any of title, url, domain, sourceCountry, language, publishedAt, and socialImage can be null for a given article (e.g. socialImage is often missing, and publishedAt is null when GDELT's seendate is unparseable). Rows with neither a url nor a title are dropped before charging.

Diagnostics

The actor never fails silently. Instead it writes a single diagnostic row (ok:false) with an errorCode and never charges for it:

BAD_INPUT — GDELT rejected the query (e.g. "query too short"). Quote phrases and avoid overly short/common terms.
NO_RESULTS — the query was valid but matched nothing. Broaden it or widen the timespan.
RATE_LIMITED / SERVER_ERROR / NETWORK — transient issues; the actor retried with backoff first.

Notes / quirks

GDELT requires the query to be URL-encoded and phrases to be quoted — the actor handles both.
On a malformed query GDELT may return a text/plain error string (sometimes with HTTP 200) or an empty body instead of JSON. The actor guards JSON.parse and surfaces a clear BAD_INPUT diagnostic.
GDELT's index covers roughly the last 3 months of news.
The actor rotates a real browser User-Agent per request attempt for retry resilience, and supports an optional proxy (proxyConfiguration). Neither is required — GDELT is a public no-key API with no anti-bot — so leave the proxy unset unless you hit IP-level rate limits.