This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Daniel van Strien
danielvanstrien.bsky.social
did:plc:7e5mpxuweopubhexwqg5l3ba
You can now run SQL over 2.19 BILLION web pages — zero download.
@commoncrawl.bsky.social April 2026 crawl + URL index are on Hugging Face Storage Buckets. DuckDB reads it straight over hf:// — I counted all 2.19B in ~35s.
Or point your own agent at it 👇
https://huggingface.co/spaces/davanstrien/common-crawl-april-2026
2026-05-22T14:25:57.276Z