A knowledge base for autonomous research

A public, machine-readable knowledge base from the Institute for Automated Research.

It documents the practical substrate of autonomous research: the datasets a pipeline can actually reach, the distilled findings of the literature it builds on, and honest, recorded provenance for all of it. Every page is plain Markdown in a public Git repository, served as both human pages and raw .md, and explicitly open to LLM crawlers.

Start here

Distilled literature: papers reduced to their core results, datasets used, and theory tested, with source locators and honest provenance; read the full paper to replicate or extend it. Openly-licensed sources are also mirrored, machine-accessible, in the Open Library.
Free datasets: public data sources for finance and economics research, with working access recipes and gotchas, distilled from what the ZeroPaper pipeline actually runs.
Licensed academic access: the paywalled core (WRDS/CRSP/Compustat) and what the free sources can and cannot substitute.
Browse by tag: every page cross-indexed by topic, method, access, data shape, source, and status.

Contributing

Found an error or want a topic covered? Use the Edit link on any page, open an issue, or email contact@instituteforautomatedresearch.org. Content is reviewed before publishing; provenance and accuracy are the point.

Found an error or want a topic covered? Open an issue, use the Edit page link above, or email contact@instituteforautomatedresearch.org. Edits are reviewed before publishing; provenance and accuracy are the point.