What is this project?

Formal Conjectures is an open collection of formalised statements of mathematical conjectures in Lean 4, building on the Mathlib library.

While there is a growing corpus of formalised theorems with proofs, there has been a shortage of open conjectures where only the formal statement has been written down. This repository fills that gap.

Goals

  • Benchmark for automated theorem provers. A collection of open problems provides a natural testbed for AI systems attempting to discover mathematical proofs.
  • Clarify conjecture statements. Formalisation forces precision, often revealing hidden ambiguities in informal statements.
  • Expand Mathlib. Highlighting definitions that are needed but missing from Mathlib encourages the mathematical community to fill those gaps.

Structure

Conjectures are organised by their source:

  • Erdős Problems — from erdosproblems.com
  • Wikipedia — famous named conjectures (Riemann, Collatz, …)
  • MathOverflow — community-sourced open questions
  • OEIS — sequence-related conjectures
  • arXiv / Papers / Books — from the research literature
  • Millennium Prize Problems — the Clay Institute's seven problems
  • Hilbert Problems — Hilbert's 1900 problem list
  • Green's Open Problems — Ben Green's combinatorics list
  • Kourovka Notebook — open problems in group theory
  • Written on the Wall II — and more

Categories

Every statement is tagged with a category reflecting its status:

  • Open — no solution currently accepted by the mathematical community.
  • Solved — an established solution exists (informal or formal).
  • Formally Solved — a machine-checkable formal proof exists.
  • Graduate / Undergraduate / High School — difficulty level, only included when directly related to a research-level problem.

AMS subject classification

Every statement also carries one or more AMS MSC2020 subject tags (e.g. Number Theory, Combinatorics), enabling filtering by mathematical area.

A note on accuracy

Formalising mathematical statements without proofs is inherently challenging. Subtle inaccuracies can arise where the formal statement might not perfectly capture the nuances of the original conjecture. The project relies on careful human review and plans to leverage automated tools to help identify potential misformalisations.

Licensing

All software is licensed under Apache 2.0. All other materials are licensed under CC-BY 4.0. Third-party content may be subject to different licensing requirements as noted in each file.

This is not an official Google product.

Links