File size: 4,442 Bytes
0d4c8a8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dd10aca
12cf310
 
 
 
e837c70
12cf310
 
dd10aca
e396b93
8c713e1
cb6b8cc
610fef2
e396b93
dd10aca
538e472
e396b93
8f19cb1
 
 
 
 
 
 
 
 
e396b93
538e472
b7d5988
e396b93
0acf989
b7d5988
 
 
cb6b8cc
 
b7d5988
 
6923100
b7d5988
 
538e472
 
 
e396b93
dd10aca
3b8de9f
 
 
0f55784
e396b93
 
8f19cb1
e396b93
 
cb6b8cc
 
 
 
61369f5
 
8f19cb1
 
cb6b8cc
61369f5
e396b93
61369f5
 
 
 
cb6b8cc
 
e396b93
538e472
e396b93
8f19cb1
 
e396b93
8f19cb1
e396b93
 
8f19cb1
 
e396b93
 
 
538e472
8f19cb1
 
e396b93
538e472
e396b93
8f19cb1
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
---
license: apache-2.0
datasets:
- internlm/Lean-Workbook
- internlm/Lean-Github
- AI-MO/NuminaMath-CoT
language:
- en
base_model:
- Qwen/Qwen2.5-Math-7B
pipeline_tag: text-generation
library_name: transformers
tags:
- lean4
- theorem-proving
- formal-mathematics
---

<div align="center">
  <h1 style="font-size: 2.0em;">πŸš€ BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving</h1>
  <div style="display: flex; justify-content: center; gap: 8px; flex-wrap: wrap;">
    <a href="https://arxiv.org/abs/2502.03438"><img src="https://img.shields.io/badge/arXiv-2502.03438-b31b1b.svg" alt="arXiv"></a>
    <a href="https://choosealicense.com/licenses/apache-2.0/"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="License: Apache 2.0"></a>
    <a href="https://github.com/leanprover-community/mathlib4"><img src="https://img.shields.io/badge/Lean-4-orange" alt="Lean 4"></a>
  </div>
  <h2>State-of-the-art tactic generation model in Lean4</h2>
</div>

This repository contains the latest tactic generator model checkpoint from BFS-Prover, a state-of-the-art theorem proving system in Lean4. While the full BFS-Prover system integrates multiple components for scalable theorem proving, we are releasing the core tactic generation model here. Given a proof state in Lean4, the model generates a tactic that transforms the current proof state into a new state, progressively working towards completing the proof.

**πŸ“„ Paper: [BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving](https://arxiv.org/abs/2502.03438)**


## ✨ Model Details

- Base Model: Qwen2.5-Math-7B
- Training Approach: 
  - Supervised Fine-Tuning (SFT) on state-tactic pairs
  - Direct Preference Optimization (DPO) using compiler feedback
- Training Data Sources:
  - Mathlib (via LeanDojo)
  - Lean-Github repositories
  - Lean-Workbook
  - Autoformalized NuminaMath-CoT dataset

## πŸ“ˆ Performance
BFS-Prover achieves state-of-the-art performance on the MiniF2F test benchmark. Here's a detailed comparison:

### πŸ” MiniF2F Test Benchmark Results

| Prover System | Search Method | Critic Model | Tactic Budget | Score |
|---------------|---------------|--------------|---------------|--------|
| BFS-Prover  | BFS | No | Accumulative | **72.95%** |
| BFS-Prover  | BFS | No | 2048Γ—2Γ—600 | **70.83% Β± 0.89%** |
| HunyuanProver | BFS | Yes | 600Γ—8Γ—400 | 68.4% |
| InternLM2.5-StepProver | BFS | Yes | 256Γ—32Γ—600 | 65.9% |
| DeepSeek-Prover-V1.5 | MCTS | No | 32Γ—16Γ—400 | 63.5% |


### πŸ”‘ Key Advantages
- βœ… Achieves better performance without requiring a critic model (value function)
- βœ… Combined with simpler search method (BFS) rather than MCTS

## βš™οΈ Usage
- The model expects Lean4 tactic states in the format `"{state}:::"`
- `:::` serves as a special indicator to signal the model to generate a tactic for the given state. 
- The model will echo back the input state followed by the generated tactic.


```python
# Example code for loading and using the tactic generator model
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("bytedance-research/BFS-Prover")
tokenizer = AutoTokenizer.from_pretrained("bytedance-research/BFS-Prover")
state = "h : x = y + 2 ⊒ x - 1 = y + 1" 
sep = ":::"
prompt = state + sep  # Creates "h : x = y + 2 ⊒ x - 1 = y + 1:::"

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs)
tactic = tokenizer.decode(outputs[0], skip_special_tokens=True).split(sep)[1]
print(tactic)

# Complete example:
# Input state:  "h : x = y + 2 ⊒ x - 1 = y + 1"
# Full prompt:  "h : x = y + 2 ⊒ x - 1 = y + 1:::"
# Model output: "h : x = y + 2 ⊒ x - 1 = y + 1:::simp [h]"
# Final tactic: "simp [h]"
```

## πŸ“š Citation

If you use this model in your research, please cite our paper:

```bibtex
@article{xin2025bfs,
  title={BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving},
  author={Xin, Ran and Xi, Chenguang and Yang, Jie and Chen, Feng and Wu, Hang and Xiao, Xia and Sun, Yifan and Zheng, Shen and Shen, Kai},
  journal={arXiv preprint arXiv:2502.03438},
  year={2025}
}
```

## πŸ“„ License

https://choosealicense.com/licenses/apache-2.0/

## πŸ“§ Contact

For questions and feedback about the tactic generator model, please contact:
- Ran Xin ([email protected])
- Kai Shen ([email protected])