科研空间

按文件夹整理。论文、实验、方法都在下面。

文件夹

/research/

/research/formal-math/autoformalization/formalizing-mathematics-at-scale/

Formalizing Mathematics at Scale 论文精读

一页读懂 AutoformBot 与 ATLAS：大规模数学教材自动形式化的多智能体工程系统。

形式化数学 / Autoformalization / 2026-05-31 #formal-math#autoformalization#lean#multi-agent

打开
/research/formal-math/autoformalization/right-symmetries-formal-theorem-proving/

What are the Right Symmetries for Formal Theorem Proving? 论文精读

用 rewriting categories 解释形式定理证明中的等价改写、success invariance 和 test-time rewriting ensemble，并对比 FormalEvolve 的 autoformalization repertoire 路线。

形式化数学 / Autoformalization / 2026-05-29 #formal-math#formal-theorem-proving#symmetry#Lean

打开
/research/formal-math/autoformalization/formalevolve-cheatsheet/

FormalEvolve 论文精读

FormalEvolve: Beyond a Single Ground Truth for Autoformalization 的精读式 HTML 解读。

形式化数学 / Autoformalization / 2026-05-24 #autoformalization#Lean#formal-methods#paper-reading

打开

/research/formal-math/lectures/berkeley-agents-autoformalization-atp/

自动形式化与自动定理证明：Formal Reasoning Meets LLMs

Berkeley CS294/194-280 Spring 2025 Kaiyu Yang 讲义 HTML 版：SFT/RL 的可验证性边界、LeanDojo/ReProver、LIPS、autoformalization 评估和 LeanEuclid。

形式化数学 / Lecture Notes / 2026-05-28 #formal-math#lecture-notes#Berkeley-CS294-280#autoformalization

打开
/research/formal-math/lectures/berkeley-agents-alphaproof/

AlphaProof：当强化学习遇到形式数学

Berkeley CS294/194-280 Spring 2025 AlphaProof 讲义 HTML 版：Lean/Mathlib、AlphaZero 风格搜索、IMO 2024、formalizer/prover、test-time RL 与形式数学边界。

形式化数学 / Lecture Notes / 2026-05-28 #formal-math#lecture-notes#Berkeley-CS294-280#AlphaProof

打开

科研空间

最近页面

文件夹

形式化数学

Autoformalization

Formalizing Mathematics at Scale 论文精读

What are the Right Symmetries for Formal Theorem Proving? 论文精读

FormalEvolve 论文精读

Lecture Notes

自动形式化与自动定理证明：Formal Reasoning Meets LLMs

AlphaProof：当强化学习遇到形式数学

自进化 agent

Coding Benchmark

FrontierSmith: Synthesizing Open-Ended Coding Problems at Scale

ICL / Agent 分析

Cheat-Sheet ICL 论文深度解析

Test-Time Learning / Adaptive Memory

Dynamic Cheatsheet 论文深度解析

Prompt Evolution / Optimization

GEPA 论文精读

Agent Evaluation

Agents' Last Exam

Automated Capability Discovery via Foundation Model Self-Exploration

Agent Memory

EvoMemBench: Benchmarking Agent Memory from a Self-Evolving Perspective

AI 自进化论坛精读

AI 自进化精读讲义：从 Recursive Self-Improvement 到可验证反馈闭环

From Self-Correction To Self-Improving

Harness Engineering：有时候语言模型不是不够聪明，只是没有被好好引导

人工智慧能不能夠做到自我成長？

AI 能自我修正吗？从 Decoding、Workflow 到 Reasoning

AI 要跨越卢比孔河了吗？自我成长的 AI 离我们多远（下集）

Drift Monitor 精读

Do Self-Evolving Agents Forget? Capability Degradation and Preservation in Lifelong LLM Agent Adaptation

AgentDevel

AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks

AgentXRay: White-Boxing Agentic Systems via Workflow Reconstruction

AIR: Improving Agent Safety through Incident Response

Alignment Tipping Process

Evaluating Goal Drift in Language Model Agents

MemoryGraft

OEP

Routine Chats Turn Toxic

Your Agent May Misevolve

Open-Endedness

Jeff Clune: Open-ended and AI-generating Algorithms in the Era of Foundation Models

Paper Reading

Epistatic strength, modularity, and locus heterogeneity shape the number of local optima in fitness landscapes 论文精读

什么让 LLM 成为好的优化器？LLM 引导进化搜索的轨迹分析

BehaveSim：重新思考 LLM 自动算法设计中的代码相似性

Fitness Landscape of LLM-Assisted Automated Algorithm Search

What Do Evolutionary Coding Agents Evolve?

Skill Optimization / Drift Monitor

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

Bitter Lesson

Sutton WAIC 2026 讲座

萨顿谈AI的苦涩教训：从历史规律到基础模型的未来

萨顿谈AI的苦涩教训：从历史规律到基础模型的未来（纯文本版）