2025

Bootstrapping Fuzzers for Compilers of Low-Resource Language Dialects Using Language Models
Bootstrapping Fuzzers for Compilers of Low-Resource Language Dialects Using Language Models

Sairam Vaidya, Marcel Böhme, Loris D'Antoni

arXiv Preprint 2025

We present Germinator, a dialect-agnostic and dialect-effective fuzzing approach for extensible compilers like MLIR. By automatically extracting grammars from dialect specifications and using LLMs to generate diverse seed inputs, Germinator bootstraps coverage-guided fuzzing without manual effort. Evaluated on six MLIR projects spanning 91 dialects, it improved line coverage by 10-120% and discovered 88 previously unknown bugs.

Bootstrapping Fuzzers for Compilers of Low-Resource Language Dialects Using Language Models

Sairam Vaidya, Marcel Böhme, Loris D'Antoni

arXiv Preprint 2025

We present Germinator, a dialect-agnostic and dialect-effective fuzzing approach for extensible compilers like MLIR. By automatically extracting grammars from dialect specifications and using LLMs to generate diverse seed inputs, Germinator bootstraps coverage-guided fuzzing without manual effort. Evaluated on six MLIR projects spanning 91 dialects, it improved line coverage by 10-120% and discovered 88 previously unknown bugs.

Constrained Adaptive Rejection Sampling
Constrained Adaptive Rejection Sampling

Paweł Parys, Sairam Vaidya, Taylor Berg-Kirkpatrick, Loris D'Antoni

arXiv Preprint 2025

We present Constrained Adaptive Rejection Sampling (CARS), an approach that strictly improves the sample-efficiency of rejection sampling without distributional distortion by adaptively ruling out constraint-violating continuations and ensuring acceptance rates improve monotonically.

Constrained Adaptive Rejection Sampling

Paweł Parys, Sairam Vaidya, Taylor Berg-Kirkpatrick, Loris D'Antoni

arXiv Preprint 2025

We present Constrained Adaptive Rejection Sampling (CARS), an approach that strictly improves the sample-efficiency of rejection sampling without distributional distortion by adaptively ruling out constraint-violating continuations and ensuring acceptance rates improve monotonically.

Constrained Sampling for Language Models Should Be Easy: An MCMC Perspective
Constrained Sampling for Language Models Should Be Easy: An MCMC Perspective

Emmanuel Anaya Gonzalez*, Sairam Vaidya*, Kanghee Park, Ruyi Ji, Taylor Berg-Kirkpatrick, Loris D'Antoni (* equal contribution)

NeurIPS 2025

We propose a new constrained sampling framework based on Markov Chain Monte Carlo (MCMC) that is constraint satisfying, monotonically converging to the true conditional distribution, and efficient at generating high-quality samples in few steps.

Constrained Sampling for Language Models Should Be Easy: An MCMC Perspective

Emmanuel Anaya Gonzalez*, Sairam Vaidya*, Kanghee Park, Ruyi Ji, Taylor Berg-Kirkpatrick, Loris D'Antoni (* equal contribution)

NeurIPS 2025

We propose a new constrained sampling framework based on Markov Chain Monte Carlo (MCMC) that is constraint satisfying, monotonically converging to the true conditional distribution, and efficient at generating high-quality samples in few steps.