Fuzz4All: Universal Fuzzing with Large Language Models

This is my personal note about the paper. https://arxiv.org/abs/2308.04748

Abstract

They suggested new fuzzing methods by using LLM named Fuzz4All. It solve current fuzzing issue which can only use specific languages so it is not easy to apply other language or other versions. Fuzz4All found 98 bugs in GCC/G++, Clang/Clang++, Z3, CVC5, Go, javac and Qiskit.

Objective

Develop Universal Fuzzing tool which can use any languages and any systems.

Contributions

Universal Fuzzing
Autoprompting for Fuzzing
LLM-powered fuzzing loop
Evidence of real-world effectiveness

Methods

Autoprompting for Fuzzing

User input that documentation of the SUT, example code snippets, or specifications for generate fuzzing input and LLM generate sample multiple candidate inputs.
LLM generate multiple code snippets from candidate inputs.
Scoring and testing code snippets and select best prompt.

Fuzzing Loop

LLM generate fuzzing inputs from receive input prompt and test on SUT.
Selec fuzzing input form a previously generated input as an example.
Select strategies to update input which is "generate-new", "mutate-existing" or "semantic-equiv.

Results

Fuzz4all achieved average 36.8% improvements coverage than base line fuzzing tools.
Found 98 bugs and confirmed 64 bugs by developers.

Interesting points

Qiskit(quantum computing platform) is targeted.
Fuzz4All is only 872 LoC (Lines of Code).

Phrase

This paper presents Fuzz4All, the first fuzzer that is universal in the sense that it can target many different input languages and many different features of these languages.