Accessing Corpora
If you want to access the corpora that we are using for your fuzz targets (synthesized by the fuzzing engines), follow these steps.
- Obtain access
- Install Google Cloud SDK
- Viewing the corpus for a fuzz target
- Downloading the corpus
- Corpus backups
Obtain access
To get access to a project’s corpora, you must be listed as the primary contact or as an auto cc in the project’s project.yaml
file, as described in the New Project Guide. If you don’t do this, most of the links below won’t work.
Install Google Cloud SDK
The corpora for fuzz targets are stored on Google Cloud Storage. To access them, you need to install the gsutil tool, which is part of the Google Cloud SDK. Follow the instructions on the installation page to login with the Google account listed in your project’s project.yaml
file.
Viewing the corpus for a fuzz target
The fuzzer statistics page for your project on ClusterFuzz contains a link to the Google Cloud console for your corpus under the corpus_size column. Click the link to browse and download individual test inputs in the corpus.
Downloading the corpus
If you want to download the entire corpus, click the link in the corpus_size column, then copy the Buckets path at the top of the page:
Copy the corpus to a directory on your machine by running the following command:
$ gsutil -m cp -r gs://<bucket_path> <local_directory>
Using the expat example above, this would be:
$ gsutil -m cp -r \
gs://expat-corpus.clusterfuzz-external.appspot.com/libFuzzer/expat_parse_fuzzer \
<local_directory>
Corpus backups
We keep daily zipped backups of your corpora. These can be accessed from the corpus_backup column of the fuzzer statistics page. Downloading these can be significantly faster than running gsutil -m cp -r
on the corpus bucket.