chore(README.md): write proper README documentation

This commit is contained in:
Matteo Settenvini 2025-09-08 16:49:07 +02:00
parent 177d252c09
commit 1881ea23c1

View file

@ -2,8 +2,71 @@
[//]: # SPDX-License-Identifier: CC-BY-SA-4.0 [//]: # SPDX-License-Identifier: CC-BY-SA-4.0
# Sysroot Cleaner # 🧹 Sysroot Cleaner
A tool to clean up sysroots for Linux embedded devices to save storage space. A tool to clean up sysroots for Linux embedded devices in order to save storage space.
Note: it will only work on files belonging to the same filesystem. This is a design choice. Here by sysroot we mean the final _target system_ filesystem, rather than the _staging folder_ potentially containing intermediate cross-compilation byproducts.
## What does it do?
_Sysroot cleaner_ is a simple tool used to remove unnecessary files from a target folder which is holding the filesystem of an ELF-based OS (such as Linux). This can for instance be either a cross-compiled device target tree, or a folder being prepared for a local chroot jail. It recurses across all subfolders **part of the same filesystem** and looks for files that can be safely removed to reduce space usage.
The full list of found files is passed to a few modules (aka "cleaners") that can decide whether to keep or remove a specific file. These are:
* **dso**: maps all ELF files and their library dependencies to a directed acyclic graph. For each library, remove it transitively if unreachanble from any executable binary. **Note**: Libraries that are dynamically opened at runtime need to be manually allow-listed. If there is interest, we might support [.note.dlopen](https://github.com/systemd/systemd/blob/main/docs/ELF_DLOPEN_METADATA.md) as it gains more widespread adoption.
* **allow-/block-list**: given a file of [gitignore patterns](https://git-scm.com/docs/gitignore#_pattern_format), either mark the file for keeping (if in the allowlist) or for removal (if in the blocklist).
## Commandline Options
Usage: `sysroot-cleaner [option…] <sysroot>`, where `<sysroot>` is mandatory, and the path to the root of the sysroot to clean up.
Options can be:
* `-n`, `--dry-run`: Simulate operations without carrying them out.
* `--split-to <dir>`: Instead of simply removing files, move them to the given location, preserving their relative folder structure.
* `--allowlist <file>`: An allowlist of files to keep, in `.gitignore` format. Can be passed multiple times. **Note**: this will take precedence over all other removal decisions.
* `--blocklist <file>`: A blocklist of files to remove, in `.gitignore` format. Can be passed multiple times.
* `--output-dotfile <file>`: An optional path to save the file graph of the DSO cleaner in GraphViz format. Useful for debugging.
* `--ld-path <dir>`: An additional path to consider when resolving libraries, relative to the sysroot root. Its behavior is similar of the one of the `LD_LIBRARY_PATH` environment variable when specified to the dynamic linker.
The log level can be controlled via the `LOG_LEVEL` environment variable, and can be one of: `error`, `warn`, `info`, `debug`, `trace`, or `off` (run completely silent).
## Example Usage
Assume that you have built a filesystem image, for instance through a tool like [buildroot](https://buildroot.org/downloads/manual/manual.html).
You could add a simple shell script to invoke `sysroot-cleaner`:
```bash
#!/bin/bash
# file: post_build.sh
set -e -o pipefail
readonly SCRIPT_DIR=$(realpath "$(dirname $0)")
readonly TARGET_DIR=$1
if [ ! -d "${TARGET_DIR}" ]; then
echo "Expecting the rootfs folder as first argument"
exit 1
fi
# Base lists
allow_lists=("${SCRIPT_DIR}/base.allowlist")
block_lists=("${SCRIPT_DIR}/base.blocklist")
LOG_LEVEL=info sysroot-cleaner \
$(printf -- '--allowlist %s ' "${allow_lists[@]}") \
$(printf -- '--blocklist %s ' "${block_lists[@]}") \
"${TARGET_DIR}"
```
Then, you can set `BR2_ROOTFS_POST_BUILD_SCRIPT` to invoke `post_build.sh`.
## Changelog
### v1.0.0
* Initial stable release.