Using C libraries in Rust: make a sys crate

*-sys is a naming convention for crates that help Rust programs use C ("system") libraries, e.g. libz-sys, kernel32-sys, lcms2-sys. The task of the sys crates is expose a minimal low-level C interface to Rust (FFI) and to tell Cargo how to link with the library. Adding higher-level, more Rust-friendly interfaces for the libraries is left to "wrapper" crates built as a layer on top of the sys crates (e.g. a "rusty-image-app" may depend on high-level "png-rs", which depends on low-level "libpng-sys", which depends on "libz-sys").

Using C libraries in a portable way involves a bit of work: finding the library on the system or building it if it's not available, checking if it is compatible, finding C headers and converting them to Rust modules, and giving Cargo correct linking instructions. Often every step of this is tricky, because operating systems, package managers and libraries have their unique quirks that need special handling.

Fortunately, all this work can be done once in a build script, and published as a <insert library name>-sys Rust crate. This way other Rust programmers will be able to use the C library without having to re-invent the build script themselves.

To make a sys crate:

Read the Cargo build script documentation.
Create a new Cargo crate cargo new --lib <library name here>-sys
In Cargo.toml add links = <library name>. This informs Cargo that this crate links with the given C library, and Cargo will ensure that only one copy of the library is linked. Use names without any prefix/suffix (e.g. florp, not libflorp.so). Note that links is only informational and it does not actually link to anything.
Create build.rs file in the root of the project (or specify build = "<path to build.rs>" in Cargo.toml).

For the build.rs script the general approach is:

Find the library.
Select static or dynamic linking.
Optionally build the library from source.
Expose C headers.

Things that sys crates should NOT do

Don't write any files outside Cargo's dedicated output directory (OUT_DIR). Specifically, do not try to install any packages on the system. If the required library or other dependency is missing: report an error, or cargo:warning and fall back to something else.

Avoid downloading anything. There are packaging and deployment tools that require builds to run off-line in isolated containers. If you want to build the library from source, bundle the source in the Cargo crate.

Cargo build process is supposed to be entirely self-contained and work off-line.

Finding the library

For Linux & BSD pkg-config is the best option to try first. There's pkg_config crate for this.
On macOS with Homebrew pkg-config technically works, but is problematic. Executables dynamically linked with pkg-config's libraries are not redistributable and will crash on other machines that don't have Homebrew with the same version of the library installed. When you use pkg-config, default to static linking.
For Windows there's vcpkg, but it's rarely available. pkg-config doesn't work. You may need to search most likely directories "the hard way" (e.g. clang-sys searches C:\Program Files\LLVM). It's best to have a build-from-source fallback (example) to offer a hassle-free crate. Be aware that there are two slightly incompatible toolchains on Windows: msvc (Visual Studio, native) and gnu (MinGW, like reverse of Wine). Detect them with CARGO_CFG_TARGET_ENV. Don't mix them.

Finally, support overriding library location with <LIBRARY_NAME>_PATH or <LIBRARY_NAME>_LIB_DIR environmental variable (example). This is necessary in some cases, such as cross-compilation and custom builds of the library (e.g. with custom features enabled, or if the library installed in /lib is 6 years old).

Once you know the directory, tell Cargo to use it by printing cargo:rustc-link-search=native=<dir>. Helper crates like pkg-config may do this for you.

Select static or dynamic linking

You select how to link the library by printing cargo:rustc-link-lib=<name> or cargo:rustc-link-lib=static=<name>. Because most users won't configure your crate (it's likely to be a dependency of a dependency of a dependency…), you must have good, safe defaults:

Linux & BSD (except the musl target) — use dynamic linking by default. You can expect the program to be packaged as RPM/deb, which will ensure these libraries are installed for you in the right places. For the musl target default to everything static, since it's mainly used for making entirely self-contained Linux executables.
macOS — use static linking by default unless you're writing a sys crate for a library shipped with macOS. On macOS there's a strong expectation for programs to "just work" without installation of any dependencies. Dynamic linking option may be useful for Rust programs that are later bundled in an app/framework bundle, installer package, or a Homebrew formula, but in Cargo that's not the default and it's complicated to set up, so it should be opt-in. macOS also supports dynamic linking with ObjC frameworks with cargo:rustc-link-lib=framework=….
Windows — use static linking by default unless you're writing a sys crate for a library shipped with Windows. Dynamic linking option may be useful for applications having installers.

Note that pkg-config has a .statik() option, but it often doesn't do anything. You may need to verify it (Linux: ldd, macOS: otool -L, Windows: dumpbin /dependents) and work around it.

As for the configuration itself, there are two options, both somewhat flawed:

Cargo features

In Cargo.toml you can have [features] section with static and dynamic options.

[features]
static = []
dynamic = []

Don't put any of them as Cargo's default feature, because it's too hard to unset defaults in Cargo.

Pro: The features are easy to set by other crates. It's possible to configure the build entirely via Cargo.toml.

Con: Cargo features can only be set, and never unset. Once any crate anywhere makes the choice, it won't be possible to override it. Also Cargo doesn't support mutually-exclusive features, so your build.rs script will have to deal with both static and dynamic set at the same time.

Environmental variables

You can check <LIBRARY_NAME>_STATIC environmental variable to see if the library should be linked statically.

Pro: Top-level project can easily override linking, even if the sys crate is a deeply nested dependency.

Con: Cargo doesn't help managing env vars, so the proper build will require extra scripts/tools besides Cargo.

Ideally, you can support both, with env vars taking precedence over features. This allows convenient features in simple cases, and env var as a fallback for cases where features fail.

Build from source

If the library is unlikely to be installed on the system by default, especially when you support Windows, it's nice to automatically build it from source (and link statically).

It's a massive usability improvement for users, because they can always run cargo build and get something working, instead of getting errors, searching for packages, installing dependencies, setting up search paths, etc.

Downloading of the source is tricky, so it's best to avoid it. The build script would need to depend on libraries for HTTP + TLS + unarchving, which itself may be bigger and more problematic to build than just bundling sources with the sys crate. Some users require builds to work off-line. So unless the library's sources are really huge, just copy them to the Rust crate (and make sure to use a compatible license for your sys crate).

To avoid having a duplicate copy of someone else's source code in your crate's git repository, you can add the C sources as a git submodule. Publishing to crates.io will copy it as a regular directory, so your crate's users won't even know it was a submodule.

git submodule add https://example/third-party.git vendor/

During development use cargo build -vv to see output from the build script. You can print cargo:warning=… to make messages user-visible. Include name of your crate in warnings and errors, because your crate is likely to end up buried several layers deep in someone else's project.

For building you have two options:

Use the original build system of the library

You assume that the required build system (such as make, autotools, cmake) is already installed, and run it. There are crates like cmake and make-cmd that help with this a little bit.

You may need to translate Cargo's environmental variables into appropriate options for the build system (e.g. libgit2, libcurl) to control output directories, optimization level, debug symbols, and enable -fPIC (Rust always wants -fPIC).

Autotools (./configure) supports "out-of-tree builds". Make it use a subdirectory of OUT_DIR, so that cargo clean will also clean the temp files of the C build.

Pro: You stick to the documented way of building the library and don't peek inside.

Con: If the build fails, the extra indirection makes it even harder to diagnose and fix. Very likely to be painful on Windows.

Replace the build system using cc

Replacing the build system seems like a terrible idea inviting a lot of maintenance work. However, for a lot of libraries all the complexity of the build system exists only to handle differences between operating systems and broken compilers (which the cc crate handles) and searching for the library's own dependencies (which other *-sys crates can do for you).

In practice it often turns out that just giving the list of .c files to the cc crate, with one or two define()s, is enough to build the code properly! Try running make --dry-run VERBOSE=1 to see the files and macros needed.

If you need to make a config.h file for the library, don't modify the source directory. Instead, write the config header to the OUT_DIR and set the out dir in include path first.

Pro: The cc crate handles integration with Cargo, even for cross-compilation. It also handles things you'd rather not do, like reading Windows Registry to find a working copy of Visual Studio.

Con: It's easy for small/medium and mature projects, but may be daunting for big and fast-moving projects.

What about closed-source libraries?

Unfortunately, Cargo crates (and crates.io) are not suited for distribution of binaries. Use another package manager (e.g. apt/RPM, chocolatey) to distribute a pre-compiled shared library (cdylib) and in the sys crate expect it to be pre-installed.

Customization

C libraries often control enabling/disabling features via #define FOO_SUPPORTED. It's a good idea to translate this into Cargo features. If the C library has some features enabled by default, set Cargo's default features the same way.

[features]
default = ["foo"]
foo = []
bar = []

if cfg!(feature = "foo") {
    cc.define("FOO_SUPPORTED", Some("1"));
}
if cfg!(feature = "bar") {
    cc.define("BAR_SUPPORTED", Some("1"));
}

Header files

There are two places where you may need to expose C library's header (.h) files:

To C code in other -sys crates (optional).
To Rust code using your sys crate.

The first case is simpler. Make sure you have links in your Cargo.toml:

[package]
name = "foobar-sys"
links = "foobar"

Print cargo:include=/path/to/include with the directory where the library's own .h files are (cargo:include is not a special name, you can use any cargo:<name>=<value> to provide more information). Use join_paths/split_paths to list multiple directories.

println!("cargo:include={}", absolute_path_to_include_dir.display());

If you need to make a relative path into absolute, use dunce::canonicalize(), because fs::canonicalize() is unusable.

In your crate's documentation instruct others to read DEP_<your links name>_INCLUDE env variable if they need the headers (e.g. libz → libpng):

cc.include(env::var_os("DEP_FOOBAR_INCLUDE").expect("Oops, DEP_FOOBAR_INCLUDE should have been set"));

Bindgen

For Rust you will need to translate C headers into a Rust module containing extern "C" {} declarations. This can be done automatically with bindgen, but there are some things to consider.

Bindgen has an option to translate C enum to Rust enum. It's nice to have enums, but Rust has extra requirements: the enum must contain only valid values at all times. That's a guarantee in safe Rust. If the C side violates it, it will "poison" Rust code and crash it. If definition says enum Foo {A=1, B=2}, but C somewhere returns (enum Foo)3, it can't be a Rust enum.

Macros, inline functions, C++

If C headers use inline functions, you can use Citrus to translate function bodies. Macros containing code and C++ templates need to be translated by hand (e.g. macro → fn) or wrapped in C functions in your crate and compiled to a private static library. Bindgen supports a subset of C++, but you may need to write a C wrapper for C++ classes (example).

Stable ABI?

There's a question whether you run bindgen once and ship that file, or whether you run bindgen every time the project is build. It depends on how incompatible different versions of the C library may be.

If the C library has a stable and portable ABI: new versions only add new functions and everything is backwards compatible, then you can pre-generate. It will be faster to build (no dependency on bindgen and clang), and you can even tweak the built file manually. Make sure it works for both 32 and 64-bit architectures (usually #[repr(C)] just works, but you'll need to disable generation of bindgen's layout tests, because they're architecture-specific).

If there are different, incompatible versions of the library in the wild, then you need to use bindgen as a Rust library and run it from your build.rs to generate fresh bindings for every user. The generated file must be included somewhere in your crate's lib.rs. Use include!(concat!(env!("OUT_DIR"),"/filename.rs"));.

Different major versions

If differences between major versions of the C library are small (e.g. only new functions added, or just a couple of struct fields changed), then you could try to automatically adapt to the version or use Cargo features to enable the new features (e.g. mozjpeg-sys supports different ABI versions, clang-sys has features for LLVM versions).

If versions of the C library are totally different and incompatible, then either have separate crates (foo1-sys & foo2-sys) or at least use different major versions of your sys crate for different major versions of the C library, so that Cargo will know that they're not compatible.

Cross-compilation

Rust can build executables and libraries for systems other than the one it's running on, e.g. build Linux programs on macOS, or build 32-bit libraries on a 64-bit OS.

Your build.rs program may be running on a different architecture than the one being compiled. This means that all your size_of checks and #[cfg]/cfg!() macros in build.rs may be for a wrong machine! For querying target system or CPU use CARGO_CFG_TARGET_OS/CARGO_CFG_TARGET_ARCH/CARGO_CFG_TARGET_POINTER_WIDTH env vars instead (run rustc --print cfg for a full list). The only exception are cfg(feature = "…") checks, which are safe to use in cross-compilation.

pkg-config will automatically bail out when it detects cross-compilation (when env var HOST != TARGET). If you're searching for libraries on disk in other ways, also keep in mind that the host system may not be compatible with the target being built.

Linking surprises

Write tests in your sys crate's lib.rs to reference as many C symbols as you can. Linkers often work "lazily" and won't notice any problems with the library unless it's actually used from Rust.

In external tests (in tests/ directory) and other crates make sure to include extern crate <your lib>_sys;. The C library won't be linked unless it's used via extern crate, even if it's set as a dependency in Cargo.toml!

Documentation

Have a good README (and the readme key in Cargo.toml) with clearly stated requirements and configuration options (especially env vars).

However, don't bother documenting individual FFI functions in Rust. Sys crates by definition don't change behavior of the C library and don't add anything that isn't already in the C version, so for function-specific information send users to the original C documentation (e.g. libc intentionally doesn't document any function). If you want to make the library easier to use, it's better to spend effort on making a second crate with a higher-level interface.

Bus factor 1

Nobody can be expected to support their crate 24/7 — forever. From time to time crate authors become unavailable, but their crates need an update (e.g. urgent security or compatibility fixes). It's a huge pain for users.

When you publish your crate on crates.io, consider adding someone else as a co-owner of your crate. There's "Manage owners" link on crate's page, or you can add your GitHub team. If you can't think of anyone, add me (kornelski).

Thanks to Michael Bryan, Mark Summerfield and other rustaceans for their feedback.

Making a *-sys crate