The elephant in the room: Rust interop with C++

- Are you stuck living in a C++ world even if you don't choose it? (cough VFX cough)
- Do you prefer the ergonomics of
cargo add
to the ones ofapt-get install some-random-dev-header
until things work ? - Do you like a language whose motto is to "empower everyone to build reliable and efficient software" ?
Then follow me in this beautiful 🦀-shaped hole..
A wor(l)d of advice
I'm not a Rust expert, and I'm definitely not a C++ or C expert. I'm just a Rustacean having fun. My knowledge of C++ comes mostly from hacking around in OpenFrameworks, so that should say it all.
Introduction
To keep it simple, but meaningful, let's imagine that we have a list of points, and we have a target point, and we want to know which point in the list is closest to the point that we have.
Since the world is full of beautiful Rust libraries that solve this problem we'll just pick one of them.
And because we're stuck in a C++ world, we'll want to deliver "something" that integrates our new Rust solution into the existing C++ project. This moves us forward in our goal of potentially writing new small bits of the project in Rust, without forcing the whole project to pivot to a whole new language.
Rust crate
Let's create a new library crate and add kiddo
, one of most ranked crates related to KD-Trees (after searching on https://crates.io/search?q=kdtree&sort=downloads):
cargo new --lib point-lookup
cd point-lookup-rs
cargo add kiddo
Now let's write a the actual Rust function to return the closest point.
Given the APIs of kiddo
, and given that we'll need to interop with C++ (or C), I like to keep things simple, and return the index of the closest point.
A first pass for the lib.rs
file might look like this:
#[derive(Debug, Copy, Clone)]
pub struct Point {
x: f32,
y: f32,
z: f32,
}
pub fn find_nearest(points: &[Point], target: &Point) -> u64 {
let mut tree = kiddo::KdTree::new();
for (i, point) in points.iter().enumerate() {
tree.add(&[point.x, point.y, point.z], i as u64);
}
let nearest_one = tree.nearest_one::<kiddo::SquaredEuclidean>(&[target.x, target.y, target.z]);
nearest_one.item
}
To make this Rust library compile into a shared library that can be used from C/C++, I'll add this section to our Cargo.toml
:
[lib]
crate-type = ["cdylib"]
Now we can check that the crates builds fine:
$ cargo check
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.06s
And builds in ~4 seconds on my machine:
$ cargo build --release 2>&1 | tail -n 1
Finished `release` profile [optimized] target(s) in 4.32s
Kiddo offers many other algos to compute the distance, like kiddo::Manhattan
, but choosing the best one is beside the point of this exercise.
For the sake of KISSing, we also won't focus on trying to 'cache' the creation of the tree, so that if we have to perform multiple queries against the same points we don't need to recreate it every time. I'll take that as a side quest for a future article (you can imagine having code that first creates the tree struct and then asks the client to pass it to any function that requires it).
Now let's do a few surgical changes to make our code ready to be used as a C library.
First, instead of using f32
, we can switch to core::ffi::c_float
instead.
The Rust docs at https://doc.rust-lang.org/std/os/raw/type.c_float.html mention that
This type will almost always be f32, [..but..] the standard technically only guarantees that it be a floating-point number, and it may have less precision than f32 or not follow the IEEE-754 standard at all.
Given that my level of trust is generally 0, generally I'd switch to model everything with c_float
instead of f32
. However, for legibility (and since my LSP shows that pub type c_float = f32
) here I'll just keep f32
.
Then, let's also use the repr(C)
macro on our struct. This tells the compiler to use the same memory layout of the C ABI (I ignore what that means precisely, but it's nice to leave the implementation details to the rustc folks, no?).
#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct Point {
x: core::ffi::c_float,
y: core::ffi::c_float,
z: core::ffi::c_float,
}
Now let's look at find_nearest
. To tell the compiler that we want to expose this function to be callable from C, we'll add the extern "C"
keyword.
We'll use the #[no_mangle]
derive so that the symbol doesn't get mangled, and we call it by name from the C side.
Since slices have no C equivalent, instead of &[Point]
we'll have to receive a pointer (*const Point
) with a length (num_points
).
To go back to a slice, we'll use core::slice::from_raw_parts
, which means our function will have to contain some unsafe (aka "trust me bro" code).
/// Given a list of `points`, return the index to the closest point.
/// # Safety
///
/// `points` should be a valid pointer with at least `num_points`.
/// The same safety considerations recommended in [core::slice::from_raw_parts] should apply.
#[no_mangle]
pub unsafe extern "C" fn find_nearest(
points: *const Point,
num_points: usize,
point: &Point,
) -> core::ffi::c_uint {
let points = unsafe { core::slice::from_raw_parts(points, num_points) };
let mut tree = kiddo::KdTree::new();
for (i, point) in points.iter().enumerate() {
tree.add(&[point.x, point.y, point.z], i as u64);
}
let nearest_one = tree.nearest_one::<kiddo::SquaredEuclidean>(&[point.x, point.y, point.z]);
nearest_one.item as core::ffi::c_uint
}
We're now also returning core::ffi:c_ulong
instead of just u64
.
Again, quoting from https://doc.rust-lang.org/stable/core/ffi/type.c_ulong.html :
This type will always be u32 or u64. Most notably, many Linux-based systems assume an u64, but Windows assumes u32. The C standard technically only requires that this type be an unsigned integer with the size of a long, although in practice, no system would have a ulong that is neither a u32 nor u64.
Let's now write some unit tests. To simplify things, let's make a short initializer for points:
impl Point {
pub fn new(x: f32, y: f32, z: f32) -> Self {
Self { x, y, z }
}
}
And then let's just ensure that the basics are working as expected:
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_find_nearest() {
let points = (0..100)
.step_by(10)
.map(|y| Point::new(0.0, y as f32, 0.0))
.collect::<Vec<_>>();
let target_point = Point::new(0.0, 12.0, 0.0);
let p_id = unsafe { find_nearest(points.as_ptr(), points.len(), &target_point) };
assert_eq!(p_id, 1);
}
}
Which seems to be the case:
$ cargo test --lib 2>/dev/null | grep result
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Then, let's integrate this in the C++ code by using find_nearest()
there.
At the root of our Cargo workspace, we'll need to have a build.rs
file to generate the required headers.
We can use cbindgen to help us. Let's add it to the build dependencies:
[build-dependencies]
cbindgen = "0.28.0"
And then let's add the bare minimum code to a build.rs
file:
fn main() {
let crate_dir = std::env::var("CARGO_MANIFEST_DIR").unwrap();
println!("Generating C/C++ header");
cbindgen::Builder::new()
.with_crate(crate_dir)
.generate()
.expect("Unable to generate bindings")
.write_to_file("include/rust_point_lookup.h");
}
This^ code will now be invoked every time we run cargo build
and will (re)generate the include/rust_point_lookup.h
file.
In my case, I have that running
cargo build --release 2>/dev/null && cat include/rust_point_lookup.h
produces this:
#include <cstdarg>
#include <cstdint>
#include <cstdlib>
#include <ostream>
#include <new>
struct Point {
float x;
float y;
float z;
};
extern "C" {
unsigned int find_nearest(const Point *points, uintptr_t num_points, const Point *point);
} // extern "C"
C++ side
For the sake of making it easy to follow along, I'll make a dedicated dir for our C++ sources:
mkdir -p my-cpp-project && cd $_
touch main.cpp
This is gonna be the content of main.cpp
:
#include <cstdint>
#include <cstdio>
#include <iostream>
#include <string>
#include <vector>
#include "rust_point_lookup.h"
auto main(int argc, char *argv[]) -> int {
Point target = Point{0.0, 12.0, 0.0};
std::vector<Point> points;
for (int y = 0; y < 100; y += 10) {
points.push_back(Point{0.0, (float)y, 0.0});
}
int len = sizeof(points) / sizeof(Point);
int nearest_i = find_nearest(points.data(), len, &target);
Point nearest = points[nearest_i];
printf("Inputs points:\n");
for (Point p : points) {
printf("\tPoint {x: %f, y: %f, z: %f}\n", p.x, p.y, p.z);
}
printf(
"The closest point to {x: %f, y: %f, z: %f} is {x: %f, y: %f, z: %f}\n",
target.x, target.y, target.z, nearest.x, nearest.y, nearest.z);
return 0;
}
Then, inside the same dir, I'll create a quick bash script to build+link everything together:
touch run.sh && chmod +x run.sh
The contents of run.sh
will be:
#!/usr/bin/env bash
set -e
# set -x
pushd .. ||false
cargo build --release
popd || false
lib_name=point_lookup
mkdir -p lib
cp ../target/release/lib$lib_name.so lib
bin_name=find-nearest
# Prerequisites:
# sudo apt-get install bear
# sudo apt-get install g++-12
bear -- clang++ main.cpp -o "./$bin_name" \
-std=c++11 \
-I/usr/include/c++/11 \
-I/usr/include/x86_64-linux-gnu/c++/11 \
-L/usr/lib/x86_64-linux-gnu \
-I../include \
-Llib \
-l$lib_name
rm -f a.out
clang++ main.cpp \
-I../include \
-Llib \
-l$lib_name
patchelf --add-rpath "$(pwd)/lib" "./$bin_name"
"./$bin_name"
Results
Now if I execute ./run.sh
, I get this:
$ ./run.sh
[..redacted..]
Compiling point-lookup v0.1.0 ([..redacted..])
Finished `release` profile [optimized] target(s) in 0.29s
[..redacted..]
Inputs points:
Point {x: 0.000000, y: 0.000000, z: 0.000000}
Point {x: 0.000000, y: 10.000000, z: 0.000000}
Point {x: 0.000000, y: 20.000000, z: 0.000000}
Point {x: 0.000000, y: 30.000000, z: 0.000000}
Point {x: 0.000000, y: 40.000000, z: 0.000000}
Point {x: 0.000000, y: 50.000000, z: 0.000000}
Point {x: 0.000000, y: 60.000000, z: 0.000000}
Point {x: 0.000000, y: 70.000000, z: 0.000000}
Point {x: 0.000000, y: 80.000000, z: 0.000000}
Point {x: 0.000000, y: 90.000000, z: 0.000000}
The closest point to {x: 0.000000, y: 12.000000, z: 0.000000} is {x: 0.000000, y: 10.000000, z: 0.000000}
Yay! This matches our unit tests in Rust. We've just successfully called Rust code from our C++ project.
Happy tinkering!
Credits
The image for this blogpost was created via from Affinity Designer using the following sources:
-
Family African Bush Elephant Kafue by Timothy A. Gonsalves, licensed under CC BY-SA 4.0
-
C Programming Language SVG by ElodinKaldwin. This logo image consists only of simple geometric shapes or text. It does not meet the threshold of originality needed for copyright protection, and is therefore in the public domain. Although it is free of copyright restrictions, this image may still be subject to other restrictions of trademark. Fair use.
-
C++ Programming Language SVG by Jeremy Kratz. This logo image consists only of simple geometric shapes or text. It does not meet the threshold of originality needed for copyright protection, and is therefore in the public domain. Although it is free of copyright restrictions, this image may still be subject to other restrictions of trademark. Fair use.