Status quo of an AWS engineer: Debugging overall performance loss
Alan's service is working better and better, but performance is still lagging from where he hoped it would be. It seems to be about 20% slower than the Java version! After calling in Barbara to help him diagnose the problem, Alan identifies one culprit: Some of the types in Alan's system are really large! The system seems to spend a surprising amount of time just copying bytes. Barbara helped Alan diagnose this by showing him some hidden rustc flags, tinkering with his perf setup, and a few other tricks.
There is still a performance gap, though, and Alan's not sure where it could be coming from. There are a few candidates:
- Perhaps they are not using tokio's scheduler optimally.
- Perhaps the memory allocation costs introduced by the
#[async_trait]
are starting to add up.
Alan tinkers with jemalloc and finds that it does improve performance, so that's interesting, but he'd like to have a better understanding of why.