Notes from: How to Performance by Dr. Eileen Uchitelle
Benchmarking vs. Profiling
Benchmarking is about measuring how slow your code currently is. Benchmarking tools are also instrumental in having an empirical demonstration of improvement.
Profiling tells us what parts of our code are slowing us down, and are used to isolate what methods are slowing us down.
Both benchmarking and profiling tools must be used together to improve performance. Using only one will not be enough.
Steps to improve performance
Get a baseline
The baseline is your beginning metric to measure against with later attempts to know if improvements are being made.
Time is not a good baseline for most performance work. There are a lot of things that impact total time – like garbage collection or caching – so it varies quite a bit and if you wanted to actually use time, you’d have to do it by average and it usually uses quite a bit more work to be definite.
A better baseline is to use benchmark-ips (https://github.com/evanphx/benchmark-ips). The gem measure how many times the specified code can run in 1 second. This means the more iterations the faster your code is. This takes the guesswork out of how many runs you need to do to get a good average. It also provides a standard deviation.
Find the slow spots
A nice profiler for ruby is ruby-prof. It is an industry standard are fast due to using c extensions.
The rubyprof output is often confusing and contains a lot of noise, and the call stack printer is one of the easier printers to use for understanding the amount of time spent in certain areas of the call stack.
Look for locations that take up a larger amount of time that are not at the top of the call stack, and start there.
Stackprof is another profiler like rubyprof, but it takes a sampling of the stack so you can focus on a specific point instead of the bigger picture like rubyprof provides. Run stackprof around 3000 times (for a broad enough sample size) to get a more specific scoping of slow spots.
While stackprof is great for pinpointing problems, it can be difficult to pinpoint problems in anonymous modules or when methods are dynamically created.
RubyVM.stat is a method in the RVM that supplies information on the current method cache. This is useful for seeing if your method cache is getting busted in between runs. Method caching is important in ruby code because the cost of creating new globals and constants is high.
Fix and verify
ABB: Always be benchmarking. Every time you make a change to improve performance you have to benchmark to verify the change actually improved the performance.
If you are on OSX, use wall time instead of cpu time as there is a bug for macs there.
Eventually, they got to a point where they were making micro-improvements and the time spent tweaking wasn’t paying off in performance gains. It usually takes quite a few micro-improvements to see any substantial gain.
Most of the time, people immediately jump to caching to solve their performance problems. Caching comes with its own costs and can often be expensive. Try to speed up your code first without caching, and see if caching is band-aid on a larger issue altogether.
GC can often make a huge impact to performance. A popular library to inspect how garbage collection works with ruby code is AllocationTracer. It will tell you how many objects are being allocated by ruby and where they are in your code. This type of tool, like stackprof, is specific and helps the user focus on the important bits.
When improving your code for GC, the goal is to minimize time spent in GC, not on overall speed of your code since less time spent in GC should reduce the time spent overall.
One way they reduced time in GC is by freezing strings to reduce the amount allocated. However, before getting overzealous with freezing allocations, prove that the allocations are a bottleneck. As a larger note: prove the bottleneck before solving it and avoid premature optimizations.