Refcounting is significantly slower under most circumstances. You are literally putting a bunch of atomic increments/decrements into your code (if you can't prove that the given object is only used from a single thread) which are crazy expensive operations on modern CPUs, evicting caches.