Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I also had that thing with the NOP happen to me once, with the program with the extra NOP running 2x as fast! Took a couple of days until I finally figured out what was going on.

After much investigation what I found out was that the original code without the NOP was actually running at only 1/2 the speed that it should. Due to very bad luck, the addresses of the jump targets in the inner loop where placed in a configuration where the branch predictor failed to predict the jumps (perhaps because of collisions in the internal "hash tables" used by the jump predictor). Any nudge to the executable would get the program out of the pathological configuration. Using a different compiler version, different OS , or different CPU model all did the trick. But the most fun of course was that adding or removing a NOP also made the difference :)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: