Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I was impressed, but I also realized that this meant that it was suboptimal for minimizing temporary registers. However, I couldn't think of some algorithm that would multiply it the way I did when I wrote that assembly, since I hadn't really had an algorithm in mind when writing it.

If the compiler were minimising touched architectural registers, it could use the same multiplication method it already does, but output this code:

  lea r8, [r9 + 4 * r9]
  lea r8, [r8 + r8]
Or this:

  lea r8, [r9 + 4 * r9]
  add r8, r8
If it used your multiplication method, but with the register allocation decisions it appears to use, it could output this code:

  lea rdi, [r9 + 8 * r9]
  lea r8, [rdi + r9]
So the compiler's decision to use temporary RDI looks to me like a register allocation decision only, independent of multiplication method. (But maybe affected by an lea/add decision).


You're right. I don't know why, but I didn't realize that neither method of multiplication required you to use more than 2 registers here. The use of a temporary register was probably coincidental. Maybe it wouldn't have used an extra register if it had needed to spill to the stack in order to use one, and instead maybe it would have written one of the alternatives you wrote. Thanks for letting me know.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: