For the calculator tool I suggest instead to just generate Javascript as an output with temperature set to 0 (system prompt set to something along the lines of: "Generate native Javascript code only. Don't provide any explanations. Don't import any extraneous libraries") and then eval that Javascript code in a VM. Deno is a good candidate for this as it has good security settings with access to filesystem and network turned off by default. You can use something like deno-vm [1] to execute it separate from your running process too. Setting GPT-4 as model works even better. I have seen it perform better than Wolfram Alpha in many cases so I am wondering why OpenAI chose to integrate with Wolfram Alpha for this. GPT-4 was able to solve some really complex math problems I threw at it.
[1]: https://www.npmjs.com/package/deno-vm