The potential for scamming is limitless with this. Elderly people were vulnerable to phone calls from their "relatives" before when the voices didn't even sound that close. Can you imagine what the hit rate is going to be on these scams when the voices are nearly identical to the voice of their relative? Also, at some point I expect that even answering the phone and saying "Hello" will be enough for some AI model to zero-shot clone your voice with enough fidelity to pass to most people. Tech like this is going to absolutely destroy what little remains of voice conversations over phones.