Seems like I didn't get the idea across clearly. While audio files are binary data, they're encoded in a specific way, and use specific playback codecs to render that binary into data suitable for playback by soundcard chipsets.
Back, way back in the day (but not as far back as punch cards, but before diskettes), binary streams were persisted to tape. Not the backup tape used today, but regular audio cassettes. The data was converted into an audio stream, similar in nature to what you'd hear during a modem handshake. On the receiving end, the audio stream was decoded back into binary data, without any loss (well, given the robustness of error correction back then, sometimes you'd have to try again). The key point being, the audio was just a transport layer for binary data, which could have been anything -- images, code, text, audio files, etc.
Sounds like GP is suggesting digitally encoding/modulating the audio bitstream, like using the speaker as a modem. Which would be far less lossy than trying to record the analog signal.