I don’t use Signal because I think their used servers are compromised (SVR and the RAM encryption used (Intel: SGX, AMD: SEV)) already have exploited vulnerabilities (see reference below)!
But to answer your question in a helpful way anyway:
I am using the Threema messenger app and have tested for you how it works with audio files.
Conclusion: Threema plays the audio file via the ear-speaker as expected.
So it is basically up to the messenger software whether you are allowed to use the smartphone like a phone or whether you have to hold it in front of your mouth like a jam sandwich when you want to listen to a received audio message.
And if it is not supported by the operating system and/or hardware, automatic switching from ear speaker to external speaker and vice versa is not possible.
With my Threema audio file I have also tested the latter, but my Fairphone 3+ has not switched from one speaker to the other – so I think /e/OS and/or the Fairphone 3+ hardware couldn’t recognize nearing the phone to my ear to switch the speaker.