The most important thing in a service is often not just what is said, but how it's said. Is the speaker sharing a joke? Are they speaking with serious, pastoral care? So much of the meaning is carried in the tone, timing, and emphasis of the original speaker.
When you're enclosed in a text-to-speech audio feed, you miss all of that. You're disconnected from the feeling in the room and the friendliness of the person speaking. You might get the words, but you risk missing the heart.
"I have been to a number of comedy gigs recently that have been signed by a BSL interpreter at the side of the stage. It's wonderful to see how they express the emotion and the humour through their physical expressions and their gestures on top of the sign language. The communication is so much more than just doing the signs. It's a full-body experience that communicates the richness of the comedy."
So it is with translation. At the moment, all translation models go via a text stage. At this current stage of technology, that process strips out the original emotion and emphasis. That stripped-out nuance can never be fully put back in by a text-to-speech engine.