STT - Speech to text

MaxO · January 6, 2024, 7:33pm

Interesting topic and information, thank you all!
When activating Futo it says that the app will be able to read all passwords on a site you use it, is that the same for all STT models?
Or shouldnt I worry about it and its even a good sign they tell me?..

Lyerbird · January 8, 2024, 11:53am

That appears to be the same for every input app (Keyboard, voice app) you may add. Just try to deactivate a keyboard and re-activate it and you’ll see that you’ll get the same message. The app is open source and offline. So no data/passwords should be transferred anyways.

Gquirt · January 13, 2024, 1:16am

As Lyerbird said, it theoretically shouldn’t be an issue. If you’re worried about it sending data, you can always take away it’s ability to access the internet from the Settings app. I’ve done this with a few privacy invading apps such as Gboard, and the Pixel camera app.

Do know that FUTO apps use a weird pseudo open source license. The code is viewable, but they can take down forks of the project for any reason. Not ideal imo, but certainly better than Gboard.

petefoth · January 13, 2024, 7:20am

The Futo licence an be found here. The do say:

This temporary license is intended to protect our intellectual property while we work towards a more open and permissive license. This license is subject to be replaced with one that will grant you more rights, not less.

which sounds reasonable to me.

Gquirt · January 13, 2024, 7:20pm

Fair enough. Though noting makes them actually ever replace that license with something better. I’ll choose to withhold my trust until it actually happens.

Won’t stop me from using cool free stuff, just that I wouldn’t build super critical infrastructure on it or contribute to the project myself.

meonkeys · February 6, 2024, 6:14pm

Wow, FUTO voice input performance is impressive. That’s great, I was stopping by to try to figure out how to get STT from a self-hosted Whisper server because Dicio doesn’t work well for me. But if FUTO (Whisper) works that well right on the device, why not use that?

For folks using FUTO: I’m not seeing any integration with the AOSP keyboard. Are you? I was expecting a mic icon to show up or something. If I use AnySoftKeyboard, I do see a mic icon and that does trigger FUTO STT.

I also asked in their chat: https://chat.futo.org/#narrow/stream/24-general/topic/how.20to.20trigger.20STT.20with.20AOSP.20keyboard

Agreed re: FUTO license. I hope they go with something standard.

More info on the company:

w1900 · February 7, 2024, 9:39pm

Integration with AOSP keyboard is working - you just have to enable the mic icon in the keyboard settings. Go to
Settings - System - Languages and input - On-screen keyboard - Android keyboard (AOSP) - Preferences.
Toggle the Voice input key switch to enable the icon.

Infinity · February 10, 2024, 9:50am

I’ve installed Futo, activated it and all so, but when I hit the mic in SwiftKey the only option I am offered is to download “Google Voice Search”. I really don’t know what I am overseeing or what else I can try…

Lyerbird · February 10, 2024, 11:46am

In this Git issue it is suggested to open Futo at least once after a Android reboot. So, for this test only, you could try to reboot and open Futo once befor you try to use the mic button.

Infinity · February 10, 2024, 12:05pm

Unfortunately this doesn’t change anything. Keeps asking for Google Voice Search.

Lyerbird · February 10, 2024, 12:26pm

Please also see the Help Menu inside the Futo app. It may have some relevant instructions or hints.
You could try to uninstall and reinstall Futo.

That said, Futo has a Forum and that might be a better place to troubleshoot this issue or you open a new thread in this /e/OS forum.

demux · March 12, 2024, 5:56pm

I did not really bother about STT, knowingly that most of the common STT applications are using cloud backed for the audio transcription. But now with FUTO’s approach with a local Wisper server and trained speech LLMs is great!

This also works fine for my major use case, for voice querying my private Ollama AI with OpenWebUI front end running on my Home PC, from the mobile.

Edit: Open WebUI is pretty powerful in terms of features. It comes with builtin Wisper support for STT running on the GPU, that works well on the localhost. But, for allowing the mic in the browser to work, requires a HTTPS connection. So, either I have to add a reverse proxy in between or stick with the FUTO Voice Input for the mobile use.

Does anybody know why FUTO Voice Input did not make it to the F-Droid repo?

Hermano · March 12, 2024, 8:24pm

It is in FDroid but you need to add the Repo.
https://app.futo.org/fdroid/repo/

demux · March 13, 2024, 6:18am

This more looks like FUTO’s personal repository, that can be added to F-Droid client store app’s PPA.

So, there must have been a reason that FUTO Voice Input did not make it to F-Droid repo or was rejected by them.

GaelDuval · March 13, 2024, 7:46am

Well, Futo is using the OpenAI Whisper API… (which is extremely good and fast but running on OpenAI servers… )

tyxo · March 13, 2024, 7:58am

What do you mean by that? It’s working completely offline. It doesn’t even try to make a connection to the servers of OpenAI.

GaelDuval · March 13, 2024, 8:20am

I might be wrong, just noticed they were using Whisper. But maybe it’s running the Whisper model on the device? (I haven’t dived into the source code)

GaelDuval · March 13, 2024, 8:27am

Sadly, Futo is NOT an covered by an open source license https://gitlab.futo.org/alex/voiceinput/-/blob/master/FTL_LICENSE.md

tyxo · March 13, 2024, 8:33am

Yes, I think it’s running the whisper model. I am using the whisper model on my desktop. Here I use a tool called DSNote (speech note). It’s a very great tool.

tyxo · March 13, 2024, 8:35am

This is indeed an issue that others reported too. So it’s not really clear why the developers of this tool use such a license but I think they have something specific in mind. For example, getting a larger user base and then switch to something commercial. I don’t know.