STT - Speech to text

Hi,

is there any work going for a future version of /e/OS in regards to STT Speech to text or Smart Assistant. This is a quite important feature for me (e.g. for dictation of chats or notes) and I’m waiting for it for a long time.
There is this post about Elivia, but I couldn’t find any further info about this.
Maybe Mycrofts Mimic3 / Git could be used on Android, also as a smart assistant or Mozilla’s DeepSpeech STT (in the future). Or is e.foundation waiting for Mozilla’s deep speech? It seems to be somewhat inactive on Git since more than 1 year.
According to a German Linux Magazin, DeepSpeech is more or less completed and can be used by App developers.

Regain your privacy! Adopt /e/ the unGoogled mobile OS and online servicesphone

1 Like

doesn’t look like work on Elivia, a client/server model, ever progressed beyond proof-of-concept.

My impression is on-device stt-engines progressed to the point of being usable. I played with Vosk inside Dicio not long ago and thought it to be fun for simple assistant tasks - [FEATURE PROPOSAL] Open Source Voice Assistant on /e/ - #49 by tcecyk - it’s on fdroid.

If you look around on github, Apps depending on Vosk or DeepSpeech and local language models pop up, I think it’s a space to watch, rather then off-devices services.

2 Likes

If you really need this functionality then it is possible using Google’s GBoard keyboard, which can be installed from Aurora Store. Yes it’s Google software, and I guess it works by sending the speech to Google’s servers for processing (using the Google cloud ‘Speech-to-Text’ API) but it does work without needing to login to a Google account.

2 Likes

yes, indeed. GBoard does work, but they could recognize my voice (I guess) from previous usage of Google STT.
Additionally, Gboard is potentially getting all input I do via virtual keyboard. Not acceptable to me. Gboard does work in the Shelter app, but things are becoming complicated then.
I need the functionality not in absolute terms. I can still use my phone without it, but I would find it very usefull to me (and many others).

Dicio is nice, but doesn’t have many features (not in my German language) and has no STT, as far as I can see.
Genie Smart Assistant looks nice, but it doesn’t seem to have an Android app.
I’ve search a lot, but didn’t found any usable STT app for Android :man_shrugging: besides Google’s GBoard

Dicio: not many features, but dictation works in german if you downloaded the accompanying model (the small one seemed good enough). I think it’s a matter of integration.

On github you can find “Sayboard” that is vosk+keyboard IME as app, but it’s very barebones and downloads I think the large models (gui is hard to click in darkmode), taking lots of space - so I didn’t link it yet.
LocalSTT isn’t publicly packaged yet afaik (beyond the catalan/spanish prebuilt) so I can’t speak to is quality, should give this a go

Edit: misstated - yes not IME input through Dicio. I meant to say “it will faithfully translate your sentence to text”. Dictation really is not yet a function, it’s command / assistant focused

1 Like

I noticed today that /e/ did experiment with Vosk ~1yr ago and packaged alphacep/vosk-android-demo with additional integration (choose language) as foundation.e.voice. You have to sign the apk yourself before install.

If you activate it as input method (slide to on as for aosp-keyboard) you can use it through the mic icon on any input field… though when I do, I’m blind to how to return to text input and am stuck :slight_smile:

2 Likes

Go to Settings and deactivate the input method again :wink: .
At least that’s what I did to get my normal keyboard back, perhaps there’s something more convenient?
But the speech to text works really well, I was pleasantly surprised.

1 Like

Thanks for your hints. Basically I’m just a user who looking for a dictation app that is Open Source and is relatively easy to install on Android. It should work rather reliable (not many bugs) and the quality of STT should be acceptable and in my language (German). It should be possible to integrate the STT app into a keyboard.
I understand that LocalSTT, Sayboard or foundation.e.voice may be candidates for this.
It is recommended to try out foundation.e.voice

…there are no easy options currently, [only] prototype stage…

Android STT Dictation

LocalSTT - see Forum
Sayboard
foundation.e.voice (based on vosk) - Package signing required → Forum → more Info + (recommended)

GBoard from Google - available on Aurora Store. Can be used in combination with Shelter (a privacy app). It’s not very practical to often switch between Gboard and an Open Source Keyboard app. I.e. freeze and unfreeze Gboard app. Also, Gboard installed with Shelter can only be used with Shelter installed apps. See forum

Elivia - no APK available
Mycroft - no APK available. It’s a smart Assistant hardware box
Mozilla DeepSpeech - no APK available. It’s just a model for developers to be used
Vosk - no APK available .it’s a backend that doesn’t work on it’s own.

Smart Assistant

Dicio - Android (F-droid) APK available
Mycroft - hardware box
Genie - no Android APK available. Only for Linux

2 Likes

nice overview. Yes there are no easy options currently, prototype stage.

For the voice/text method switch: I had System → Accessibility → Voice recognition → Shortcut for VR enabled when fumbling around - that icon stopped the keyboard symbol in the bottom right to show up.

With this out of the way, I’d recommend you to sign the foundation.e.voice apk (4 commands) and give it a try. You’ll need android sdk tools. If you run a debian based Linux, those are packaged already.

zipalign -p 4 app-release-unsigned.apk app-release-unsigned-aligned.apk 
zipalign -c 4 app-release-unsigned-aligned.apk 
keytool -genkey -v -keystore my.keystore -keyalg RSA -keysize 2048 -validity 10000 -alias app
apksigner sign --ks-key-alias app --ks my.keystore app-release-unsigned-aligned.apk
adb install app-release-unsigned-aligned.apk

You’ll need to enable both input methods in the Language and Keyboard: AOSP keyboard and the newly added Voice Recognition.

Before usage, start the VR App and let it download the language model (~50mb?)

2 Likes

On my phone it’s quite simple to switch: when both keyboards are enabled, tere is a small button/icon that appears to the right of the Recents button when either keyboard is active. Pressing this will allow you to switch between keyboards

1 Like

I use Anysoft keyboard and I can’t see a button to switch between keyboards. Even if there was a button, I can’t switch to Gboard because I can’t see it in my Android settings. This is because it’s installed with Shelter and I can use Gboard only with Shelter installed apps. However, my main Apps I would use Gboard with are not Shelter installed apps (e.g. chat app, car navigation or note app).
Installing Gboard without Shelter is not an option for me.

@tcecyk thanks for your hints. I’ve updated the summary and will try out foundation.e.voice.

I don’t use Shelter. Gboard is installed in the ‘normal’ profile, so switching in easy :slight_smile:

I can switch to different keyboards and language settings by press and hold the “space”. I think there is a specific setting for this.

it’s not working for me with Anysoft and even if it did, the gboard keyboard would not appear there since it’s installed in Shelter.
I know that the space bar supposed to work with Anysoft and other keyboards but I dunno why it’s not working for me and I also don’t need it currently.

Finally Dicio has released a working STT app version of Dicio. :tada:
It is a first version, so don’t expect too much (e.g. there are not punctuation marks), but I find it useful and it save’s me time with dictating text for SMS, chat or notes vs. writing it with the keyboard.

1 Like

True. Works quite good for me too.

Dicio seems to recognize text well enough. Nice!

I’m a bit lost in this thread… is there some way to trigger Dicio to input text instead of me typing, in any arbitrary app?

I can copy and paste what I say into Dicio, but that’s quite a few taps (see below). I’m looking for something more like what gboard does when you tap on the mic icon (and I’d rather not use gboard).

Right now my “quite a few taps” process is to:

  1. open Dicio
  2. touch the hamburger menu icon
  3. touch “Speech to text service”
  4. say something
  5. touch the copy button, and finally:
  6. paste that text into another app

I tried fiddling with Android OS settings a bunch and was able to set Dicio as the “digital assistant”, but this doesn’t seem to do anything useful.

if you use another keyboard app (AnySoftKeyboard) it will recognize the availability of Dicio on microphone-tap. The service will not be instantly available though. System IME integration as voice input (or by intent) is lacking currently.

For both issues there are entries at:

3 Likes

My recommendation is Futo for several reasons.

  1. It has an extremely good recognition of language. So most words will be recognized correctly.
  2. It recognises in contrast to Dico capital letters and the punctation and end of sentences.
  3. If you are speaking more than one language it can be set to automatically detect the language. It will recognise and switch to the other language automatically.
  4. Completely offline
  5. No known trackers

https://gitlab.futo.org/alex/voiceinput
https://github.com/futo-org/voice-input

It works well with anysoftkeyboard.

7 Likes