[FEATURE PROPOSAL] Open Source Voice Assistant on /e/

hi all,
https://f-droid.org/en/packages/ai.susi/
this one is just for the home:
https://f-droid.org/en/packages/io.homeassistant.android/
Rik

1 Like

haha Susi and Leo :slight_smile:

This is an excellent idea, Julien, and I am happy to see that Manoj says it is on the /e/ roadmap.

Suggesting building on an existing project such as Mycroft is brilliant, also.

2 Likes

I personally don’t like them, but they are popular and from an accessibility standpoint they’re crucial. However, if voice recordings are uploaded and analysed on a server, this needs to be communicated very clearly, so people can make an informed choice about using them.

1 Like

Same here.
I find it annoying when people make loud calls or don’t hold their cell phones to their ears. It’s loud enough around us already. I don’t need people talking with their mobile phones. Or with Alexa or Siri or what they’re all called. That just makes us all more stressed.

Sure, for people with certain handycaps it’s ok, but please not for everyone, everywhere.

1 Like

It is not really a dication app, but I use VoiceTra (Voice Translator) and save the text output.
I haven’t found any other dictation app with decent privacy.
I someone has found a better work around, I would love to hear it.
I’m using a Oneplus 3.

As an aside, on my Samsung Galaxy Tab, running stock Android 7.1.2 , very few non-Google dictation apps work properly.

mozillas DeepSpeech is able to proceed speech-to-text tasks utterly on your device without any connectivity to cloud providers. it only comes with trained models for english, but other languages are also on the way. nevertheless it’s only a development solution for this kind of tasks, not an end user friendly simple app.

https://hacks.mozilla.org/2019/12/deepspeech-0-6-mozillas-speech-to-text-engine/

This assistant needs to connect to an account, it let connect to your own server but i don’t find how to create this one.

Well, I understand that, but everyone should be happy if we can make our phone easier to use…

Voice assistants are good, but they are neither the best or only solution to make your phone super efficient/easy to use. We can try something easier and more silent anyway, such as a visible “screen assistant” (think about that Apple’s little round stuff, I dunno it’s name :sweat_smile:). It is silent, and can be audible if you turn on a switch. (and have a TTS engine behind it,which /e/ do’t have…). It will show a list of actions depends on the app you’re using, and can say hello and chat with you(using text) if connected to some AI

Actually this “screen assistant” can be quite cute if you change the default boring icon with an animated pony or a anime figure! :smiling_face_with_three_hearts:that’s quite sweet! :smiling_face_with_three_hearts:

1 Like

Hi all,

I recently looked into this and found:
https://almond.stanford.edu/about/get-almond
and
https://www.openjarvis.com/

I haven’t tried them yet but will see if they’re any good. Unfortunately they’re not available on F-Droid only Play store and the OpenJarvis one seems to have some doppelgangers so may be difficult to find the trustworthy version. I’ll also try out Susi, I’d not seen that one before.

Unfortunately though, reliable open source apps which do STT and voice control seem to be very rare, or not available yet. Since voice control is becoming a more and more common (even the most common in some countries/demographics) way to interact with smartphones I think this is something that the /e/ team should have high on their priority list.

Cheers :slightly_smiling_face:

Apps that does STT function is actually present. It’s called Kõnele, which is available in F-droid. This app does supports English, although it provides more support for Estonian. To achieve best experience in /e/OS , /e/ should set up an official server, although this is not that easy as said…

1 Like

Hi @Xenium,

I installed Konele but I can’t get it to recognise English. This only appears to be available in the grammar mode, but even then I can’t get it to work. Does it only work in specific apps?

Everyone :slightly_smiling_face:

Almond doesn’t have a STT engine, it needs it from another app. Thus far I have not tested it much.

Open Jarvis is intended for home automation running on a raspberry pi, so this app Android only works with that.

I also found SUSI.AI but didn’t use it yet because it needs a login. I don’t know if this includes a STT engine.

There is a DeepSpeech demo app which can be built for Android here:


I don’t know exactly what it makes because I’ve not tried it yet.

More generally, I found that STT engines for Android are very rare! To understand this I’ll explain what I’ve found.

First, we can break down the “voice assistant” into two parts:

  • STT engine to convert voice to text
  • AI “assistant” which interprets the resulting text to commands to execute on the phone or over the internet

We can have two separate apps for this and that may actually be an easier way to do it (also the STT engine can be used for voice typing as well).

There’s also a subtlety here: “Assistant” apps have a clear place to be set as default in the Android settings. However STT engines is not so clear, there are in fact three places:

  • Voice Input setting - in default apps, assist app, you can set a default app for voice input. Konele provides this service but very surprisingly most voice keyboards and assistant apps do not
  • Virtual keyboard setting - Apps which can do their own STT for typing but don’t make it available to other apps, e.g. Gboard
  • Assistant app! Assistant apps which do their own STT but don’t make it available for other uses, e.g. Cortana and Alexa are like this.

So, for a true STT engine we need it to provide the voice input service (and others, this is well described in the Konele docs).

Most voice typing apps, including keyboards that can do STT, actually use Google STT engine; so if Google isn’t present then they don’t work.

Try to find a non-Google STT engine which you can use with other apps - they exist, but are almost never open source. E.g. Swift Keyboard, Dragon Dictation.

So far, overall the most complete option I found is Kõnele, but it needs a network connection and so far is only available in Estonian (unless you can get it work in English or provide an English language server).

It would be preferable to have an on-device STT engine, but these are quite demanding on resources. Apparently DeepSpeech with TensorFlow Light can run on an RPi4 so should be okay on most smartphones.

So is this voice assistant for /e/ on the roadmap, does anyone know?

Cheers :slightly_smiling_face:

1 Like

Nice analyze, @madbilly :smile:

Now you don’t have to test SUSI.AI, because I have done it a few months ago. It doesn’t include anything about STT, just provides a messenger-like interface to “chat” with the AI.

I have konele installed on two phones, they both works (occasionally), but turn out to be miserably inaccurate. Imagine most of the "b"s you spoke are recognized as "p"s .:fearful:

Actually Mozilla is building some voice assistant, but it’s only a plugin and is currently in beta. I have tested it too, not.that accurate but useable anyway…

1 Like

Hi @Xenium,

Thanks for the info on SUSI.AI, I’m glad I don’t need to test it! :smiley:

Good to know your experience with Konele. I presume that the absence of any English (or other language) servers for it means that it’s not easy to set up and/or may not work very well.

You mean Firefox Voice plugin for Firefox desktop? Yes I also signed up for the beta of this but haven’t used it much. I would prefer an app on Android to one on desktop, since for desktop my habits of mouse and keyboard are pretty difficult to shake!

Cheers :slightly_smiling_face:

2 Likes

so it seems we have to wait for elivia… I just switched completely to /e/ OS including locked bootloader (oneplus 7t pro) and searching for a simple assistant since months… most of the stuff is incredible outdated or in alpha state. really interesting how many approaches exists but no one is really finished. The MyCroft approach is using Google voice recognition so is not an option too. at least not if that does not change:

Taken from their github page:

It implements voice recognition and Text To Speech (TTS) via Google API’s at the moment, but that may change soon.

I REALLY like the approach of having the whole thing on the device itself without any need of a network connection or own server but again… alpha :wink:
https://github.com/Tadashi-Hikari/Sapphire-Assistant-Framework

So there is no development on Elivia since 9 months which leads me to the conclusion there will be nothing (usable) anytime soon… even not in months. I hope I am wrong but that is the frustrating result after my research.

apropos alpha: alphacepheis Offline speech recognition on Android with VOSK is being used in Dicio - it can do some languages and basic tasks. Haven’t found both mentioned in this forum.

It’s fun to play around with and sees small releases. It is still a toy, but surprisingly good for an on-device solution. Language models to be downloaded can be large. It’s available via F-Droid and thus the /e/ Apps, or get it at the github release section

2 Likes

Dicio seems to have some potential, I tried it out myself today. As you said pretty limited still, but it does a decent job at what skills it does have, especially for a relatively new project.

Vosc is also impressive, I would argue a privacy conscious STT engine is more important than an assistant at this stage of /e/. I see the lack of stt for texting on the go, as a big stumbling block for adoption of /e/ by the masses.

I hope both of these teams have better luck than their predecessors…

3 Likes

After some searching I found this project on Github. Felicis has forked VOSK and packaged it as an input method for android, basically it’s a voice keyboard. It’s labeled pre-release and Felicis calls it a proof-of-concept but it is working, at least in English. German is also available but I can’t comment on that.

Unfortunately, this one and only release is from 10 months ago. However on another fork Felicis did mention they have been too busy to get back to this but welcome anyone to jump in. I wish that I had the skills! It looks like the ultimate goal is to wrap it into a ‘recognition service’ that will make Konele work offline.merge with konele: change to RecognitionService · Issue #1 · Felicis/vosk-android-demo · GitHub

2 Likes

Biggest problem with mycroft is that it really depends on Mycroft servers. If you read their statement on their own pages. Mycroft probably won’t be usable without access sending you (audio)data to their (closed) servers.

An alternative:
@Blort@social.tchncs.de :link: Blort™ (Unofficial) 🚫: "Looking for a #FOSS #VoiceAssistant with a much b…" - Mastodon

Looking for a #FOSS #VoiceAssistant with a much better #Privacy policy than #Mycroft (which just ate #Rhasspy)? #Genie is looking amazing, and uses unique algo’s to understand requests that #Amazon #Alexa and #GoogleAssistant can’t!

1 Like