MediaPortal Forums HTPC/MediaCenter

Go Back   MediaPortal Forum » MediaPortal 1 » Community Skins and Plugins » Plugins


Plugins Plugins developed and maintained by users. Want to create your own plugin? Start a thread in here.

Reply
 
Thread Tools Display Modes
Old 2006-05-21, 18:45   #1 (permalink)
Portal Member
 
booyakasha's Avatar
 
Join Date: Dec 2005
Location: Canada
Posts: 152
Thanks: 2
Thanked 12 Times in 7 Posts

Country:

My System

Default myVoice Plugin: Progress report

Hey all,

As I've stated on another thread, I've started working on a speech recognition system for MP utilizing MS Speech 5.1.
Currently it does the following:

When you start MP, the only thing it understands is the keyword: "my computer"
If you say the keyword, it will change to understanding the following:
-move up
-move down
-move left
-move right
-previous menu
-home
-exit
-my music
-my videos
-settings
-select item
etc...(many words not tested yet)

Phrases such as "my music" will take you to that window from whatever screen you're on.
If you don't say anything it understands in 5 seconds, it reverts back to just understanding the keyword.

My next steps are:
-further explore MP's interface and add the corresponding voice commands
-add modifiers (ex: move down three)
-integrate playlist/video/music selection through artist, genre, etc
-multi languages (could be SDK specific though)?

What I could use:
-General feedback
-beta tester(s)
-collaborators

I'm trying not to release a large public beta on this as I'm looking at an easier way to distribute the MS Speech engine as most people don't need most of the SDK.

That's about it for now. I'll use this thread to update my progress, so please feel free to post or PM me.
booyakasha is offline   Reply With Quote
Old 2006-05-21, 19:00   #2 (permalink)
Retired Team Member
 
smnnekho's Avatar
 
Join Date: Feb 2006
Location: Germany
Age: 24
Posts: 482
Thanks: 3
Thanked 0 Times in 0 Posts

My System

Send a message via ICQ to smnnekho
Default

needless to say: this will be (the hell of an) improvement.

i would love to stand ready for your call as a beta tester for your work and of course with any kind of help and ideas i could possibly provide!!
smnnekho is offline   Reply With Quote
Old 2006-05-21, 19:48   #3 (permalink)
Portal Member
 
Join Date: Jun 2005
Location: Norway
Posts: 199
Thanks: 1
Thanked 0 Times in 0 Posts


Default

I just love this

Would it be possible for the end user to choose speech engine, or do we have to use microsoft's?

In my opinion multilangual should not be a priority.
English and French is the most important in the begining.
__________________
htpc- all in one.
aasmund Nordal is offline   Reply With Quote
Old 2006-05-22, 09:04   #4 (permalink)
Portal Member
 
Join Date: Apr 2006
Location: France
Age: 28
Posts: 23
Thanks: 0
Thanked 0 Times in 0 Posts


Default

hello !

I've already work on a similar plugin but i 've a problem with the popup and using voice remote. In fact, MP froze when an popup menu (F9) opened.
Have you solve this problem or there isn't similar problem on your plugin.

I think there isnot french speech engine ( i don't find it) perhaps an french canadian ( i have read on an forum but i don't fint to)

good job
guilhem is offline   Reply With Quote
Old 2006-05-22, 12:49   #5 (permalink)
Portal Member
 
eagle's Avatar
 
Join Date: Sep 2004
Location: Lower Frankonia
Age: 49
Posts: 321
Thanks: 2
Thanked 0 Times in 0 Posts

Country:

My System

Default

Quote:
Originally Posted by aasmund Nordal
In my opinion multilangual should not be a priority.
English and French is the most important in the begining.
In my opinion multilangual is important, because it increases the acceptance of the family. My two sons (10 and 8 years old) don't speak English or French, so it is a must.

If it works, they won't use IR oder Touchscreen anymore

eagle
eagle is offline   Reply With Quote
Old 2006-05-22, 13:16   #6 (permalink)
Retired Team Member
 
smnnekho's Avatar
 
Join Date: Feb 2006
Location: Germany
Age: 24
Posts: 482
Thanks: 3
Thanked 0 Times in 0 Posts

My System

Send a message via ICQ to smnnekho
Default

imho, the highest priority should be to make this as configurable as even possible. this depends on how the 'keywords' are stored. if they are internal, there would be no (t really) chance for changing them, but if they would be stored external with referral id's it would be customizable. (at least if you say 'my computer' , 'casseiopeia', 'dude', 'KITT' or whatever (-; as activating keyword etc.)

a question related to the engine: i guess it really recognizes the words, meaning you don't have to record a command and save it as keyword (which wouldn't be very (multi-) user friendly but have written 'My computer' in the source and the engine just recognizes it.?
smnnekho is offline   Reply With Quote
Old 2006-05-22, 13:37   #7 (permalink)
Portal Member
 
Join Date: Apr 2006
Age: 34
Posts: 157
Thanks: 0
Thanked 2 Times in 2 Posts

Country:


Default

MS Speech if speech-to-text, so it will decode the speech to text, no need to "record" a command. The downside is that you need to speak English pretty good (Maybe not for these simple commands tho).

Waiting with great expectations.
zion22 is offline   Reply With Quote
Old 2006-05-22, 17:28   #8 (permalink)
Portal Member
 
booyakasha's Avatar
 
Join Date: Dec 2005
Location: Canada
Posts: 152
Thanks: 2
Thanked 12 Times in 7 Posts

Country:

My System

Default

guilhem: I'm not using sendkeys. In an early test I ran into this freezing problem as well using that method. Now I just send messages (gui and action) to MP directly.

smnnekho: currently none of the recognition is really hardcoded except the switch between recognition modes. An xml file contains many of the mappings from MP's actions/windows to voice, so it's quite easy to change something like "my music" to "my tunes", or "my computer" to "KITT"

aasmund Nordal: It's built specifically for the MS Speech 5.1 engine (free and easy to integrate with C#). If I run into too many problems down the line, especially with multi-language support, I'll look into using another engine (sphynx?).

As for the multi-language support, that's something I'll probably look at in more detail once I get the full functionality working.
booyakasha is offline   Reply With Quote
Old 2006-05-22, 18:44   #9 (permalink)
Retired Team Member
 
smnnekho's Avatar
 
Join Date: Feb 2006
Location: Germany
Age: 24
Posts: 482
Thanks: 3
Thanked 0 Times in 0 Posts

My System

Send a message via ICQ to smnnekho
Default

i would second that multilanguage support shoudln't have priority. imho english would please most of us (and i'm in fact no native speaker)

priority should be to get this one working fine in one language before getting to work for other ones. of course all this ain't true if implementing other languages wouldn't be that much of a problem...

2 questions / 1 suggestion :

will the engine recognize a specific word even if it's a whole sentence? just because of the effect on visitors (-; much more impressive to build the words (not the activating keyword) in a whole sentence..?

will the XML files containing the 'keywords' and the 'gui.actions' be universal? meaning that you can add your own commands even for your own gui.actions? or will you implement every thinkable action anyway?? (-;

suggestion (only if you didn't thought about it:

i would think about a tiny sound-prompt. meaning when the computer is ready to take commands afer saying "my computer" (kitt was just a joke by the way) a small beepbeep like in star trek would be cool. should be as gentle as possible though, so it doesn't start to bother. and another signal for the end of the 'hearing sequence' so that if something went wrong you don't keep talking for 10 minutes (-;
smnnekho is offline   Reply With Quote
Old 2006-05-23, 06:56   #10 (permalink)
Portal Member
 
booyakasha's Avatar
 
Join Date: Dec 2005
Location: Canada
Posts: 152
Thanks: 2
Thanked 12 Times in 7 Posts

Country:

My System

Default

Quote:
will the engine recognize a specific word even if it's a whole sentence? just because of the effect on visitors (-; much more impressive to build the words (not the activating keyword) in a whole sentence..?
Not quite sure what you mean here. Currently I have a 1 to 1 relationship between a phrase/word to an action, although I'll add more. ex: a cursor up command could be mapped to both "move up" and "up"

Quote:
will the XML files containing the 'keywords' and the 'gui.actions' be universal? meaning that you can add your own commands even for your own gui.actions? or will you implement every thinkable action anyway?
The gui/window actions currently mimic the structure already contained in a file called keymap.xml found in your mediaportal folder. I'm basically adding a <voice> tag to that (through a secondary file, so as not to change the original). For me to send MP a command, it has to know about it. I'm not sure yet how this applies to other peoples plugins.

As for your last suggestion, the last thing I added was a beep when switching to full recognition mode, and an exclamation sound when switching to "my computer" mode. I'm still not completely happy with that solution, as I think a visual representation would be better, but I'm not sure how to do that at this point.
booyakasha is offline   Reply With Quote
Reply

Bookmarks

Tags
myvoice, plugin, progress, report

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
myGames plugin: Early progress report waeberd Plugins 26 2006-05-19 05:23
Gathering topical infos about plugins (To: authors & use tomtom21000 Plugins 2 2006-02-05 00:20
Traffic Report Plugin? rwbradley Plugins 3 2005-12-31 10:15
Plugin example doesnt show in list Anonymous Plugins 5 2005-08-13 12:12


All times are GMT +1. The time now is 05:13.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0 Protected by Akismet Blog with WordPress
Advertisement System V2.6 By   Branden