GithubHelp home page GithubHelp logo

scratch-vui's People

Contributors

dependabot[bot] avatar quachtina96 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

annoogy

scratch-vui's Issues

Support "I don't know"

In the linear project creation process, a child is asked

"What do you want to call it[the project]?"
"What's the next step"

They might not know.....

Current behavior:
1)
"What do you want to call it[the project]?"
"I don't know"
// Scratch ignores anything that doesn't start with Scratch.
2)
"What do you want to call it[the project]?"
"Scratch I don't know"
"Cool! When you say Scratch I don't know, I'll play the project. What's the first step"
"Scratch I don't know"
"I heard you say I don't know. That doesn't match any Scratch commands"

Desired behavior:
"What do you want to call it[the project]?"
"Scratch I don't know"
"That's okay, before you leave the project, I'll ask you to name it. What's the first step?"
"Scratch I don't know"
"A random Scratch command you could use is: ______. <skippable explanation?>"

Change listening mode: Toggle requiring use to say "Scratch" before everything

Someone who is very intentional or quiet, might want to not have to say "Scratch" before every Scratch command.

Examples of Supported Interactions (from user):
"Stop making me say Scratch"
"Stop making me say Scratch before every scratch command when I'm editing a project"
"Stop making me say Scratch when I'm editing a project"
"Only listen when I say Scratch"
"Always listen to me even if I don't say Scratch"

Reduce number of misrecognized commands.

"Play note fifty for zero point two five beats" is recognized as "play note 54.25 beats"
"play note fifty for zero point two five beats is recognized as "play note 54 0.25 beats"

basically, for is misrecognized as "four"

Some things we can try

  • add Scratch command structures to the grammar
  • attempt to match rhymes when a comma cannot be parsed validly?

This makes me think that there could be some way to probabilistically determine the most likely speech or desired form of the speech... but how would you generalize this? how might i build a model for this?

Support use of projects as instructions to be made in other projects.

e.g. When I make a project called "say my name in cat language" and the instructions are "play the meow sound", Can I make a new project called "i really think you are a cat", and have its instruction be "say my name in cat language 5 times".

This also raises another question. Beyond using the should projects be able to be recursive? Thinking about recursive functions...useful recursive functions usually take inputs or operate on some variable that is maintained and updated outside of the recursive helper function. Currently, the projects do not provide a way for the user to provide input.... but that would be doable!

Support project descriptions!

In the same way that well documented code often has a README file and in-line comments, our system could support the same thing. The README may be analogous to the project page.

User: "Add a description to the project" or "Add a message to the viewer/user" or "add a project description" or "Here's what the user has to know" or "add project instructions"
Scratch: "Tell me about the project" or "What's the project description"
User: "This project is about ...."

  • speech recognition --> text saved as description; speech synthesis
  • users can record their own project descriptions

Should the system automatically insert "that's it" for simple conditional/loops?

so that ScratchNLP parses the instruction properly (it expects every instruction to end with thats' it)
"if one plus one equals two, play the meow sound (that's it)" .

One argument against (multicommand conditional case):

  • a user might not say "That's it" until the NEXT utterance.
    "if one plus one equals two, play the meow sound"
    "play the chomp sound"
    "play the bing sound 2 times"
    "that's it"
    ^ Using questioning / guidance can be a way to guide the flow of creating conditionals and loops (for multicommand conditionals"

That utterance could then be combined to create the right output? Is this expected behavior? Or one that is harder to understand.

Explore using syntactic-semantic rule structure to create a more flexible interface.

based on the work we did in ScratchNLP and also in lab 3 of 6.863, can we integrate that system into that of the scratch vui coding interface?

It seems like the perfect use case because it's where we want the flexibility and knowledge to lie. By being able to handle different parts of speech and questions, we can provide users with access to a set data model that can be really meaningful and intuitive.

Some things I anticipate/some questions to explore

  • handling punctuation
  • running on the client-side (javascript based version/port of the code? --> could be better in the long run for not having to have the server take care of the computation?
  • what would the semantic rules be? need to explore how lab 3 worked again to understand the data model that was being worked with
  • how would this rule interface interact with the state machine and the ability to match actions etc.

Handle the variation caused by thinking out loud and children's speech

Talking to figure out something or wanting to rephrase is a natural occurrence in conversational speech. The system will need a way to handle this variation in a non-frustrating way.

IDEAS

  • remove filler words
  • ignore incomplete sentences in utterances/commands
  • "Scratch, hold on" (to pause any interpretation of what is being said). "Ready, Scratch?"

Allow users to create projects that take in inputs (basically functions).

When designing the scratch-vui system, I imagined that the projects would be modular and composable so that users could reuse their projects and create even more complex behavior. The way projects are activated in Scratch-VUI ("Scratch, <project_name>") frames projects as commands or things that Scratch might do. Which kind of makes sense... but doesn't actually provide a framework for allowing users to provide input (when programming and when interacting with the project itself)

I propose a two-part solution.

  1. Explicitly provide the values at the time of the project being called/referenced, "Scratch, <project_name> with <variable_name> as (and ... and <variable_name> as )"
  2. When running a project that wants inputs, Scratch will ask the user for the values if they were not provided when the project was being run.
    3)) When creating a project that uses a project that wants inputs as an instruction, Scratch will ask the user for the values while the user builds up the project.

There is more to figure out with actually modifying these values (plus considering how these fit into the Scratch project representation.

Add a "hello" getting started sequence for new users.

based on certain metrics for knowing how experienced a user is, I could introduce certain kinds of vocabulary as scaffolding and give the user the ability to skip this if they so desire.

one element of scaffolding I'd like to create is a sort of "hello" getting started sequence.
design:
" Hi, I’m Scratch, a tool for you to build and interact with Scratch projects. Scratch projects are computer programs that you can play, interact with, and share. You can create Scratch projects by telling me instructions. I keep track of these instructions and when you say the name of the project, I will follow the instructions step by step."

Trigger listening mode with "Scratch" alone

Maintain state for when Scratch was said by itself in the last utterance. For example:

"Scratch"
"How many projects do I have"
should get the response to "scratch how many projects do i have" instead of no response since "How many projects do I have" did not begin w/ the trigger word.

Validate Inputs

Problem:
you can name a project
e.g. project name = “scratch create a new project” and when you're done, you can’t call the project because “scratch create a new project” will always trigger the project creation.

Expected Behavior:
The system responds, saying that you can't use that name and asks for a different name.

Give users option to spend less time listening to ScratchVUI interface

For new users, it's important to give guidance and hear Scratch-VUI out (for guidance). However, things can be long. we want to user to be empowered and engaged and be able to go at their own pace without having to hear Scratch-VUI out (especially if its repetitive).

  • concise mode (different set of strings are used to communicate -- see the connectToVM_cypress branch for strings.js)
  • audio cues
  • skippable speech synthesis

On audio cues
when user gets a command wrong...they have to hear over and over...
"i heard you say ____, thats not a scratch command." this was initially designed assuming 2 failure modes:

  1. the speech recognition mishears u
  2. you say something that isn't a scratch command

some missing failure modes
3) you say something, but you know you messed up so you want to start over
screen shot 2018-10-05 at 1 18 44 pm
4) you say something, but you're still trying to figure it out as you're talking...while you pause it processes what you said

4 can be addressed by only listening when user says "scratch" or triggers listening. If this were to be a mobile app, icld imagine it working with a touch screen by it being one big screen and the entire surface could be a button.

On skippable speech synthesis
inspired by T.E.D. as you navigate through options, you're able to cut off the last thing being said (Because of arrow keys and changes in focus). In our system, we are assuming a screenless experience, but doesn't mean there can't be buttons. maybe there can be a skip button or "skip" cue.

"Test the project" vs "Play the project" from inside the project.

Not sure if this is needed, but when the user is inside a project, editing a project they might want to test the project without having to "See Inside" again after. The "Test the project" command would make it so the user doesn't have to "Play the project" and then "see inside"

Clarify ambiguous parses of Scratch Commands

"When you've given an instruction that has more than one meaning, I will ask for clarification."

User says something....
Scratch says "I understand that as # different things. 1. [insert some representation of the program that is unambiguous. 2. [insert another representation that is not ambiguous]]

link to issue #35

Rethink the relationship between context and state in the system.

Right now, the state machine maintains a set of contexts on the general system navigation level and on the project level. What about on a finer scale? Or in a context that touches on both the project and the general system?

Is my current implementation based on contexts too rigid?

For example, say we want to support the behavior of confirming a (dangerous) act before executing:

Delete a project

User: Scratch, delete the say hello project.
Scratch: Are you sure you want to delete the say hello project?
User: Yes.

Confirm a step in the program

User: Scratch, say hello.
Scratch: LIke this? hello
User: No. Scratch, say jello.
Scratch: LIke this? jello
User: yes.
Scratch: Okay, what's the next step?

Understand pronouns as references to antecedents of the previous utterance.

User: Scratch, how many projects do I have?
Scratch: You have 3 projects.
User: What are they called?”
Scratch: Give me a compliment, big water bottle, and get ready for the dance party.

Support ability to ask about Scratch Commands.

When someone is inside a project or creating a project, they might ask

  • what scratch commands are there?
  • what are the scratch commands ?

and want to dive deeper

  • what's an example?
    (leading into tutorial mode)

I want to explore workflows / ideal flows through the interface and document them...

Allow user to interrupt Scratch via speech.

//stopTalking
ScratchAction.General.stopTalking = new Action({
"trigger":/stop talking playing|stop playing (?:the)? ?project|stop (?:the)? ?project/,
"idealTrigger":"stop the project",
"description":"skip what Scratch is saying"
});

Currently, this is difficult to resolve because speech synthesis will get picked up by speech recognition. To handle this, the microphone is turned off, but this means that the user will not be heard via voice if they try to interrupt Scratch.

Provide meaningful feedback.

instead of saying, "i don’t know how to do that” give more helpful error messages at the lowest level of parsing that succeeded,.

e.g.
user's goal: play project called give me water bottle
actual problem: scratch doesn't have a project called give me water bottle
scratch says: I heard you say scratch give me water bottle. I don’t know how to do that.

Should scratch say...
I heard you say scratch give me water bottle. There is no project called give me water bottle.

Even better, can we follow up with
Do you want to create one?

"play" trigger acts on Scratch commands instead of just Scratch VUI projects.

image

This is tricky because we want projects to be treated like commands... but not the other way around.
Ideas:

  • syntactically distinguish between playing a project and playing a sound. (Be more strict about how we play projects). "Play the (projectname) project"
  • check that the "project name" is actually a project name + if not, consider it to be a sound (validate that its a sound)
  • check that the "project name" is actually a project name + if so, ask the user whether they meant to play the project or play a sound ...

Breaking up long commands.

Examples:

do the following 10 times play the meow sound play the chomp sound thats it
listen and wait. if the speech is knock knock say who's there thats it
if the speech is knock knock say who's there thats it
when the project starts listen and wait and then if the speech is knock knock say who's there thats it thats it.

looking at example 1:
do the following 10 times. we see that we can generally...collect commands until "thats it"...and then send that to scratchNLP for the parse

plan:

  • scratchNLP will notify of partial parse by returning something :) ScratchInstruction.parse will understand that signal and build up until "thats it" or other end phrases like "end the if statement" "end if" "end loop"

Support more phrases for returning from the InsideProject state (editing flow)

prototype.js:136 nevermind
scratch_instruction.js:59 Error: Scratch does not know how to 'nevermind'
at Function.jsonToScratch (scratch_instruction.js:137)
at ScratchInstruction.getSteps (scratch_instruction.js:56)
at new ScratchInstruction (scratch_instruction.js:20)
at cstor.handleUtterance (scratch_project.js:121)
at ScratchProjectManager.handleUtterance (scratch_project_manager.js:188)
at cstor.handleUtterance (scratch_state_machine.js:67)
at SpeechRecognition.recognition.onresult (prototype.js:138)
scratch_project_manager.js:87 sayingI heard you say nevermind
scratch_project_manager.js:87 sayingThat doesn't match any Scratch commands.
prototype.js:136 go back
scratch_instruction.js:59 Error: Scratch does not know how to 'go'
at Function.jsonToScratch (scratch_instruction.js:137)
at ScratchInstruction.getSteps (scratch_instruction.js:56)
at new ScratchInstruction (scratch_instruction.js:20)
at cstor.handleUtterance (scratch_project.js:121)
at ScratchProjectManager.handleUtterance (scratch_project_manager.js:188)
at cstor.handleUtterance (scratch_state_machine.js:67)
at SpeechRecognition.recognition.onresult (prototype.js:138)
scratch_project_manager.js:87 sayingI heard you say go back
scratch_project_manager.js:87 sayingThat doesn't match any Scratch commands.
prototype.js:136 I'm done
scratch_instruction.js:59 Error: Scratch does not know how to 'i'm'
at Function.jsonToScratch (scratch_instruction.js:137)
at ScratchInstruction.getSteps (scratch_instruction.js:56)
at new ScratchInstruction (scratch_instruction.js:20)
at cstor.handleUtterance (scratch_project.js:121)
at ScratchProjectManager.handleUtterance (scratch_project_manager.js:188)
at cstor.handleUtterance (scratch_state_machine.js:67)
at SpeechRecognition.recognition.onresult (prototype.js:138)
scratch_project_manager.js:87 sayingI heard you say i'm done
scratch_project_manager.js:87 sayingThat doesn't match any Scratch commands.

Rework the ScratchProject State Machine

The state transitions don't actually make sense....

When a user HOME --> Inside Existing Project, the project state is in 'create'

  • create a way to jump to the appropriate state (if creating a new project instance to represent an existing project)
  • reevaluate the states used.

Speech recognition often misrecognizes speech.

Speech Recognition Common Mishaps:

step and stop (step #)
the inside versus see inside

Idea:

How might I verify that the grammar is actually helping the situation? OR improve the grammar?

Expand the sound library + refine interface for recognizing sounds.

"What sounds do you have"/"What sounds do I have"/"What sounds do you know"

  • lots of sounds... here's a few plays sounds (pull request #27)

"Can you make the/a ___ sound"

  • direct match to name (pull request #27)

  • automatically map a kind of sound to the existing library (via tags)

  • lots of sounds... pick 3 random categories... [to implement, need categorization, need way to explore categories of sound]

Another idea:
utilize freesound api to search for sounds and return those
[the blank could be a SOUND_NAME or what the sound is like. could implement by using the freesoundapi to search for and get particular sounds. Also consider synonyms for describing or searching sound.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.