XOMiArm3 - OMiLAB Robotic Arm Experiment 3

Keywords: Ontology, Speech Recognition

Use Case

Scenario

Mixing a cocktail with a Dobot arm by receiving speech input and subsequently incorporating the required ingredients.

Flow of activities

  1. The user opens the web-interface in Google Chrome and sends an instruction by using one of these options:
    1. Search bar
    2. List view
    3. Speech input
  2. If the user gives a voice input the text is getting evaluated with the cocktail database.
  3. Cocktail-ID is sent to the backend (rule-based system) and the ingredients, their respective coordinates and the specified order are gathered.
  4. Instructions are transmitted to the robot arm
  5. Robot arm performs movements

Move to ingredient ? grab tube ? move to cocktail glass ? tilt and empty the tube ? move to the center ? move to original ingredient position ? drop empty tube ? move to starting position ? move to next ingredient …

Actors

User that is remotely controlling the robot arm via a web interface.

Precondition

  • All ingredients are refilled and at their dedicated positions
  • The robot arm is waiting for instructions and in the starting position

Postcondition

  • Selected cocktail has been mixed successfully
  • Used tubes are completely empty and stand upright at their original positions

 

Problem Statement

? Planning Problem: Reaching the goal state in a complex environment

In order to successfully prepare a cocktail with the robot arm we have to address several problems. We need to dynamically assign a list of cocktails with their respective ingredients and must take into account the exact position of ingredients and the cocktail glass. This can be accomplished by implementing a rule-based system.

The integration of a voice input feature requires additional processing steps that must be implemented. We cannot assume that users give standardized voice commands, therefore, we synchronize the given input with our cocktail database and search for occurrences. For example, if someone says “Can I have one Mojito, please” the algorithm should filter all the noise and should apply the rules for mixing a Mojito.

Additionally, the use of liquids requires much more accurate movements of the robot arm. This can be solved, for example, by following an experimental approach. Various textures on the tubes (ingredients) can be tested to increase the slip resistance and by adding multiple intermediary steps we can ensure that the movements and transitions are much smoother.

 

Experiment

This experiment consists of a simple web application that communicates with an industrial grade robot arm via a RESTful API. The user interface, which can be accessed from the web browser Google Chrome, allows the user to specify the desired cocktail by speech recognition. Alternatively, the user may enter the name of the cocktail manually into a search bar as well. The given input is then further analyzed and processed in the backend of the application by means of a rule-based engine. As soon as the decision for the rule to be applied has been made, a specified set of REST calls is being sent to the robot arm that invoke the cocktail mixing procedure. The REST API and the firmware that controls the robot are provided by the OMiLab platform.

 

Knowledge Engineering Concept

As already mentioned, we applied two KE concepts in our project: Speech-recognition in the frontend and a simple rule-based engine in the backend of our application.

Speech recognition is very popular nowadays and has been integrated in many devices in both the private and industrial sectors (e.g smartphones, entertainment systems, smart homes etc). The use of this technology allows users another type of communication with a certain device and reduces barriers. In our example, we wanted to implement an input method that is as easy and intuitive as possible. Therefore, we integrated multiple options for users to order a cocktail (send robot instructions), among others, also a voice input feature. If a user enables the feature on the website and starts speaking, our script automatically evaluates the input and looks for cocktails in the database. After a successful command, the robot starts automatically and grabs the required ingredients. The speech recognition feature is based on the W3C Web Speech API (https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html) that is available in the Google Chrome browser.

The mapping between cocktails and ingredients has been realized with a simple rule-based engine in the backend. Relating to many cocktail recipes it is very important to follow a strict sequence in which the ingredients are added. The rule-engine considers the ingredients and the specified order.

Results

 

Validation  Environment

The validation environment used in this experiment consists of:

  • Industrial grade robot arm (Dobot)
  • Internet capable device for accessing the web-interface
  • Cocktail glass and small tubes
  • Various cocktail ingredients

 

We implemented a client-server architecture to enable users to control the robot arm remotely.   A JavaScript based fronted takes the user input (search bar, list or voice input) and evaluates  the input with the database. If successful, an AJAX call with the cocktail ID is sent to the  server (backend).The backend is composed of a rule-based engine written in PHP. All the cocktails, ingredients and the matching between them are stored in a nested JSON object. Based on the given cocktail, the ingredients with their x/y-coordinates get selected and the instructions are sent to the Dobot arm by using the REST interface (http://austria.omilab.org/omirob/ dobot2/rest/positionXYZRG).

In order to be able to not only navigate to a specified position but to also control the gripper of the robot arm, an HTTP request can be send to the resource /positonXYZRG. Whereas X, Y and Z stand for the transmitted coordinates, R defines the rotation angle of the end tool and G specifies the opening and closing angle of the gripper respectively. To e?ectively change the values of R and G, at least one of the coordinates X, Y or Z needs to be modified as well. The voltage of the gripper servo is only kept stable while moving and is released if the arm stands still. Hence, to transport a subject from one position to another, there must be constant movement with no stops in between.

We opted for a modular design that can be easily extended with additional cocktails, ingredients or movements. For example, the Dobot arm could also grab fresh mint leaves in the future.

 

Web interface

 

Dobot arm

 
 

 

 

 

Web-interface:   http://michaeloppermann.com/cocktails

Video:    https://youtu.be/HOya9EZQgpQ