Gesture and Voice THETA by Wichai Tossamartvorakul

Download Voice THETA From Store

To take picture with voice command or using hand to activate Theta V

Story

Abstract

Theta is an amazing 360 camera. You can shoot without worry what angle you are. It will keep picture 360 degree around you. However it always take picture of your hand too, because you need finger to press the shuttle button. This project is going to solve this problem. You can place theta V far away like you take normal camera and use your voice to activate the camera or let camera detect your hand, it will take picture within 5 seconds delay.

Project Background

This project use tensorflow sample for Theta V project by @mktshhr at https://github.com/mktshhr/tensorflow-theta as a starting point. Because complexity of the project if we need to do both voice recognition and image recognition at the same time. We decided to develop separate two plugins, one for voice control and another for gesture control. For details on explanation on tensorflow sample for theta, you can read article written by Craig Oda at https://medium.com/theta360-guide/howto-build-tensorflow-apps-for-ricoh-theta-1b64da06a0bd for starting point

Tensorflow is mainly used in the project. As you may know, training tensorflow is time consuming you need to get many samples to learn for precise results. Because of short period of time, we decide to use current model that has been used in the sample. which mean you cannot change word or object to recognition beyond the model provided. From our experiment, we found that word “on” is more accuracy that others word. So we decide to use word “on” to activate the camera and “stop” for plugin exit. For gesture control, we decide to use hand detection model from Victor Dibia at https://github.com/victordibia/handtracking. From our experiment, it will be hard to classify different gesture with Theta V camera. So we decide to use simple hand detection to activate the camera.

Project implementation

Develop theta V plugin just like develop normal Android application. The difference is you don’t have screen to display and keyboard to input. You only have four buttons for input that is

o Shuttle button

o Power /Sleep button

o Wifi mode selected

o Mode select ( Video, Camera, Live)

In this project we use Android studio to develop project with gradle to build project. Download project from github https://github.com/wtos03/gesture_theta for gesture control and https://github.com/wtos03/voice_theta for voice control.

To develop you need to enable developer mode on Theta by submit application to Ricoh with Theta serial number. For details on Theta plugin development go to

https://api.ricoh/products/theta-plugin/

For output checking and testing, you need to use Vysor that will emulate Theta V screen for you. https://www.vysor.io/ This program very helpful when you want to see output of the program.

Project configuration and setup

When you are install program for first time, you need to give permission to the program for resource usage ( Camera, Microphone, Storage) if not program will exit and red LED will blink and stop. That is the reasons why you need to use Vysor to set up program permission. However if you install plug in from the store. This process will not necessary. Theta can run plug in one at a time by pressing mode button until the LED light turn from blue to white. You can set default plugin both from Theta mobile application and PC application.

When plugin start the LED will become white. If you start voice_theta plugin say “on” to take picture. The camera will hold 5 seconds before take picture. This allow you to pose what you like. To stop plugin you can say “stop” to exit and LED become blue again. During plugin, you can short press mode switch to switch to Video mode. and say “on” to start video. However for video mode, the voice recognition cannot work anymore. So you must press shuttle switch to stop video recording and plug in will exits. You need to restart plugin again if you want to continue using voice command.

For gesture control plugin, just place your hand in front of the camera. it will automatic take picture with 5 seconds delay same as voice. From our experiment place hand at front camera around 30 -50 cm will get best result. However it depends on the background and light. It is the common problem for image recognition. To improve the performance, we need to develop more algorithm to differentiate background from the object by using openCV.

Plug in usage

Refer to following diagrams for how to use plugin and sample video

Voice Theta

SIde Button

LED show working status

Shuttle key can use for manual take picture in camera mode and stop video in video mode.

Gesture theta

Side Button

LED show working status

Video show how to operate plug in

Troubleshooting

Sometime if the theta V cannot connect to Android or PC . You may need to reset the Theta by using following steps.

  1. Set Theta’s Wireless to local mode (Green Wifi) by pressing Wifi select button (center button).

  2. Connect PC to Theta. The password always the serial number if you did not change to other. For example YL01234567 the password is 01234567

  3. open terminal on PC and post command to Theta by using curl or other command.

$curl -d ‘{“name”: “camera.reset”}’ -H “Content-Type: application/json” -X POST http://192.168.1.1/osc/commands/execute

  1. For MAC, you may tried to connect another USB port.

Schematics

Tensorflow activity

See how interaction between class in tensorflow and CameraActivity