SLAM with Ricoh Theta using OpenVSLAM

Has anyone seen this? I Have been doing some SLAM research recently and stumbled upon this: https://github.com/xdspacelab/openvslam

It is an open source SLAM (simultaneous localization and mapping) framework that advertises support for equirectangular videos like those recorded (or streamed) from a Theta. It supports monocular camera setups, where you use a single camera and stucture-from-motion to estimate the actual 6 dimensional trajectory of the camera.

To be clear, this is a REALLY cool. Projects like this make autonomous robotics MUCH more accessible. The computer vision technology they are using, ORB features for key framing, is a pretty cutting edge approach that has only really become popular in the past 4 years. Technology like this allows a robot to navigate and map its environment autonomously in real time. A single camera is an order of magnitude cheaper than the LIDAR / LIDAR+IMU / camera+IMU setups that are usually used for these kinds of things. I even found a report from someone who claims to have been able to succesfully use the software with a Theta V.

Anybody have experience with this? I am finally settled and think I might pick up a Theta to build a bot and test this out.

2 Likes

It works well. I have tested it with Z1

3 Likes

Sickkkk. Did you test it with a pre-recorded video, or did you use some method like UVC to stream video to the system live? It seems to me like that would require a bit more work, but to me thats a much more interesting use-case.

2 Likes

I tested it with pre recorded video. Havnt tested it Live . I am yet to figure out that part.

1 Like

At the moment, the UVC webcam technique can’t be used with the THETA on Linux as we don’t have a streaming driver at the moment. I’ve heard that some people on the community on working on a driver.

It may work on MacOS X. I don’t think you can get web cam detection with WSL on Windows. Stack Overflow post from April 2018.

Thanks for the heads up. Maybe I’ll hunt down the folks looking at making a driver!

The Linux driver now exists.

https://theta360.guide/special/linuxstreaming/

Someone was asking about VSLAM during the meetup today. If anyone tests it, please post.

I’m currently trying to get it working with the z1 but im having trouble. I got the z1 to work with gst_viewer and gst_loopback to get the camera under /dev/video2. The problem I’m having is getting OpenVSLAM to see the video feed. When I run $ ./run_camera_slam , OpenVslam only reads one frame form the video feed and then won’t read anymore. So basically it just freezes after one frame.

I don’t have a solution. I asked a question about OpenVSLAM on a YouTube video.

image

have you tried cutting the video resolution down to 2K for a test?

I believe the hardware requirements are very high to use OpenVSLAM in realtime. Does OpenSLAM work for you with a normal 4K webcam in realtime?

ok, so I’ve narrowed down the problem to the fps with v4l2loopback. Seems like loopback is slowing it down because I don’t have something set right. On gst_viewer I get realtime. Before I plug in the camera I use the android app and set the video resolution to "FullHD(1920x960), I’m not sure if this is the right way to change the live feed resolution. Anyways, I plug the camera in and start my laptop and open a terminal in the v4l2loopback dir and enter the command

sudo modprobe v4l2loopback exlusive_caps=1 max buffer=8 width 1920 height=960

then I check my /dev/video* with the command

ls -ltrh /dev/video*

and see that I created crw-rw----+ 1 root video 81, 2 Dec 9 06:32 /dev/video2

I check gts_viewer and see that it works and it does.

Then I enter the ./gst_loopback command from the /gst dir.

I open VLC and select my capture device as /dev/video2 and press play.

VLC displays one new frame immediately, then the next frame about 20 seconds later, and then it seems to get slower but I’m not counting. That’s my problem.

I’m not sure what type of information you would need to help me with this problem so just tell me what you would need to know to help me.

1 Like

The video resolution with the Android app is likely the video to file resolution.

There is significant documentation here:
https://theta360.guide/special/linuxstreaming/

Did you set qos=false ?

if (strcmp(cmd_name, "gst_loopback") == 0)
    pipe_proc = "decodebin ! autovideoconvert ! "
        "video/x-raw,format=I420 ! identity drop-allocation=true !"
        "v4l2sink device=/dev/video0 qos=false sync=false";

You can change the video resolution of the stream like this
https://github.com/codetricity/libuvc-theta-sample

Check video resolution

$ v4l2-ctl --list-formats-ext
ioctl: VIDIOC_ENUM_FMT
	Index       : 0
	Type        : Video Capture
	Pixel Format: 'YU12'
	Name        : Planar YUV 4:2:0
		Size: Discrete 1920x960
			Interval: Discrete 0.033s (30.000 fps)

Information that might help:

  • CPU (such as x86 and roughly if it is fast)
  • GPU setup, primarily if you are using a discrete NVIDIA graphics card or if you are using the integrated GPU on the x86 chip

If the problems persists after you set qos=false in the code, you next need to verify that you are using hardware acceleration, not software decoding.

System Specs

 System:
      Kernel: 5.4.0-56-generic 86_64 bits: 64 compiler: gcc v: 9.3.0 
      Desktop: Gnome 3.36.4 Distro: Ubuntu 20.04.1 LTS (Focal Fossa) 
    Machine:
      Type: Laptop System: Micro-Star product: GL65 9SD v: REV:1.0 
    CPU:
      Topology: Quad Core model: Intel Core i5-9300H bits: 64 type: MT MCP 
      arch: Kaby Lake rev: A L2 cache: 8192 KiB 
      flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx 
      bogomips: 38400 
      Speed: 1568 MHz min/max: 800/4100 MHz Core speeds (MHz): 1: 973 2: 803 
      3: 942 4: 804 5: 957 6: 957 7: 817 8: 960 
    Graphics:
      Device-1: Intel UHD Graphics 630 vendor: Micro-Star MSI driver: i915 
      v: kernel bus ID: 00:02.0 
      Device-2: NVIDIA TU116M [GeForce GTX 1660 Ti Mobile] 
      vendor: Micro-Star MSI driver: nvidia v: 450.80.02 bus ID: 01:00.0 
      Display: x11 server: X.Org 1.20.8 driver: modesetting,nvidia 
      unloaded: fbdev,nouveau,vesa resolution: 1920x1080~120Hz 
      OpenGL: renderer: GeForce GTX 1660 Ti/PCIe/SSE2 v: 4.6.0 NVIDIA 450.80.02 
      direct render: Yes 
    Info:
      Processes: 291 Uptime: 12h 18m Memory: 15.49 GiB used: 2.99 GiB (19.3%) 
      Init: systemd runlevel: 5 Compilers: gcc: 9.3.0 Shell: bash v: 5.0.17 
      inxi: 3.0.38 

Graphics Drivers

    sudo lshw -C display
  *-display                 
       description: VGA compatible controller
       product: TU116M [GeForce GTX 1660 Ti Mobile]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:01:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress vga_controller bus_master cap_list rom
       configuration: driver=nvidia latency=0
       resources: irq:149 memory:a4000000-a4ffffff memory:90000000-9fffffff memory:a0000000-a1ffffff ioport:4000(size=128) memory:a5000000-a507ffff
  *-display
       description: VGA compatible controller
       product: UHD Graphics 630 (Mobile)
       vendor: Intel Corporation
       physical id: 2
       bus info: pci@0000:00:02.0
       version: 00
       width: 64 bits
       clock: 33MHz
       capabilities: pciexpress msi pm vga_controller bus_master cap_list rom
       configuration: driver=i915 latency=0
       resources: irq:146 memory:a3000000-a3ffffff memory:80000000-8fffffff ioport:5000(size=64) memory:c0000-dffff

I did not add qos=false

if (strcmp(cmd_name, "gst_loopback") == 0)
		pipe_proc = "decodebin ! autovideoconvert ! "
			"video/x-raw,format=I420 ! identity drop-allocation=true !"
			"v4l2sink device=/dev/video2 sync=false";
	else
		pipe_proc = " decodebin ! autovideosink sync=false";

when i run the command v4l2-ctl --list-formats-ext i get this

$ v4l2-ctl --list-formats-ext
ioctl: VIDIOC_ENUM_FMT
	Type: Video Capture

	[0]: 'MJPG' (Motion-JPEG, compressed)
		Size: Discrete 1280x720
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 320x180
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 320x240
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 352x288
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 424x240
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 640x360
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 640x480
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 848x480
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 960x540
			Interval: Discrete 0.033s (30.000 fps)
	[1]: 'YUYV' (YUYV 4:2:2)
		Size: Discrete 1280x720
			Interval: Discrete 0.100s (10.000 fps)
		Size: Discrete 320x180
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 320x240
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 352x288
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 424x240
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 640x360
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 640x480
			Interval: Discrete 0.033s (30.000 fps)
		Size: Discrete 848x480
			Interval: Discrete 0.050s (20.000 fps)
		Size: Discrete 960x540
			Interval: Discrete 0.067s (15.000 fps)

I’m away from home so I can’t test it with qos=false and at 2k resolution right now but I will test it in about 12 hours and report back.
Thank you for your help this is my first linux project ever so I’m very new.

The output of v4l2-ctl in your post is the built-in webcam on your laptop. To see the THETA, most likely on /dev/video1, do this after you start v4l2loopback.

Note the device specification.

$ v4l2-ctl --list-formats-ext --device  /dev/video1
ioctl: VIDIOC_ENUM_FMT
    Type: Video Capture

    [0]: 'YU12' (Planar YUV 4:2:0)
        Size: Discrete 1920x960
            Interval: Discrete 0.033s (30.000 fps)

I am hopeful that you will see better framerates when you set qos=false in the C code and then recompile the code.

Feel free to post again. You are several steps away. If you can get decent framerates with v4l2loopback on /dev/video1 (I’m assuming your laptop webcam in on /dev/video0), then test it with vlc or gst-launch-1.0 or another video player before moving to SLAM.

I was able to get it working flawlessly in 4k. Thank you very much sir. Now its time to do it over wifi lol

2 Likes

Yea i just got it to work with my Z1. Took 2 days for a noob but soo cool.

2 Likes

Well done! If you get a chance to post screenshots, I’d love to see them.

This screenshot a real-time path made generated by openvslam on my Ricoh Theta Z1 over USB.

2 Likes

Wow, congratulations!! Nice job.

See this post by Zac for RTSP transmission between two devices. He’s using gscam on ROS to get the output.

You can also modify the c code for gst_viewer.c and experiment with the pipeline.

For example, this will send it to a specific IP device.

pipe_proc = " decodebin ! jpegenc ! rtpjpegpay ! udpsink host=192.168.2.100 port=5000 qos=false sync=false";

You can get the IP addresses of the devices on your network with:

sudo arp-scan --interface=eth0 --localnet

Change the interface to your Wi-Fi card. The example above is Ethernet.

On a Jetson Nano, I am receiving the RTSP stream with this and viewing it as equirectangular. The pipeline is specific to a Jetson. However, you can get a basic idea of the concept.

gst-launch-1.0 udpsrc port=5000 !  application/x-rtp,encoding-name=JPEG,payload=26 ! rtpjpegdepay ! jpegdec ! videoscale ! video/x-raw,width=640,height=320 ! nveglglessink

If you get something working, please post.

Hi @ryleymcc and @craig . I’m considering using a ricoh theta V or the same Z1 for the same purpose but I have some doubts. I’m not working with a NVIDIA, I don’t know if that will be a problem to get the stream from the USB. Also, is the latency a problem when creating the map in real time?

If you can tell me your experience using the Theta Z1 to create a map and localize in it, that would be very helpful. Like, for a small map it works fine but for a larger one it is impossible or the way to tune the configuration parameters for the theta to use it on OpenVSLAM properly.

Thanks a lot,

Albert Arlà