Linux Demos and Examples

Demos

Using Nvidia Jetson Nano live streaming from a THETA V. Processing done with Python3, OpenCV 4.4. Scroll down for code.

DetectNet

Running live on Jetson Nano with RICOH THETA Z1.

DetectNet applied to both single frame with SSD Mobilenet-v2 to assess accuracy and to live stream to assess framerate. Works good on both.

Video demo with Jetson Nano.

See Jetson Nano inference benchmarks.

Code is available in the at GitHub - dusty-nv/jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

There is super small text in the green box that says, “person”. The system accurately detected the only person in the image.

It is 88.6 percent confident that I am a person. Nice.

Despite the distorted view of my feet, the program does detect the human form.

Even at night, in low-light conditions with me on the side of the shutter button, the program did detect me.

However, there were many frames where I was not detected.

To proceed, you will likely need a database of fisheye or equirectangular images to build your own model.

Sample Code

import jetson.inference
import jetson.utils

net = jetson.inference.detectNet("ssd-mobilenet-v2", threshold=0.5)
camera = jetson.utils.gstCamera(1280, 720, "/dev/video0")
display = jetson.utils.glDisplay()

while display.IsOpen():
    img, width, height = camera.CaptureRGBA()
    detections = net.Detect(img, width, height)
    display.RenderOnce(img, width, height)
    display.SetTitle("RICOH THETA Detection | Network {:.0f} FPS".format(net.GetNetworkFPS()))

OpenCV Python

Works on live stream.

Procedure

  • install libuvc-theta
  • install libuv-theta-sample
  • install v4l2loopback
  • load kernel modules for v4l2loopback and verify that /dev/video0 or equivalent shows THETA stream
  • run Python script with cv2

Recommend you recompile OpenCV 4.4 from source code. May take 2.5 hours if you compile on the Nano.

Simple Python cv2 Test

Frame resize test.

import cv2

cap = cv2.VideoCapture(0)

# Check if the webcam is opened correctly
if not cap.isOpened():
    raise IOError("Cannot open webcam")

while True:
    ret, frame = cap.read()
    frame = cv2.resize(frame, None, fx=0.25, fy=0.25, interpolation=cv2.INTER_AREA)
    cv2.imshow('Input', frame)

    c = cv2.waitKey(1)
    if c == 27:
        break

cap.release()
cv2.destroyAllWindows()

Build OpenCV

One script to install OpenCV 4.3 is from AastaNV here.

The script I used is from mdegans here

Canny Edge Detection Test

import sys
import argparse
import cv2
import numpy as np

def parse_cli_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("--video_device", dest="video_device",
                        help="Video device # of USB webcam (/dev/video?) [0]",
                        default=0, type=int)
    arguments = parser.parse_args()
    return arguments

# On versions of L4T previous to L4T 28.1, flip-method=2
# Use the Jetson onboard camera
def open_onboard_camera():
    return cv2.VideoCapture(0)

# Open an external usb camera /dev/videoX
def open_camera_device(device_number):
    return cv2.VideoCapture(device_number)


def read_cam(video_capture):
    if video_capture.isOpened():
        windowName = "main_canny"
        cv2.namedWindow(windowName, cv2.WINDOW_NORMAL)
        cv2.resizeWindow(windowName,1280,720)
        cv2.moveWindow(windowName,0,0)
        cv2.setWindowTitle(windowName,"RICOH THETA OpenCV Python Demo")
        showWindow=3  # Show all stages
        showHelp = True
        font = cv2.FONT_HERSHEY_PLAIN
        helpText="'Esc' to Quit, '1' for Camera Feed, '2' for Canny Detection, '3' for All Stages. '4' to hide help"
        edgeThreshold=40
        showFullScreen = False
        while True:
            if cv2.getWindowProperty(windowName, 0) < 0: # Check to see if the user closed the window
                # This will fail if the user closed the window; Nasties get printed to the console
                break;
            ret_val, frame = video_capture.read();
            hsv=cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
            blur=cv2.GaussianBlur(hsv,(7,7),1.5)
            edges=cv2.Canny(blur,0,edgeThreshold)
            if showWindow == 3:  # Need to show the 4 stages
                # Composite the 2x2 window
                # Feed from the camera is RGB, the others gray
                # To composite, convert gray images to color. 
                # All images must be of the same type to display in a window
                frameRs=cv2.resize(frame, (640,360))
                hsvRs=cv2.resize(hsv,(640,360))
                vidBuf = np.concatenate((frameRs, cv2.cvtColor(hsvRs,cv2.COLOR_GRAY2BGR)), axis=1)
                blurRs=cv2.resize(blur,(640,360))
                edgesRs=cv2.resize(edges,(640,360))
                vidBuf1 = np.concatenate( (cv2.cvtColor(blurRs,cv2.COLOR_GRAY2BGR),cv2.cvtColor(edgesRs,cv2.COLOR_GRAY2BGR)), axis=1)
                vidBuf = np.concatenate( (vidBuf, vidBuf1), axis=0)

            if showWindow==1: # Show Camera Frame
                displayBuf = frame 
            elif showWindow == 2: # Show Canny Edge Detection
                displayBuf = edges
            elif showWindow == 3: # Show All Stages
                displayBuf = vidBuf

            if showHelp == True:
                cv2.putText(displayBuf, helpText, (11,20), font, 1.0, (32,32,32), 4, cv2.LINE_AA)
                cv2.putText(displayBuf, helpText, (10,20), font, 1.0, (240,240,240), 1, cv2.LINE_AA)
            cv2.imshow(windowName,displayBuf)
            key=cv2.waitKey(10)
            if key == 27: # Check for ESC key
                cv2.destroyAllWindows()
                break ;
            elif key==49: # 1 key, show frame
                cv2.setWindowTitle(windowName,"Camera Feed")
                showWindow=1
            elif key==50: # 2 key, show Canny
                cv2.setWindowTitle(windowName,"Canny Edge Detection")
                showWindow=2
            elif key==51: # 3 key, show Stages
                cv2.setWindowTitle(windowName,"Camera, Gray scale, Gaussian Blur, Canny Edge Detection")
                showWindow=3
            elif key==52: # 4 key, toggle help
                showHelp = not showHelp
            elif key==44: # , lower canny edge threshold
                edgeThreshold=max(0,edgeThreshold-1)
                print ('Canny Edge Threshold Maximum: ',edgeThreshold)
            elif key==46: # , raise canny edge threshold
                edgeThreshold=edgeThreshold+1
                print ('Canny Edge Threshold Maximum: ', edgeThreshold)
            elif key==74: # Toggle fullscreen; This is the F3 key on this particular keyboard
                # Toggle full screen mode
                if showFullScreen == False : 
                    cv2.setWindowProperty(windowName, cv2.WND_PROP_FULLSCREEN, cv2.WINDOW_FULLSCREEN)
                else:
                    cv2.setWindowProperty(windowName, cv2.WND_PROP_FULLSCREEN, cv2.WINDOW_NORMAL) 
                showFullScreen = not showFullScreen

    else:
     print ("camera open failed")



if __name__ == '__main__':
    arguments = parse_cli_args()
    print("Called with args:")
    print(arguments)
    print("OpenCV version: {}".format(cv2.__version__))
    print("Device Number:",arguments.video_device)
    if arguments.video_device==0:
      video_capture=open_onboard_camera()
    else:
      video_capture=open_camera_device(arguments.video_device)
    read_cam(video_capture)
    video_capture.release()
    cv2.destroyAllWindows()

OpenPose

Works on live stream with Jetpack 4.3, not 4.4.

Usage Examples

stream to YouTube with ffmpeg

from Paul Gullett. post

ffmpeg -f lavfi -i anullsrc \
-f v4l2 -s 3480x1920 -r 10 -i /dev/video0 \
-vcodec libx264 -pix_fmt yuv420p -preset ultrafast \
-strict experimental -r 25 -g 20 -b:v 2500k \
-codec:a libmp3lame -ar 44100 -b:a 11025 -bufsize 512k \
-f flv rtmp://a.rtmp.youtube.com/live2/secret-key

As my knowledge of ffmpeg is weak, I simplified Paul’s video pipeline.

 ffmpeg -f lavfi -i anullsrc -f v4l2 -s 1920x960 -r 10 -i /dev/video2 \
-vcodec libx264 -pix_fmt yuv420p \
 -b:v 2500k \
-codec:a libmp3lame -ar 44100 -b:a 11025 -bufsize 512k \
-f flv rtmp://a.rtmp.youtube.com/live2/$SECRET_KEY

stream to another computer with gstreamer

by zdydek. post

in gst_viewer.c

pipe_proc = " rtph264pay name=pay0 pt=96 ! udpsink host=127.0.0.1 port=5000 sync=false ";

with gst-rtsp-server

./test-launch "( udpsrc port=5000 ! application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264 ! rtph264depay ! h264parse ! rtph264pay name=pay0 pt=96 )"

Receive on ROS.

GSCAM_CONFIG="rtspsrc location=rtspt://10.0.16.1:8554/test latency=400 drop-on-latency=true ! application/x-rtp, encoding-name=H264 ! rtph264depay ! decodebin ! queue ! videoconvert"  roslaunch gscam_nodelet.launch

Simplified computer to computer streaming with rtsp and gstreamer

This was tested going from an x86 machine to a Jetson Nano. The THETA Z1 is connected to the x86 Linux machine. It is not working with the Jetson as the sender.

On x86 computer sending THETA video.

Modify the pipeline in gst_viewer.c

This example has the IP address hardcoded in. Switch to a variable in your code.

pipe_proc = " decodebin ! jpegenc ! rtpjpegpay ! udpsink host=192.168.2.100 port=5000 qos=false sync=false";

If you are looking for the IP address of the receiver, you can use arp-scan on the command line.

Example:

sudo arp-scan --interface=eth0 --localnet

On the receiving device, if the receiver is a NVIDIA Jetson Nano.

$ cat receive_udp.sh 
gst-launch-1.0 udpsrc port=5000 !  application/x-rtp,encoding-name=JPEG,payload=26 ! rtpjpegdepay ! jpegdec ! videoscale ! video/x-raw,width=640,height=320 ! nveglglessink

If you’re on x86, change nveglglessink to autovideosink. You may want to make the width and height bigger as well.

Save to File

by Les Wu aka snafu666. post

Using the v4l2loopback capability and thetaV loopback example, here are 2 example gstreamer pipelines to grab the video:

As a lossless huffman encoded raw file:

gst-launch-1.0 v4l2src device=/dev/video99 ! video/x-raw,framerate=30/1 \
! videoconvert \
! videoscale \
! avenc_huffyuv \
! avimux \
! filesink location=raw.hfyu

And with default h.264 encoding on a Jetson:

gst-launch-1.0 v4l2src device=/dev/video99 ! video/x-raw,framerate=30/1 \
! nvvidconv \
! omxh264enc \
! h264parse ! matroskamux \
! filesink location=vid99.mkv

Pro tip, when you install v4l2loopback, use the video_nr option to create the video device somewhere high so it does not get displaced by PnP of other cameras.

The Huffyuv format is a large file format. VLC can play it.

Here’s a shot of me playing a file that I generated with Les’s pipeline.

On x86, this is the pipeline I used to save to a H.264 file.

$ gst-launch-1.0 v4l2src device=/dev/video2 ! video/x-raw,framerate=30/1 ! autovideoconvert ! nvh264enc ! h264parse ! matroskamux ! filesink location=vid_test.mkv

Example of playing file with gst-launch.

gst-launch-1.0 playbin uri=file:///path-to-file/vid_test.mkv

Community member Nakamura_Lab indicated that the he experienced significant frame loss when using lossless Huffman to save to file with the Xavier NX. He could save to file with H.264. However for his use case with multiple face detection, he needed higher resolution than provided by H.264.

He ended up using H.265 encoding to save to file, which provided both high quality and no frame loss.

gst-launch-1.0 v4l2src num-buffers=600 \
device=/dev/video0 \
! video/x-raw \
! nvvidconv \
! nvv4l2h265enc \
! h265parse \
! qtmux \
! filesink location=test.mp4 -e

Stream From Raspberry Pi 4 to a Windows PC

Thanks to Shun Yamashita of fulldepth for this solution to stream the Z1 video to a Raspberry Pi 4 with USB then restream it to a Windows PC.

This is the process:

  • Use GStreamer to stream UDP(RTP) to the Windows PC
  • Do not use H264 decoding on the Raspberry Pi as the Windows machine is handling it and it’s likely that the RPi4 can’t use hardware decoding for 4K H.264.
  • Tested with Raspberry Pi4 modelB with 4GB of RAM running Raspberry Pi OS

On the Raspberry Pi, the following was changed in the GStreamer pipeline in gst/gst_viewer.c.

src.pipeline = gst_parse_launch(
        " appsrc name=ap ! queue ! h264parse ! queue"
        " ! rtph264pay ! udpsink host=192.168.0.15 port=9000",
        NULL);

RViz in ROS2 Dual-Camera Setup with OpenCV

Robot displays live images from two THETA V cameras on RViz in ROS2 using the following pipeline and OpenCV.

credit: H. Usuba from Kufusha - Robotics development

original discussion

thetauvcsrc mode=2K ! queue ! h264parse ! nvv4l2decoder ! queue ! nvvidconv ! video/x-raw, format=BGRx ! videoconvert ! video/x-raw, format=BGR ! queue ! appsink