RICOH THETA Z1 Firmware 3.01.1 - Adds Single-Fisheye, simultaneous recording of 2 videos, 50min video length

A Major version update to RICOH THETA Z1 firmware was released on June 27, 2023. The new firmware 3.01.1 adds changes to video, including new formats for video.

The four new formats are:

{"type": "mp4","width": 3648,"height": 3648, "_codec": "H.264/MPEG-4 AVC", "_frameRate": 2} *1
{"type": "mp4","width": 3648,"height": 3648, "_codec": "H.264/MPEG-4 AVC", "_frameRate": 1} *1
{"type": "mp4","width": 2688,"height": 2688, "_codec": "H.264/MPEG-4 AVC", "_frameRate": 2} *1
{"type": "mp4","width": 2688,"height": 2688, "_codec": "H.264/MPEG-4 AVC", "_frameRate": 1} *1

The new mode outputs two fisheye video for each lens. The MP4 file name ending with _0 is the video file on the front lens, and _1 is back lens. This mode does not record audio track to MP4 file.

You can also set the new video to 3,000 seconds or 50 minutes. These are low framerate video formats intended for frame extraction. There is no audio track on the video files.

The listFiles command is also updated to show the framerate of the video file.

The USB API and the Bluetooth API have also been updated.

upgrade process

You can use either a mobile phone or a Windows or Mac desktop to update the firmware. This example shows a mobile phone.

image

image

image

camera flashes, reboots, shows firmware upgrade

image

image

New Feature Testing

The new video format produces two video files for every video.

image

each video is a single fisheye

The default bitrate is 20MB for each video.

image

The file size is pretty small for these 5 second test clips.

Above the red line, the files are 3.6K. Below the red line, the files are 2.7K

image

Sample Videos

R0010222_0.MP4 - Google Drive,

R0010223_0.MP4 - Google Drive,

R0010221_1.MP4 - Google Drive,

R0010223_1.MP4 - Google Drive,

R0010221_0.MP4 - Google Drive,

R0010222_1.MP4 - Google Drive,

Android Demo app

image

We’re working on this app for testing. If you want to download the test version, send @jcasman a note

The app can also set image bitrate.

image

hi @craig ,
thanks for sharing! I was experimenting with the same earlier… My only question is: can desktop app stitch these into 1 equirectangular video file? That would be cool and help a lot @Juantonto process! Stitched 8k video coming from Z1 again could be superb…

I don’t think there is any way to stitch the two videos into a single equirectangular video right now. I don’t think the RICOH desktop app will work.

1 Like

I installed the new Z1 firmware and did several tests. I’m confirming that it works for me. I do not have new information to add.

I tested recording video. I did not test the new, longer recording time.

I upgraded my Z1 firmware using my mobile phone, an iPhone 14. It was a smooth process, took about 5 minutes.

Here’s output in the THETA Bitrate Tester app that I was using:

I set the mode to video, set the video format to 3.6K at 2fps.

I took several test videos. This is a screenshot from RICOH THETA File Transfer for Mac app. The file sizes of the two halves of the video appear to be roughly the same size. (I only took a 5 sec video, so the file size is small.)

This is a screenshot from Quicktime, showing both halves of the video.

I don’t see a way to stitch the two halves using the RICOH THETA desktop app. I am guessing you could use PTGui or your own stitching algorithm.

I recorded a 50 minute video, but I was not able to successfully test playing it on my Linux laptop. I was able to play the 5 second video clips with VLC on Ubuntu.

I submitted the app to Google Play on June 23. It’s still under review as of July 3, 2023. It would be good to learn more about the app submission process.

1 Like

Sounds great.
What I would love to see added is a flat color profile.
Higher bitrate, resolution, 10bit color would also be great, but those could be HW constrained.

What is a “flat color profile”? I am not a good photographer and do not know photography terms.

The Z1 has DNG/RAW format of the images.

I just watched the video below.

Is “flat color profile” kind of like RAW?

This is for images, not video, right?

That’s great to have, such a pitty they did not implement such feature in theta V or S with it’s corresponding sizes, would be great to get that kind of high resolution low framerate videos.
Also seems like they did not change the information regarding the camera orientation in each frame, which would be great for a good frame stitch, isn’t it ?

I have quickly make a test with javascript and it works great, so I already have some videos with that new format.

It may be in an undisclosed proprietary format. I have not checked yet.

Do you need the IMU data per frame? It may not be possible to get the data track, but we can look.

Would be great to have that data per frame !
Note that I’m working on a javascript environment by now.

Thanks in advance

I believe the Z1 3.6k (2 files) video has the IMU data in CaMM format.

Yes, it has the IMU data in CaMM format,

you can easily access the data using exiftools like that

exiftool -ee -V3 path/to/your/video.MP4 > path/to/results.txt

In the generated file, you will find all the data from the camera, you can look for

Track2 Type='camm' Format='camm', Sample 1 of 3642 (16 bytes)

Note this is from a video with 18 frames, because the camera is saving many samples for the accelerometer and gyro, in this case 3642 samples

Track2 Type='camm' Format='camm', Sample 1 of 3642 (16 bytes)
 147e567: 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
SampleTime = 0
SampleDuration = 0
camm2 (SubDirectory) -->
- Tag 'camm' (16 bytes):
 147e567: 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
+ [BinaryData directory, 16 bytes]
| AngularVelocity = 0 0 0
| - Tag 0x0004 (12 bytes, float[3]):
|  147e56b: 00 00 00 00 00 00 00 00 00 00 00 00             [............]
Track2 Type='camm' Format='camm', Sample 2 of 3642 (16 bytes)
 147e577: 00 00 03 00 00 13 c7 3f c0 27 9e 40 00 28 e3 3d [.......?.'.@.(.=]
SampleTime = 0
SampleDuration = 0.005
camm3 (SubDirectory) -->
- Tag 'camm' (16 bytes):
 147e577: 00 00 03 00 00 13 c7 3f c0 27 9e 40 00 28 e3 3d [.......?.'.@.(.=]
+ [BinaryData directory, 16 bytes]
| Acceleration = 1.55526733398438 4.94235229492188 0.110916137695312
| - Tag 0x0004 (12 bytes, float[3]):
|  147e57b: 00 13 c7 3f c0 27 9e 40 00 28 e3 3d             [...?.'.@.(.=]

Actually you can find camm2 and camm3 as stated in the docu

  case 2:
    float gyro[3];
  break;
        
  case 3:
    float acceleration[3];
  break;
          

if you want to calculate pitch and roll from those values I’m using this function

def calculate_pitch_roll(acceleration):
    x, y, z = acceleration
    pitch = np.arctan2(x, np.sqrt(y**2 + z**2))
    roll = np.arctan2(y, np.sqrt(x**2 + z**2))
    return np.degrees(pitch), np.degrees(roll)

Hope it helps to anybody working on this kind of data

1 Like

@Jordi_Vallverdu , this is fantastic! I loved this so much that I got our summer intern to run some tests on it.

On a Mac, she was able to transfer the 3.6K 2fps files from the camera with the RICOH THETA File Transfer App. Using exiftool, she created several results.txt files for different experiments.

for a 6 second video, there are about 2600 samples, roughly 1,300 for angular velocity and 1,300 for acceleration.

The sample time is 6.61 for 2,641 samples.
It seems like two samples are coming in every 5ms.

EXAMPLE OF DATASETS:

3rd Test: When camera is on side of table(not moving)

¼ through dataset:
| AngularVelocity = 0.000274658203125 0.0014190673828125 -0.0003204
| Acceleration = -9.86866760253906 0.156494140625 -0.1824951171875345703125

½ through dataset:
| AngularVelocity = 0.000274658203125 -0.0004119873046875 -0.0003204345703125
| Acceleration = -9.85910034179688 0.137344360351562 -0.175308227539062

¾ through dataset:
| AngularVelocity = 0.000274658203125 -0.02178955078125 0.00152587890625
| Acceleration = -9.85430908203125 0.134963989257812 -0.189666748046875

4th Test: When human holding camera walks

¼ through dataset:
AngularVelocity = -0.13238525390625 0.370697021484375 -0.201797485351562
| Acceleration = -0.620407104492188 11.134765625 -0.6514892578125

½ through dataset:
| AngularVelocity = -0.351089477539062 -0.153427124023438 -0.148651123046875
| Acceleration = 1.66712951660156 8.70126342773438 1.2723388671875

¾ through dataset:
| AngularVelocity = 0.04046630859375 -0.476577758789062 -0.0515289306640625
| Acceleration = -2.14703369140625 9.71821594238281 -1.56315612792969

Her initial conclusions are below.

CONCLUSIONS:

The AngularVelocity and Acceleration values vary the most when the human is moving the camera as shown in the dataset. Since the camm data has a timestamp for each reading, we can match the angular velocity and acceleration to the videoframe.


Next Steps for Intern?

Is there any easy way to use a library to get the path?

do people use libraries such as this? GitHub - xioTechnologies/Fusion

This article had some information on the algorithm to get a path.


1 Like

Absolutely, it’s unfortunate that the update isn’t available for the theta V. If Ricoh could implement this in future updates for other theta versions as well, it would significantly enhance their value.

I’ve delved deeply into this subject and discovered some crucial insights, also thanks to the community. Special mention to @craig who initially directed me towards the new format – that was my starting point.

Because of this, I’d like to contribute back by sharing a repository that compiles all of my research. Inside, you’ll find documentation on the camm format and some Python scripts that streamline the data extraction process.

The parse_camm.py file is particularly noteworthy; it extracts IMU information and converts it into a useful JSON format.

Feel free to explore it here: Extract Camm GitHub Repository

I hope you find it valuable and can build upon it.

2 Likes

Thank you for building this repo. The CAMM Data Description section in particular was super useful for me in trying to understand what is being recorded.

The ‘camm’ track in a video file is a specific data track containing motion-related information that the camera’s built-in sensors have captured. This motion data comprises readings from the gyroscope (Angular Velocity) and accelerometer (Acceleration), both provided in the Inertial Measurement Unit (IMU) coordinate system.

The gyroscope data represents the rotational speed around each of the XYZ axes in radians/second. On the other hand, the accelerometer data signifies the linear acceleration along each of the XYZ axes in meters/second^2. The relationship between the IMU coordinate system and the camera coordinate system is defined by the applications and usually recommended to be aligned by the camera manufacturers.

If the camera coordinate system is defined by the application, does that mean there is no standard way of extracting the data? Is it unique for each camera?

1 Like

@Jordi_Vallverdu do you also need the GPS information. I think it is in the CAMM track. Do you plan to add the case 5 GPS data in your library to extract the data from the videos.

Kamera Hareketi Meta Verilerinin Özellikleri  |  Street View Publish API  |  Google for Developers

  case 5:
    double latitude;
    double longitude;
    double altitude;
  break;