Sigmedia Stereo Video Database

The database is designed to be a resource that provides test data working in the field of stereo cinema post-production. It consists of sequence shot in indoor and outdoor locations and under both controlled and uncontrolled lighting conditions. It also contains footage shot on side-by-side and mirror stereo rigs as well as footage shot on a steadisam and mirror rig.

Sigmedia Database Updated

It has now been updated to include ambisonics b-format audio recordings along with some of the sequences. Along with the b-format recordings we are providing some simple matlab scripts that allow conversion of the b-format file to a format suitable for playback over a stereo speakers or a 5.1 speaker array. Details of the video and audio formats used for these sequences are given in the ammended sections below. It also includes some new footage shot on a mirror rig which we have recently acquired. We would be interested in receiving any feedback with regard to shooting scenarios or types of artefacts which you think would be beneficial to add to the database.

The Experimental Setup

The database contains sequences shot on two stereo camera rigs: a side-by-side rig and a mirror rig. The side-by-side rig consists of two iconix HD-RH1 cameras mounted on an Inition 'bolt' side-by-side rig. The data was capture on Flash XDR units and is recorded in a 4:2:2 XDCAM format using the "xd5e" codec. The spatial resolution of the sequences is 1920x1080 and has a frame rate of 25 frames per second.

Two Sony PMW-EX3 cameras are mounted on the mirror rig which is the P+S Technik Standard Mirror Rig. The data is encoded to the MPEG-2 standard using a 4:2:0 variable bit rate encoder up to a maximum bit rate of 35 Mbps. Similarly to the side-by-side configuration all sequences have a resolution of 1920x1080 pixels and at a frame rate of 25 fps progressive.

The ambisonics recordings were obtained using either a sound filed microphone or a zoom H-2 recorder and are subsequently converted to ambisonics b-format. The soundfield microphone allows for periphonic recording of the sound field and so these recordings contain 4 channels: W, X, Y and Z. The zoom mic only allows for horizontal-only planar recordings and so there is no Z channel in this case.

Sequence Information

The majority sequences in the database are LZW-compressed tiff sequences which are archived in the tar.gz format. The exception is for the sequences with the accompanying audio which are compressed using a H.264 encoder at a variable bit rate of 25 Mbps. This was necessary as the sequences with audio are much longer on average. Tiff versions of these sequences may be made available on request.

For the h.264 encoded files, the left and right views contain a stereo audio track and the four ambisonics b-format channels are encoded onto these tracks. The left view contains the W and X channels with the W channel in the left stereo channel and the X channel in the right stereo channel. The right view contains the Y (left stereo channel) and (where it exists) the Z channels (right stereo channel). These files are encoded in the video stream using a AAC codec with a sample rate of 48 kHz and a bit rate of 192kbps. The ambisonics files are also presented separately as uncompressed mono audio files in the wav format. The sample rate is again 48 kHz with 24 bits per sample.

There are two versions of each sequence in the database. Both the un-processed "raw" sequences and colour-corrected sequences are included. More details of the processing applied can be found in our paper from CVMP '10 [1]. In each sequence folder there are low-res low bit-rate video thumbnails of each of the archived sequences.

File Naming Convention for the Video-Only Sequences

For sequences that without accompanying audio, file names in the database have the format "name"."rig"."location"."mount"."processing"."view"."index".tiff

  • name - a unique ID for each sequence.
  • rig - the stereo rig on which the sequence was shot. The values are
    1. sbs - for sequences shot on the side-by-side rig.
    2. ind - for sequences shot on the mirror rig.
  • location - the location where the sequence is shot. The potential values are
    1. std - for sequences shot in an indoor studio. It is implied that all sequences in this location are shot under controlled lighting conditions.
    2. ind - for other indoor locations. Control over the Lighting is limited.
    3. otd - for outdoor locations.
  • mount - an ID that describes how the rig is mounted. It is either fix for tripod mounted shots or scm for steadicam mounted shots.
  • processing - describes the level of processing applied to the sequences. Possible values are
    1. raw - for un-processed sequences.
    2. col - for colour-corrected sequences.
  • view - the left and right stereo views.
  • index - a four digit frame index (%04d format).

File Naming Convention for the Sequences with Audio

The video files adopt the convention "name"."rig"."location"."mount"."processing"."view"."channels".mp4. The same values as above are used for the first 6 tags. The channels tag desciibes the ambisonics channels encoded in the mp4 file. The value is WX for the left view representing the W and X ambisonics channels and is either YZ or Y for the right view. The wav files follow the convention "name"."channel".wav where "channel" is W, X, Y or Z depending on the b-format channel it reepresents.

Accessing the Database

Unfortunately due to copyright issues we require users wishing to access the database to register. Registration can be made by following the link on the left of the page. We may offer physical distribution of the database on an external hard drive upon request should there be sufficient demand. All queries regarding the database should be forwarded to corrigad_at_tcd_dot_ie or anil_dot_kokaram_at_tcd_dot_ie.

Acknowledgements

We would like to acknowledge the authors of [1] for their efforts in gathering the material for this database. We would also like to thank John Squires and Fionnuala Conway from Trinity College Dublin as well as Charlie Perera from Inition for their assistance during the recording of the database. This work was funded in part by the Science Foundation Ireland (SFI) PI project CAMP, the EU FP7 project i3DPost and Adobe Systems Inc.

Publications

  1. A Video Database for the Development of Stereo-3D Post-Production Algorithms. D. Corrigan, F. Pitié, V. Morris, A. Rankin, M. Linnane, G. Kearney,
    M. Gorzel, M. O’Dea, C. Lee and A. Kokaram. Conference on Visual Media Production (CVMP), London, 2010.

Video Playback for Stereo-3D

We have been looking for a video player capable playing in stereo-3D from separate left and right video files. We recently found one called Bino, which is based on the ffmpeg libraries and so plays a wide variety of video formats. We have tried it on the compressed videos on the database and it seems to work. It seems a good option to preview in stereo the sequences in the database before committing to download the HD high quality tiff sequencess.

The website for Bino is www.nongnu.org/bino

Syndicate content