This chapter attempts to provide an overview of the operation of the camera under various conditions, as well as to provide an introduction to the low level software interface that picamera utilizes.
The Pi’s camera has a discrete set of input modes. On the V1 camera these are as follows:
# | Resolution | Aspect Ratio | Framerates | Video | Image | FoV | Binning |
---|---|---|---|---|---|---|---|
1 | 1920x1080 | 16:9 | 1-30fps | x | Partial | None | |
2 | 2592x1944 | 4:3 | 1-15fps | x | x | Full | None |
3 | 2592x1944 | 4:3 | 0.1666-1fps | x | x | Full | None |
4 | 1296x972 | 4:3 | 1-42fps | x | Full | 2x2 | |
5 | 1296x730 | 16:9 | 1-49fps | x | Full | 2x2 | |
6 | 640x480 | 4:3 | 42.1-60fps | x | Full | 4x4 | |
7 | 640x480 | 4:3 | 60.1-90fps | x | Full | 4x4 |
Note
This table is accurate as of firmware revision #656. Firmwares prior to this had a more restricted set of modes, and all video modes had partial FoV. Please use sudo apt-get dist-upgrade to upgrade to the latest firmware.
On the V2 camera, these are:
# | Resolution | Aspect Ratio | Framerates | Video | Image | FoV | Binning |
---|---|---|---|---|---|---|---|
1 | 1920x1080 | 16:9 | 0.1-30fps | x | Partial | None | |
2 | 3280x2464 | 4:3 | 0.1-15fps | x | x | Full | None |
3 | 3280x2464 | 4:3 | 0.1-15fps | x | x | Full | None |
4 | 1640x1232 | 4:3 | 0.1-40fps | x | Full | 2x2 | |
5 | 1640x922 | 16:9 | 0.1-40fps | x | Full | 2x2 | |
6 | 1280x720 | 16:9 | 40-90fps | x | Partial | 2x2 | |
7 | 640x480 | 4:3 | 40-90fps | x | Partial | 2x2 |
Modes with full field of view (FoV) capture from the whole area of the camera’s sensor (2592x1944 pixels for the V1 camera, 3280x2464 for the V2 camera). Modes with partial FoV capture from the center of the sensor. The combination of FoV limiting, and binning is used to achieve the requested resolution.
The image below illustrates the difference between full and partial FoV for the V1 camera:
While the various FoVs for the V2 camera are illustrated in the following image:
The input mode can be manually specified with the sensor_mode parameter in the PiCamera constructor (using one of the values from the # column in the tables above). This defaults to 0 indicating that the mode should be selected automatically based on the requested resolution and framerate. The rules governing which input mode is selected are as follows:
A few examples are given below to clarify the operation of this heuristic (note these examples assume the V1 camera module):
The are additional limits imposed by the GPU hardware that performs all image and video processing:
This section attempts to provide detail of what picamera is doing “under the hood” in response to various method calls.
The MMAL layer below picamera presents the camera with three ports: the still port, the video port, and the preview port. The following sections describe how these ports are used by picamera and how they influence the camera’s resolutions.
Firstly, the still port. Whenever this is used to capture images, it (briefly) forces the camera’s mode to one of the two supported still modes (see Camera Modes) so that images are captured using the full area of the sensor. It also uses a strong de-noise algorithm on captured images so that they appear higher quality.
The still port is used by the various capture() methods when their use_video_port parameter is False (which it is by default).
The video port is somewhat simpler in that it never changes the camera’s mode. The video port is used by the start_recording() method (for recording video), and is also used by the various capture() methods when their use_video_port parameter is True. Images captured from the video port tend to have a “grainy” appearance, much more akin to a video frame than the images captured by the still port (this is due to the still port using a slower, more aggressive denoise algorithm).
The preview port operates more or less identically to the video port. The preview port is always connected to some form of output to ensure that the auto-gain algorithm can run. When an instance of PiCamera is constructed, the preview port is initially connected to an instance of PiNullSink. When start_preview() is called, this null sink is destroyed and the preview port is connected to an instance of PiPreviewRenderer. The reverse occurs when stop_preview() is called.
The camera provides various encoders which can be attached to the still and video ports for the purpose of producing output (e.g. JPEG images or H.264 encoded video). A port can have a single encoder attached to it at any given time (or nothing if the port is not in use).
Encoders are connected directly to the still port. For example, when capturing a picture using the still port, the camera’s state conceptually moves through these states:
As you have probably noticed in the diagram above, the video port is a little more complex. In order to permit simultaneous video recording and image capture via the video port, a “splitter” component is permanently connected to the video port by picamera, and encoders are in turn attached to one of its four output ports (numbered 0, 1, 2, and 3). Hence, when recording video the camera’s setup looks like this:
And when simultaneously capturing images via the video port whilst recording, the camera’s configuration moves through the following states:
When the resize parameter is passed to one of the aforementioned methods, a resizer component is placed between the camera’s ports and the encoder, causing the output to be resized before it reaches the encoder. This is particularly useful for video recording, as the H.264 encoder cannot cope with full resolution input (the GPU hardware can only handle frame widths up to 1920 pixels). Hence, when performing full frame video recording, the camera’s setup looks like this:
Finally, when performing unencoded captures an encoder is (naturally) not required. Instead data is taken directly from the camera’s ports. However, various firmware limitations require acrobatics in the pipeline to achieve requested encodings.
For example, in older firmwares the camera’s still port cannot be configured for RGB output (due to a faulty buffer size check). However, they can be configured for YUV output so in this case picamera configures the still port for YUV output, attaches as resizer (configured with the same input and output resolution), then configures the resizer’s output for RGBA (the resizer doesn’t support RGB for some reason). It then runs the capture and strips the redundant alpha bytes off the data.
Recent firmwares fix the buffer size check, so with these picamera will simply configure the still port for RGB output (since 1.11):
A further complication is the “OPAQUE” encoding. This is the most efficient encoding to use when connecting MMAL components as it simply passes pointers around under the hood rather than full frame data. However, not all OPAQUE encodings are equivalent:
The new mmalobj layer introduced in picamera 1.11 is aware of these OPAQUE encoding differences and attempts to configure connections between components with the most efficient formats possible. However, it is not aware of firmware revisions so if you’re playing with MMAL components via this layer be prepared to do some tinkering to get your pipeline working.
Please note that even the description above is almost certainly far removed from what actually happens at the camera’s ISP level. Rather, what has been described in this section is how the MMAL library exposes the camera to applications which utilize it (these include the picamera library, along with the official raspistill and raspivid applications).
In other words, by using picamera you are passing through (at least) two abstraction layers which necessarily obscure (but hopefully simplify) the “true” operation of the camera.