Understanding video compression

What is video compression?

Video compression is the process of encoding video data in such a way that it consumes less space than the original (raw data), making it easier to store or to transmit over the network/Internet. This is achieved by eliminating redundant and non-functional data from the original data stream.

Video compression is performed through a video codec that works on one or more compression algorithms. Usually video compression is done by removing repetitive images, sounds and/or scenes from a video. For example, a video may have the same background, image or sound played several times or the data displayed/attached to the video file is not that important. Video compression will remove all such data in order to reduce the video file size.

What are the drawbacks of using video compression?

Once a video is compressed, its original format is changed into a different format (depending on the codec used). The video player must support that video format or be integrated with the compressing codec to play back the video file.

Another important thing to be aware of when compressing/decompressing video is the actual processing power required to perform these operations. This can vary greatly depending on the codec used. All of this processing also introduces latency; latency can be extremely important for real-time applications (using PTZ on a remote camera in a video surveillance system for example), but less important for non real-time applications (playing back a movie from an attached storage drive for instance).

In terms of image quality, the effects of the lost data due to video compression should normally be imperceptible. This is due to the fact that some of the visual data that is lost takes advantage of the limitations of human perception with regards to color or motion. However, if the compression is very aggressive (to have the smallest file size possible), the drop in image quality becomes noticeable.

What are the differences and similarities between MJPEG, H.264 and H.265?

IONODES encoders support some of the most popular video codecs on the market: MJPEG, H.264 and H.265 (support varies based on hardware platform, so not all of them may be available on a specific product). The main difference between H.264/H.265 and MJPEG is that MJPEG only compresses individual frames of video, while H.264/H.265 compresses across frames.

For MJPEG, each frame of video is treated by itself, just as if you were compressing a series of JPEG images manually. For H.264, some of the frames are compressed by themselves (called an I or initizaliation/intra frame) while most of the frames only record changes from the previous frame (called P or progressive/predicted frames). This can save a significant amount of bandwidth compared to MJPEG which encodes each frame anew (can reach up to 80% of reduction in bandwidth).

H.265 was designed as a successor to the widely used H.264. In comparison to H.264, H.265 offers from 25% to 50% better data compression at the same level of video quality, or substantially improved video quality at the same bit rate. This becomes increasingly important as resolution goes up from 1080p to 4K and beyond.

Whether you use MJPEG or H.264 or H.265, it is important to know the complexity of your scene. By scene complexity, we mean how much activity is occurring in the scene of video that we are capturing. For instance, a person talking in front of a white wall is far less 'complex' than a crowded stadium. In general, the more color, shapes, sizes, objects and movements in a scene, the more complex that scene will be (time of day or scene lighting will also play a part in this). Compression will always be limited by this scene complexity. All compressions depend on discovering patterns and representing those patterns by shorter codes/messages. The more complex or the more seemingly random a pattern is, the less likely it is for it to be compressed (or the harder it is to accomplish this).