In this short article, I follow the main steps of video transcoding and optimization for the web. I share basic tips and examples to consistently automate these processes inhouse using ffmpeg, a powerful and open source package.
While making a living from deploying and building video processing and delivery services, here I’ll be the devil’s advocate. In the case you are done with progressive video and the following steps cover your expectations, you likely don’t need anything else.
I assume you start from a first version of a finished video provided by a photographer or media agency. This is the pristine video. The first thing to be aware is that the pristine sets the maximum resolution and the maximum quality.
A good pratice for a web is to set a format, resolution, and quality for pristines and stick to it. For instance H264 in mp4, 4k resolution, and maximum visual quality (crf=17). Requiring providers to stick to this definition will avoid last minute (bad) surprises. You might need to define more than one resolution with different aspect ratios, depending on the web layout.
The variants or derivatives
From the pristine video you’ll need to derive the variants with the formats, the resolutions, and the quality that you intend to deliver. As for images, progressive videos should be delivered under a responsive approach. It doesn’t make sense to deliver a heavy FullHD version only to be rendered in a small viewport. Breakpoints should be defined to ensure a proper coverage of your audience.
<video width="100%" controls> <source src="Sintel_Trailer.1080p.DivX_Plus_HD.mp4" type="video/mp4; codecs=hevc"> <source src="Sintel_Trailer.1080p.DivX_Plus_HD.webm" type="video/webm; codecs=vp9"> <source src="Sintel_Trailer.1080p.DivX_Plus_HD.m4v" type="video/mp4"> </video>
If you decide to serve several formats, then you need transcoding. Assuming a pristine in H264, you may use ffmpeg to convert to H265 (HVEC) and to VP9. For instance
# H265 ffmpeg -i input.mp4 -c:v libx265 -crf 23 -tag:v hvc1 -pix_fmt yuv420p -color_primaries 1 -color_trc 1 -colorspace 1 -movflags +faststart -an output.mp4 # VP9 ffmpeg -i input.mp4 -c:v libvpx-vp9 -crf 30 -speed 3 -pix_fmt yuv420p -color_primaries 1 -color_trc 1 -colorspace 1 -movflags +faststart -an output.webm
As pointed, to be responsive you’d serve several resolutions. To define your breakpoints you’ll look to the page layout and the traffic analytics (devices and screens). Once defined, Ffmpeg allows you to easily rescale a video. for instance:
# Rescale to full HD ffmpeg -y -i input.mp4 -vf scale=1920:-2 -c:v libx264 -crf 23 -profile:v high -pix_fmt yuv420p -color_primaries 1 -color_trc 1 -colorspace 1 -movflags +faststart -an output.mp4
Remind to always downscale from the pristine. Avoid upscaling. You would be inventing pixels that add to the weight and use the bandwidth, while the browser can do the same without such cost. Moreover, upscaling produces artifacts, specially blur. Just define the pristine and the breakpoints with resolution enough.
Video compression for web is lossy. And lossy means discarding information to reduce weight. Depending on the amount of information discarded, the content of the video and which coding standard you’re using, at a given point you’ll notice increasing visual artifacts (blur, blocking, mosquito).
When choosing compression settings, there are two main approaches: setting a visual quality or setting a bitrate.
Assure visual quality
In this case you should set the crf parameter, which is a proxy for visual quality. This is the default configuration in ffmpeg.
The underlying metric behind the crf is different for each coding standard. You’ll need to guess the value that meets your expectations for each video format. A good starting point is the default value in ffmpeg,
# Example of quality definition based on crf ffmpeg -y -i input.mp4 -vf -c:v libx264 -crf 23 -profile:v high -pix_fmt yuv420p -color_primaries 1 -color_trc 1 -colorspace 1 -movflags +faststart -an output.mp4
While the benefit of a policy based on crf is the assurance of visual quality, the main risk are peak values of bitrate in videos with rich textures and fast movements.
Assure bandwidth usage
If you want to assure the bandwidth used by the video, then you need to set the bitrate.
# Example of compression with constant bitrate ffmpeg -y -i input.mp4 -vf -c:v libx264 -b:v 2500k -profile:v high -pix_fmt yuv420p -color_primaries 1 -color_trc 1 -colorspace 1 -movflags +faststart -an output.mp4
The downside is that to avoid a good number of videos with very poor visual quality, you’ll have to set a fairly high bitrate value. This means delivering a lot of other videos much heavier than needed. I’d rather use a compression based on crf and not on bitrate.
For pros: HLS and per-title encoding
Although transcoding can be certainly done using ffmpeg, it involves more complex aspects (quality selection, renditions, viewer), out of the scope of this basic tips article.
Here I have gone through some basic tips that cover a simple pipeline for transcoding progressive videos for web delivery.