I want to expand on Tobias' 3D encoding guide. While it provides a good starting point, encoding the video in three steps (downsample, split, merge) is quite wasteful, both in the time it takes and the quality loss.
Instead, it is possible to do it all in one go:
ffmpeg -i INPUT -filter_complex...