I think many people would argue that they should be listed multiple times, once for each audio stream. I just can’t bring myself to do that as it kind of explodes (what if there are multiple angles in the video also, then you need to list all the angle/channel combinations which means each gets listed at least 4 times). Also with videos is is much more common that the 5.1 and stereo mixes would be downmix/upmixes which would really be the same recording anyway and difficult to determine).
Not really answering the question, just throwing out more things to consider.