Render time and previewing definitely gets messy with that many different source videos. One trick I stumbled on was to standardize all the individual videos first. I kind of arbitrarily decided everything would be cropped to a portrait 3:4
aspect ratio, so I did that, added the person's name, and rendered that as a new file. Then in the main project I could focus on the
grid layouts & transitions, and so forth. I originally tried it just because it was an easier approach to make the titles work, but it ended up making a huge difference in
rendering time. The project had... I think 15-17 singers and an initial draft was around 4hrs
rendering. Pre-processing all the videos took 1-2 hours total, but the main project render time dropped down to a little over an hour. Dealing with all the different aspect ratios, resolutions, formats, codecs, and random other crap that everyone's phone happens to spit out adds a lot more overhead than I realized going into it.
As far as process goes, we did something that combines some of the ideas already mentioned here. The choir teacher created a video that the students would
play on their laptop or chromebook (with
headphones) while recording with their phones. The video included the score and a small view of the
conductor in one corner. It would start with a "One, two, three, *clap* " followed by a pause, and then the lead in to the music--which included a metronome, piano, and one person singing each voice part. We had one person pause their phone between the clap and the start of the music, but for the most part it was pretty easy to get everything lined up, and they mostly stayed in time with each other.
I used Vegas Pro for the video editing--which has some good and bad characteristics. I felt like the process of deciding, "Okay, I've got this many people, so I need to make them all this size and do x rows of y" was pretty clumsy. Instead I actually used Inkscape to just draw out some squares to get the size and
spacing correct. There was a little math involved because Vegas counts from the center of everything and Inkscape uses the bottom left corner, but once I had that figured out it was pretty quick and easy to create position presets and assign the source videos to them.
The choir teacher did the audio editing separately while I worked on the video. It was easier for me to time-align everything in Vegas first, export WAV files to
send to him from that, and then add the mixed version as a new stereo
track when he was done. I can't speak much to his process other than that we both acknowledged one of the hardest parts is deciding what "good enough" means under the circumstances. It's a bunch of mediocre phone mics in terrible acoustical environments, so it's never going to sound as good as you'd like. At some
point you have to accept what's realistic with what you have and not waste endless hours trying to make it just a little
bit better.