I gave up and wrote my own code. GPUImage is great but it has so many issues and the code is not very easy to read for openGL noobs like me.
I learned more by writing it myself - all roads lead to rome, and all of GPUImage is basically based on a few examples out there on the net anyhow..
Well, Brad Larson has just commented to that request:
There currently is no way to do this kind of audio mixing. Only one audio source is used at a time.
Merging video tracks does work but audio doens’t is because the way how it works. For video tracks merging, there are two GPUImageMovie instances and each starts reading frames, uploading image buffer to GPU then grabbing the reference to rendered texture, passing to GPUImageMovieWriter instance. Then GPUImageMovieWriter bind references of two textures to chroma key shader’s input sample textures, rendered it. Finally grabbing the rendered output texture and write it to output video.
However audio tracks works quite differently, as you couldn’t mix two CMSampleBufferRef of audio track output. How about in the GPUImageMovieWriter we use AVMutableComposition to mix two audio tracks and output it as a source for asset writer to write it to final output video.