The only point where I think we might loose too much flexibility with Smeulf's concept is this: The focused player not only receives player control commands. It is also the player which provides the data for the "Currently Playing" screen. And it is the player of the playlist which is currently being edited. So in Smeulf's concept, imagine you're watching a football game noiseless and you want to listen to music in the secondary player, which could be a quite common scenario. To add music to the music playlist, you would need to have the music player in fullscreen all the time while you're browsing through your ML. I think that is neither acceptable nor is it what we want to achieve with our player-picture-in-the-background. We WANT the user to be able to always see the player in the background which is most interesting. And in my example scenario, you have two different interests: The visual interest is on the football player while the control focus must be on the audio player...
In my concept, you keep a way to focus any player handly... It's "just" a kind priority for the next focus player. If a player is fullscreen then it gains the focus, this don't means the other player needs to be fullscreen to be focused by hand... But if a new player is started fullscreen or if the player become fullscreen from PIP, then it's auto-focused, and you have to switch or manually focus the PIP player to gain control on it. I hope I'm more clear this way.
To add the music, just focus the music player could be suffisient right ?
User option is also a good way, I agree FreakyJ.
Cheers.
France