Why not just initiate 3D acceleration when it's required, if it's even required.
listening to MP3's and browsing pictures surely cant need 3D. I can even turn off 3D acceleration completely in windows media player and play an Xvid full screen using 20% CPU.
because many elements of the current interface require it.
The interface itself uses a psudo-3d engine that gives us the zooming, rotating, transparency effects - things that are offloaded from the CPU and already put into place in GPU hardware. Transparency is a huge CPU hog, and using the GPU to perform the calculations saves a lot of processing power.
MP also uses VMR9 renedering for displaying video. VMR9 will allow the rendering of video footage onto a 3D plane as well as enable hardware features like bilinear and trilinear filtering by default for better picture quality. When you use MP's aspect functions when watching a video, MP is actually manipulating a 3D plane to display the desired aspect ratio.
One thing I would really like to see for the future of MP is a completly 3D interface, with buttons and text that exist entirely in a 3D world. Panning, zooming, rotations, etc... would all be smooth and seamless. With the 3D foundation already in place, its just a natural progression for us to implement this.