Read up on Amdahl's Law. In summary, the law states that for every additional thread for workload that can be parallelized, you get diminished performance boost (speedup). So going from single thread to two threads might give you 50% performance boost. Advancing next to three threads might give you another 20% boost, but moving on to four threads might give you no boost at all. Total computation time cannot be lower than the time it takes to finish sequence workload portion.
Problem is, game is mostly consist of sequence workload (very little parallel workload), which CANNOT be multithreaded. Think about, in gaming you use controllers to give constant input/feedback. Computer use those inputs to calculate and deliver output (what you see on TV or screen). This is a sequence workload, what happens is unknown unless input is received. This is a dependent workload, cannot be multithreaded. Workload can only be multitreaded if it does not depend on other factors.
A lot of games multithread on minor workload (e.g. audio processing).
Think about it, PS3 has 7 working SPEs (very close to 7 threads), one is reserved for background OS. So game only have access to 6 SPEs. Did you see any game take advantage of that? Did you see a push for more threads? No.
People need to realize that gaming will never be as parallel as say video encoding (computer encodes video from multiple time stamp, and put them all together). There will be game that can take advantage of more than 4 threads, but those will be few.
It'll only be harder and harder to multithread (you'll only see less and less game heavily multithreaded).
While Amdahl's Law is certainly true (the math behind it is actually fairly trivial), I'm not sure it applies in this case in the way you describe.
Amdahl's Law concerns the total compute time for a given task. It follows that, as you increase the number of threads created and used by the parallelizable portions of the task, it quickly becomes that case that the performance increase (percentage wise) from the previous case with one less thread quickly diminishes due to the sequential portion of the code being so large (taking a long time) in comparison.
In regards to games, this may mean that higher frame rates (above, say, 60Hz or 120Hz) may not be something that is achievable through parallelization - there will always be that part of the code that holds the rest of it back. What is possible, however, is to perform more
calculations per frame given that said calculations are parallelizable. In fact, you can use Amdahl's Law to prove that given enough threads, you can cram an absurdly large amount of computation into a task (assuming said calculations are parallelizable) while having very little affect on total compute time.
What can be parallelized in games? Well, to start with, AI is very easy to do that with. Note that this is one of the specific points they made about CoD, disregarding the amount of public mocking the statement received. So, for zombie games, way more zombies in play at once (whatever that new zombie game is). More complex actions and responses by AI in racing (Forza) and sports (new EA games) games. RTS games would also benefit much from this (look at how much of an issue single thread bottlenecking is with StarCraft II, and how long AI decisions take in Civ V because they are done sequentially).
Complex physics engines can also be easily parallelized (although not as easily as AI, I would expect). Everybody likes physics. Each object in the game could have its own thread dedicated to calculating how it will respond to the environment in the next frame.
Basically, as you also stated, the only reason things aren't more heavily parallelized as of right now is because it's a total pain in the ass to do (from a software engineer's viewpoint, at least), and nobody has developed any sort of SDK to make it any easier. This is what I think will have the largest influence over the success of the next gen consoles, and will cause one or the other to come out on top.
Developers also had the excuse (up until now, at least) that they were developing for the lowest common denominator in the market. As such, a lot of time spent working on parallelization would be completely wasted in many cases. So, when basing their business decisions on this, they decided against wasting time on that and put work in elsewhere (fixing bugs, adding features).
So, basically, while it's not as easy to multithread games as people say, its also not that hard, either. Now that the next gen systems both have 8 cores (fuck you, 3 core WiiU), I'm hoping that some of the previous excuses won't apply anymore, and that the power of the ecosystem will force the software to evolve. Only time will tell, though.