"memory segmentation and memory protection" might make this a bit harder if you want to do more embedded hardware type things. Without it there are any number of development boards with ARM processors of various stripes, usually the older ones as might have appeared in the GBA or DS rather than more modern stuff seen in Android type devices (and I guess the 3ds as well). On said boards you could stick anything you like.
Intel did release a sort of embedded version of X86 which would have all that but they are a bit fiddly
http://uk.mouser.com/Semiconductors...-6hpef?Keyword=143236997&FS=True&Ntk=P_MarCom
http://www.intel.com/content/www/us...-which-intel-processor-fits-your-project.html
http://www.intel.com/content/www/us/en/do-it-yourself/edison.html
http://www.intel.co.uk/content/www/uk/en/do-it-yourself/galileo-maker-quark-board.html
AMD do have some stuff as well
http://www.amd.com/en-us/products/embedded/processors/lx/geode-lx-db800
"good CPU for a homebrew computer"
I kind of went there above but just to say it that will depend entirely upon what you want to do, and such things run from stuff like the microcontrollers, through FGPAs, into ARM and traditional embedded processors which is massive field all before landing on things that basically are X86/x64 desktop processors, power requirements, computing ability, cost, ease of coding and more all vary within each of those. Unless you make a clone or very nearly a clone of something else then chances are you will be doing all the software for it and that means you will probably not have a lot unless you want to go 24/7 on it. On "memory segmentation and memory protection" I should also mention you can do quite a bit of this in software --
http://www.uclinux.org/ being a port of Linux that those without memory controllers can use more easily.
Multiple processors is at once something special and nothing special. The trick is getting them to talk to each other in a useful way -- if you only have them tied together by something with high latency and low throughput it will be a mutli processor system but depending upon what computing task you are doing it might not be any good -- if they need the results of each other than not good, if it just needs to say "yes it is complete" then that is different. Amdahl's law is more for parallel computing/supercomputing but the underlying logic will be the same as what you face.