Ok, finally i managed to find those physical address changes and successfully managed to port HBL/Loadiine to 4.0.x. I'll provide a brief explanation of my investigation since it might help someone in porting HBL to even earlier firmwares.
So first of all look at this page
http://wiiubrew.org/wiki/Physical_Memory HBL uses physical addresses of root.rpx and loader+coreinit.rpl and relies on fact that they are always the same. But that's true only for 4.1.0+ firmwares.
To experiment with these thing best instrument is rpc from libwiiu. It allows us to call OSEffectiveToPhysical form python shell, making experiments with addresses simple since we don't need to recompile app to test other addresses. Other great thing about this tool that it does not require kernel access, so it makes our work so much faster.
So, first of all, we need to understand that physical layout was actually changed. Go here
http://wiiubrew.org/wiki/Cafe_OS and look through the virtual mapping table. Virtual range 0x01000000 - 0x01800000 is always mapped to loader and coreinit. So we simply call OSEffectiveToPhysical(0x01000000) and for 4.0.x we receive 0x4D000000 (it's 0x32000000 for 4.1+, so that's a pretty big difference).
But I still needed to find toot.rpx since hbl uses free space reserved for root.rpx to load elfs/rpls. My first idea was really horrible. I decided to map physical memory that is interesting for me to 0xA0000000 and then dump it using rpc. The problem was - dumps were inconsistent. Occasionally memory that was reserved for coreinit returned me dump full of zeroes or some garbage. Moreover the process of dumping is very slow. So I wasted several days with 0 result.
So i decided to be smarter. I've found an interesting virtual range - 0x10000000 - 0x50000000. It seems like it is mapped to different parts of MEM2 region depending on the running process RAMPID (All of this is described on those two links that I specified previously). The problem here is that we run our rpc from browser, so we can find out only the address of "Background app" area (according to Cafe OS page). But that's at least something. So we do OSEffectiveToPhysical(0x10000000) and receive 0x30000000 (address of root.rpx for 4.1+ lol). Then I made a desperate move. I knew for sure that mapping for static regions like 0x01000000 to coreinit and that 0xA0000000 stuff is specified in the kernel in the table and we know addresses of those tables (since kernel exploit updates this table to change mapping of 0xA0000000). So I was pretty sure that there should be similar table for 0x10000000 that specifies where should we map that area depending on RAMPID. So I simply searched kernel for value 0x30000000 and to my surprise there were only two occurrences of that value and one of them was actual map table.
So now let's look into this table and see what happened to those physical layouts. First of all - it seems like table contains of 3-property tuples (RAMPID, PhysicalAddressStart, RangeLength). And there are two very similar tables one after the other. The only difference between them is last entry - that's probably that difference in memory size for devkits. Next let's see into the actual tables for different firmwares. I provide (RAMPID, PhysicalAddressStart, RangeLength) and for the last entry alternative value from the second table in parentheses.
Let's start with 5.3.2.
5, 0x28000000, 0x8000000 - Home Menu
1, 0x30000000, 0x2000000 - root.rpx
6, 0x33000000, 0x1000000 - Error display
4, 0x34000000, 0x1C000000 - Background app memory
7, 0x50000000, 0x80000000 (7, 0x50000000, 0x40000000) - Foreground app memory
Nothing special here. Everything is the same as in Physical memory page, so we can use names from there to have a better understanding of who is who. The only range that is not in the table is 0x32000000, 0x1000000 - loaer + coreinit.
Next let's continue with 4.1.0
4.1.0
4, 0x28000000, 0x8000000 - Home Menu
2, 0x30000000, 0x2000000 - root.rpx
6, 0x32000000, 0x1000000 - loader + coretinit.rpl
3, 0x33000000, 0x1000000 - Error display
1, 0x34000000, 0x1C000000 - Background app memory
7, 0x50000000, 0x80000000 (7, 0x50000000, 0x40000000) - Foreground app memory
As you can see sizes, addresses and order of areas are the same. But there is a loader+coreinit section here. And it seems like later firmware changed RAMPIDs.
And finally 4.0.0
4, 0x28000000, 0x8000000 - Home Menu
1, 0x30000000, 0x1C000000 - Background app memory
3, 0x4C000000, 0x1000000 - Error display
6, 0x4D000000, 0x1000000 - loader + coretinit.rpl
2, 0x4E000000, 0x2000000 - root.rpx
7, 0x50000000, 0x80000000 (7, 0x50000000, 0x40000000) - Foreground app memory
As you can see that even though addresses and order (they seem to be actually ordered by address) were changed, RAMPIDs are the same as for 4.1.0, so we can easily identify the areas.
Big thanks to everyone who replied to this thread. And special thanks to
@dimok for helping me in IRC.