Well, I "may" have a solution to my problem and still maintain the 1024x1024 texture to output to. Guess I should let you all know the purpose for this whole thing so people will understand what I'm trying to do, get insight and possible solutions to improving it.
Mode 7.
I'm testing ways to improve Mode 7 with blargSNES under hardware rendering. Currently with the WIP version I pushed to gitHub long ago, the Mode 7 process was handled much the same way tile scanning was done for all other backgrounds, but could result in being CPU-heavy depending on the scale, rotation, and mid-frame changes (for perspective Mode 7) of the layer, to the point where it would be better to run under software for various scenarios.
So, my idea was to simply render the entire Mode 7 layer to its own texture (which is a static 1024x1024 pixel area), and then use that texture and simply plot that down. That would make the process more static than dynamic, and would take into account scenarios where even if the layer is zoomed out so much that it's wrapping, the use of a texture would allow that. Of course though, I mentioned earlier how rendering to that size of a texture is slow. The latest solution that I thought up the other night would just use the current method that's required to render to a 1024x1024 texture in the first place. Render smaller sections, but in this case, don't render all sections in the particular frame. Perhaps just half or a quarter, depending on if that section needs to be updated. The section that don't get updates will just show what's been retained from their last update, not reliant on color, VRAM, and layer changes. I'd imagine that most people would not notice it except for when the Mode 7 is initially used, as it would likely be blank, but that's based on the situation. Thinking that changes to the color palette would flag all 4 sections to need updating (but not all updated in the same frame), changes to tile cels will flag only those sections that specific tile cel is used in, and then changes to individual tiles on the layer will only flag the one section that change was made in.
So that's where I am right now. I would, however, request anything that would help with the actual rendering process to the large texture, like whatever sort of settings for the GPU would grant the fastest processing, as I'm simply rendering a pre-made list (that has polygon coordinates already set, like <0,0>, <8,0>, <24,16>, etc) that only changes with where the "tiles" are taken from the source texture (that has 256 8x8 pixel tiles). Figured no culling, no depth processing, etc would help, but I may be missing some other things.