So I've just spent a good part of an hour messing with my accidentally newly acquired O2DS. Originally I was going to install some seeds with FBI so I click the icon, then proceed to OCR the screen to know what I'm doing because none of these apps have any sound cues and sometimes confusing interfaces to a blind person like me.
The OCR was a bit shit so I moved on to Be My AI, which uses OpenAI's GPT4 Vision as it's backend and managed to get super good results. I even did manage to get to installing the seed for Rhythm Paradise Megamix but looks like they weren't found because I got an error.
I then proceeded to play with the settings app, and even got to getting through the account linking portion which led me to Mii Maker in which I partially succeeded in making a mii.
There were lots of hickups though, for example lots of unnecessary information was being red out about my surroundings given I was shooting photos of the screen and not getting a direct image from it. Sometimes I would shoot the photo incorrectly and would have to wait another 10 seconds for it to scan the image and give me the results.
This got me thinking about an idea I had before acquiring this console a couple days ago of a script which when trigger with a controller keybind, like the Rosalina menu as an example, would scan the current screen, then report back using a speech synthesizer such as Espeak or Flite. This would avoid the need for having my phone in my other hand basically at all times, as well as making some games a lot more playable.
Would this even technically be possible? I know it wouldn't work in AGB firm or TWL firm because these don't work in 3DS mode, but it would still give me a lot less frustration when I think about doing something then go "Shit, guess I have to find a sighted person again!"
If someone would be willing to attempt this, you would have my eternal grattitude and I would even be able to pay you for making such a tool! I'm not knowledgeable in C at all and given I found learning Python somewhat dificult, I don't imagine C will be a walk in the park in comparison lol.
The OCR was a bit shit so I moved on to Be My AI, which uses OpenAI's GPT4 Vision as it's backend and managed to get super good results. I even did manage to get to installing the seed for Rhythm Paradise Megamix but looks like they weren't found because I got an error.
I then proceeded to play with the settings app, and even got to getting through the account linking portion which led me to Mii Maker in which I partially succeeded in making a mii.
There were lots of hickups though, for example lots of unnecessary information was being red out about my surroundings given I was shooting photos of the screen and not getting a direct image from it. Sometimes I would shoot the photo incorrectly and would have to wait another 10 seconds for it to scan the image and give me the results.
This got me thinking about an idea I had before acquiring this console a couple days ago of a script which when trigger with a controller keybind, like the Rosalina menu as an example, would scan the current screen, then report back using a speech synthesizer such as Espeak or Flite. This would avoid the need for having my phone in my other hand basically at all times, as well as making some games a lot more playable.
Would this even technically be possible? I know it wouldn't work in AGB firm or TWL firm because these don't work in 3DS mode, but it would still give me a lot less frustration when I think about doing something then go "Shit, guess I have to find a sighted person again!"
If someone would be willing to attempt this, you would have my eternal grattitude and I would even be able to pay you for making such a tool! I'm not knowledgeable in C at all and given I found learning Python somewhat dificult, I don't imagine C will be a walk in the park in comparison lol.