Potential homebrew request and technical questions, GPT Vision and Espeak for O3DS and N3DS systems. Is this even possible?

Mudb0y

Member
OP
Newcomer
Joined
Feb 21, 2024
Messages
22
Trophies
0
Age
18
XP
64
Country
Poland
So I've just spent a good part of an hour messing with my accidentally newly acquired O2DS. Originally I was going to install some seeds with FBI so I click the icon, then proceed to OCR the screen to know what I'm doing because none of these apps have any sound cues and sometimes confusing interfaces to a blind person like me.
The OCR was a bit shit so I moved on to Be My AI, which uses OpenAI's GPT4 Vision as it's backend and managed to get super good results. I even did manage to get to installing the seed for Rhythm Paradise Megamix but looks like they weren't found because I got an error.
I then proceeded to play with the settings app, and even got to getting through the account linking portion which led me to Mii Maker in which I partially succeeded in making a mii.
There were lots of hickups though, for example lots of unnecessary information was being red out about my surroundings given I was shooting photos of the screen and not getting a direct image from it. Sometimes I would shoot the photo incorrectly and would have to wait another 10 seconds for it to scan the image and give me the results.
This got me thinking about an idea I had before acquiring this console a couple days ago of a script which when trigger with a controller keybind, like the Rosalina menu as an example, would scan the current screen, then report back using a speech synthesizer such as Espeak or Flite. This would avoid the need for having my phone in my other hand basically at all times, as well as making some games a lot more playable.
Would this even technically be possible? I know it wouldn't work in AGB firm or TWL firm because these don't work in 3DS mode, but it would still give me a lot less frustration when I think about doing something then go "Shit, guess I have to find a sighted person again!"
If someone would be willing to attempt this, you would have my eternal grattitude and I would even be able to pay you for making such a tool! I'm not knowledgeable in C at all and given I found learning Python somewhat dificult, I don't imagine C will be a walk in the park in comparison lol.
 

ack

Well-Known Member
Member
Joined
Jan 30, 2020
Messages
285
Trophies
0
XP
638
Country
United States
probably not, I doubt the 3ds is powerful enough to do OCR that is going to be useful for you. You could probably send a screenshot off to a server somewhere and have it send sound back but you'd have to implement WiFi and all that in luma. I think the best solution would be to get a 3ds with a capture card mod and then have OCR running on your computer for the feed it's being sent, and then have a script that runs the OCR and says the output when you press a key.
 
  • Like
Reactions: Deepdive543443

Deepdive543443

New Member
Newbie
Joined
Oct 25, 2023
Messages
3
Trophies
0
Age
23
XP
40
Country
China
From my previous experience on porting vision models to 3DS, models like OCR and Object Detection usually takes time and memory. Only a few extremely light-weight models will works. With operating system and gaming running in background, resource management will be a challenging task. Streaming 3DS graphic output to PC and have OCR and others running on PC would be a better approach
 

Mudb0y

Member
OP
Newcomer
Joined
Feb 21, 2024
Messages
22
Trophies
0
Age
18
XP
64
Country
Poland
From my previous experience on porting vision models to 3DS, models like OCR and Object Detection usually takes time and memory. Only a few extremely light-weight models will works. With operating system and gaming running in background, resource management will be a challenging task. Streaming 3DS graphic output to PC and have OCR and others running on PC would be a better approach
My idea was to only take the screenshot on the console's end, the OCR would be done by GPT Vision and it would simply send back the result as speech but as @ack mentioned this might not be possible, I wasn't aware Luma doesn't have wi-fi capabilities. I still wish some apps had accessibility though, in cases like FBI it's possible to navigate them without it mostly fine to install CIAs but then you get to apps like Universal Updator which are basically unusable when you're blind.
 

Mudb0y

Member
OP
Newcomer
Joined
Feb 21, 2024
Messages
22
Trophies
0
Age
18
XP
64
Country
Poland
The question remains: why not just stream everything to your PC using Snickerstream, and run whatever OCR program you like on your PC?
I was going to do this but you can't do that with the OG 3DS systems, and I was curious if a solution that was more portable than that was possible.
 

Site & Scene News

Popular threads in this forum

General chit-chat
Help Users
  • No one is chatting at the moment.
    K3Nv2 @ K3Nv2: Lol rappers still promoting crypto