Stop Copy-Pasting from Images: Build a Universal Screen Translator with Python

Lingo-Live started with a frustration I’m sure you’ve felt too.
Have you ever tried copying text from a YouTube video?
Or translating a Japanese error message inside a game?
Yeah. You can’t.
Because it’s not text — it’s just pixels.
Most of us end up doing one of two things:
painfully typing everything by hand, or
pulling out our phones and using Google Lens, holding it up to the screen like it’s 2010.
It’s clunky. It breaks focus. And honestly, we can do better.
So I built Lingo-Live — a sleek desktop app that lets you translate anything you see on your screen instantly.
The Superpower We Wanted
I didn’t want just another translation app. I wanted something that felt like a superpower.
That meant it had to be:
Invisible – runs quietly in the background
Instant – hit a hotkey, select an area, get a translation
Modern – glassy UI, dark mode, blur effects, no Windows-95 vibes
Press Ctrl + Alt + T, drag over any part of your screen, and boom — translated text appears on top of whatever you’re doing.
The Secret Sauce: How Lingo-Live Works
Python made this possible. It’s basically a Swiss Army knife for building tools like this.
Here’s how everything comes together.
1. The “Glass” Overlay
The trickiest part was creating a window that stays on top without being annoying.
I used CustomTkinter to build a frameless, translucent overlay that feels light and modern.
Key details:
Always on top so translations stay visible
Semi-transparent so you can still see context underneath
Frameless — no ugly title bar; custom drag-and-drop instead
The result feels less like an app and more like a layer on your desktop.
2. The Eyes (OCR)
When you trigger the hotkey, Lingo-Live doesn’t try to “read the screen.”
Instead, it:
Lets you select a region
Takes a screenshot of just that area
Sends it to Tesseract OCR to extract text from the pixels
Conceptually, it looks like this:
screenshot = ImageGrab.grab(bbox=(x1, y1, x2, y2))
text = ocr_engine.extract_text(screenshot)
That’s where the magic starts — turning images into actual text.
3. The Brain (Translation)
Once OCR gives us something like こんにちは, we need a translation that actually makes sense.
This is where Lingo.dev comes in.
Instead of raw dictionary swaps, it handles context properly, which makes a huge difference — especially for UI text, error messages, and game dialogue.
The result feels natural, not robotic.
4. The Voice (Text-to-Speech)
Sometimes you don’t want to read. You just want to hear it.
So I added Edge TTS, which uses the same high-quality voices found in Microsoft Edge.
Now Lingo-Live can read translations out loud — great for pronunciation or just staying hands-free.
“Fish are vertebrate animals that live in water…”
5. Leveling Up: AI Summarization
Full translations are great, but sometimes you just want the gist.
So I added a Summarize button powered by Google Gemini.
Here’s what happens:
The translated text is sent to Gemini
It returns a clean, one-sentence summary
You get the point instantly Perfect for skimming foreign articles, long error messages, or RPG dialogue dumps.
6. Make It Yours: Settings That Actually Matter
I didn’t want Lingo-Live to feel rigid, so I built a full settings system backed by JSON.
You can:
[- Change the hotkey (Alt + Z? Sure.)
Switch themes (dark mode is the correct choice)
Pick different fonts (Roboto > Segoe UI, fight me)](url)
Best part?
All changes apply instantly — no restarts, no reloads.
Conclusion
Building your own tools is one of the most satisfying parts of being a developer.
Lingo-Live solves a problem I run into constantly: text that’s trapped inside images, videos, and games. Instead of working around it, I built something that feels fast, modern, and genuinely useful.
If you’ve ever rage-typed a foreign error message at 2 AM, this app is for you.
Lingo.dev makes localization feel effortless—turning a painful, error-prone task into a smooth, developer-friendly experience.
Check out the repo at https://github.com/Samar-365/lingo_live, clone the code, and stop copy-pasting from pixels.
Special thanks to @sumitsaurabh927 and @maxprilutskiy for their continuous guidance throughout the hackathon and also for providing us this great opportunity.
Happy coding!