I made Vimium for the Linux desktop | Use Linux without a mouse

Feb 04, 2025 · 6 minute read

Introducing Hints

hints

If you're a fan of 🔗 Vimium and wish you could navigate graphical user interfaces (GUIs) the same way you navigate browsers, then you're in for a treat! I’ve recently created 🔗 Hints, an open-source application for Linux desktops that lets you interact with GUIs using your keyboard. Here's how I made it and why it might be the productivity boost you're looking for.

What Is Vimium, and Why Does It Matter?

hints

Before diving into Hints, let’s quickly talk about Vimium—the inspiration behind it all. Vimium is a browser extension that allows you to navigate the web using Vim-like keybindings. For example, you can open new tabs by pressing 'T', scroll through pages using 'hjkl', and more. The real gem in Vimium is the F key, which highlights clickable elements on a webpage, making it easier to click through links without touching the mouse.

While this functionality is amazing in browsers, there’s one significant limitation: it only works within the browser. And even there, it doesn’t cover everything. So, I wanted to bring this functionality to the Linux desktop, enabling mouse-free navigation across applications, just like Vimium.

Enter Hints: Vimium for Linux Desktops

hints-giff

That’s where Hints comes in. Hints allows you to find clickable elements on your Linux desktop applications, interact with them, and perform actions like right-clicking, left-clicking, hovering, and dragging without needing a mouse. It’s a productivity booster, especially for power users who want a more keyboard-driven experience.

I’ve built Hints using Python, and it’s completely free and open-source. You can check out the project on GitHub if you want to try it for yourself.

Why Was Hints Needed on Linux?

Over the years, I’ve found myself longing for an application like this, especially when browsing Reddit and seeing others express the same frustration. Unfortunately, there weren’t any solutions like this for Linux. While macOS and Windows had apps like 🔗 Homerow or 🔗 Shortcat (for Mac) and 🔗 Hunt-n-peck or Fluent Search (for Windows), Linux didn’t have an equivalent.

I believe the reason for Linux not having an equivalent is rooted in accessibility. These apps work by leveraging each operating system’s accessibility layer to detect elements in the GUI. But on Linux, the situation is trickier. It took a lot of trial and error, but I was able to create a working solution. Here’s how I tackled it.

The Challenge of Linux Accessibility

Linux doesn't have a universal accessibility solution across all GUIs. This made it much more challenging to create a program like Hints. I explored a few different methods to get clickable elements, starting with computer vision using OpenCV. The idea was to take a snapshot of the application and figure out where clickable items were based on the image. But this approach lacked fine control and was far too slow to be practical.

Next, I looked into machine learning models leveraging 🔗 YOLO and open source models on 🔗 RoboflowUniverse; as well as AI models like 🔗 LLaVA to better understand which elements were clickable. But again, the models I found were incomplete and slow, making the whole process impractical. So, I needed a more efficient method.

Using AT-SPI for UI Interaction

Eventually, I found a solution in 🔗 AT-SPI (Assistive Technology Service Provider Interface). Developed by the GNOME Foundation, AT-SPI is the closest thing Linux has to an accessibility solution that works across various GUI toolkits. By leveraging AT-SPI, I could query for elements in different applications and allow Hints to function in a more efficient way.

However, the road wasn’t easy. AT-SPI is written in C, and I needed a Python interface to interact with it. There aren’t many examples of how to use AT-SPI online, which made things more difficult. Fortunately, I know C and was able to dig into the source code to make it work. I also learned a lot from 🔗 Accerciser, a GNOME program that uses 🔗 pyatspi, a pythonic wrapper for AT-SPI which Hints doesn't use it, since it’s deprecated

How Hints Works

hints

Now, let me walk you through how Hints works. At its core, Hints uses AT-SPI to gather UI elements that can be interacted with. You can bind a keyboard shortcut to launch Hints. When you press the keyboard shortcut, it displays hints for the current focused window, allowing you to interact with UI elements using your keyboard.

I also incorporated a backup method using OpenCV for cases where AT-SPI isn’t available (e.g., in terminal applications or for GUI frameworks that don't support AT-SPI).

Understanding AT-SPI's Role

AT-SPI works by querying accessible applications for their elements, like buttons, menus, and links. For most applications that implement the necessary APIs, this process is fast and efficient. However, there’s a catch: AT-SPI relies on the 🔗 collection interface. If the application doesn’t implement this interface, the process becomes much slower, as Hints has to query every element recursively. This can cause significant delays, especially in applications with a lot of elements (e.g., messaging apps like nheko as QT does not support that AT-SPI collection interface).

Despite these challenges, for most applications, Hints works pretty well. And even if it’s a bit slow for certain apps, it’s still a vast improvement over not having this functionality at all.

Hints' Support for Wayland

When I first released Hints, many users asked about Wayland support. The good news is that Hints now supports Wayland! However, unlike X11, Wayland doesn’t allow applications to easily snoop on each other to figure out window positions. So, to get around this, I’ve implemented custom solutions for different Wayland compositors like Sway and Hyperland. Additionally, Wayland support is a bit trickier, especially when it comes to mouse control. Currently Hints uses 🔗 ydotool to simulate mouse actions, but it doesn’t work reliably across all Wayland compositors. This is something I’m still working on.

Installation and Future Improvements

One area where users have given feedback is that Hints can be a little tricky to install, primarily due to the fragmented nature of Linux desktop environments. There’s no universal package manager or installation method that works across all Linux distros, window managers, or desktop environments.

I’m actively working on making the installation process easier and have documented steps for various environments in the Hints wiki. However, given the diversity of Linux setups, it’s an ongoing challenge.

As for future improvements, I’m working on better support for different window managers, faster element querying, and further optimizations for Wayland support. And as always, the project will continue evolving based on community feedback.

Conclusion

Building Hints has been an exciting and rewarding journey. It’s an application I’ve wanted for a long time, and I’m thrilled to share it with the Linux community. If you’re a Vimium fan or just someone looking to speed up your workflow on Linux, I highly recommend giving Hints a try.

You can find the source code, installation instructions, and everything you need on GitHub. Let me know what you think, and feel free to contribute if you’re interested!

Happy navigating!