DYPA Published a 30MB PDF - I Built a Search Engine for It in a Couple of Hours — Pantelis Theodosiou

When DYPA (the Greek Public Employment Service) released the results for their recent training program for unemployed and employed citizens, thousands of people rushed online to find out if they were accepted.

Except… there was a small problem.

Instead of a nice little search box, we got a 30MB PDF.

Yes. A single giant PDF file. No filters, no search. Just thousands of lines containing something called a KAYAS (ΚΑΥΑΣ – aka the submission code for your application). The only way to find out if you made it? Scroll. For minutes. Maybe hours.

I thought to myself: “Surely we can do better than this.”
So I built a tool.

👨‍💻 Code on GitHub

The idea

The goal was simple:
Give people a fast way to check if their KAYAS code is in the “accepted” or “rejected” list.

I didn’t want people to have to deal with the PDF madness. So I downloaded the massive document, and decided to break it down.

How I built it

The process:

PDF slicing
I split the original PDF into two:
- One with all the accepted KAYAS codes
- One with all the rejected ones
  This made it easier to work with.
Text extraction with Linux tools
Using the power of the terminal:
pdftotext file.pdf - | grep -E '^[0-9]{10}$' > kayas.txt
That helped extract all KAYAS codes from the text content.
Pocketbase for storing the results
A lightweight and fast backend, perfect for projects like this. I stored each KAYAS with a status: "accepted" or "rejected".
The frontend which is a clean SPA
Built using Vue3, PrimeVue for UI components and TailwindCSS for styling. The UI is simple as it has a search field for the KAYAS code, a table that presents the results and instant feedback on whether you made it or not.

Tech stack

Linux (Debian-based distro)
pdftotext, grep for parsing
Pocketbase – self-hosted, lightweight backend
Vue 3 + PrimeVue + TailwindCSS
Deployed here: brs-dypa.theodosiou.me

Why I made this

Let’s be honest: when a government platform gives you a 30MB PDF and says “Good luck”, it’s hard not to laugh.

But this isn’t just about DYPA. This is about how digital services should work in 2025.

Information should be:

Easy to access
Searchable
Fast
Mobile-friendly

So, this is my tiny contribution toward better digital public services. And hopefully, a helpful tool for anyone trying to find their result without getting lost in PDF purgatory.

Want to use or contribute?

Feel free to:

Fork the GitHub repo
Deploy your own version

Let’s build better tools, even when we’re given… not-so-great ones.

If you enjoyed this post or the tool helped you, share it with a friend. Or with someone at DYPA 😅