DYPA Published a 30MB PDF - I Built a Search Engine for It in a Couple of Hours
Greece's employment service dropped a giant PDF of results. I parsed it with Linux tools, stuffed codes in PocketBase, and shipped a Vue search UI.
DYPA released training program results and thousands of people went hunting for their KAYAS code (submission ID). The official answer was a 30MB PDF. Scroll until your thumb falls off.
I figured we could do better in an afternoon.
๐จโ๐ป Code on GitHub
Goal
One field. Type your KAYAS. See accepted or rejected. No PDF archaeology.
How I built it
Split the PDF into accepted and rejected lists so parsing wasn't one monster file.
Terminal extraction:
pdftotext file.pdf - | grep -E '^[0-9]{10}$' > kayas.txt
That pulled 10-digit codes from the text dump.
Stored rows in PocketBase with a status field (accepted or rejected). Fast enough for a weekend tool, self-hosted, no Postgres ceremony.
Frontend: Vue 3, PrimeVue table, Tailwind. Search box, instant result, done.
Live: brs-dypa.theodosiou.me
Stack
Debian-ish Linux, pdftotext + grep, PocketBase, Vue 3, PrimeVue, Tailwind.
Why bother
Government UX jokes write themselves, but the point is serious. Public info should be searchable, fast, and usable on a phone. PDFs are fine for archival. They're hostile for lookup.
Fork it
Repo is on GitHub. Deploy your own if the next PDF drop looks familiar.
If this helped you find your result, pass it along. Maybe someone at DYPA sees it too ๐