VDAB has a database of over 11000 competencies, skills, and knowledge elements. This database keeps on growing. One of the biggest challenges is avoiding duplicate competencies and grouping competencies that resemble each other. Given the size of the database, it is no longer possible to perform this manually. The aim of this project was to build an AI-based solution that automatically links similar competencies.
Our solution
Our matching algorithm works in several steps. Using the Google Translate API we first translate everything from Dutch/French to English. This gives us four different "languages", i.e., Dutch, French, English (translated from Dutch), and English (translated from French). After this phase, we can start the word embeddings for which we've chosen fastText from Facebook. Next, we use both classical methods as well as machine learning-based methods for the sentence embeddings. We rank each of these sentence embeddings and based on the continuous feedback from VDAB we retrain our models to ensure a correct matching.
Results
The applications allows VDAB to quickly clean up there database and search for matching competences. The custom build user interface enables a feedback loop to continuously keep the algorithms up to date. This allows VDAB to save a significant amount of time in their search queries.