How does ML-based Job Resume Matching Algorithm Work?

Missing out on a quality candidate in a swarming number of applications is a prevalent problem for recruitment teams worldwide. Not only is it virtually impossible for recruiters to scan through every application received, but the mundane task also takes up the valuable time of recruiters.

To address this issue, we at Skillate created our AI-based matching engine to ensure that recruitment teams never miss out on a quality candidate.                   This is how it works:

How does an AI-based Matching Engine work?

Step 1:Vocabulary and AI training models

To begin with, we needed to train the Skillate system with various profiles so that it understands the multiple terms and vocabulary used in the application. The Skillate AI engine is introduced on over 120 million profiles and almost 3 million jobs to understand the following information:

Skills Similarity

The Skillate system has a vocabulary of 44,345 skills across domains. The system represents every skill with a vector, finding cosine similarity. Example:

  •  Sim(Java, Spring) - 0.8
  •  Sim(Java, Sales) - 0.1
  •  Sim(Sales, Business Development) - 0.9
  •  Sim(Sales, Customer Success) - 0.6

Since sales and business development are complementary skills the system classifies them in one cluster giving it a high cosine similarity score. Same with the JAVA and Spring framework.

However, in Sales and JAVA, the skill set is very alien to each other, thus getting a low cosine score.


So, if a candidate has worked in an AI SAAS company, But has worked as a Business Development Manager. The system would recognize business development as the dominant skill via the context in his or her resume and cluster it with BD or sales profile rather than an AI or ML expert.

Company Industry Mapping

The other part comes in mapping companies with their respective industries so that once the JD demands applications from certain sectors, those profiles are given higher preference. For Example:

College Ranking on the basis of QS World ranking university:
The Skillate system uses QS World ranking to index universities across the world. This parameter is especially helpful for recruiters emphasizing the candidate’s educational background or only considering applicants from a specific university tier. Helps few companies who look for the pedigree more than anything else with a typical experience range in between 0-3 years.

Title Mapping with Role/Seniority Level and Role Category:
Skillate understands the various positions and the terms used to communicate similar position levels on hierarchical.

Major Similarity

Similar to how we generate cosine scores for various skillset, we do the same for the various education sectors.

For example, A candidate with a degree in CS and IT would be given more preference for an IT role, than someone studying Economics but with mentions of Computer Science in their resume.



Understanding the commonly used abbreviations and industry terminology is one of our features:

Step 2: Resume Indexing

The Skillate resume parser uses deep learning to extract information from the most complex resumes. You can learn its detailed working here.

However, extracting the information is just the beginning of the process to different index profiles; we need to map different information present in the Resume. For Example:
If the candidate is Principal Software Engineer at Accenture, MSc from UIUC, pursuing a major in CSE in 2011. Apart from parsing the relevant information, we index the additional/indexed information as well from the training data:


Similar information computation is done for all the current and past experience and education of a candidate. This helps the system in understanding the candidate’s experience in the specific industry, his seniority level, and how fast he has progressed in his career.

  • Finding the core competencies and ignoring irrelevant skills

A lot of times it is difficult to gauge the primary skillset of a candidate from his or her CV. in order to understand the best fit, it is important to the the relevant skills. This is done in 2 phases:

  1. First, the Skillate system finds all the possible sets of skills that can be extracted from the resume.
  2. Then for every skill, we check the candidate proficiency on the basis of his/her work experience, projects, and publications:

For example, let's say for skill “Machine Learning”, we compute the skill proficiency on the basis of :

Work Experience: What is the candidate's current designation, and his past designations, whether he is working as a “Data Scientist” or “Machine Learning Engineer” and has worked in the libraries like Theano, Tensorflow, Deep Learning, SVM, etc. And also the duration/freshness for which the candidate has worked on these skills which are related to “Machine Learning”

Projects: Whether the candidate has worked on the projects like “Video Classification using AI”, or “Text Classification” or “Sentimental Analysis” which uses AI-related skills.

Education: Whether the candidate’s education major is similar to the skills. For example, there is a high chance the computer science candidate is more suitable for “Machine Learning” skills as compared to other disciplines.

So, we calculate the rating proficiency of all possible skills and accordingly compute the core skills which are used in matching with the job.

Step 3: Job Indexing

The Job indexing algorithm is used for understanding the job description in a detailed manner. Most of the JDs are written in unstructured format and the system needs to convert them into a structured format. Better the understanding of the job, better the matching results. At Skillate, we have created an AI-powered JD assistant that provides real-time feedback to recruiters.


From the entire JD, we first classify the entire job content into two categories, “Current Skills and Expertise” and “Roles and Responsibilities” using NLP based classification model.
For example:

  1. The candidate should have expertise in Java” will be classified into “Current Skills and Expertise” and “The candidate will be responsible for building web applications using Spring Framework” will be classified as “Role and Responsibilities”.
  2. From the above sentence “Spring Framework” will be given higher weightage as compared to “Java”. This is because the recruiter would prefer those candidates who know both - “Spring Framework” and “Java”

After computing the core skills, we also normalize and calculate the additional fields from the Designation, Industry, Major,  Degree in the same way as we compute the core skills.

At Skillate, we have created one of the world’s most sophisticated Resume Parsing and Matching Engine that can get up to 93% accurate information from the most complex resumes around the world.

In this part, we have discussed the first three factors - Vocabulary and AI training models, Resume Indexing, and Job Indexing. We will discuss the final two parts - Matching and Chatbots in the second part of this blog.

Contact Us

Show Comments