Using Data Science to Redefine Talent Evaluation

                                                                                                                                                                                                                                                                           -Varun Aggarwal

Talent evaluation in the current job market is critical as businesses cannot risk a bad hire. With the need to bring on board highly skilled manpower, the effectiveness of current evaluation techniques in itself is in question. Assessments based on traditional Multiple Choice Questions format (closed response format) have limited value to offer as candidates are given the option to ‘choose’ the right answer instead of ‘finding’ one. A jobseeker in this situation can apply elimination techniques to find out the answer even if he lacks the understanding. Also, there is no real way to test the creative pursuit of the candidate or his thought process, since the correct answer is already available as one of the options.

There has been a need to be able to carry out assessments wherein the candidate can submit a free flow answer (open responses) which can be automatically graded. This type of a test provides a much more holistic assessment but typically requires human expertise. With the scale of operations increasing and the volume of hiring involved, human intervention alone is a huge challenge. Thanks to the growing amount of data present around us and the recent progress in data science algorithms, it is now becoming possible to do these assessments automatically. Crowd intelligence is adding an edge to what data science can do, to create super powerful assessments.

Testing software proficiency

Let us look at the example of programming skills. Companies spend large amounts of money in interviewing candidates for software engineering jobs.These interviewers look for a few different things in the candidate:

  • Is the candidate’s ‘approach’ to solve the problem correct? Even if the candidate’s program may not pass test cases, if the thought process of the candidate is correct, s/he is considered.
  • Is the code readable, maintainable and scalable? The industry spends so much time just managing badly written code – the code is not readable, it is not usable by another developer, it is prone to bugs and is not modular.
  • Is the code efficient? Does it take optimal time in running or is slow?

In the last couple of years, there are data science algorithms which can test all these automatically! There are tools wherein a person actually writes a computer program and the algorithm automatically generates a report enumerating the candidate’s thought process, the maintainability and efficiency of the code.

Such a tool provides great efficiency. In a recent case, a sample of 90,000 US college students took an automated programming assessment. The tool identified 16% candidates who have the right thought process, but do not pass test cases. On the other hand, it finds 20% super good programmers who write maintainable and efficient code.

Other key areas of application

Let us switch to another large demand area in the market – grading of spoken English. There is no real way to assess the spoken English of a candidate in the MCQ format. Computers earlier could not do free speech evaluation but now data science coupled with crowd sourcing can solve such problems. The spoken English sample of a candidate is transcribed by the crowd and the data science algorithm works with this transcription (together with the voice sample) to do accurate grading on pronunciation, fluency, content and grammar. Such techniques can do speech grading as good as human experts.

These mechanisms involve a part of the evaluation being done by humans and the other half by machines. If we combine them together, they can form a very powerful tool.

The article originally appeared in PC Quest