Skip to content

AI-Enabled Candidates in Hiring Report |April 2025

I used ChatGPT to apply for 50 jobs. Here’s how I did.

Generative artificial intelligence (AI) is everywhere— our social media, inboxes, grammar-checking apps—in places we can’t even register. And it’s only widening its reach. 

At first glance, generative AI is self-assured and incredibly confident. It boldly answers questions in any domain. However, the large language models (LLMs) that power AI have no sense of fact versus fiction. Users can even make AI agree that it’s wrong when it’s not.

But we can’t expect every casual AI user to know this. So, people still rely on generative AI as if it were an infallible expert. Students use it to write essays, take notes in class, compose emails, and take online tests. Job seekers use it to draft resumes, answer live interview questions, and pass hard skills tests.

The AI-Enabled Candidate

The highly intentional process of considering the implications and responsibilities of a job and drafting dazzling cover letters to make recruiters stop in their tracks is no more. The application process for today’s job seekers has evolved. Candidates no longer have to take the time to research a company and position to find the perfect match. LinkedIn, newsletters, and job boards deliver personalized job recommendations right to job seekers’ inboxes and apps. Then with the help of generative AI, they can easily draft customized resumes and complete assessments at a previously unimaginable scale.

The adoption of AI leads to an important question at the forefront of HR—how do we evaluate what candidates actually know? Why administer a battery of tests to your candidate pool so they can simply copy and paste the questions into ChatGPT? 

According to Capterra, 29% of candidates have used AI to complete a test assignment or skills assessment, 28% have used AI to generate answers to interview questions, and 26% have used AI to mass-apply for jobs. While deep-faking interviews aren’t a widespread practice yet, it could become a problem as average candidates’ AI skills advance.

In a world where generative AI can be useful, but also be used in ways we would prefer it not be, what do we do? Simply asking candidates not to use AI in the hiring process isn’t effective. 82% of candidates believe other people applying to similar jobs use AI to embellish or exaggerate their applications, so they feel the need to use it to keep up. As such, HR teams need better processes to identify quality talent before candidates apply – fast. 

Ashby’s 2025 Talent Trends Report shares that from 2021 to 2024:

  • Applications per hire tripled
  • Teams interviewed nearly 40% more candidates
  • Average interview hours per hired candidate remained the same

In the same vein, LinkedIn’s Work Change Report found that 22% of recruiters report spending three to five hours sifting through applications daily.

Talent assessments are HR’s first line of defense in identifying quality talent and managing the increasing number of applications. So how do they hold up to generative AI? It’s easy enough to use tools like ChatGPT on something like “Biology 101” homework, where the answers are straightforward. But how does generative AI fare when dealing with the gray area of custom soft skills assessments whose answers are unknown? And how can HR teams create a hiring process that prevents AI cheating without compromising efficiency?

We conducted a study to find out.

An AI-enabled candidate applying to an open job using AI in HR

Testing Candidates’ AI Enablement on Hiring Assessments

Our goal was to determine if ChatGPT could effectively cheat Cangrade’s hiring assessment and discover any patterns in how it responded. Using that data, we can then develop strategies for HR to use as a defense against AI “cheating.”

To test if and how AI could cheat our soft skill assessment, we acted as job seekers asking ChatGPT to maximize their scores on Cangrade’s personality test, given the job title, job description, and assessment questions. 

How long would it even take a job seeker to do this? 

There are 1,654 words in our flagship hiring assessment. The average typing speed is 40 words per minute, meaning it would take the average person about 41 minutes just to type our questions into a generative AI tool given that our software prohibits copying and pasting. With our time limit on the assessment in place, a candidate wouldn’t even have time to finish writing their prompt.

Ignoring this, we wrote a series of scripts to mimic jobseekers using generative AI to take our assessment for 50 different roles across a variety of industries, including real estate, technical and IT, trades, and government. We gave the AI the instruction to “maximize my score on the personality test for X job, with Y job description.” Then we hit “Run.”

Question 1: How well did AI “cheat” the hiring assessment?

The scripts yielded 50 different answer sets, which we had our algorithms score for each job. The results?

ChatGPT failed. 

ChatGPT's results at "cheating" hiring assessments like an AI-enabled candidate

Of the 50 results, only 6 would have been identified as “high fits” to recruiters. For this experiment, we defined “high fit” as scoring over 70 – a common threshold used by Cangrade’s customers. This means ChatGPT only had a 12% success rate, a number comparable to acceptance rates at top universities. 

Of the remaining 44 results, only 5 scored between “no fit” and “high fit” – between 50-70. A mere 10% of the total. These gray zone candidates we’ll call wildcards are considered risky as they show some potential but need extra development, and are typically not moved forward in the application process.

In analyzing ChatGPT’s successes and failures, no trends emerged in the types of roles or industries to which AI was better at applying.

High fit

Wildcards

No fit

Question 2: Did ChatGPT favor certain assessment responses more than others?

Cangrade’s assessment uses the Likert scale. To respond, the AI was asked to rate statements from 1 (strong disagreement) to 5 (strong agreement) to maximize its chances for a job given the job description. Its responses showed some clear preferences.

Response Distribution

Trends in how AI responded to hiring assessments when replicating an AI-enabled candidate applying to an open role.
  • Most Common Response: 4 (Slightly Agree) was the most frequently chosen response, appearing 3,415 times (37.94% of all responses), indicating a tendency for responses to lean towards agreement.
  • Least Common Response: 1 (Strongly Disagree) was the least selected response, with only 227 occurrences (2.52%). 
  • Balanced Middle: Responses 2 (Slightly Disagree) and 3 (Neither Agree nor Disagree) were relatively evenly distributed at 24.46% and 23.94%, respectively.
  • Less Extreme Positive: 5 (Strongly Agree) was chosen only 11.13% of the time, making it the second least frequent response.

ChatGPT avoided extreme positions (i.e., 1 and 5), typically leaning towards agreement, which we see in its tendency to select 4. Further, its relatively high selection rate of neutral (3) and Slightly Agree (4) responses suggests a cautious or moderate strategy rather than an extreme affirmation or denial approach.

Question 3: Did the AI tend to answer certain questions in certain ways?

Yes, ChatGPT’s responses conveyed a sense of altruism and productive tendencies, while shying away from anti-social tendencies.

Highest-Rated Statements

I absolutely never take pleasure in another’s misfortune” received an average score of 5, meaning ChatGPT gave it 5s across the board. This is not the case with real-world candidates. Why do we love reality television if we don’t take pleasure in misfortune? 

This perfect score across 50 tests reflects ChatGPT working to convey high integrity and empathy, traits universally viewed as positive in most work contexts. 

Other highly-rated responses included: 

I finish what I start, no matter what.”

“I would be unhappy if I didn’t excel professionally.”

“It is very important to set and track your goals.”

These averages suggest that ChatGPT reasoned that persistence and goal achievement are also highly desirable attributes. 

Lowest-Rated Statements

Although there was one perfectly consistent “Strongly Agreed” upon question in the AI’s assessment responses, the same cannot be said for “Strongly Disagree.”

I am not really interested in others” received the lowest average score of all the responses at 1.18. 

Other low-rated responses included: 

“I am not interested in other people’s problems”

“I make other people look bad by working harder than them.”

“I don’t particularly enjoy mentoring others.”

The statements above reflect arrogance and a lack of empathy, attitudes that are commonly viewed as highly negative in the workplace. These low averages highlight that one of ChatGPT’s priorities was strongly conveying that it is prosocial and a team player.

Trends in ChatGPT’s Statement Responses

ChatGPT appears to have identified statements about productivity, moral fiber, and goal-directed behavior as highly job-relevant. Its tendency to strongly agree suggests it tried to maximize a candidate’s fit for positions by prioritizing these traits. 

This is doubly evident when examining the statements with the lowest average scores. ChatGPT tries to distance itself from negative or anti-social tendencies by strongly disagreeing with statements highlighting negative interpersonal behaviors and individualistic tendencies. It likely assumed the red and green flags from a recruiter’s perspective and responded by both maximizing and minimizing the corresponding statements.

ChatGP's highest and lowest scored responses in our AI-enabled candidate experiment

Why did ChatGPT fail?

ChatGPT tried to paint a positive candidate picture. However, its strategy of leaning toward mildly positive answers did not align with our job-specific profiles. Because each job has unique demands, a one-size-fits-all approach of “Slightly Agree” misses nuanced signals that certain roles require. For example, let’s consider “Attention to Detail.” If a candidate applying to a role that requires high attention to detail, like a quality control specialist or accountant, “Slightly Agrees” with a statement like “​​I can focus deeply on one thing, but I find it hard to switch between tasks” this is a red flag. In this case, ChatGPT’s uniform mildly positive response approach would miss that deep focus could be a strong asset in this role and should have warranted a “Strongly Agree” response.

On the other hand, a job that requires quick decision-making or creativity might not require strict attention to detail. For these roles, flexibility and the ability to handle distractions could be positive. This allows someone to pivot between ideas or tasks rapidly. The tendency to not focus deeply would not be detrimental.

In theory, a universal positivity strategy should maximize your chances, but it’s not optimal in practice—especially for a nuanced personality assessment keyed to specific job competencies. This ultimately explains why only 6 out of 50 generated test answers were strong fits. 

Statistics on how AI-enabled candidates would perform on hiring assessments

What this means for HR teams and AI-enabled candidates

So, what are the implications? Trying to “cheat” Cangrade’s soft skill assessment is for naught. There are no answers in the back of the book because every job exists in the context of its unique team, location, responsibilities, and goals. And so, the Call Center Agent profile for Company A is likely radically different than the Call Center Agent profile for Company B. This makes perfect sense given that the people in those roles are uniquely suited to hit specific performance goals with different populations in different capacities. 

Despite largely failing, identifying the small portion of candidates who leverage AI to answer their assessments and pass based on the trend that AI leans towards positivity would prove difficult and time-consuming. 

If your HR team is using assessments that are custom-created and soft skills focused so that there is no “right” set of answers, your assessments should be effective as a first line of defense against AI cheating in the application process.

Soft skills are the skills that determine how individuals work and interact with others. They include traits like problem-solving, adaptability, and communication, and are challenging to learn – as opposed to hard skills, which are technical, job-specific, and easily trained. Custom-built soft skills assessments identify top applicants by evaluating the unique personality traits that lead to long-term success in your role and organization.

If your hiring assessments are not custom-created and soft skills based, your assessment is likely letting through candidates who are overstating their abilities. This will only worsen as AI adoption and expertise grows.

HR teams need to ensure their assessments effectively prevent cheating and adopt steps in their hiring process to weed out cheating candidates.

What to look for in assessments to prevent AI-cheating

To effectively protect your hiring process against AI cheating in talent assessments, look for assessments that:

A person using pre-hire assessments properly to block AI-enabled candidates from cheating

Have no “right” or “wrong” answers and no answer key.

  • Opt for hiring assessments that prioritize soft skills before hard skills.
    The benefits of this are two-fold as we wade into the future of work and AI. Not only are soft skills assessments more challenging to cheat, but hard skills can be mastered handily by AI and will become even less relevant to your workforce over time.
  • Custom-create assessments for your role and organization.
    An out-of-the-box assessment can be tempting given its quick implementation, however, being one of many organizations using the same assessment puts you at a higher risk of candidates being able to cheat your assessment with AI and answer keys becoming available.
  • Avoid typecasting and personality buckets in your assessments.
    Assessments that put candidates into personality groups inherently gather fewer personality data points than those that look at holistic soft skills profiles. The fewer data points you collect the more AI-enabled candidates have a better shot at getting it right.
A person successfully implementing hiring assessments to prevent cheating on job applications

Create a higher barrier to utilizing AI effectively.

  • Implement time limits on your assessments.
    Time limitations on hiring assessments can prohibit cheating as switching between applications, typing in questions, and copying answers, become more challenging.
  • Choose assessment platforms that do not allow you to copy and paste.
    While answer keys have been available to job seekers pre-AI, allowing users to copy questions and paste responses into your assessment platforms limits the ability to cheat for an AI-enabled candidate.
  • Look for randomization in your hiring assessment.
    While it’s not foolproof, randomizing the questions asked in your assessment further prohibits AI-enabled candidates from creating and utilizing answer keys to pass your assessment.

How to further update your hiring process to adapt to increasing AI-enabled candidates

While assessments are an effective tool for narrowing your talent pool efficiently and effectively and reducing AI-enabled applicants cheating their way through your hiring process, additional steps should be taken to reduce your risk of hiring candidates who overstate their skills. Here are several ways to optimize your hiring process to further limit candidates’ AI enablement.

Remove or reduce reliance on resumes and cover letters.

AI can handily tailor resumes and cover letters to roles, organizations, and individuals, and can even falsify experience. 

Minimize the use of hard skills assessments.

If AI can master the hard skill you are testing, testing for it opens you up to cheating and does not aid in preparing your workforce for the future. Identifying soft skills fit and the ability to learn new hard skills will prove more effective.

Choose an ATS that can identify AI usage.

An increasing number of ATS systems are adding features to identify AI usage in candidate applications. While migrating your ATS is not a small project, if you’re in the market, AI content checking is a feature to add to your list of must-have criteria.

Screen with video interviews instead of phone screens to validate responses.

28% of candidates are using AI to generate responses to interview questions. Phone screening applicants allows room for AI-enabled candidates to cheat, as you can’t see them as they respond. Video interviewing can help your team identify if candidates answer based on their knowledge or enter the question into an AI platform.

Lean on reference checking.

Reference checks are difficult to fake. Traditional phone calls and reference-checking platforms with fraud and IP-checking features will be increasingly valuable to gaining insight into whether your finalists are the top talent you think they are.

Upskill your team on how AI can help candidates further their applications in the hiring process.

Knowledge of the ways AI-enabled candidates can use technology to inflate their fit is the first step in your team’s ability to develop systems that prevent those candidates from getting through the pipeline. 

Don’t rely on your personal ability to spot AI usage.

Individuals typically believe they are better at identifying AI usage than they are. Research from Penn State proved that “humans can distinguish AI-generated text only about 53% of the time in a setting where random guessing achieves 50% accuracy.” Don’t make the mistake of believing your team will be able to accurately detect AI. Leverage technology to support your efforts.

In summary

AI enablement in the hiring process will only grow as more tools become available and individuals advance their AI skills. With the high volume of candidates that recruiters must screen and interview, processes to identify candidates inflating their abilities with AI are crucial to maintaining hiring efficiency and quality.

Our study showed that soft skills assessments are an efficient and effective way to identify top talent at the beginning of your hiring process. ChatGPT only had a 12% success rate at passing our assessment and had clear priorities in the types of responses it gave. It leaned into a strategy of positivity, moderately agreeing with most statements, and prioritized productivity, moral fiber, and goal-directed behavior across roles.

An AI-enabled candidate applying to an open position.

However, no system is fail-proof against AI. HR teams must proactively protect their hiring process against candidate AI-enablement that leads to undesirable results. Ensuring you have the right assessments in place and evolving your hiring process to weed out candidates cheating with AI will ensure you continue to hire quality candidates efficiently and are prepared for the future of hiring.

Want a copy of the full report?
Download it here.