3 Of The Best AI Text Detection Tools Available (According To Our Own Testing)

By Emma Street Dec. 30, 2024 2:45 pm EST

Two male humans in suits sitting on chairs with a robot sitting between them

Stokkete/Shutterstock

Generating content with AI has been possible since the early 2010s, but it really took off in November 2022 with the release of OpenAI's ChatGPT-3.5. Suddenly, everybody seemed to be generating essays, job application letters, social media posts, and even poetry by simply typing a prompt into a large language model (LLM). Alongside the development of artificially generated text, there's been an equally rapid growth of AI detection tools. A recent study (PDF) by the Center for Democracy & Technology found that 68% of US educators have used AI detection tools to check student's essays. So what are these tools, and how do they work?

Detectors are a way of using artificial intelligence to spot artificial intelligence. They use algorithms trained on human-written and AI data to analyze linguistic patterns like repetitive phrasing or unnatural word frequency. Some also look for inconsistencies or superficial reasoning. Most AI detection tools return a percentage score indicating how much text is likely to be human-written and how much is AI-generated. It's a challenging job, given that LLMs are getting better all the time. We tested different AI detection tools using four pieces of text, two written by humans and two produced by ChatGPT, which you can read about in more detail in the Methodology section.

If you need to check someone else's text to see if it was written by a robot, check out our three best tools below. If, however, you want to use AI detection tools to test your own AI-generated work so you can submit it and not get caught, then beware. Our tests showed a huge difference in results across different applications. If whoever checks your work uses a different AI detection tool, you could still get caught out.

QuillBot is best for unlimited checks

Screenshot of QuillBot AI Detection result showing AI generated text and a score of 66%

Emma Street/QuillBot

QuillBot performed well in tests, identifying both examples of non-AI written content successfully, with scores of 100% human. It also recognized the AI-generated content, although it did think that 34% of the AI factual prose and 7% of the AI fiction writing were written by a human. It's fast, free to use and can check English, Spanish, German, and French text. You can either paste in text or upload DOCX or PDF documents. The free version limits you to 1,200 at a time, but there's no limit to how many checks you can run, so you can still check longer texts if you break them into chunks of 1,200 words or fewer.

Alternatively, you can pay for a premium version ($8.33 per month, billed annually) for unlimited text length. QuillBot also has other features, including a Paraphraser, plagiarism checker, and content summarizer. As well as its percentage scores for "AI-generated" and "Human-written" content, there are two other options: "AI-generated and AI-refined" and "Human-written and AI-refined." These are only available in the English language version. However, in my tests, the scores for these categories were 0% across all four documents.

Sapling is best for quick results

Screenshot of Sapling AI Detector page showing a large text box containing text from the Hitchhiker's Guide to the Galaxy

Emma Street/Sapling

Sapling returns results almost instantly and was completely successful in identifying the AI-generated non-fiction and both human-written texts correctly. However, it didn't manage to spot the AI fiction and returned a result of only 26% AI-generated. The AI checker quickly returns results and doesn't require a login to use the free version, although you will have access to more features if you do.

The free plan limits you to 2,000 characters per search, but you can upgrade to one of the paid plans from $25 per month or subscribe annually to pay the equivalent of $12 per month. The paid plan removes the word limit and gives you access to other features like web integration and an autocomplete feature that you can use on CRM and email platforms like Salesforce or Gmail. You can, however, get along perfectly well with the free version as long as you don't mind the word limit. It's fast, easy to use, and accepts pasted text, DOCXs, and PDFs.

Smodin is best for non-fiction text

Screenshot of Smodin AI Detection results showing AI generated fiction and a score of 13%

Emma Street/Smodin

Smodin recognized that the human-generated content in our test was 0% AI-generated. It also reported that the AI factual text was 96% AI. It performed less well when spotting the ChatGPT-written fiction piece, giving it a score of 16% AI. Many of the detectors were less accurate at understanding the difference between human-written and AI-generated fiction. If you only need a tool that works for non-fiction, then Smodin works well.

It generated our results in under ten seconds, and you can upload DOCX or PDF files. However, it comes with one big drawback. Its free plan is extremely limited, allowing you only five checks per week. The cheapest paid-for plan is $15 per month at the time of writing (but cheaper if you pay annually), which makes it pricier than QuillBot or Sapling. This includes additional features like AI text generation, plagiarism detection, summarizer, and translator. If you stick with the free plan, Smodin has a handy bar on the front page, which makes it easy to see how many entries you have available and how long you've got until the next refresh.

Runners Up: AI detection tools that didn't quite make the top three

Two screenshots overlapping on a beige background. Screenshors show Winston and Copyleaks AI detection pages

Emma Street

Copyleaks was a top contender for a place in the top three. A drawback was that, unlike the previous tools, the free version doesn't provide percentages for the amount of text it detects as AI. It either says, "This is human text," or "AI Content Detected." It was 100% accurate with the four text documents, but when I gave it a piece of writing that was half human-written and half AI-generated, it simply returned a result of "AI Content Detected." This makes the free version a bit of a blunt tool but you get more nuance with paid versions.

Winston AI provided decent results. However, the limitations of its free plan meant that I could only test three of the four example texts before I ran out of credit. It returned. It gave the AI-generated factual piece a score of 85%, which was less accurate than Sapling or Smodin, but it was better able to detect AI fiction than most of the other detectors.

The AI detection tools to avoid

Image showing overlsapping screenshots of Undetectable AI, GPT Zero, and Merlin AI detector web pages

Emma Street

GPTZero allows you to scan text using its Basic Scan for free without a requirement to sign up. It allows you three scans before you need to sign up for a free account, although the free account gives you fewer features than many other free accounts. You can only see highlighted likely AI passages if you opt for a paid account. In tests, it performed well with the factual pieces, giving the AI-generated text and human-written text scores of 98% and 0%, respectively. However, it was unable to tell the difference between human and ChatGPT fiction. Douglas Adams' writing received a score of 59% human, making it only marginally more human, in GPTZero's opinion than the AI-generated SciFi, which got 58%

Undetectable AI claims to check text against multiple AI detection tools, including QuillBot and Sapling. Yet its results didn't match those we got when using the tools directly. Every one of the 4 test articles came back as human-written. I did manage to make it detect some AI content by pasting in some shockingly bad ChatGPT examples, but the writing needs to be unnatural and cliche-ridden before Undetectable thinks it was produced by AI.

The absolute worst AI detection tool we tested was Merlin AI, whose scores bore very little resemblance to how the writing examples were produced. My factual article resulted in a score of 40% AI, so it did at least consider it to be slightly more human than the GPT version, which scored 78%. When it came to detecting AI fiction, it was completely off-beam. ChatGPT's story returned 45% AI-generated, while the preface to the "Hitchhiker's Guide to the Galaxy" was, in Merlin's opinion, 97% AI-generated, which is quite a feat for a book published in 1979.

Methodology

Close up picture of a woman's hands typing on a laptop

PeopleImages.com - Yuri A/Shutterstock

We only tested text-based AI detection tools, although similar image and video tools are also available. We focused on products that were free to use, although all came with advanced paid options. We used four pieces of text; two were factual articles, and two were pieces of fiction. For the factual content, I used the words from my entirely human-written LinkedIn article. I then generated an article of similar length with the same title on ChatGPT.

To see how good the tools were at spotting original fiction and AI-generated stuff, I used the preface to Douglas Adams' "Hitchhiker's Guide to the Galaxy." Then I took the first eighteen words ("Far out in the uncharted backwaters of the unfashionable end of the western spiral arm of the Galaxy...") and told ChatGPT to use them as a starting point for the first 600 words of a sci-fi novel. I removed any framing text around ChatGPT's answers but did not make any other changes.

The AI detection tools were scored on their accuracy. I also took into account how easy they were to use and gave higher rankings to tools without overly restrictive limitations on their free plans. In judging the results, we considered that false positives (where human-written text was reported as AI) were a bigger problem than false negatives, where AI content was missed. This is because, as AI models continue to improve, some non-human-generated content is bound to slip through the net, annoying as that might be. However, the consequences of human-written prose being flagged as AI are much bigger and can have serious consequences that may be completely underserved.