Things to Consider When Using ChatGPT

How Does ChatGPT Compare to Similar AI Tools? Does It Ever Go Wrong?

Pranath Fernando

Oct 14, 2024

AI Tool: ChatGPT
Level: Beginner
Access: Free

This guide is part 2 of the free course Learn how to use ChatGPT.

In this guide, we will cover:

Can ChatGPT make mistakes?
Potential pitfalls of LLMs
The deeper reason why LLMs make mistakes
Hallucination Rates of ChatGPT
Differences between ChatGPT and similar tools
Limitations of ChatGPT

Can ChatGPT Make Mistakes?

ChatGPT, like other Large Language Models (LLMs), has undergone extensive training, helping it give impressive responses with consistency.

However, it is also important to understand that this technology is relatively new and not infallible, and can occasionally make mistakes for several reasons.

For instance, ChatGPT can make factual errors, especially with complex or specialised information.

It is crucial to double-check all content generated by ChatGPT for tasks requiring accurate information.

Also, ChatGPT will generate biased responses due to the data it was trained on, which is also biased to some extent.

Potential Pitfalls of LLMs

LLMs may provide inaccurate or misleading answers, which are known as AI hallucinations. This can happen for a range of reasons such as:

Ambiguity in Input Questions: When your question lacks clarity, the AI may make inaccurate assumptions, leading to incorrect responses.
Gaps in Knowledge: If the required information is absent from the AI's training data, it may struggle to provide accurate answers.
Inaccurate Information: Some of the information that AI learns from, such as some web pages or social media, might have inaccurate or false information.
Design Limitations: ChatGPT can constrained by their inability to access real-time information and understand context beyond their training cut-off date. However, some versions of ChatGPT do have access to live information.

The Deeper Reason Why LLMs Make Mistakes

LLMs don’t live in the world like we do; they don’t have eyes, ears, or can smell or taste things.

Their world is text.

Learning to generate human-like content is not the same as knowing the difference between what is real and true.

Fundamentally, LLMs can’t tell the difference between what's real and what's not, and that is why its a significant challenge to reduce all mistakes by LLMs.

Hallucination Rates of ChatGPT

Hallucination rates vary for different LLMs. While specific rates for ChatGPT are not provided in the original source, it is generally understood that hallucination rates can range between 2.5%-8.5% for many popular LLMs, with some being higher around 15%

Recent studies have shown that ChatGPT can have significantly higher hallucination rates than other AI’s. For instance, a study on the performance of ChatGPT and Bard in replicating the results of human-conducted systematic reviews found that the hallucination rates for ChatGPT were 28.6% for GPT-4 and 39.6% for GPT-3.5

Moreover, the study highlighted that these hallucinations were not just minor errors but often involved significant inaccuracies, such as incorrect titles, authors, or publication years.

This underscores the importance of verifying information generated by ChatGPT, especially for critical tasks like research.

Differences Between ChatGPT and Claude

While ChatGPT has many similarities to other AI tools like Claude, there are key differences:

Model Performance and Updates: ChatGPT's GPT-4 is a highly advanced model, but its performance can vary compared to other models like Claude Opus 3
Feature Sets and Capabilities: ChatGPT has a comprehensive feature set, including the custom GPT’s store. However, Claude excels in handling complex or lengthy information better due to its larger context window
Ethical Guidelines and User Expectations: ChatGPT adheres to ethical standards, but its focus may differ from other LLMs like Claude, which prioritises transparency and explainability
Conversation Handling: ChatGPT excels in handling nuanced and context-dependent conversations, allowing for more human-like interactions. However, Claude provides more detailed and structured responses, making it easier to work with for AI content marketing

Limitations of ChatGPT

While ChatGPT can do many tasks, it does have some key limitations.

Accuracy Issues: ChatGPT can make factual errors, especially with complex or specialised information. It is crucial to double-check all content generated by ChatGPT
Biased Answers: ChatGPT may generate biased responses due to the data it was trained on, which can lead to the propagation of harmful stereotypes and biased results
Lack of Human Insight: ChatGPT lacks human-level common sense and emotional intelligence, leading to nonsensical or overly literal responses
Overly Long or Wordy Answers: ChatGPT’s training datasets encourage it to cover a topic from many different angles, answering questions in every way it can conceive of. This can make ChatGPT’s answers overly formal, redundant, and very lengthy
Trouble Generating Long-Form: ChatGPT struggles with producing long-form structured content, making it better suited for shorter pieces like summaries or brief explanations

We can also better understand ChatGPT’s limitations in comparison to Claude.

Comparison of Writing Styles

A comparison of writing styles between ChatGPT and Claude reveals that ChatGPT's writing style is more polished and blog-appropriate but often repetitive and reliant on clichéd phrases. In contrast, Claude's output is more natural sounding, specific and detailed, although it may require more editing to refine the tone of voice.

Logic and Reasoning

Both ChatGPT and Claude perform well in logic and reasoning tasks, but they approach these tasks differently. ChatGPT provides more detailed responses, while Claude delivers its output in shorter, punchier paragraphs.

Practical Applications

In practical applications, the choice between ChatGPT and Claude depends on the specific needs of the task. For example, Claude might be better suited for tasks requiring detailed and structured responses, while ChatGPT might be more appropriate for tasks requiring nuanced and context-dependent conversations

Ethical Considerations

Ethical considerations also play a crucial role in choosing between ChatGPT and Claude. For instance, Claude's focus on transparency and explainability may make it more suitable for tasks where ethical considerations are paramount

Next in this course: Using ChatGPT for personal tasks

The FuturAI

Discussion about this post