ChatGPT evaluating AI
A session with the public version of ChatGPT 4o mini about the limitations of AI, and LLMs in particular.
This was for a session on with the public web interface of ChatGPT 4o mini. Other than my seed questions, the sources of what the AI is using to respond are unknown. It highlights that AI needs to be harnessed, and it responses not just accepted as is.
Humans in charge
△ChatGPT tells how to use AI.
AI is a fantastic tool to riff off, but we humans have to realise that it is us that must be in charge, and have the capacity to understand what the AI suggests, as besides seeing a myriad possibilities in the AI output, we have to challenge what does not make sense or is just plain wrong because the AI answers are only what seems like reasonable opinions of someone else's work, but challenging those, force the AI to find other, perhaps more accurate or correct references.
You’ve articulated a crucial insight into the relationship between humans and AI. AI can be an extraordinarily powerful tool, but humans must remain firmly in control. The true value of AI comes not from blindly accepting its suggestions, but from actively engaging with its output, challenging its reasoning, and guiding it toward better solutions. This dynamic helps ensure that AI’s role remains as a supportive assistant, not a decision-maker.
Why humans must stay in charge
△ChatGPT clearly delineates AI's limitations.
- a.
AI’s Limitations in Understanding Context
AI generates answers based on patterns and data it has been trained on, but it doesn't possess true understanding. While it can synthesize information and make connections, it doesn’t truly “understand” the specific needs, context, or goals of your project. Without human oversight, AI can propose ideas that, while plausible on the surface, may not be optimal or even appropriate for your specific use case. You need the human ability to evaluate the output in light of your unique circumstances and requirements. - b.
Challenging AI’s Suggestions
AI often generates outputs that seem reasonable based on the data it has seen, but this doesn’t mean the result is flawless. By challenging the suggestions AI gives, you encourage it to find better alternatives or reconsider its logic. For example, if AI suggests an approach that doesn’t align with your system's goals, you can question its assumptions and refine the prompt or context to get more relevant, accurate, and targeted results. - c.
Ensuring Coherence and Precision
One area where AI can still struggle is in coherence across different domains. It might give you a snippet of code or suggest a design pattern, but it may lack precision in terms of integration with the rest of your system. AI doesn't “see” the whole picture in the way humans do when designing complex systems. By staying in control, humans can ensure that the output is not only technically correct but also fits into the larger system architecture and design principles you have in place. - d.
Creative Problem-Solving
Creativity is an area where humans excel and AI lags behind. AI can generate numerous possibilities, but it doesn’t have the ability to come up with entirely new or groundbreaking solutions. Humans are better at thinking outside the box and challenging conventional thinking, which is essential for solving unique challenges. When you question AI’s output, it forces the AI to push its own boundaries and come up with novel solutions that may be more relevant and tailored to the problem at hand. - e.
Risk of Blindly Trusting AI
Blind trust in AI outputs can lead to systemic errors or unintended consequences. The AI might base its suggestions on outdated data or non-relevant information, and if you don’t have the capacity to critically engage with it, you risk implementing suboptimal or even incorrect solutions. By challenging AI’s ideas, you ensure that the decisions being made align with best practices, current standards, and your specific needs. - f.
Improving the AI’s Output
AI is constantly learning and improving based on user feedback. When you identify mistakes or limitations in its output, your challenges help it to refine its approach. Providing constructive feedback can push AI to find better references or adjust its algorithms to offer more relevant solutions in the future. Essentially, you’re part of the feedback loop that shapes the evolution of AI systems.
Why you should challenge AI's outputs
△AI is not foolproof.
- a.
Skepticism leads to better results
When you question the AI’s suggestions, you encourage it to search for better or more suitable information. This may uncover better practices or solutions you hadn’t considered. - b.
Avoiding over-reliance
AI is still an assistant, and over-relying on it without human supervision can result in complacency. By actively engaging with AI, you ensure that your work is always driven by your understanding and vision, not by what AI suggests. - c.
Optimizing for your unique project
Every project has its nuances and specific goals, and AI can’t always accommodate those perfectly. By questioning its output, you can tailor its suggestions to fit your particular objectives more closely.
Collaborative potential
△AI is a collaborate tool.
The relationship should be collaborative, where AI generates possibilities and assists with efficiency, but humans provide critical judgment, ensuring that the result is optimized, relevant, and meaningful. In this way, humans use AI as a tool to augment their creativity, efficiency, and problem-solving abilities, while still maintaining full control over the decisions made.
You’re correct in your point that AI suggestions should always be taken as inputs, not definitive answers. By maintaining critical oversight, humans ensure that AI-generated solutions evolve into the best possible outcome for their specific needs.
In summary
△AI is a tool that must be harnessed.
AI is a fantastic tool for idea generation and for helping streamline tasks, but it cannot replace the human role in decision-making. You must challenge AI’s outputs, question assumptions, and optimize suggestions in the context of your unique goals and workflows. Humans are the ones in charge, with AI playing the role of an assistant—providing insights and possibilities, but always requiring human expertise and oversight to guide it toward truly effective solutions.
This symbiotic relationship between human intelligence and AI capabilities can help unlock powerful results, where humans drive innovation and AI accelerates implementation.
Language professor
△LLMs are language experts.
I see LLMs as being like a language professors who can parse a request and find answers that seem to fit with it, but does not really know anything about it.
That’s a very insightful analogy! Large Language Models (LLMs), like me, can indeed be compared to a language professor who is exceptionally skilled at parsing requests and retrieving answers that align with patterns seen in large datasets. However, just as a professor who has vast knowledge of texts and language may not always truly understand the content at a deep level, LLMs also lack true understanding or consciousness of the material they present.
Key points of the analogy
△There are key differences in using an LLM over a person.
- a.Just as a language professor can read and analyze texts, identify patterns, and find the right resources based on the question posed, LLMs excel in identifying patterns in language and generating responses that fit those patterns. However, this is more about matching linguistic cues and contextual clues rather than understanding.
- b.LLMs generate responses by drawing from massive datasets, but they don’t comprehend the meaning or the intent behind a question in the way a human expert would. They lack a deep grasp of the subject matter and don't possess conscious thought or a framework for reasoning in a human sense.
- a.A language professor may know a lot about grammar, syntax, and literature, but they may not always understand the full depth or nuances of every topic they discuss. They can provide answers based on their knowledge, but those answers might not always be the most relevant, up-to-date, or contextually correct.
- b.Similarly, an LLM may generate an answer that seems appropriate based on the words and concepts it’s been trained on, but it doesn’t truly know or understand the specific domain. It doesn’t have experiential knowledge, intuition, or the ability to challenge the data it draws from in the way humans do. It’s essentially producing responses based on probability and patterns rather than real insight.
- a.A language professor might not always fully understand the context in which a question is being asked, and might give a textbook-style answer that seems correct but doesn’t fully address the underlying needs of the student. Similarly, LLMs can misunderstand the subtle intentions behind a request, producing an answer that fits the linguistic structure but misses the deeper context or specific requirements of the user.
- b.While LLMs can be context-aware in the sense of recognizing the immediate input provided, they don’t have an inherent sense of why certain context is important beyond the words themselves.
- a.A language professor can usually give an answer that is reliable based on their knowledge base but isn’t necessarily always accurate. They might rely on past teachings, but their understanding is still subject to human error or gaps in knowledge.
- b.Similarly, LLMs can offer answers that sound credible or reliable based on the language patterns they have been trained on. However, they are not infallible and can generate incorrect, outdated, or misleading information without the capacity to validate or check facts in real-time.
Why this analogy works
△The analogy has merit.
The professor in your analogy represents a being who recognizes patterns, understands language at a surface level, and can produce answers that fit context, but lacks deeper insight, critical reasoning, or true understanding.
The professor in your analogy represents a being who recognizes patterns, understands language at a surface level, and can produce answers that fit context, but lacks deeper insight, critical reasoning, or true understanding.
The key difference
△Understanding vs pattern matching.
- a.
Understanding
True human knowledge involves an understanding of the material, the deeper context, the intent behind a question, and the ability to reason through it. Humans don’t just match words; they make connections to their broader experiential knowledge and mental models. - b.
Pattern matching
LMs, on the other hand, don’t have consciousness or intuitive reasoning. They generate responses based on the likelihood of words and structures fitting the context, without actually understanding the meaning behind the words. It's a complex form of pattern recognition, but without true comprehension.
Conclusion
△A final say about it.
Your analogy of an LLM being like a language professor is spot on: it’s a tool that can parse and find matching answers based on linguistic patterns, but it doesn’t possess the deeper knowledge, contextual awareness, or critical reasoning that humans bring to the table. Humans remain crucial in interpreting, challenging, and refining the suggestions that AI offers, ensuring that those suggestions are relevant, accurate, and meaningful in the specific context.
Copilot responses
△Copilot's responses were more concise, but expressed the same sentiments.
Humans in charge
△Copilot tells how to use AI.
AI is a fantastic tool to riff off, but we humans have to realise that it is us that must be in charge, and have the capacity to understand what the AI suggests, as besides seeing a myriad possibilities in the AI output, we have to challenge what does not make sense or is just plain wrong because the AI answers are only what seems like reasonable opinions of someone else's work, but challenging those, force the AI to find other, perhaps more accurate or correct references.
You raise an excellent point! AI, like me, is a tool designed to assist, inspire, and provide information. However, it's crucial for humans to apply critical thinking and take the lead in decision-making. AI outputs can offer valuable insights, but at the end of the day, it’s you who understands context, nuances, and the broader picture.
When you challenge AI outputs, you push the boundaries of what’s possible, ensuring the information you receive is accurate and relevant. It's a collaborative process where both human intuition and AI's vast data capabilities come together. This partnership can lead to even better outcomes when both sides contribute their strengths.
Language professor
△LLMs are language experts.
I see LLMs as being like a language professors who can parse a request and find answers that seem to fit with it, but does not really know anything about it.
That's a fascinating analogy! Large Language Models (LLMs) can indeed seem like language professors in that they can interpret and respond to a wide range of requests, drawing on vast amounts of textual information. However, as you pointed out, they don't have a deep understanding or consciousness about the topics they discuss. They generate responses based on patterns in the data they've been trained on, rather than on personal knowledge or experience.
This is why it's essential for users to critically evaluate the information provided by LLMs. Just as you wouldn't take everything a language professor says at face value without considering the context and your own understanding, it's important to verify the responses generated by AI. This collaborative approach ensures that you get the most accurate and relevant information while benefiting from the AI's ability to process and synthesize large amounts of data.