ChatGPT is improving at alarming rates. From a fledgling scholar to a candidate for Mensa, the newest version of ChatGPT (ChatGPT-4) is performing leaps and bounds over its previous model, ChatGPT-3.5. OpenAI ChatGPT-4 is the largest language model created to date. The biggest difference to ChatGPT over its predecessors is its ability to not only pass, but score highly on different examinations. Other improvements to ChatGPT-4 are, ChatGPT-4 can now be spoken to in twenty-six different languages, can now recognize images and even describe images to visually-impaired people.
ChatGPT-4 can pass a multitude of tests with scores in upper percentiles, even beating out its predecessor, ChatGPT-3.5.Here’s a sample of some of the results;
- The bar exam: GPT-4 was in the 90th percentile with a score of 298 out of 400. GPT-3.5 came in the 10th percentile.
- The SAT: GPT-4 scored 1400 out of 1600, ranking in the 89th percentile of test-takers. GPT-3.5 scored 1260.
- AP exams: GPT-4 received a 5 on Art History, Biology, Environmental Science, Macroeconomics, Microeconomics, Psychology, Statistics, US Government, and US History, according to OpenAI. GPT-3.5 received a 5 only on Art History and Psychology.
- Sommelier exams: GPT-4 has also passed the Introductory Sommelier, Certified Sommelier, and Advanced Sommelier exams with scores of 92%, 86%, and 77%, respectively. GPT-3.5 had a less discerning palate, earning marks of 80%, 58%, and 46%.
ChatGPT-4 can even pass a US Medical Licensing exam.
This shouldn’t come as a shock, as ChatGPT is trained in answers. But many don’t find this improvement in technology as worrisome. Among them is Daniel Van Boom, a journalist over at CNET. “After years of artificial intelligence hype, ChatGPT solidified the idea that AI probably will disrupt many industries. But disruption can mean change rather than destruction. In many cases, that change will be for the better. Education is a prime candidate for improvement. The industry is notoriously slow moving, and can be prodded by AI without the risk of mass layoffs since teachers’ jobs are typically more secure than many others.” Because users are capped at fifty questions every three hours, it’s still not practical to use ChatGPT to cheat on lengthy examinations like the Bar exam.
On the other hand, this statement released by OpenAI doesn’t inspire much confidence that the growing intellect of ChatGPT at rapid rates isn’t going to lead to catastrophe. The heads of ChatGPT’s developer, OpenAI, have signed a statement (alongside many AI experts) warning of the need to address the human extinction risk associated with AI.
The statement reads: “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”
Members of the OpenAI team to sign the statement include its CEO, Sam Altman, and its chief scientist, Ilya Sutskever. The list of signatories also includes the CEO of Google DeepMind, many university professors and public figures such as Bill Gates. The issue of ethics as ChatGPT rapidly evolves is always a concern but the onus of responsibility will always fall squarely on OpenAI.
However, ChatGPT-4 isn’t an omnipotent supercomputer, yet. ChatGPT-4 faces limitations including a knowledge base limited to things that occurred up until 2021, as well as common sense, emotional intelligence, limitations in understanding context and ChatGPT cannot provide sources for any of the information it provides, unlike search engines. It also faces problems generating long-form, structured content.
ChatGPT also lacks the ability to multitask, has accuracy problems and grammatical issues and gives potentially biased responses. Lastly, ChatGPT and AI require such large computing resources that running the model can be expensive and require additional specialized software and hardware systems, so it’s not quite scalable yet.
I hope the one test ChatGPT can never pass is the Turing test.



































