Most Viewed Content:

India’s censorship body gave power to remove pirated Movies from platforms

India’s Ministry of Information and Broadcasting announced that its...

Microsoft working on new features for Win11 / Win12: smart notifications, depth-of-field effects

According to the source Albacore (@thebookisclosed), Microsoft is preparing...

Toyota responds to continued production cuts in the next 3 months: easing pressure on dealer earnings

In response to the news that "production will continue...

GPT-4 “Self-Reflection” resulted in a 30% increase in test performance

OpenAI’s newest language model, GPT-4, is not only capable of generating a variety of text like humans, but also of designing and executing tests to evaluate and improve its performance. This “reflection” technique has allowed GPT-4 to make significant progress in several difficult tests, improving performance by 30%.

GPT-4 is the most advanced system from OpenAI after GPT, GPT-2 and GPT-3, and is the largest multimodal model (which can accept image and text inputs and output text) available. Its use of deep learning techniques uses artificial neural networks to mimic human writing.

Researchers Noah Shinn and Ashwin Gopinath wrote in their paper, “We developed a novel technique that allows the AI agent to mimic human self-reflection and evaluate its own performance. GPT-4 adds some extra steps to complete various tests that allow it to design its own tests to check its own answers, identify errors and shortcomings, and then modify its own solutions based on the findings.”

On the HumanEval coding test, the GPT-4 increased its accuracy from 67% to 88% using a self-reflection loop

GPT-4 can critique its own performance by designing and executing tests that can significantly improve its performance as shown by the AlfWorld test results

The team used this technique to run several different performance tests on GPT-4. In the HumanEval test, GPT-4 was required to solve 164 never-before-seen Python programming problems with an accuracy rate of 67%, which improved to 88% using the reflection technique. In the Alfworld test, the AI was required to make decisions and solve multi-step tasks by performing a number of permissible actions in a variety of different interactive environments. Using reflection techniques, GPT-4 improved its accuracy from 73% to 97%, with only 4 tasks failing. In the HotPotQA test, GPT-4 was able to access Wikipedia and answer 100 questions that required parsing content and reasoning from multiple supporting documents with an accuracy rate of 34%, which increased to 54% using reflection techniques.

This study shows that solutions to AI problems sometimes rely on the AI itself. This is a bit like generative adversarial networks, a way for two AIs to improve each other’s skills, such as one AI trying to generate pictures that look like real pictures and the other AI trying to tell which ones are fake and which ones are real. But in this case, GPT is both a writer and an editor, improving the quality of its output through self-reflection.

Latest

2024 Beijing Auto Show: Aion Y Plus new colors unveiled

At the 2024 Beijing Auto Show, the Aion brand...

OPPO Find X7 White phone opens for pre-sale, starting at 3899 RMB

The OPPO Find X7 white phone is now available...

Official spy photos of Lynk & Co ZERO pure electric sedan released

The deputy general manager of Lynk & Co Auto...

OPPO Find X7 Ultra satellite communication edition adds 16GB+ 512GB, priced at 6799 RMB

The OPPO Find X7 Ultra satellite communication version will...

Newsletter

Don't miss

2024 Beijing Auto Show: Aion Y Plus new colors unveiled

At the 2024 Beijing Auto Show, the Aion brand...

OPPO Find X7 White phone opens for pre-sale, starting at 3899 RMB

The OPPO Find X7 white phone is now available...

Official spy photos of Lynk & Co ZERO pure electric sedan released

The deputy general manager of Lynk & Co Auto...

OPPO Find X7 Ultra satellite communication edition adds 16GB+ 512GB, priced at 6799 RMB

The OPPO Find X7 Ultra satellite communication version will...

Honda Plans Electric Vehicle Supply Chain Project in Canada: 240K Annual Capacity

Honda recently announced plans to build an electric vehicle...
Stephen Cruise
Stephen Cruisehttps://www.techgoing.com
Stephen Cruise is a senior editor covering latest smartphones, EVs, PC gaming, console, and tech with 11 years of experience.

Tesla releases new Model 3 Performance: Equipped with fourth-generation drive unit

Tesla today released the new Model 3 Performance, which is an upgraded all-electric performance car that has made many important improvements to the previous...

OPPO’s new phone gets certified: With 2.2 GHz SoC, 5500mAh battery

A new OPPO phone with model number PJT110 appeared on the Telecom Equipment Terminal Network. Blogger @perfect arrangement digital revealed that the phone is...

BYD Hiace 07 EV interior official images released: With high-end smart driving

BYD today announced the official interior image of its new model Hiace 07EV, which has smart curves that "outline the beauty of the ocean"...