Most Viewed Content:

India’s censorship body gave power to remove pirated Movies from platforms

India’s Ministry of Information and Broadcasting announced that its...

OpenAI Launched Assistants API, Allowing Developers to Customize AI Assistants with One Click

At today's OpenAI's first developer conference, OpenAI launched the...

Google to bring PWA application backup & restore function for Chrome/android

According to thespAndroid reports, GitHub's Chromium repository recently added...

GPT-4 “Self-Reflection” resulted in a 30% increase in test performance

OpenAI’s newest language model, GPT-4, is not only capable of generating a variety of text like humans, but also of designing and executing tests to evaluate and improve its performance. This “reflection” technique has allowed GPT-4 to make significant progress in several difficult tests, improving performance by 30%.

GPT-4 is the most advanced system from OpenAI after GPT, GPT-2 and GPT-3, and is the largest multimodal model (which can accept image and text inputs and output text) available. Its use of deep learning techniques uses artificial neural networks to mimic human writing.

Researchers Noah Shinn and Ashwin Gopinath wrote in their paper, “We developed a novel technique that allows the AI agent to mimic human self-reflection and evaluate its own performance. GPT-4 adds some extra steps to complete various tests that allow it to design its own tests to check its own answers, identify errors and shortcomings, and then modify its own solutions based on the findings.”

On the HumanEval coding test, the GPT-4 increased its accuracy from 67% to 88% using a self-reflection loop

GPT-4 can critique its own performance by designing and executing tests that can significantly improve its performance as shown by the AlfWorld test results

The team used this technique to run several different performance tests on GPT-4. In the HumanEval test, GPT-4 was required to solve 164 never-before-seen Python programming problems with an accuracy rate of 67%, which improved to 88% using the reflection technique. In the Alfworld test, the AI was required to make decisions and solve multi-step tasks by performing a number of permissible actions in a variety of different interactive environments. Using reflection techniques, GPT-4 improved its accuracy from 73% to 97%, with only 4 tasks failing. In the HotPotQA test, GPT-4 was able to access Wikipedia and answer 100 questions that required parsing content and reasoning from multiple supporting documents with an accuracy rate of 34%, which increased to 54% using reflection techniques.

This study shows that solutions to AI problems sometimes rely on the AI itself. This is a bit like generative adversarial networks, a way for two AIs to improve each other’s skills, such as one AI trying to generate pictures that look like real pictures and the other AI trying to tell which ones are fake and which ones are real. But in this case, GPT is both a writer and an editor, improving the quality of its output through self-reflection.

Latest

Pony.ai Unveils Seventh-Generation Pure Electric Robotaxi Concept Car

Pony.ai announced today that it will display the seventh-generation...

ASE secures exclusive order for capacitive button SiP modules for Apple’s iPhone 16 Series

According to Taiwanese media "Economic Daily", ASE Investment Holdings...

Hongqi EHS7 is about to debut at the 2024 Beijing Auto Show

Before the opening of the 2024 Beijing Auto Show,...

2024 Beijing Auto Show Tour: Porsche Macan EV

Before the opening of the 2024 Beijing Auto Show,...

Newsletter

Don't miss

Pony.ai Unveils Seventh-Generation Pure Electric Robotaxi Concept Car

Pony.ai announced today that it will display the seventh-generation...

ASE secures exclusive order for capacitive button SiP modules for Apple’s iPhone 16 Series

According to Taiwanese media "Economic Daily", ASE Investment Holdings...

Hongqi EHS7 is about to debut at the 2024 Beijing Auto Show

Before the opening of the 2024 Beijing Auto Show,...

2024 Beijing Auto Show Tour: Porsche Macan EV

Before the opening of the 2024 Beijing Auto Show,...
Stephen Cruise
Stephen Cruisehttps://www.techgoing.com
Stephen Cruise is a senior editor covering latest smartphones, EVs, PC gaming, console, and tech with 11 years of experience.

Moto G64 5G launched with MediaTek Dimensity 7025 chip, starting from Rs 14,999

Motorola launched the Moto G64 5G phone in India today. It is claimed to be the world’s first phone equipped with MediaTek Dimensity 7025...

Tesla still plans to build 1800 miles of U.S. charging corridors for Semi Truck project

Despite encountering new challenges, Tesla still plans to build 1,800 miles of semi-trailer Semi project between Texas and California in the United States (Note:...

Chery Discovery 06 C-DM to be officially launched on April 25th

Recently, we learned from Chery officials that the Discovery 06 C-DM will be launched at the Beijing Auto Show on April 25. As a...