Home App OpenAI introduces GPT-4 large language model: outperforms humans in many tests

OpenAI introduces GPT-4 large language model: outperforms humans in many tests

0

Open-AI has just announced the latest version of its large language model, GPT-4 (API waitlist here).

GPT-4 can solve your puzzles more accurately, and the multimodal GPT-4 can also generate and edit creative or technical articles, outperforming its predecessors in advanced reasoning (the current public version of ChatGPT is based on GPT-3.5). Of course, as one might guess, Microsoft New Bing’s chat capabilities are indeed based on GPT-4.

In addition, the company is testing GPT-4’s image input capabilities with its partner Be My Eyes (note: Be My Eyes is an upcoming smartphone app that recognizes and describes scenes, similar to an enhanced version of the common AI map recognition).

In addition to the introductory website, OpenAI also provides a technical paper describing the capabilities of GPT-4, as well as a system model Card detailing its limitations.

OpenAI plans to provide users with text-enabled support for GPT-4 through ChatGPT and its commercial API, but there’s still a wait. it.com reminds us that GPT4 is currently only available for ChatGPT plus accounts, with an optional GPT4 mode for conversations and a limit of 100 messages every 4 hours.

The fee is about 3 cents for a prompt of about 750 words and 6 cents for a response of about 750 words (that’s the difference between a question and an answer).

GPT-4 is described as “bigger” than previous versions, meaning it has been trained with more data than its predecessor and has more weights in the model file, making it more expensive to run.

In terms of tasks, GPT-4 performs better than its predecessor by following complex instructions in natural language and generating technical or creative content, and it can also do so in greater depth: it supports generating and processing up to 32,768 tokens (about 25,000 text words), enabling longer content creation or text analysis than its predecessor.

OpenAI says GPT-4 gets fewer wrong answers and is less likely to go off-topic and talk about taboo topics whenever possible, and even performs a little better than humans on many standardized tests.

For example, the GPT-4 ranks in the top 10% of test takers on the Mock Bar Exam, in the top 7% on the SAT Reading Test, and in the top 11% on the SAT Math Test. In contrast, the GPT-3.5 generally scores in the bottom 10% on the bar exam. Of course, they’re still fine for a graduate exam.

Of course, AI is AI after all, and OpenAI also says that GPT-4 is not perfect at the moment, and that it is not as good as humans in many scenarios.

The model allegedly still suffers from “illusions” or fabrications and is not always reliable on the facts, “and it tends to insist that it is right even when it is wrong. its limitations, such as social bias, illusions and adversarial cues.

Exit mobile version