Google's Gemini quietly strengthens, rapidly narrowing the gap with GPT-4o, and even surpassing in Chinese!

Wallstreetcn

2024.05.30 05:21

I'm PortAI, I can summarize articles.

In the Chinese test, Gemini Pro and Advanced both surpassed GPT-4o, ranking first and second respectively

Despite being overshadowed by OpenAI when it was first unveiled, Google has been quietly iterating on the Gemini large model, and the gap with OpenAI's latest large model GPT-4o has significantly narrowed. The latest test results show that Gemini 1.5 Pro/Advanced ranks 2nd in comprehensive testing, approaching GPT-4o, while the lightweight version Gemini 1.5 Flash ranks 9th, surpassing Llama-3-70b and approaching GPT-4.

Compared to the version in April, the capabilities of the free Gemini Pro and Flash have been significantly enhanced. The context length can reach 1 million tokens, far exceeding GPT-4's 128,000 tokens.

Gemini's Chinese capabilities are also impressive. In Chinese testing, Gemini Pro and Advanced both surpass GPT-4o, ranking first and second respectively.

Furthermore, in Hard Prompts testing, Gemini also ranks at the forefront. In Hard Prompts testing, large models need to face more challenging questions, and Gemini 1.5 Pro ranks second in this test, second only to GPT-4o.

Looking at the Confidence Intervals of large models, Gemini's test results also rank at the top.

It is worth mentioning that two weeks ago, when Google Gemini was updated to coincide with the release of GPT-4o, Gemini almost became the target of ridicule due to its weak capabilities. According to evaluations from multiple tech blogs, even though Google has been improving 1.5 Pro for several months, it still cannot compare to OpenAI's latest GPT-4o model in terms of common sense reasoning, multimodal capabilities, and code capabilities. The only highlight is the larger context window. Now, Google Gemini has made such rapid progress, demonstrating that the AI industry still has a profound technical foundation