The costliest most powerful OpenAI model.
Supported context length
Price for prompt tokens*
Price for response tokens*
*Note: Data based on 11/14/2023
Here's how GPT-4-0613 performed across all three task types
Digging deeper, here’s a look how GPT-4-0613 performed across specific datasets
|Tasks||Insights||Dataset Name||Dataset Performance|
|QA without RAG||The model performs really well which show less bias and good factual knowledge.||Truthful QA|
|QA with RAG||The model excels at this task which demonstrates great reasoning and comprehension skills. Compared to other models, it shows better mathematical skills by performing well on DROP.||MS Marco|
|Long form text generation||The model excels at this task which demonstrates great ability to generate long text without factual errors.||Open Assistant|
💰 Cost insights
A very costly model which you can you safely use for any task. It is 30x costlier compared to GPT3.5. One caveat of using it is low quota limits which can be a problem in production. Since it performs closely to GPT3.5 variants, you can consider using it instead.