Details
Developer
OpenAI
License
NA (private model)
Model parameters
NA (private model)
Supported context length
16k
Price for prompt token
$0.5/Million tokens
Price for response token
$1.5/Million tokens
Chainpoll Score
Short Context
0.84
Digging deeper, here’s a look how gpt-3.5-turbo-0125 performed across specific datasets
Tasks | Task insight | Cost insight | Dataset | Context adherence | Avg response length |
---|---|---|---|---|---|
Short context RAG | The model demonstrates good reasoning and comprehension skills, excelling at short context RAG. It also decent mathematical proficiency, as evidenced by its performance on DROP and ConvFinQA benchmarks. It comes out as one of the worst small closed source model behind Haiku and Gemini Flash. | We wish it performed well but its a costly model for performance it offers. We recomend using 2x cheaper Gemini Flash or Haiku. | Drop | 0.78 | 158 |
Hotpot | 0.85 | 158 | |||
MS Marco | 0.91 | 158 | |||
ConvFinQA | 0.83 | 158 |