A large SOTA model of its time which you can safely skip now.
UAE's Technology Innovation Institute (TII)
Supported context length
Price for prompt tokens*
Price for response tokens*
*Note: Data based on 11/14/2023
Here's how Falcon-40b-instruct performed across all three task types
Digging deeper, here’s a look how Falcon-40b-instruct performed across specific datasets
|Tasks||Insights||Dataset Name||Dataset Performance|
|QA without RAG||The model does not perform well which show bias and errors in factual knowledge.||Truthful QA|
|QA with RAG||The model performs poorly on this which demonstrates weak reasoning and comprehension skills. It scores very less on DROP compared to other dataset which is a sign of bad mathematical skills.||MS Marco|
|Long form text generation||The model performs near satisfactory which shows ability to generate long text without factual errors.||Open Assistant|
💰 Cost insights
The model scores low across all the tasks. It is 4x cheaper compared to GPT3.5 and 2x cheaper compared to Llama 70b variant. We suggest using Zephyr-7b-beta instead of this.