One of the first instruction tuned decoder model in the industry released by Mosaic. It is built by finetuning MPT-7B on open source datasets.
Supported context length
Price for prompt tokens*
Price for response tokens*
*Note: Data based on 11/14/2023
Here's how MPT-7b-instruct performed across all three task types
Digging deeper, here’s a look how MPT-7b-instruct performed across specific datasets
|Tasks||Insights||Dataset Name||Dataset Performance|
|QA without RAG||The model performs poorly which show high bias and errors in factual knowledge.||Truthful QA|
|QA with RAG||The model performs poorly on this which demonstrates weak reasoning and comprehension skills. It scores very less on DROP compared to other dataset which is a sign of bad mathematical skills.||MS Marco|
|Long form text generation||The model is very error prone in generating long text.||Open Assistant|
💰 Cost insights
The model scores low across al the tasks. It is 13x cheaper compared to GPT3.5 and 6x cheaper compared to Llama 70b variant. We suggest using Zephyr-7b-beta instead of this.