We evaluate davinci-003 across a range of classification, summarization, and generation tasks. Using Scale Spellbook, the platform for large language model apps, we show where davinci-003 significantly outperforms the prior version and where it still has room to improve.
How Much Better is OpenAI’s Newest GPT-3 Model?
Posted in robotics/AI