Meta AI Open Sourced BlenderBot 3, A 175B Parameter Model that can Chat About Every Topic and Organically Improve Its Knowledge
The new release represents a major improvement compared to previous versions.
--
I recently started an AI-focused educational newsletter, that already has over 125,000 subscribers. TheSequence is a no-BS (meaning no hype, no news etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Please give it a try by subscribing below:
BlenderBot has been one of the most exciting open source projects in the natural language understanding(NLU) space of the last few years. Initially released by Meta AI in 2020, BlenderBot provides a conversational interface that can chat with users about almost every topic by actively mining the internet for domain knowledge. The second release of BlenderBot delivered improved long term memory and internet search capabilities. Last week, Meta AI open sourced BlenderBot 3 which represents a major improvement in scale compared to its predecessors.
The first notable improvement in BlenderBot 3 is its size. The new release features an astonishing 175 billio parameter architecture based on the OPT-175B language model. This represents a 58x size increase compared to BlenderBot 2.
One of the main promises of BlenderBot 3 is its ability to improve itself organically based on the feedback collected in active conversations. The model obviously builds on the capabilities of its two previous released which include ternet search, long-term memory, personality, and empathy. Additionally, BlenderBot 3 inherited over 1000 pretrained skills. Despite those inherited capabilities, building an architecture that improves over time is far from an easy endeavor. BlenderBot 3 achieves this with an architecture called Director. The architecture generates responses using two main mechanisms: