Last week, Mighty AI, formerly known as Spare5, announced a new $14 million funding round to expand its crowsourced artificial intelligence(AI) training platform. The reoun was led by Intel Capital as well as GV (Google Ventures), Accenture Ventures and previous Spare5 investors.
The investment on Mighty AI serves as the strongest validation of a new and yet very important discipline in the AI ecosystem: Training Data as a Service(TDaaS). Conceptually, TDaaS provides a cloud model for the curation of training data that will be used by AI algorithms. Mighty AI innovates on the traditional TDaaS concepts by leveraging crowsourcing techniques that can simplify and streamline the availability of training data for AI solutions.
TDaaS platforms are banking on the long termi viability of supervised and semi-supervised AI models. While most AI experts agree that unsupervised algorithms are the future of AI, they also agree that technologies need to evolve in order to improve the practical applicability of unsupervised AI solutions. For the time being, supervised models are predominant in AI solutions. In many mission critical AI solutions, the training of supervised AI models is cost prohibited. As a result, companies such as Google or IBM have been investing tens of millions od dollars to acquire domain experts that can help train and build knowledge into their AI solutions. Crowsourced TDaaS models can help to drastically democratize the training of AI solutions while also providing many other interesting benefits.
5 Benefits of TDaaS
1 — Elastic Scalability of AI Training: Some AI problems require more training data and more intense processes than other. Crowsourced TDaaS solutions can elastically scale the curation of training data while controlling the costs.
2 — Consistent Training Data Management Tools: Crowsourced TDaaS solutions such as Mighty AI provide consistent tools and user experience to manage training data across different AI models.
3 — Consists Training Data APIs and Libraries: TDaaS platforms provide a consistent programming model for accessing training data across different AI solutions.
4 — Training Data Continuous Curation and Monitoring: In AI systems, training is a continuous exercise that expands throughout the lifecycle of an AI solution. TDaaS platforms provide the infrastructure for continuously capturing, updating and curating training data relevant to AI solutions.
5 — Public and Proprietary Training Data: Although the current generation of TDaaS solutions is focused on leveraging public data, the same model should extend to private, domain-specific data in the near future.
5 Ideas or the Roadmap of TDaaS Platforms
Some of the following ideas might be part of the immediate roadmap of TDaaS platforms:
1 — Interoperability with Cloud ML Platforms: Solutions such as Mighty AI should provide seamless interoperability with cloud AI-ML service such as AWS ML, Azure ML or Google Cloud ML.
2 — Integration with AI Training Tools: TDaaS solutions should integrate with the new generation of AI training tools such as the ones released by DeepMind, OpenAI, Microsoft, Google and other to AI solution providers.
3 — AgaaS Interoperability: Algorithm as a Service(AgaaS) platforms such as Algorithmia should provide integration with TDaaS platforms in order to improve the testing and validation of AI algorithms published in the platform.
4 — AI Training Effectiveness Monitoring: TDaaS platforms should include APIs and tools to monitor and validate the effectiveness of specific training datasets and models.
5 — AI Training Tools: It seems obvious but the AI ecosystem desperately needs better training tools and experiences. TDaaS platforms on a unique position to address that limitation.