One of the main differences between human and artificial intelligence(AI) agents is that the latter remain constrained to single tasks. I am not even talking about specific domains but about tasks within those domains. To this day, one of the greatest challenges of AI remains finding effective ways to reuse knowledge representations across AI models. In the deep learning theory, a small sub-discipline known as multitask learning is attempting to address that challenge.
Using the notion of a computational graph is a simple way to visualize deep learning models. In that representation, deep learning models typically have one input layer, a large number of hidden layers and an output layer. The outputs of one layers serve as inputs to other layers on the next level. Given the large number of nodes in each layer, reusing knowledge representations across them becomes both very appealing and really difficult. Not surprisingly, multitask learning has received a lot of attention among deep learning researchers and technologists.
Multitask learning is a very broad term that encompasses a few different segments. Think about how humans reuse and apply knowledge. Sometimes, when performing a task, we apply knowledge we learn during a previous task. Other times, we need to adapt what is essentially the same knowledge and the same task to different domains. In the deep learning world, those variations of multitask learning are known as transfer learning and domain adaptation respectively.
Transfer learning is a form of multitask learning that is applied when a model needs to perform different tasks and knowledge representations should be reused across those tasks. Let’s consider the domain of language learning. When mastering a second or third language, we intuitively reuse many concepts or practices of our native language. Among languages that share similar origins, such as the ones coming from the Latin( Spanish, French, Italian, Portuguese…) there are many commonalities in terms of syntax, ver conjugation, pronunciation, grammar and many other aspects. In a polyglot deep learning model, we can envision a transfer learning technique that reuses the knowledge representation from layers designed for one language and apply them on layers focused on another language.
The principle of domain adaptation are complementary to transfer learning in the sense that the target task remains fundamentally the same and what varies is the distribution of the input. Let’s move our examples to the field of image analysis and take an algorithm that recognizes objects such as shoes, handbags or dresses in fashion designer pictures. Typically, those images include a model posing with the target items. On another scenario, we would like to build a model that recognizes the same objects(shoes, handbags and dresses) in images of social events that include multiple people each one wearing different items. Whether the knowledge created by the first model can clearly be applied in the second scenario, it would have to be represented in a way that can be generalized across both domains (fashion vs. social pics). That’s the role of domain adaptation.
We can imagine that multitask learning will be an essential element in the quest towards achieving general AI. In many modern deep learning frameworks, there are some distinctions between transfer and multitask learning. Notably, transfer learning is mostly used in relationship with supervised models while multitask learning can be applied to both supervised and unsupervised algorithms.