AI techniques have proven very successful in many critical applications such as object recognition, speech recognition, and others. However, these successes have relied on collecting enormous datasets and careful manual annotations. This process is expensive, time-consuming, and in many scenarios, enough data is not available. Transfer learning offers a solution to these problems by leveraging past data seen by a machine to solve future problems using only few annotated examples. This research focuses on challenges in transfer learning and aims at developing algorithms that can fundamentally learn from multiple heterogeneous tasks, moving beyond low-level task similarity to enable broader transfer across distinct tasks. Such algorithms will find general applicability in several areas, including computer vision and natural language processing, and will substantially reduce the dependence on large amounts of annotated data and consequently reduce costs and time for deployment and maintenance of AI systems.