Transfer learning is a type of deep learning technique that uses a pre-trained machine learning model for a purpose that's different from yet related to its original function. The basic idea is that the knowledge developed from its original task can be reapplied to help the AI system more effectively learn how to accomplish the new task. Because transfer learning requires considerable processing power, it's most commonly applied in language processing and computer vision.
Transfer learning typically begins with a system trained on a large dataset and a 'source task' which essentially forms the core of the machine learning model.
Once this pre-training is complete, programmers then identify which parts of the model pertain to the new task and which parts need to be retrained. For example, if a machine learning model were trained to identify a motorcycle, the components of that model that pertain to image recognition could be used to train the model to identify shoes. Instead of having to retrain an entire system from scratch, programmers can simply build upon the foundation of the original system.
The most significant benefit of transfer learning is that it saves considerable time and resources compared to training a model from scratch. It's also valuable for scenarios with limited data or when there are only unlabelled datasets available.
Other benefits of transfer learning include:
Transfer learning also solves another long-standing problem with machine learning, known as overfitting.
Machine learning models are often only accurate within environments related to their original training data. Any changes to their operating environment directly and adversely affects their accuracy. A significant enough change may require full retraining.
Although machine learning has become significantly more advanced over the years, this remains a persistent problem, tangentially related to another issue with machine learning known as overfitting.
Essentially, overfitting occurs when a machine learning model cannot generalize, and instead skews to its training data. This can happen for several reasons:
For instance, a machine learning model trained to recognize oranges but only trained with photos that show the fruit in a bowl might assume the bowl is a characteristic of an orange, and struggle to identify an orange growing on a tree in the wild.
Provided the pre-trained model was provided with an adequately-sized, high-quality data set, transfer learning doesn't suffer from any of the above issues. This does not, however, mean that it is flawless. There are certain circumstances in which transfer learning can fail, including: