$30
Overview: Transfer Learning
• As discussed in lecture, transfer learning plays an essential role in many vision tasks.
• Torchvision provide many model architectures and pre-trained weights was trained on big general ImageNet dataset.
Overview: MTL(Multi-Task Learning)
• Multitask Learning (MLT) is an approach to inductive transfer that improves
generalization by using the domain information contained in the training signals of related tasks as an inductive bias.
• It does this by learning tasks in parallel while using a shared
representation; what is learned for each task can help other tasks be learned better.
• In this assignment, you will gain experience in transfer learning and MLT. You are to implement a multi-task model to predict the category and attributes of a fashion item.
Deep Fashion
• Deep Fashion is a large-scale clothe dataset from The Chinese University of Hong Kong(香港中文大學).
• Dataset have over 800K images (different angles and different scenes).
• Each images of dataset is labeled with:
1. 50 category (multi-class)
2. 1000 attributes (multi-label)Category: 0(dress)
3. Bounding box
4. LandmarksAttributes: floral, maxi
• 10 categories was selected from source dataset. Have 55845 images.
• 15 attributes was selected to compose this dataset.
Your task
• Build a deep network (could from pretrained one) that predicts the category and attributes of an item simultaneously (multi-tasking).
• There are two parts of output
• Category (multi-class classification): • Each image could be classified into 1 of 10 categories
• Attribute (multi-label classification): • Each image could be attributed with some of 15 attributes (could >= 1)
• You should consider the choice of activation and loss function
• Note: DO NOT build two models respectively.