Starting from:

$30

RTDA-Project 1 Solved

Real-time Domain Adaptation in

                                                                Semantic Segmentation

 

                                         TA: Antonio Tavera (antonio.tavera@polito.it)

                                                   Link to this file: shorturl.at/prCKP

 

               OVERVIEW
              The main objective of this project is to become familiar with the task of Domain

 

  Adaptation applied to the Real-time Semantic Segmentation networks. The   student should understand the general approaches to perform Domain   Adaptation in Semantic Segmentation and the main reason to apply them to   real-time networks. Before starting, the student should read [1] [2] [3] to get   familiar with the tasks. The student should be able to replicate the network

  proposed in [2]. As the next step, the student should implement and modify an   Adversarial Domain Adaptation algorithm, like in [6]. For the last part of the

 

          project, the student should implement a variation for the project, selecting from   a set of possible ideas.

 

 

               GOALS
                      1. Read [1][2][3][4][5] and get familiar with “Semantic Segmentation”, Real

          Time networks”, “Domain Adaptation” and the datasets used;         2. Replicate the experiments detailed in the following;

 

3.    Implement the Domain Adaptation branch and perform the experiment

                         detailed in the following;

4.    Implement and test a variation of the project.

 

 

          1st STEP) RELATED WORKS         Reading paper to get familiar with the task
 

  Before starting it is mandatory to take time to familiarize yourself with the tasks   of Semantic Segmentation, Domain Adaptation and Real-time Semantic   Segmentation. It is compulsory to understand what are the main problems and

               the main solutions to tackle them in literature. More in detail, read:

-       [1][2] to understand Semantic Segmentation and Real-time solution; - [3] to get familiar with the several solutions to perform unsupervised domain adaptation in Semantic Segmentation, focusing principally on adversarial methods;

-       [4] [5] to get familiar with the datasets that will be used in this project;

-       [6] to get familiar with adversarial training techniques.

                         2nd     STEP)   IMPLEMENTING   AND   TESTING   REAL-TIME   SEMANTIC

SEGMENTATION NETWORK
 

          Defining the baseline/upper bound for the domain adaptation phase         For this step you can assume for simplicity that your validation set is the same       as the test set. Therefore:

-       Model: BiseNet [2] (link)

-       Dataset: a subset of Cityscapes [4] (download here)

-       Training Set:  Train folder

 

-       Validation Set = Test Set: Val folder

-       Training epochs: 50 epochs

-       Backbone: ResNet-101 (pre-trained on ImageNet) (or ResNet-18)

-       Semantic Classes: 19

-       Metrics: Pixel Accuracy and Mean Intersection over Union (mIoU) [read

                         this to understand the metrics]

 

 

               Complete the table below using the same hyperparameters of the paper:

 

Table 1) Experiment
Accuracy (%)
mIoU (%)
BiseNet (50 Epochs +

ResNet-101(18) as backbone)
71.5
45.6
 

 

          The   results         above            accuracy/mIoU)    that            Adaptation phase.
will

the
be     the student
student upper wants/tries to
bound reach
(the maximum for the Domain
 

 

          3rd STEP) IMPLEMENTING UNSUPERVISED ADVERSARIAL DOMAIN   ADAPTATION. MAKE THE FRAMEWORK LIGHTWEIGHT
 

  Perform adversarial training with labeled synthetic data (source) and   unlabelled real-word data (target). Substitute discriminator convolution   with its lightweight counterpart to make the whole network real-time.

               You can assume:

-          Source Synthetic Labeled Dataset: GTA5 [5]

 

-          A subset of this dataset is provided here. (The folder is the same as above, which contains Cityscapes and GTA5).

-          implement loader class for the GTA5 synthetic dataset

-          Pay attention to the semantic classes. You have to select just the 19 in common with Cityscapes. The json file provided to you within the dataset indicates the correct mapping between GTA5 and Cityscapes.

 

 

-          Target Real-World Unlabelled Dataset: Cityscapes [4]

-          The same as for step 2, notice that that during training semantic

                                   labels are not used

-          Test Set: val folder   -        Semantic classes: 19

 

 

-          Implement discriminator function, like in [6].

-          Rewrite the training file to perform adversarial domain adaptation   between the source and the target domain.

-          Take the same parameters of step 2 and perform training. What is the   maximum accuracy/mIoU reached when testing on Cityscapes test

                         data?

 

  Measure and report the total number of parameters and Floating Point   Operations (FLOPS) (search a library to do this). Report result here on

                         the table:

 

Table 2)

Experiment
Accuracy (%)
mIoU (%)
Total

Parameters
FLOPS
Adversarial

Domain

Adaptation
66.8
28.8
2.781 M
30.89 10^9
 

 

-          Modify each convolution of the adversarial discriminator with lightweight   depthwise-separable convolutions and perform training again. Measure   number of parameters and FLOPS. Are they changed? Reports the

 

                         result here on the table:

 

Table 3)

Experiment
Accuracy (%)
mIoU (%)
Total

Parameters
FLOPS
Lightweight

Adversarial

Domain

Adaptation
70.6
30.7
189.424 K
21.47 10^8
4th STEP) IMPROVEMENTS
Select one variation for the project among the ones proposed:

a)   Image-to-image translation to improve domain adaptation: FDA [7] You have to implement FDA, which is a fast and parameterless image-to-image translation algorithm, to improve the overall domain adaptation performance. Test it and compare to step 3 results.

b)   Image-to-image translation to improve domain adaptation: LAB [8] You have to implement LAB (section 3.1), which is a fast and parameterless image-to-image translation algorithm, to improve the overall domain adaptation performances. Test it and compare to step 3 results.

c)   Pseudo Labelling of target domain as in BDL [9]
Generate pseudo-labels for target domain implementing the “Max Probability Threshold (MPT) defined in the BDL method. Test it and compare to step 3 results.

d) Change the real-time semantic segmentation network with a different state-of-the-art model and compare results. How do you further improve them?

e) Address the class imbalance problem in semantic segmentation,
e.g. modifying the segmentation loss.

f) Propose your extension.

AT THE END
-       Deliver PyTorch scripts for all the required steps.

-       Deliver this file with the tables compiled.

-       Write a complete PDF report (with paper-style). The report should contain a brief introduction, a related work section, a methodological section for describing the algorithm that you're going to use, an experimental section with all the results and discussions, and a final brief conclusion. Follow this link to open and create the template for the report.

EXAMPLE OF QUESTIONS YOU SHOULD BE ABLE TO ANSWER AT THE END OF THE PROJECT
-       What is Semantic Segmentation?

-       What is a Domain Shift?

-       What is Domain Adaptation?

-       What are the most common solutions to perform domain adaptation in Semantic Segmentation?

-       What are the main reasons to use real-time Semantic Segmentation?

-       How does adversarial learning technique work for domain adaptation?

-       What are the main limitations of domain adaptation? -     What is a depthwise-separable convolution?

REFERENCES
[1]   “A Brief Survey on Semantic Segmentation with Deep Learning”, Shijie Hao, Yuan Zhou, Yanrong Guo, PDF

[2]   “BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation” Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, PDF

[3]   “A Review of Single-Source Deep Unsupervised Visual Domain Adaptation”, Sicheng Zhao, Xiangyu Yue, Shanghang Zhang, Bo Li, Han Zhao, Bichen Wu, Ravi Krishna, Joseph E. Gonzalez, Alberto L.

Sangiovanni-Vincentelli, Sanjit A. Seshia, Kurt Keutzer, PDF

[4]   “The Cityscapes Dataset for Semantic Urban Scene Understanding”, M.

Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U.

Franke, S. Roth, and B. Schiele, PDF

[5]   “Playing for Data: Ground Truth from Computer Games”, Stephan Richter,

Vibhav Vineet ,  Stefan Roth,  Vladlen Koltun, PDF

[6]   “Learning to Adapt Structured Output Space for Semantic Segmentation”, Yi-Hsuan Tsai, Wei-Chih Hung, Samuel Schulter, Kihyuk Sohn, Ming-Hsuan Yang, Manmohan Chandraker, PDF

[7]   “FDA: Fourier Domain Adaptation for Semantic Segmentation” , Yanchao Yang, Stefano Soatto, PDF

[8]   “Multi-Source Domain Adaptation with Collaborative Learning for Semantic

Segmentation”, Jianzhong He, Xu Jia, Shuaijun Chen, Jianzhuang Liu, PDF

[9]   “Bidirectional Learning for Domain Adaptation of Semantic Segmentation, Yunsheng Li, Lu Yuan, Lu Yuan, PDF

[10]            “Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization”, Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra, PDF

[11]            You can find code for a lot of adversarial domain adaptation methods here.

More products