Starting from:

$30

NSYSU-Attention Solved

Overview
•    In the assignment #5, we implemented the image captioning tasks.

•    In this assignment, you will add an attention module.

•    You are free to use pre-trained models like ResNet or LSTM as your backbone structure.

Image Captioning
 
Image captioning is an interdisciplinary research problem that stands between computer vision and natural language processing.

ref: https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning



 
Image Captioning with Attention 


Flickr8k  Dataset
•        Flickr8k-Images-Captions

•       Collected by Alexander Mamaev.

•       Sentence-based image description and search

•       Consisting of 8,091 images that are each paired with five different captions

A child in a pink dress is climbing up a set of stairs in an entry way .

Assignment #6 Dataset
We use the same dataset as Assignment #5.

 

Your task
•    We have code skeleton for you guys.

•    https://colab.research.google.com/drive/15zLahwl2Ud8TAJIaUyh79SFee3h Eb45_?usp=sharing

•    Design an image captioning model with attention module. 

•    To get a high accuracy, you’ll need to experiment with different filter sizes, different number of layers, and other design principles discussed in class to figure out a network architecture that works best.

•    You’ll also need to try data augmentation, dropout, batch normalization as well as different optimizers and other tricks to boost performance.

More products