NSYSU-Attention Solved

Starting from:

$30

Overview
• In the assignment #5, we implemented the image captioning tasks.

• In this assignment, you will add an attention module.

• You are free to use pre-trained models like ResNet or LSTM as your backbone structure.

Image Captioning

Image captioning is an interdisciplinary research problem that stands between computer vision and natural language processing.

ref: https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning

Image Captioning with Attention

Flickr8k Dataset
• Flickr8k-Images-Captions

• Collected by Alexander Mamaev.

• Sentence-based image description and search

• Consisting of 8,091 images that are each paired with five different captions

A child in a pink dress is climbing up a set of stairs in an entry way .

Assignment #6 Dataset
We use the same dataset as Assignment #5.

Your task
• We have code skeleton for you guys.

• https://colab.research.google.com/drive/15zLahwl2Ud8TAJIaUyh79SFee3h Eb45_?usp=sharing

• Design an image captioning model with attention module.

• To get a high accuracy, you’ll need to experiment with different filter sizes, different number of layers, and other design principles discussed in class to figure out a network architecture that works best.

• You’ll also need to try data augmentation, dropout, batch normalization as well as different optimizers and other tricks to boost performance.

More products

CS61A Lab 8- Midterm Review Solution

$29.99

Add to cart

CS61A Lab 7- Linked Lists, Trees / Tree Mutation Solution

$24.99

Add to cart

CS61A Lab 6- Object-Oriented Programming Solution

$24.99

Add to cart