NSYSU-Image Captioning Solved

Starting from:

$30

Overview
• In the previous assignments, we implemented multiple image classification tasks.

• In this assignment, you will design and train a neural network which combines CNN and RNN to process an input image, then output a sequence that describe the image.

• You are free to use pre-trained models like ResNet or LSTM as your backbone structure.

Image Captioning
Image captioning is an interdisciplinary research problem that stands between computer vision and natural language processing.

Flickr8k Dataset
• Flickr8k-Images-Captions

• Collected by Alexander Mamaev.

• Sentence-based image description and search

• Consisting of 8,091 images that are each paired with five different captions

A child in a pink dress is climbing up a set of stairs in an entry way .

Assignment #5 Dataset
8091 imags

captions

Your task
• We have code skeleton for you guys.

• https://colab.research.google.com/drive/1E96yjndJyBTAEEcgSthRyAVqd4H1WcW?usp=sharing

• Design a convolutional neural network to do image captioning.

• The images provided are of different resolutions. You’ll need to resize the images into a fixed size of your own choice.

• To get a high accuracy, you’ll need to experiment with different filter sizes, different number of layers, and other design principles discussed in class to figure out a network architecture that works best.

• You’ll also need to try data augmentation, dropout, batch normalization as well as different optimizers and other tricks to boost performance.

Things you cannot do
• You cannot copy trained models from others.

• You cannot copy a whole page of code from the Internet.

Any violation will result in no points!

More products

CSE420 Lab 1 Solution

$29.99

Add to cart

CSE410 Assignment 3 Solution

$34.99

Add to cart

CSE410 Assignment 2- Raster Based Graphics Pipeline Solution

$34.99

Add to cart

Shopping cart

US$0

NSYSU-Image Captioning Solved

More products