Starting from:

$30

Data Engineering-Project Solved

Examples of Tasks
Below you can find examples of valid tasks for each for the various project phases. You can use them, or you can invent new ones (appreciated).

NOTE we are not interested in the actual result but in the design process. We recommend the group to work together and brainstorm, try different things and carry on a discussion for each tasks

 Data Cleansing

Removing Non-Memes entry from Know your meme. removing data non forming to the schema removing memes with bad-words or sensitive content uniforming content structure, e.g, clustering similar tags

Data Augmentation/Enrichment

Process the memes text fields (about, origin) to include bag of words

Download instances related to each meme template

Link meme templates to ImgFlip meme templates

Extract temporal information from text fields

Link the memes to Knowledge Graphs: DBPedia, Wikidata, Yago

Include new data by using extra APIs: DBPrdia Spotlight or google vision (we have credits if you need some)

Data Transformations

Covert the memes into RDF/Labelled Property Graph

Create a relational Model (ER for any of the available dataset

 Analysis the queries should be written in two different query languages, e.g., SQL and Cypher (but also mongodb and SPARQL are ok).

What the most popular memes across country/website/community?

 how many memes include a parent relation?

 design issues define the grouping criteria   design the relational schema given a meme m, what are the related memes m' in the dataset

return all the memes pairs that are not related to each other directly but they are related to at least two of the same memes memes

 return all the meme that share an entity in common, but they are not related directly design issues define what "related" means, e.g., it exists a link of any kind identity nodes identity encode the properties adequately

More products