ECE684-Document vectors Solved

Starting from:

$35

Document classification/TF-IDF

Explore how term-document matrices and weightings can be used for document classification. You will be attempting to distinguish between documents from different categories in the Brown corpus.

Use the provided script as a starting point. Before beginning, read and understand what it’s doing. Then implement three sorts of document vectors:

1. Raw counts of terms in each document.

2. TF-IDF weighting, using the specific scheme described by Jurafsky and Martin (ch. 6).

3. Another weighting of your own invention/discovery. This may be another TF-IDF variant, or something else entirely

More products

Dorm and Meal Plan Calculator

$24

Add to cart

a Friend class that holds the first name, last name and hometown and the traditional methods

Shopping cart

US$0

ECE684-Document vectors Solved

More products