$30
The ability to be an effective member of a virtual team is highly valued in the data science job market.
----------- o -----------
Using any of the three classifiers described in chapter 6 of Natural Language Processing with Python, and any features you can think of, build the best name gender classifier you can.
Begin by splitting the Names Corpus into three subsets: 500 words for the test set, 500 words for the devtest set, and the remaining 6900 words for the training set. Then, starting with the example name gender classifier, make incremental improvements. Use the dev-test set to check your progress. Once you are satisfied with your classifier, check its final performance on the test set.
How does the performance on the test set compare to the performance on the dev-test set? Is this what you'd expect?
Source: Natural Language Processing with Python, exercise 6.10.2.