Contest Description Data The Rotten Tomatoes movie review data set is a corpus of movie reviews used for sentiment analysis , originally collected by Pang and Lee [pdf]. In their work on sentiment treebanks, Socher et al. [pdf] used Amazon's Mechanical Turk to create fine-grained labels for all parsed phrases in the corpus. The train and test data sets are tab-separated files with phrases from the Rotten Tomatoes data set. Each sentence has been parsed into many phrases by the Stanford parser . A quick glance of the raw training data set shows us: ...
Read full article from Movie Review Sentiment Analysis with Vowpal Wabbit | MLWave
No comments:
Post a Comment