大数据竞赛平台――Kaggle 入门 - wepon的专栏 - 博客频道 - CSDN.NET



大数据竞赛平台――Kaggle 入门 - wepon的专栏 - 博客频道 - CSDN.NET

器学习、数据挖掘等知识,建立算法模型,解决问题得出结果,最后将结果提交,如果提交的结果符合指标要求并且在参赛者中排名第一,将获得比赛丰厚的奖金。更多内容可以参阅: 大数据众包平台 下面我以图文的形式介绍Kaggle: 进入Kaggle网站: 左图的比赛是"101",右图的是"Playground",都是练习赛,适合入门。入门Kaggle最好的方法就是独立完成101和playground这两个级别的竞赛项目。本文的第二部分将选101中的"Digit Recognition"作为讲解。 点击进入赛题"Digit Recognition": 这是一个识别数字0~9的练习赛,"Competition Details"是这个比赛的描述,说明参赛者需要解决的问题。"Get the Data"是数据下载,参赛者用这些数据来训练自己的模型,得出结果,数据一般都是以csv格式给出: 其中,train.csv就是训练样本,test.csv就是测试样本,由于这个是训练赛,所以还提供了两种解决方案,knn_benchmark.R和rf_benchmark.R,前者是用R语。言写的knn算法程序,后者是用R语言写的随机森林算法程序,它们的结果分别是knn_benchmark.csv和rf_benchmark.csv。关于csv格式文件,我前一篇文章有详述: 【Python】csv模块的使用 。 得出结果后,接下来就是提交结果"Make a submission": 要求提交的文件是csv格式的,假如你将结果保存在result.csv,那么点击"Click or drop submission here",选中result.csv文件上传即可,系统将测试你提交的结果的准确率,然后排名。 另外,除了"Competition Details"、"Get the Data"、"Make a submission",侧边栏的"Home"、"Information"、"Forum"等,也提供了关于竞赛的一些相关信息,包括排名、规则、辅导......

Read full article from 大数据竞赛平台――Kaggle 入门 - wepon的专栏 - 博客频道 - CSDN.NET


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts