10 questions about big data and data science - Data Science Central



10 questions about big data and data science - Data Science Central

  • Should companies embrace big data? Which ones (start-ups, big-companies, tech companies, retail, health care)? And how? Using vendors, outsourcing or by hiring employees? And how do you measure ROI on big data? Should they use redundant data to consolidate KPI's?
  • What do you consider to be big data? I tend to think of big data as anything 10 times larger (in terms of megabytes per day) than the maximum you are used to. Also, sparse data might not be as big as they look, can be costly to process. Is there a price per megabyte, for big data storage, big data transfers, and big data analytics?
  • How did you become interested in data science?
  • What is the difference between data science, statistics, machine learning, and data engineering? Do you think an hybrid role (cross-disciplines) would be helpful (helpful to small companies, or helpful to the analytic practitioner as it opens up more job opportunities?
  • What kind of training do you recommend for future data scientists? Any specific program in mind?
  • How to get university professors more involved in teaching students how to process real live, big data sets? Should curricula be adapted, outdated material removed, new material introduced?
  • During my first year in my PhD program, I worked part-time for a high-tech small company, in partnership with my stats lab. This was a great experience - being exposed to the real world, and decently paid to do my PhD (in Belgium in 1988). How to encourage such initiatives in US?
  • Besides Hadoop-like and graph database environments, do you see other technology that would made data plumbing easier for big data?
  • Does it make sense to try to structure un-structured data (using tags, NLP, taxonomies, etc.)
  • Can you tell me 5 business activities that would benefit most from big data, and 5 that would benefit least?

  • Read full article from 10 questions about big data and data science - Data Science Central


    No comments:

    Post a Comment

    Labels

    Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

    Popular Posts