All About Programming: November 2014

Count number of circle intersections - PrismoSkills

Count number of circle intersections Problem: Given an array of integers where array[i] represents the radius of a circle at point (0,i) Find the number of intersections of all the circles formed by the above array. Example: If given array is [2,1,3], then 3 circles would be formed whose end-points would be: [-2,2], [0,2] and [-1,5] Solution: An O(n2) solution can be done very simply using the following properties: (Assuming R ): Two circles do not intersect if their centers are more than R 1 + R 2 Two circles intersect once if their centers are exactly R 1 + R 2 distance apart. Two circles intersect twice if their centers are less than R 1 + R 2 Two circles intersect once if their centers are exactly R 1 - R 2 distance apart. Two circles do not intersect if their centers are less than R 1 - R 2 distance apart. Using the above properties, compare each circle with rest of the circles and count the number of intersections.

Read full article from Count number of circle intersections - PrismoSkills

Count number of circle intersections - PrismoSkills

Read full article from Count number of circle intersections - PrismoSkills

Find missing number in 4 billion integers - PrismoSkills

Find missing number in 4 billion integers Problem: Using 1GB of memory only, find the missing numbers in 4 billion integers stored in a file. Solution: One integer takes 4 bytes, so 4 billion integers would take: 4 bytes * (4 * 109) = 16 GB Since we have only 1 GB of memory, we cannot get all the integers into memory at once. Even if we could get all numbers into memory, we would need some data-structure like an array where each integer from the 4 billion lot would be kept and then we will scan the array to find holes. The array to store the 4 billion integers need not be of type int [] It can be of type boolean [] also since all we need is a true/false to know whether an int is present or not. Memory required to store 4 billion booleans = 1 bit * (4 * 109) = 4Gb = 0.5 GB (small b means bit and capital B means bytes). So this seems doable with 1 GB of memory. We have a boolean array as boolean numberPresent[4*1000*1000*1000],

Read full article from Find missing number in 4 billion integers - PrismoSkills

Five hidden features that make Android Lollipop worth the upgrade

The latest version of Android is a big upgrade. Android 5.0 Lollipop is here and beginning to make its way to more and more devices as manufacturers and carriers slowly send out updates. You can read more about the latest big-time mobile operating system in our full Android Lollipop review , but you'll also want to read on as we dig a little deeper into a handful of the most compelling new features and upgrades that make the fifth major revision of Android the sweetest yet. ART makes everything better That's not Warhol, that's Android RunTime (ART), which is the new run-time environment that completely replaces the old Dalvik VM in Lollipop. Google claims that ART "improves app performance and responsiveness" and is 64-bit compatible. So far, in our testing on a Nexus 9 , ART seems to deliver the goods. Even with Lollipop's focus on animations all the time in Material design, apps run smoothly and with almost no lag,

Read full article from Five hidden features that make Android Lollipop worth the upgrade

algorithm - java indexof(String str) method complexity - Stack Overflow

The complexity of Java's implementation of indexOf is O(m*n) where n and m are the length of the search string and pattern respectively.

What you can do to improve complexity is to use e.g., the Boyer-More algorithm to intelligently skip comparing logical parts of the string which cannot match the pattern.

Read full article from algorithm - java indexof(String str) method complexity - Stack Overflow

Check pairs divisible by K - PrismoSkills

Problem: Given an array having even number of integers. Find if the array has N / 2 pairs of integers such that each pair is divisible by a given number k. Example: A normal approach would be to try all the pairs possible and see if the array holds the above property. However, that would be a very costly approach in terms of efficiency and an O(n) solution is possible for this problem. Code: public class CheckPairDivisibility { public static void main(String[] args) { boolean isPairs = checkPairs (new int []{12, 30, 20, 22}, 6); System.out.println(isPairs); } static boolean checkPairs(int nums[], int k) { // Debug prints System.out.println("k = " + k); printArray (nums, "input "); // initialize counts of modulus int modulusCounts[] = new int[k]; for(int i = 0; i < k; i++) modulusCounts[i] = 0; // For each number in the array, calculate // modulus and update relevant count for (int num: nums) modulusCounts[num %k]++; if (modulusCounts[0] % 2 !

Read full article from Check pairs divisible by K - PrismoSkills

Splay trees - PrismoSkills

Read full article from Splay trees - PrismoSkills

Introduction to a SkipList Data Structure in Java : The Coders Lexicon

Hello everyone! The other day I was approached by macosxnerd101, a newly minted moderator at Dream.In.Code , and asked to contribute to a new thread coming up on Java data structures. Now most of the new programmers here know of the basic data structures like binary trees, arrays, heaps or stacks. Many of these topics I knew would be covered by other contributors to the thread. So I decided to introduce you to one that is based on our old friend the linked list, but with a new twist. It is called the SkipList (aka JumpList). We will talk a bit about what it is, how to build one in Java and what the heck they are good for. So sit right back and lets throw around some data structures on this episode of the Programming Underground! To understand what a Skiplist is, we have to understand what a linked list is. For those of you who don't know, a linked list is a collection of objects which are linked together by one "node" pointing to the next one in the chain.

Read full article from Introduction to a SkipList Data Structure in Java : The Coders Lexicon

Stable marriage problem - PrismoSkills

Stable marriage problem Puzzle: Given n men and n women and a preference order of marriage from each one of them, how to marry these 2n bachelors such that their marriages are stable. A marriage is stable when both partners in a couple cannot find anyone else (higher in their priority list) available for marriage. In other words, a marriage is stable when every person gets its most desired partner subject to availability. Example: Given men m1 and m2 and women w1 and w2. Preference order of m1 is w1,w2 and that of m2 is also w1,w2. Preference order of w1 is m1,m2 and that of w2 is also m1,m2. Possible marriages among the above are: m1,w1 and m2,w2 m1,w2 and m2,w1 The second system of marriages is less stable than the first one because there exists a combination where both m1 and w1 can find a better spouse as per their choice. While there is no such possibility in the first case, hence its more stable. Solution: An algorithm to do this as follows:

Read full article from Stable marriage problem - PrismoSkills

Schönhage-Strassen algorithm - Wikipedia, the free encyclopedia

Schönhage–Strassen algorithm - Wikipedia, the free encyclopedia

The Schönhage–Strassen algorithm is based on the Fast Fourier transform (FFT) method of integer multiplication . This figure demonstrates multiplying 1234 × 5678 = 7006652 using the simple FFT method. Number-theoretic transforms in the integers modulo 337 are used, selecting 85 as an 8th root of unity. Base 10 is used in place of base 2w for illustrative purposes. Schönhage–Strassen improves on this by using negacyclic convolutions. The Schönhage–Strassen algorithm is an asymptotically fast multiplication algorithm for large integers . It was developed by Arnold Schönhage and Volker Strassen in 1971. [1] The run-time bit complexity is, in Big O notation , O(n log n log log n) for two n-digit numbers. The algorithm uses recursive Fast Fourier transforms in rings with 22n + 1 elements, a specific type of number theoretic transform . The Schönhage–Strassen algorithm was the asymptotically fastest multiplication method known from 1971 until 2007, when a new method, Fürer's algorithm ,

Read full article from Schönhage–Strassen algorithm - Wikipedia, the free encyclopedia

Rope data structure for efficient string manipulation - PrismoSkills

Rope is a binary-tree like data structure used for efficient string operations.

Most of the operations like index, split, insert, delete etc. take O(logN) time in rope which makes them ideal for text editors.

Read full article from Rope data structure for efficient string manipulation - PrismoSkills

The 20 best Black Friday deals | The Verge

November 26, 2014 November 25, 2014 Warm up your credit cards Black Friday is right around the corner, and there are plenty of great deals to be had. We've been covering the deals for weeks now , but if you want to cut through the mess and just score the best deals you can find, you've come to the right place. As to be expected, this year there are lots of deals to be had on TVs small and large, 1080p to 4K. You can also get a great price on last year's iPads, which are still better tablets than pretty much anything save for this year's iPads. If you're in the market for a smartwatch or fitness tracker, you can save some money on some really great options this weekend. And if you want to pick up a laptop or new headphones, there are deals to be found too. Keep in mind that the best deals won't last long and many of them are limited to Friday itself (or in rare occasions, Thursday too). To win the Black Friday game, you have to be aggressive and quick,

Read full article from The 20 best Black Friday deals | The Verge

[DAEMON-229] Windows 2003 server 64bit and JDK1.6.0_29 64bit -- Failed creating java - ASF JIRA

Give it couple of hours till the server sync is made and
http://commons.apache.org/daemon/binaries.html
will have better explanation for cpu subdir

In the mean time:
adm64 - AMD64/EMT64
ia64 - Intel Itanium 64

FYI ia64 is standard naming for Itanium processor.
amd64 was used for what we now know as x86_64 (or just x64)
However we will stick with amd64 for 1.0.x branch because
many users depend on the naming scheme.

Read full article from [DAEMON-229] Windows 2003 server 64bit and JDK1.6.0_29 64bit -- Failed creating java - ASF JIRA

Build windows service from java application with procrun | Jörg Lenhard's Blog

Recently, I needed to adjust a Java application to be able to run as a Windows service which is the Windows equivalent to a Unix daemon. There are several libraries available for this kind of task and after some reading, I chose Apache procrun which turned out to be a good choice. The configuration was less painful than expected and the tool seems quite powerful, due to a large set of configuration options. Also, building a Windows service or a Unix daemon should not make much of a difference. I used a 64bit Windows 7 and 32bit Java 7 JVM for this tutorial. Everything you need to know about procrun really is on the page that is linked above. There even is a basic tutorial. However, the documentation is not overly verbose and it took me some time to get everything up and running, so I think a more verbose basic tutorial will not hurt. To begin with, you need a Java application. You can built a service from more or less any application and here is a simple class, SomeService package com.

Read full article from Build windows service from java application with procrun | Jörg Lenhard's Blog

Apache Spark RDD API Examples

La Trobe University Bundoora, Victoria 3086 Email: z.he@latrobe.edu.au Authors of examples: Matthias Langer and Zhen He Emails addresses: m.langer@latrobe.edu.au, z.he@latrobe.edu.au These examples have only been tested for Spark version 1.1. We assume the functionality of Spark is stable and therefore the examples should be valid for later releases. The RDD API By Example RDD is short for Resilient Distributed Dataset. RDDs are the workhorse of the Spark system. As a user, one can consider a RDD as a handle for a collection of individual data partitions, which are the result of some computation. However, an RDD is actually more than that. On cluster installations, separate data partitions can be on separate nodes. Using the RDD as a handle one can access all partitions and perform computations and transformations using the contained data. Whenever a part of a RDD or an entire RDD is lost, the system is able to reconstruct the data of lost partitions by using lineage information.

Read full article from Apache Spark RDD API Examples

Scala Tutorial - Maps, Sets, groupBy, Options, flatten, flatMap | Java Code Geeks

Scala Tutorial – Maps, Sets, groupBy, Options, flatten, flatMap | Java Code Geeks

November 26, 2014 4:11 pm Scala Tutorial – Maps, Sets, groupBy, Options, flatten, flatMap Preface This is part 7 of tutorials for first-time programmers getting into Scala. Other posts are on this blog, and you can get links to those and other resources on the links page of the Computational Linguistics course I’m creating these for. Additionally you can find this and other tutorial series on the JCG Java Tutorials page. Lists (and other sequence data structures, like Ranges and Arrays) allow you to group collections of objects in an ordered manner: you can access elements of a list by indexing their position in the list, or iterate over the list elements, one by one, using for expressions and sequence functions like map, filter, reduce and fold. Another important kind of data structure is the associative array, which you’ll come to know in Scala as a Map. (Yes, this has the unfortunate ambiguity with the map function, but their use will be quite clear from context.

Read full article from Scala Tutorial – Maps, Sets, groupBy, Options, flatten, flatMap | Java Code Geeks

Spark the fastest open source engine for sorting a petabyte - Databricks

Spark the fastest open source engine for sorting a petabyte – Databricks

October 10, 2014 | by Reynold Xin Update November 5, 2014: Our benchmark entry has been reviewed by the benchmark committee and Spark has won the Daytona GraySort contest for 2014! Please see this new blog post for update . Apache Spark has seen phenomenal adoption, being widely slated as the successor to Hadoop MapReduce, and being deployed in clusters from a handful to thousands of nodes. While it was clear to everybody that Spark is more efficient than MapReduce for data that fits in memory, we heard that some organizations were having trouble pushing it to large scale datasets that could not fit in memory. Therefore, since the inception of Databricks, we have devoted much effort, together with the Spark community, to improve the stability, scalability, and performance of Spark. Spark works well for gigabytes or terabytes of data, and it should also work well for petabytes. To evaluate these improvements, we decided to participate in the Sort Benchmark .

Read full article from Spark the fastest open source engine for sorting a petabyte – Databricks

java - Producing a sorted wordcount with Spark - Code Review Stack Exchange

My method using Java 8

As addendum I'll show how I would identify your problem in question and show you how I would do it.

Input: An input file, consisting of words. Output: A list of the words sorted by frequency in which they occur.

Map<String, Long> occurenceMap = Files.readAllLines(Paths.get("myFile.txt"))          .stream()          .flatMap(line -> Arrays.stream(line.split(" ")))          .collect(Collectors.groupingBy(i -> i, Collectors.counting()));  List<String> sortedWords = occurenceMap.entrySet()          .stream()          .sorted(Comparator.comparing((Map.Entry<String, Long> entry) -> entry.getValue()).reversed())          .map(Map.Entry::getKey)          .collect(Collectors.toList());

This will do the following steps:

Read all lines into a List<String> (care with large files!)
Turn it into a Stream<String>.
Turn that into a Stream<String> by flat mapping every String to a Stream<String> splitting on the blanks.
Collect all elements into a Map<String, Long> grouping by the identity (i -> i) and using as downstream Collectors.counting() such that the map-value will be its count.
Get a Set<Map.Entry<String, Long>> from the map.
Turn it into a Stream<Map.Entry<String, Long>>.
Sort by the reverse order of the value of the entry.
Map the results to a Stream<String>, you lose the frequency information here.
Collect the stream into a List<String>.

Beware that the line .sorted(Comparator.comparing((Map.Entry<String, Long> entry) -> entry.getValue()).reversed()) should really be .sorted(Comparator.comparing(Map.Entry::getValue).reversed(), but type inference is having issues with that and for some reason it will not compile.

I hope the Java 8 way can give you interesting insights.

Read full article from java - Producing a sorted wordcount with Spark - Code Review Stack Exchange

Parallel Programming With Spark

Parallel Programming With Spark
Improves efficiency through:
» In4memory computing primitives
»General computation graphs

Spark originally written in Scala, which allows
concise function syntax and interactive use
Recently added Java API for standalone apps

Concept: resilient distributed datasets (RDDs)
» Immutable collections of objects spread across a cluster
» Built through parallel transformations (map, filter, etc)
» Automatically rebuilt on failure
» Controllable persistence (e.g. caching in RAM) for reuse

Main Primitives
Resilient distributed datasets (RDDs)
» Immutable, partitioned collections of objects
Transformations (e.g. map, filter, groupBy, join)
» Lazy operations to build RDDs from other RDDs
Actions (e.g. count, collect, save)
»Return a result or write it to storage

RDD Fault Tolerance
RDDs track the series of transformations used to
build them (their lineage) to recompute lost data

Easiest way: Spark interpreter (spark-shell)
» Modified version of Scala interpreter for cluster use
Runs in local mode on 1 thread by default, but
can control through MASTER environment var:
MASTER=local ./spark-shell # local, 1 thread
MASTER=local[2] ./spark-shell # local, 2 threads
MASTER=host:port ./spark-shell # run on Mesos

First Stop: SparkContext
Main entry point to Spark functionality
Created for you in spark-shell as variable sc

Creating RDDs
// Turn a Scala collection into an RDD
sc.parallelize(List(1, 2, 3))
// Load text file from local FS, HDFS, or S3
sc.textFile("file.txt")
sc.textFile("directory/*.txt")
sc.textFile("hdfs://namenode:9000/path/file")
// Use any existing Hadoop InputFormat
sc.hadoopFile(keyClass, valClass, inputFmt, conf)

val nums = sc.parallelize(List(1, 2, 3))
// Pass each element through a function
val squares = nums.map(x => x*x) // {1, 4, 9}
// Keep elements passing a predicate
val even = squares.filter(_ % 2 == 0) // {4}
// Map each element to zero or more others
nums.flatMap(x => 1 to x) // => {1, 1, 2, 1, 2, 3}

Working with KeyUValue Pairs
Spark’s "distributed reduce" transformations
operate on RDDs of key4value pairs
Scala pair syntax:
val pair = (a, b) // sugar for new Tuple2(a, b)
Accessing pair elements:
pair._1 // => a
pair._2 // => b

Some KeyUValue Operations
val pets = sc.parallelize( List(("cat", 1), ("dog", 1), ("cat", 2)))
pets.reduceByKey(_ + _) // => {(cat, 3), (dog, 1)}
pets.groupByKey() // => {(cat, Seq(1, 2)), (dog, Seq(1)}
pets.sortByKey() // => {(cat, 1), (cat, 2), (dog, 1)}
reduceByKey also automatically implements
combiners on the map side

val lines = sc.textFile("hamlet.txt")
val counts = lines.flatMap(line => line.split(" "))
.map(word => (word, 1))
.reduceByKey(_ + _)

val visits = sc.parallelize(List(
("index.html"
, "1.2.3.4"),
("about.html"
, "3.4.5.6"),
("index.html"
, "1.3.3.1")))
val pageNames = sc.parallelize(List(
("index.html"
, "Home"), ("about.html"
, "About")))
visits.join(pageNames)
// ("index.html", ("1.2.3.4", "Home"))
// ("index.html", ("1.3.3.1", "Home"))
// ("about.html", ("3.4.5.6", "About"))
visits.cogroup(pageNames)
// ("index.html", (Seq("1.2.3.4", "1.3.3.1"), Seq("Home")))
// ("about.html", (Seq("3.4.5.6"), Seq("About")))

Controlling The Number of
Reduce Tasks
All the pair RDD operations take an optional
second parameter for number of tasks
words.reduceByKey(_ + _, 5)
words.groupByKey(5)
visits.join(pageViews, 5)
Can also set spark.default.parallelism property

Using Local Variables
Any external variables you use in a closure will
automatically be shipped to the cluster:
val query = Console.readLine()
pages.filter(_.contains(query)).count()
Some caveats:
» Each task gets a new copy (updates aren’t sent back)
»Variable must be Serializable
»Don’t use fields of an outer object (ships all of it!)

sample(): deterministically sample a subset
union(): merge two RDDs
cartesian(): cross product
pipe(): pass through external program

Task Scheduler
Runs general task
graphs
Pipelines functions
where possible
Cache4aware data
reuse locality
Partitioning4aware
to avoid shuffles
Please read full article from Parallel Programming With Spark

Java BlockingQueue Example implementing Producer Consumer Problem

java.util.concurrent.BlockingQueue is a Queue that supports operations that wait for the queue to become non-empty when retrieving and removing an element, and wait for space to become available in the queue when adding an element. BlockingQueue doesn't accept null values and throw NullPointerException if you try to store null value in the queue. BlockingQueue implementations are thread-safe. All queuing methods are atomic in nature and use internal locks or other forms of concurrency control. BlockingQueue interface is part of java collections framework and it's primarily used for implementing producer consumer problem. We don't need to worry about waiting for the space to be available for producer or object to be available for consumer in BlockingQueue as it's handled by implementation classes of BlockingQueue. Java provides several BlockingQueue implementations such as ArrayBlockingQueue, LinkedBlockingQueue, PriorityBlockingQueue, SynchronousQueue etc.

Read full article from Java BlockingQueue Example implementing Producer Consumer Problem

What is PriorityQueue or priority queue data structure in Java with Example - Tutorial

What is PriorityQueue or priority queue data structure in Java with Example - Tutorial
PriorityQueue is an unbounded Queue implementation in Java, which is based on priority heap. PriorityQueue allows you to keep elements in a particular order, according to there natural order or custom order defined by Comparator interface in Java. Head of priority queue data structure will always contain least element with respect to specified ordering

PriorityQueue in Java is that it's Iterator doesn't guarantee any order, if you want to traverse in ordered fashion, better use Arrays.sort(pq.toArray()) method. PriorityQueue is also not synchronized, which means can not be shared safely between multiple threads, instead it's concurrent counterpart PriorityBlockingQueue is thread-safe and should be used in multithreaded environment. Priority queue provides O(log(n)) time performance for common enqueing and dequeing methods e.g. offer(), poll() and add(), but constant time for retrieval methods e.g. peek() and element().

PriorityQueue keeps least value element at head, where ordering is determined using compareTo() method. It doesn't keep all elements in that order though, only head being least value element is guaranteed. This is in fact main difference between TreeSet and PriorityQueue in Java, former keeps all elements in a particular sorted order, while priority queue only keeps head in sorted order. Another important point to note is that PriorityQueue doesn't permit null elements and trying to add null elements will result in NullPointerException.
Read full article from What is PriorityQueue or priority queue data structure in Java with Example - Tutorial

How to handle a job interview by video | LinkedIn

October 03, 2014 Q: Help! I've been called to a first-round interview – but it's a video interview. I have to sit in front of my webcam, press the button, and start answering questions that will be served up as written text on my screen. I'd really like to work for this company, but I can't get my head around this format at all. I'm not accustomed to talking to a screen like that, unless there's somebody talking back to me. Have you any tips? (LK, email). A: I believe this is going to become an even more common approach in the years ahead. For companies, video interviewing – or early-stage screening as some call it – offers all sorts of advantages, namely: It allows them to interview a huge number of applicants – just formulate the questions, create the video link, and send out to all applicants. The fact that they can do this without having to commit their own people to a few days of interviewing is a real bonus – in other words, it saves cost; At early-screening stage,

Read full article from How to handle a job interview by video | LinkedIn

10 Tips For Not Screwing Up Your Video Interview - Business Insider

If you haven't encountered the video interview yet, just wait. In order to better manage the time of Recruiters and Hiring Managers and to save travel expenses, many companies are turning to them. There are two basic types of video interview . Live interviews, where you talk to the interviewer from your video device, were the first wave. While they are still used, their use is declining. Taped interviews, where you respond to prompts, either written or in an application, are becoming the norm. They allow recruiters and hiring managers to evaluate you at their leisure. In either case, people are just becoming truly competent in the use of video for this purpose. Over 50 years ago, John F. Kennedy handily defeated Richard Nixon in the first televised presidential debates. Kennedy's team knew that there were certain colors (a blue shirt) that presented well on TV. Nixon's crew thought that it was just another debate.

Read full article from 10 Tips For Not Screwing Up Your Video Interview - Business Insider

A simple machine learning app with Spark - Chapeau

A simple machine learning app with Spark I'm currently on my way back from the first-ever Spark Summit , where I presented a talk on some of my work with the Fedora Big Data SIG to package Apache Spark and its infrastructure for Fedora. ( My slides are online, but they aren't particularly useful without the talk. I'll post a link to the video when it's available, though.) If you're interested in learning more about Spark, a great place to start is the guided exercises that the Spark team put together; simply follow their instructions to fire up an EC2 cluster with Spark installed and then work through the exercises. In one of the exercises, you'll have an opportunity to build up one of the classic Spark demos: distributed k-means clustering in about a page of code. Implementing k-means on resilient distributed datasets is an excellent introduction to key Spark concepts and idioms. With recent releases of Spark, though, machine learning can be simpler still:

Read full article from A simple machine learning app with Spark - Chapeau

Inc. 5000 list's Silicon Valley companies include Superfish, Crescendo, SmartZip - Silicon Valley Business Journal

No. 4, Superfish, $35.3 million, 26,043%, Palo Alto

No. 83, iCracked, $7.3 million, 4,033%, Redwood City

No. 110, Multicoreware, $7.7 million, 3,322%, Sunnyvale

No. 122, Smashwords, $22 million, 3,019%, Los Gatos

No. 172, Ensighten, $8.3 million, 2,416%, Cupertino

No. 185, Duda, $5.5 million, 2,300%, Palo Alto

Read full article from Inc. 5000 list's Silicon Valley companies include Superfish, Crescendo, SmartZip - Silicon Valley Business Journal

List of High-Tech Employers in Silicon Valley

Welcome! Please bookmark this site and visit again, there are frequent changes. This page is a service to individuals seeking jobs with high-tech employers in Silicon Valley. Please use the information for this purpose. Information for this list has been gathered from a multitude of public sources. Note to users: I plan to split this growing list into multiple pages in the near future. The current URL will remain, but will become a table of contents page. The List 3dfx Interactie, San Jose: ABB, Santa Clara: Acta Technology, Palo Alto: Advise America, Newark: Affymetrix, Inc., Santa Clara: Airspeak, Morgan Hill: Alteon Websystems, San Jose: AMD, Sunnyvale: Anritsu Company, Morgan Hill: APCON, Santa Clara: Apple Computer, Inc., Cupertino: Aptix Corporation, San Jose: ArrayComm, Inc., San Jose: Arzoo.com, Fremont: Atmel, San Jose: Atmosphere Networks, Campbell: Badger Technology, Inc., Milpitas: beyond.com, Sunnyvale: Bizlink Technology, Fremont: Blaze Software, Mountain View:

Read full article from List of High-Tech Employers in Silicon Valley

Spark SQL: Parquet | InfoObjects

Apache Parquet as a file format has garnered significant attention recently. Let's say you have a table with 100 columns, most of the time you are going to access 3-10 columns. In row oriented format all columns are scanned where you need them or not. Apache Parquet saves data in column oriented fashion so if you need 3 columns only data of those 3 columns get loaded. Another benefit is that since all data in a given column is same datatype (obviously), compression quality is far superior. In this recipe we'll learn how to save a table in Parquet format and then how to load it back. Let's use the Person table we created in other recipe. first_name last_name gender Barack Obama M Bill Clinton M Hillary Clinton F Let's load it in Spark SQL scala> val hc = new org.apache.spark.sql.hive.HiveContext(sc) scala>import hc._ scala>case class Person(firstName: String, lastName: String, gender: String) scala>val person = sc.textFile("person").map(_.split("\t")).map(p => Person(p(0),p(1),

Read full article from Spark SQL: Parquet | InfoObjects

Correlation Coefficients: Find Pearson's Correlation Coefficient

Sample question: Find the value of the correlation coefficient from the following table: Subject 1 43 99 2 21 65 3 25 79 4 42 75 5 57 87 6 59 81 Step 1:Make a chart. Use the given data, and add three more columns: xy, x2, and y2. Subject xy x2 y2 1 43 99 2 21 65 3 25 79 4 42 75 5 57 87 6 59 81 Step 2::Multiply x and y together to fill the xy column. For example, row 1 would be 43 × 99 = 4,257. Subject xy x2 y2 1 43 99 4257 2 21 65 1365 3 25 79 1975 4 42 75 3150 5 57 87 4959 6 59 81 4779 Step 3: Take the square of the numbers in the x column, and put the result in the x2 column. Subject xy x2 y2 1 43 99 4257 1849 2 21 65 1365 441 3 25 79 1975 625 4 42 75 3150 1764 5 57 87 4959 3249 6 59 81 4779 3481 Step 4: Take the square of the numbers in the y column, and put the result in the y2 column. Subject xy x2 y2 1 43 99 4257 1849 9801 2 21 65 1365 441 4225 3 25 79 1975 625 6241 4 42 75 3150 1764 5625 5 57 87 4959 3249 7569 6 59 81 4779 3481 6561 Step 5:

Read full article from Correlation Coefficients: Find Pearson's Correlation Coefficient

Pearson Correlation: Definition and Easy Steps for Use

Watch the video on how to find Pearson's Correlation Coefficient, or read below for an explanation of what it is: What is Pearson Correlation? Correlation between sets of data is a measure of how well they are related. The most common measure of correlation in stats is the Pearson Correlation. The full name is the Pearson Product Moment Correlation or PPMC. It shows the linear relationship between two sets of data. In simple terms, it answers the question, Can I draw a line graph to represent the data? Two letters are used to represent the Pearson correlation: Greek letter rho (ρ) for a population and the letter "r" for a sample. The Pearson correlation coefficient can be calculated by hand or one a graphing calculator such as the TI-89 What are the Possible Values for the Pearson Correlation? The results will be between -1 and 1. You will very rarely see 0, -1 or 1. You'll get a number somewhere in between those values. The closer the value of r gets to zero,

Read full article from Pearson Correlation: Definition and Easy Steps for Use

Histogram inScala | Big Data Analytics with Spark

Read full article from Histogram inScala | Big Data Analytics with Spark

Histogram in Spark (1) | Big Data Analytics with Spark

Spark's DoubleRDDFunctions provide a histogram function for RDD[Double]. However there are no histogram function for RDD[String]. Here is a quick exercise for doing it. We will use immutable Map in this exercise. Create a dummy RDD[String] and apply the aggregate method to calculate histogram scala> val d=sc.parallelize((1 to 10).map(_ % 3).map("val"+_.toString)) scala> d.aggregate(Map[String,Int]())( | (m,c)=>m.updated(c,m.getOrElse(c,0)+1), | (m,n)=>(m /: n){case (map,(k,v))=>map.updated(k,v+map.getOrElse(k,0))} | ) The 2nd function of aggregate method is to merge 2 maps. We can actually define a Scala function scala> def mapadd[T](m:Map[T,Int],n:Map[T,Int])={ | (m /: n){case (map,(k,v))=>map.updated(k,v+map.getOrElse(k,0))} | } It combine the histogram on the different partitions together scala> mapadd(Map("a"->1,"b"->2),Map("a"->2,"c"->1)) res3: scala.collection.mutable.Map[String,Int] = Map(b -> 2, a -> 3, c -> 1) Use mapadd we can rewrite the aggregate step scala> d.

Read full article from Histogram in Spark (1) | Big Data Analytics with Spark

Statistics With Spark

Josh - 07 Mar 2014 Lately I've been writing a lot of Spark Jobs that perform some statistical analysis on datasets. One of the things I didn't realize right away - is that RDD's have built in support for basic statistic functions like mean, variance, sample variance, standard deviation. These operations are avaible on RDD's of Double import org.apache.spark.SparkContext._ // implicit conversions in here val myRDD = newRDD().map { _.toDouble } myRDD.mean myRDD.sampleVariance // divides by n-1 myRDD.sampleStdDev // divides by n-1 Getting It All At Once If you're interested in calling multiple stats functions at the same time, it's a better idea to get them all in a single pass. Spark provides the stats method in DoubleRDDFunctions for that; it also provides the total count of the RDD as well. Histograms Means and standard deviation are a decent starting point when you're looking at a new dataset;

Read full article from Statistics With Spark

Printing array in Scala - Stack Overflow

mkString will convert collections (including Array) element-by-element to string representations.

println(a.mkString(" "))

is probably what you want.

Read full article from Printing array in Scala - Stack Overflow

How to convert a Scala array (sequence) to string with mkString | alvinalexander.com

By Alvin Alexander. Last updated: Jun 9, 2014 Scala collections FAQ: How can I convert a Scala array to a String? (Or, more, accurately, how do I convert any Scala sequence to a String.) A simple way to convert a Scala array to a String is with the mkString method of the Array class . (Although I've written "array", the same technique also works with any Scala sequence, including Array, List, Seq, ArrayBuffer, Vector, and other sequence types.) Here's a quick array to string example using the Scala REPL: scala> val args = Array("Hello", "world", "it's", "me") args: Array[java.lang.String] = Array(Hello, world, it's, me) scala> val string = args.mkString(" ") string: String = Hello world it's me In this first statement: val args = Array("Hello", "world", "it's", "me") I create a Scala array named args, and in this second statement: val string = args.mkString(" ") I create a new String variable named string, separating each String in the array with a space character,

Read full article from How to convert a Scala array (sequence) to string with mkString | alvinalexander.com

Scala - How to convert a String to an Int (Integer) | alvinalexander.com

By Alvin Alexander. Last updated: Jun 10, 2014 Scala FAQ: How do I convert a String to Int in Scala? If you need to convert a String to Int in Scala , just use the toInt method which is available on String objects, like this: scala> val i = "1".toInt i: Int = 1 As you can see, I just cast a String (the string "1") to an Int object using the toInt method on the String. However, beware that this can fail just like it does in Java, with a NumberFormatException, like this: scala> val i = "foo".toInt java.lang.NumberFormatException: For input string: "foo" so you'll want to account for that in your code, such as with a try/catch statement. Scala String to Int conversion functions As an example, the following toInt functions account for the exceptions that can be thrown in the String to Int conversion process. This first example shows the "Java" way to write a String to Int conversion function: def toInt(s: String):Int = { try { s.toInt } catch { case e:

Read full article from Scala - How to convert a String to an Int (Integer) | alvinalexander.com

Estimating Financial Risk with Apache Spark | Cloudera Engineering Blog

Learn how Spark facilitates the calculation of computationally-intensive statistics such as VaR via the Monte Carlo method. Under reasonable circumstances, how much money can you expect to lose? The financial statistic value at risk (VaR) seeks to answer this question. Since its development on Wall Street soon after the stock market crash of 1987, VaR has been widely adopted across the financial services industry. Some organizations report the statistic to satisfy regulations, some use it to better understand the risk characteristics of large portfolios, and others compute it before executing trades to help make informed and immediate decisions. For reasons that we will delve into later, reaching an accurate estimate of VaR can be a computationally expensive process. The most advanced approaches involve Monte Carlo simulations , a class of algorithms that seek to compute quantities through repeated random sampling.

Read full article from Estimating Financial Risk with Apache Spark | Cloudera Engineering Blog

Spark Shell Examples - Altiscale Docs

Spark Shell Examples – Altiscale Docs

Copy Test Data to HDFS The following will upload all of our example data to HDFS under your current login username. These include GraphX PageRank's datasets, MLLib decision tree, logistic regression, Kmean, linear regression, SVM, and naive bayes. pushd `pwd` cd /opt/spark/ Second, launch the spark-shell command again with the following command: SPARK_SUBMIT_OPTS="-XX:MaxPermSize=256m" ./bin/spark-shell --master yarn --queue research --driver-class-path $(find /opt/hadoop/share/hadoop/mapreduce/lib/hadoop-lzo-* | head -n 1) Run following Scala statements in Scala REPL Shell: SVM Logistic Regression Naive Bayes KMeans GraphX PageRank Decision Tree - Classification and Regression/Prediction // CLASSIFICATION import org.apache.spark.SparkContext import org.apache.spark.mllib.tree.DecisionTree import org.apache.spark.mllib.regression.LabeledPoint import org.apache.spark.mllib.linalg.Vectors import org.apache.spark.mllib.tree.configuration.Algo._ import org.apache.spark.mllib.tree.

Read full article from Spark Shell Examples – Altiscale Docs

Understanding Transaction Logs, Soft Commit and Commit in SolrCloud - Lucidworks

Redwood City, CA 94065 Hard commits, soft commits and transaction logs As of Solr 4.0, there is a new "soft commit" capability, and a new parameter for hard commits – openSearcher. Currently, there's quite a bit of confusion about the interplay between soft and hard commit actions, and especially what it all means for the transaction log. The stock solrconfig.xml file explains the options, but with the usual documentation-in-example limits, if there was a full explanation of everything, the example file would be about a 10M and nobody would ever read through the whole thing. This article outlines the consequences hard and soft commits and the new openSearcher option for hard commits. The release documentation can be found in the Solr Reference Guide , this post is a more leisurely overview of this topic. I persuaded a couple of the committers to give me some details. I'm sure I was told the accurate information, any transcription errors are mine!

Read full article from Understanding Transaction Logs, Soft Commit and Commit in SolrCloud - Lucidworks

So, are you ready for 4K smartphone screens? Qualcomm believes you should be and here's why

x We have placed cookies on your device to make your experience better. Find more info here . › So, are you ready for 4K smartphone screens? Qualcomm believes you should be and here's why So, are you ready for 4K smartphone screens? Qualcomm believes you should be and here's why Posted: , by Paul.K Is 4K way too much for smartphone screens? Most would say so – a large part of the consumer base does not even deem QHD (2560 x 1440) as a necessity, many finding it hard to discern a noticeable image sharpness increase, while techies who love to spend more time on their mobile device often argue that they prefer the now standard 1080p resolution and longer battery life, instead of having their batteries be sucked up by a bucketload of extra pixels. Still, it seems that 4K is coming and it may be a mainstream thing, too, instead of an extravagant feature for a limited amount of phones. Sure, when we first heard about Sharp readying up a 4K phone display panel ,

Read full article from So, are you ready for 4K smartphone screens? Qualcomm believes you should be and here's why

How To Build a Naive Bayes Classifier

Feb 09, 2012 Some use-cases for building a classifier: Spam detection, for example you could build your own Akismet API Automatic assignment of categories to a set of items Automatic detection of the primary language (e.g. Google Translate) Sentiment analysis, which in simple terms refers to discovering if an opinion is about love or hate about a certain topic In general you can do a lot better with more specialized techniques, however the Naive Bayes classifier is general-purpose, simple to implement and good-enough for most applications. And while other algorithms give better accuracy, in general I discovered that having better data in combination with an algorithm that you can tweak does give better results for less effort. In this article I'm describing the math behind it. Don't fear the math, as this is simple enough that a high-schooler understands. And even though there are a lot of libraries out there that already do this,

Read full article from How To Build a Naive Bayes Classifier

A Programmer's Guide to Data Mining | The Ancient Art of the Numerati

Menu A guide to practical data mining, collective intelligence, and building recommendation systems by Ron Zacharski . About This Book Before you is a tool for learning basic data mining techniques. Most data mining textbooks focus on providing a theoretical foundation for data mining, and as result, may seem notoriously difficult to understand. Don't get me wrong, the information in those books is extremely important. However, if you are a programmer interested in learning a bit about data mining you might be interested in a beginner's hands-on guide as a first step. That's what this book provides. Table of Contents This book's contents are freely available as PDF files. When you click on a chapter title below, you will be taken to a webpage for that chapter. The page contains links for a PDF of that chapter and for any sample Python code and data that chapter requires. Please let me know if you see an error in the book, if some part of the book is confusing,

Read full article from A Programmer's Guide to Data Mining | The Ancient Art of the Numerati

K-Means++ | 愈宅屋

Read full article from K-Means++ | 愈宅屋

Improved Seeding For Clustering With K-Means++ | The Data Science Lab

Improved Seeding For Clustering With K-Means++ Clustering data into subsets is an important task for many data science applications. At The Data Science Lab we have illustrated how Lloyd's algorithm for k-means clustering works, including snapshots of python code to visualize the iterative clustering steps . One of the issues with the procedure is that this algorithm does not supply information as to which K for the k-means is optimal; that has to be found out by alternative methods, so that we went a step further and coded up the gap statistic to find the proper k for k-means clustering . In combination with the clustering algorithm, the gap statistic allows to estimate the best value for k among those in a given range. An additional problem with the standard k-means procedure still remains though, as shown by the image on the right, where a poor random initialization of the centroids leads to suboptimal clustering:

Read full article from Improved Seeding For Clustering With K-Means++ | The Data Science Lab

Clustering With K-Means in Python | The Data Science Lab

Clustering With K-Means in Python A very common task in data analysis is that of grouping a set of objects into subsets such that all elements within a group are more similar among them than they are to the others. The practical applications of such a procedure are many: given a medical image of a group of cells, a clustering algorithm could aid in identifying the centers of the cells ; looking at the GPS data of a user's mobile device, their more frequently visited locations within a certain radius can be revealed; for any set of unlabeled observations , clustering helps establish the existence of some sort of structure that might indicate that the data is separable. Mathematical background The k-means algorithm takes a dataset X of N points as input, together with a parameter K specifying how many clusters to create. The output is a set of K cluster centroids and a labeling of X that assigns each of the points in X to a unique cluster.

Read full article from Clustering With K-Means in Python | The Data Science Lab

K-means++ - Wikipedia, the free encyclopedia

In data mining , k-means++ [1] [2] is an algorithm for choosing the initial values (or "seeds") for the k-means clustering algorithm. It was proposed in 2007 by David Arthur and Sergei Vassilvitskii, as an approximation algorithm for the NP-hard k-means problem—a way of avoiding the sometimes poor clusterings found by the standard k-means algorithm. It is similar to the first of three seeding methods proposed, in independent work, in 2006 [3] by Rafail Ostrovsky, Yuval Rabani, Leonard Schulman and Chaitanya Swamy. (The distribution of the first seed is different.) Contents However, the k-means algorithm has at least two major theoretic shortcomings: First, it has been shown that the worst case running time of the algorithm is super-polynomial in the input size. [5] Second, the approximation found can be arbitrarily bad with respect to the objective function compared to the optimal clustering.

Read full article from K-means++ - Wikipedia, the free encyclopedia

k-means clustering - Wikipedia, the free encyclopedia

The problem is computationally difficult ( NP-hard ); however, there are efficient heuristic algorithms that are commonly employed and converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both algorithms. Additionally, they both use cluster centers to model the data; however, k-means clustering tends to find clusters of comparable spatial extent, while the expectation-maximization mechanism allows clusters to have different shapes. Contents 1 , x 2 , …, x n ), where each observation is a d-dimensional real vector, k-means clustering aims to partition the n observations into k (≤ n) sets S = {S 1 , S 2 , …, S k } so as to minimize the within-cluster sum of squares (WCSS). In other words, its objective is to find: where μ i The term "k-means" was first used by James MacQueen in 1967, [1] though the idea goes back to Hugo Steinhaus in 1957.

Read full article from k-means clustering - Wikipedia, the free encyclopedia

Clustering - MLlib - Spark 1.1.0 Documentation

MLlib supports k-means clustering, one of the most commonly used clustering algorithms that clusters the data points into predefined number of clusters. The MLlib implementation includes a parallelized variant of the k-means++ method called kmeans|| . The implementation in MLlib has the following parameters: k is the number of desired clusters. maxIterations is the maximum number of iterations to run. initializationMode specifies either random initialization or initialization via k-means||. runs is the number of times to run the k-means algorithm (k-means is not guaranteed to find a globally optimal solution, and when run multiple times on a given dataset, the algorithm returns the best clustering result). initializationSteps determines the number of steps in the k-means|| algorithm. epsilon determines the distance threshold within which we consider k-means to have converged. Examples spark-shell . In the following example after loading and parsing data, we use the import org.apache.

Read full article from Clustering - MLlib - Spark 1.1.0 Documentation

Support Vector Machines (SVM)

Support Vector Machines (SVM) Introductory Overview Support Vector Machines are based on the concept of decision planes that define decision boundaries. A decision plane is one that separates between a set of objects having different class memberships. A schematic example is shown in the illustration below. In this example, the objects belong either to class GREEN or RED. The separating line defines a boundary on the right side of which all objects are GREEN and to the left of which all objects are RED. Any new object (white circle) falling to the right is labeled, i.e., classified, as GREEN (or classified as RED should it fall to the left of the separating line). The above is a classic example of a linear classifier, i.e., a classifier that separates a set of objects into their respective groups (GREEN and RED in this case) with a line. Most classification tasks, however, are not that simple, and often more complex structures are needed in order to make an optimal separation, i.e.,

Read full article from Support Vector Machines (SVM)

(12) What is alternating least square method in recommendation system? - Quora

In SGD you are repeatedly picking some subset of the loss function to minimize -- one or more cells in the rating matrix -- and setting the parameters to better make just those 0.

In ALS you're minimizing the entire loss function at once, but, only twiddling half the parameters. That's because the optimization has an easy algebraic solution -- if half your parameters are fixed. So you fix half, recompute the other half, and repeat. There is no gradient in the optimization step since each optimization problem is convex and doesn't need an approximate approach. But, each problem you're solving is not the "real" optimization problem -- you fixed half the parameters.

You initialize M with random unit vectors usually. It's in feature space so wouldn't quite make sense to have averages over ratings.

Read full article from (12) What is alternating least square method in recommendation system? - Quora

Apache Spark: Distributed Machine Learning using MLbase - Sigmoid Analytics

Apache Spark: Distributed Machine Learning using MLbase Implementing and consuming Machine Learning techniques at scale are difficult tasks for ML Developers and End Users. MLbase (www.mlbase.org) is an open-source platform under active development addressing the issues of both groups. MLbase consists of three components—MLlib, MLI and ML Optimizer. MLlib is a low-level distributed ML library written against the Spark, MLI is an API / platform for feature extraction and algorithm development that introduces high-level ML programming abstractions, and ML Optimizer is a layer aiming to simplify ML problems for End Users by automating the tasks of feature and model selection. In this talk we will describe the high-level functionality of each of these layers, and demonstrate its scalability and ease-of-use via real-world examples involving classification, regression, clustering and collaborative filtering. In the course of applying machine-learning against large data sets,

Read full article from Apache Spark: Distributed Machine Learning using MLbase - Sigmoid Analytics

Interop Between Java and Scala - Code Commit

9 Feb 2009 Sometimes, the simplest things are the most difficult to explain. Scala's interoperability with Java is completely unparalleled, even including languages like Groovy which tout their tight integration with the JVM's venerable standard-bearer. However, despite this fact, there is almost no documentation (aside from chapter 29 in Programming in Scala) which shows how this Scala/Java integration works and where it can be used. So while it may not be the most exciting or theoretically interesting topic, I have taken it upon myself to fill the gap. Classes are Classes The first piece of knowledge you need about Scala is that Scala classes are real JVM classes. Consider the following snippets, the first in Java: public class Person { public String getName() { return "Daniel Spiewak"; } } …and the second in Scala: class Person { def getName() = "Daniel Spiewak" } Despite the very different syntax, both of these snippets will produce almost identical bytecode when compiled.

Read full article from Interop Between Java and Scala - Code Commit

Integrating Scala components in a Java application | akquinet AG - Blog

Integrating Scala components in a Java application | akquinet AG – Blog

Scala is starting to be really popular, and there are many reasons why you might like to use it in your current projects. At akquinet we’re now using Scala inside Java applications to reduce the amount of written code and to benefit from Scala’s flexibility. However, integrating Java and Scala in the same application requires some tricks. Using Java classes in Scala is pretty straightforward; however, using Scala classes in Java is not. Scala has several language features which cannot be directly mapped to Java, for example function types and traits. Here we will describe how these language features are compiled to Java byte code and how to access them from Java afterwards. Unit and Any Unit or Any void and Object respectively. Scala special characters Scala’s syntax is more relaxed than Java’s with regard to special characters. Consider for example the following code snippet where Scala mimicks properties: class ScalaPerson(var _name: String) { def name = _name def name_= (value:

Read full article from Integrating Scala components in a Java application | akquinet AG – Blog

typesafehub/sbteclipse

For sbt 0.13 and up

Add sbteclipse to your plugin definition file. You can use either:
- the global file (for version 0.13 and up) at ~/.sbt/0.13/plugins/plugins.sbt
- the project-specific file at PROJECT_DIR/project/plugins.sbt

addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "2.5.0")

In sbt use the command eclipse to create Eclipse project files

> eclipse

In Eclipse use the Import Wizard to import Existing Projects into Workspace

Read full article from typesafehub/sbteclipse

How to check sbt version? - Stack Overflow

 sbt sbtVersion

This prints the sbt version used in your current project, or if it is a multi-module project for each module.

Read full article from How to check sbt version? - Stack Overflow

Creating Scala Project With SBT 0.12 | Knoldus

Installing SBT for Windows $ set SCRIPT_DIR=%~dp0 $ java -Xmx512M -jar "%SCRIPT_DIR%sbt-launch.jar" %* Now, put the sbt-launch.jar file in same directory where you put sbt.bat file.Put sbt.bat on your path so that you can launch sbt in any directory by typing sbt at the command prompt. After installing SBT if we type in sbt in the command prompt in an empty directory, this is what we are likely to see D:\hello>sbt [info] Set current project to default-30dbd1 (in build file:/D:/hello/) In order to create project execute the following commands in the sbt session. > set name :="hello" [info] Reapplying settings... [info] Set current project to hello (in build file:/D:/hello/) > > set scalaVersion :="2.9.2" [info] Reapplying settings... [info] Set current project to hello (in build file:/D:/hello/) > set version :="1.0" [info] Reapplying settings... [info] Set current project to hello (in build file:/D:/hello/) > session save [info] Reapplying settings...

Read full article from Creating Scala Project With SBT 0.12 | Knoldus

Building Java projects with sbt - Xerial Blog

Mar 24th, 2014 I have recently joined Treasure Data , and found many java projects that are built with Maven . Maven is the de facto standard for building Java projects. I agree Maven is useful for managing library dependencies, but as the project is becoming more and more complex, managing pom.xml turns to a headache for many developers; For example, one of the projects uses a script-generated pom.xml in order to avoid writing too many pom.xml files. To simplify such build configuration, I have started using sbt in Treasure Data. sbt (Simple Build Tool) , usually pronounced as es-bee-tee, is a build tool for Scala projects. I found it is also useful to build pure-java projects, but Java programmers are not generally familiar with sbt and Scala. So, in this post I'm going to introduce the basic structure of sbt, how to configure sbt and its standard usage. If you know Scala, you can enjoy rich features of sbt for customizing build processes. For simply using sbt, however,

Read full article from Building Java projects with sbt - Xerial Blog

Scala: Part 4: Classes & Objects | Rasesh Mori

Scala: Part 4: Classes & Objects | Rasesh Mori
Scala does not have static members but it has singleton objects. A singleton object definition looks similar to class but with the keyword “object”

When a singleton object shares the same name with a class, it is called that class’scompanion object. You must define both the class and its companion object in the same source file. The class is called the companion class of the singleton object. A class
and its companion object can access each other’s private members.

Singleton object acts similar to static block in Java and is executed once it is accessed.

A singleton object that does not share the same name with a companion
class is called a standalone object. You can use standalone objects for many
purposes, including collecting related utility methods together.

To run a Scala program, you must supply the name of a standalone singleton
object with a main method that takes one parameter, an Array[String],
and has a result type of Unit.Read full article from Scala: Part 4: Classes & Objects | Rasesh Mori

Scala difference between object and class - Stack Overflow

A class is a definition, a description. It defines a type in terms of methods and composition of other types.

An object is a singleton -- an instance of a class which is guaranteed to be unique. For every object in the code, an anonymous class is created, which inherits from whatever classes you declared object to implement. This class cannot be seen from Scala source code -- though you can get at it through reflection.

There is a relationship between object and class. An object is said to be the companion-object of a class if they share the same name. When this happens, each has access to methods of private visibility in the other. These methods are not automatically imported, though. You either have to import them explicitly, or prefix them with the class/object name.

For example:

class X {    // Prefix to call    def m(x: Int) = X.f(x)      // Import and use    import X._    def n(x: Int) = f(x)  }    object X {    def f(x: Int) = x * x  }

Read full article from Scala difference between object and class - Stack Overflow

eclipse - Run class in Scala IDE - Stack Overflow

You need to run an object not a class as noted by urban_racoons. so you can run either:

object MyApp{    def main(args: Array[String]): Unit = {      println("Hello World")    }  }

object MyApp extends App {       println("Hello World")     }

Scala can not run a class because the main method needs to be static. Which can only be created behind the scenes by the compiler from a singleton object in Scala. Create the object and "run as a Scala application" should appear in the "run" context sub menu as long as you have the Scala perspective open.

Read full article from eclipse - Run class in Scala IDE - Stack Overflow

Learning Scala part three - Executing Scala code

Learning Scala part three – Executing Scala code
Learning Scala part three – Executing Scala code With Scala installed and some tools to work with it it’s about time to look at how we can execute Scala code. This is the third post in the Learning Scala series after all. Sorry to keep you waiting for so long before showing some code. Anyhow, there are three ways to execute Scala code, interactively, as a script and as a compiled program. In this post we’ll take a quick look at each of them and print out the mandatory Hello world message. Using the interactive interpreter The easiest way to execute a single line of Scala code is to use the interactive interpreter with which statements and definitions are evaluated on the fly. We start the interactive interpreter using the Scala command line tool “scala” which is located in the bin folder in the folder where Scala is installed. With the proper environment variable in place you can just type “scala” to start it.

In fact the concept of static methods (or variables or classes) doesn’t exist in Scala. In Scala everything is an object. However Scala has an object construct with which we can declare singletons. In other word our main method is an instance method on a singleton object that is automatically instantiated for us.
def main(args: Array[String]) : Unit = //Body

Read full article from Learning Scala part three – Executing Scala code

Amazon pushing deals to Snapchat

London quotes now available Talk about a limited-time offer. To kick off its Black Friday specials, Amazon on Thursday announced that it will start posting deal codes to disappearing messaging app Snapchat, which can then be redeemed during checkout on its site. Also Thursday, the online retailer said that its Instagram is now shoppable—a move that competitors Target and Nordstrom made earlier this year. Getty Images A worker prepares packages for delivery at an Amazon.com warehouse in Brieselang, Germany. "Our customers are redefining Black Friday shopping," said Steve Shure, Amazon's vice president of worldwide marketing. "They want to stay home with family, enjoy some turkey and football, and shop the hottest deals." Amazon also said that it will bring back its deals every 10 minutes campaign, which will kick off this Friday and last two more days than in 2013. The retailer will also host three "Deals of the Day" on Thanksgiving and Black Friday, compared with two last year.

Read full article from Amazon pushing deals to Snapchat

4 Minimal Scala Web Frameworks for Web Developers

Scala to many is just another language, more or less useless, and never going to bother their path to awesome. To others, Scala is a lovely programming language that provides scalable environment – hence: Scala – and allows to have fun along the way. It's worth mentioning that companies like LinkedIn, The Guardian and even Sony, use Scala to power their infrastructure. You can see a great example of how versatile Scala is by taking a look at this database migration library , published as an open-source project from Sony. The Scala Migrations library is written in Scala and makes use of the clean Scala language to write easy to understand migrations, which are also written in Scala. Scala Migrations provides a database abstraction layer that allows migrations to target any supported database vendor. Even though its syntax is fairly conventional, Scala is also a full-blown functional language. It has everything you would expect, including first-class functions,

Read full article from 4 Minimal Scala Web Frameworks for Web Developers

Free First-name and Last-name Databases (CSV and SQL) | The Quiet Affiliate

Matt over on the WF forums had recently posted a SQL database that he had created comprised of first and last names, a great resource to have handy when auto-generating identities, accounts, comment authors, etc.. While it was a pretty large list, it still wasn't as complete as the one i've been using over the years (also it was separated by gender and used one giant table instead of segmenting the first and last names into two). I figured that if I was going to complain about something free, I might as well provide something in return, so here's your chance to download a free first and last name database in both SQL and CSV formats (pick your poison). There are a total of 5494 first-names and 88799 last-names allowing for a never-ending source of randomly generated full names.

Free Name Database Download – SQL:
.sql File | .zip File

Free Name Database Download – CSV:
First Names (5494) – .csv File & Last Names (88799) – .csv File
I also compressed them both for easy download – .zip of both files

Read full article from Free First-name and Last-name Databases (CSV and SQL) | The Quiet Affiliate

frickjack: scala REPL classpath hack

Saturday, May 04, 2013 scala REPL classpath hack While writing code I often want to take a quick look at the behavior of some class or module, and might wind up writing a short test program just to verify that some method does what I expect or that I'm obeying some DSL's state machine or whatever. One of the great things about Scala (and groovy, javascript, clojure, ...) is its REPL , which allows me to test little blocks of code without the hassle of compiling a test file. Unfortunately, Scala's REPL launch scripts (scala.bat, scala.sh) do not provide command-line options for manipulating the REPL's classpath, so the REPL can't see the jars IVY assembles for my project ( javasimon , joda-time , guava , gson , guice , ... ). Anyway - it's easy to just copy %SCALA_HOME%/bin/scala.bat (or ${SCALA_HOME}/scala if not on Windows - whatever), and modify it, so the REPL runs with the same classpath as your project, and at the same time set the heap size, logging config-file system property,

Read full article from frickjack: scala REPL classpath hack

Why developers should get excited about Java 9 | InfoWorld

From Sorry Why developers should get excited about Java 9 More like this InfoWorld | Aug 22, 2014 With work moving forward on the next edition of standard Java, developers can start looking forward to what they will get with the planned upgrade. Several JEPs (JDK Enhancement Proposals) for Java Development Kit 9 were updated this week, offering the latest perspectives on what to expect with JDK 9, which has been targeted for release in early 2016 and is based on the Java Standard Edition 9 platform. Headlining the release at this juncture is a modular source code system. Oracle has planned a modular Java via Project Jigsaw, which had been planned for JDK 8 but was pushed back ; the existing JEP is part of Project Jigsaw. Standard Edition Java becomes more scalable to smaller devices with this technology. "The module system should be powerful enough to modularize the JDK and other large legacy code bases, yet still be approachable by all developers," says Oracle's Mark Reinhold,

Read full article from Why developers should get excited about Java 9 | InfoWorld

Programmers could get REPL in official Java | JavaWorld

From Sorry More like this Proponents of open source Java are investigating the possibility of formally adding a REPL (Read Evaluate Print Loop) tool to the language. Java advocates are considering REPL as part of Project Kulla, currently under discussion on the openjdk mailing list for open source Java. Featured in Lisp programming, REPL expressions replace entire compilation units; the REPL evaluates them and offers results. With REPL, the overhead of compilation is avoided for looping operations, says Forrester analyst John Rymer. "From a developer perspective, it's nice to be able to interact with the code while it's running in real time without having to recompile/redeploy," analyst Michael Facemire, also of Forrester, says. REPLs already are featured in most dynamic and functional languages, including Scala, says Scala founder Martin Odersky in an email. There have even been REPLs available for Java before, he says.

Read full article from Programmers could get REPL in official Java | JavaWorld

My method using Java 8

For sbt 0.13 and up

Labels

Popular Posts