github The code base ：https://github.com/apache/spark
Python,Scala,Java Take a look at the example ：http://spark.apache.org/examples.html
Spark Java Version example ：https://github.com/apache/spark/tree/master/examples/src/main/java/org/apache/spark/examples
As Java Programmers , The official website has made it as convenient as possible , All you want is here ：
It is said that , It is better to know a son than a father ,Apache spark Features and version guides can be found exactly ：http://spark.apache.org/documentation.html
Spark Can do , Please keep in mind the following characteristics ：
The speed of running the workload has increased 100 times .
Apache Spark Use the latest DAG The scheduler , Query optimizer and physical execution engine , High performance for batch and streaming data .
Easy to use
Use Java,Scala,Python,R and SQL Write applications quickly .
Spark Provides 80 Multiple senior operators , It's easy to build parallel applications . You can from Scala,Python,R and SQL Shell Interaction Use it .
Infer read through automatic mode JSON file
An introduction to
Use a combination of SQL, Flow and complex analysis .
Spark Can be found in Hadoop,Apache Mesos,Kubernetes, Stand alone or in the cloud . It has access to a variety of data sources .
You can go to EC2,Hadoop YARN,Mesos or Kubernetes Use it on Independent cluster mode function Spark . visit HDFS, Alluxio, Apache Cassandra, Apache HBase, Apache Hive And hundreds of other data sources .
Bear in mind ： The official website has always been the best teacher , Hearsay is not to be trusted ！