Continuing with the rapid innovation of the Apache Spark code base the Spark Streaming API allows enterprises to leverage the full power of the Spark architecture to process real-time workloads.
Built upon the foundation of Core Spark, Spark Streams is able to consume data from common real time pipelines such as Apache Kafka, Apache Flume, Kinesis, TCP Sockets and run complex algorithms (MLib Predictive Models, GraphX Algorithms). Results can be then displayed in real time dashboards or be stored in HDFS.
Reference:
- Apache Spark Streaming Programming Guide:
https://spark.apache.org/docs/2.1.0/streaming-programming-guide.html