fbpx

AWS Kinesis Stream Analytics

Stream Processing facilitates the collection, transformation and analysis of real-time data and enables the continuous generation of insights and quick reactions / alerts to emerging situations. In such processes, as time goes by, the value of such analytical data diminishes and therefore, the faster we process and react, the more value we can deliver to our stakeholders.

Amazon Kinesis Analytics is a service capable of processing real-time data stream and with the help of simple SQL queries can operate on the data stream transforming it on-the-fly and send such transformed output to downstream destinations like AWS QuickSight (Visualization & Dashboards), AWS Elasticsearch (distributed document based analytics engine), Dynamo DB (NoSQL data store), Lambda (process further via Non SQL techniques and integrate with other AWS services).

Architecture

The following diagram illustrates a typical Kinesis Analytics application architecture:

Tumbling

This window is used for periodic reports. It can be used to summarize data over a specific time. We could be getting thousands of requests per second and if we want to find out how many such requests are ingested per minute, we can use a tumbling window interval in our SELECT SQL query to do so.

Sliding

In this case, as the window slides with time, Amazon Kinesis Data Analytics emits an output when new records appear on the stream. Kinesis Data Analytics emits this output by processing rows in the window. Windows can overlap in this type of processing, and a record can be part of multiple windows and be processed with each window

Stagger

Using stagger windows is a windowing method that is suited for analyzing groups of data that arrive at inconsistent times. It is well suited for any time-series analytics use case, such as a set of related sales or log records.

Features

  • Amazon Kinesis Data Analytics delivers sub-second processing latencies so you can generate real-time alerts, dashboards, and actionable insights.
  • You get an interactive editor to build SQL queries using streaming data operations like sliding time-window averages. You can also view streaming results and errors using live data to debug or further refine your script interactively.
  • Amazon Kinesis Data Analytics provides an easy-to-use schema editor to discover and edit the structure of the input data. The wizard automatically recognizes standard data formats such as JSON and CSV. It infers the structure of the input data to create a baseline schema, which you can further refine using the schema editor.
  • The interactive SQL editor comes bundled with a collection of stream processing templates that provide baseline SQL code for the most common types of operations such as aggregation, per-event transformation, and filtering and even Machine Learning algorithms such Random_Cut_Forest() to detect anomalies in records.
  • It also supports stream processing in Java via the open source library Apache Flink if SQL is not preferred.
Menu