kinesis-sql | Kinesis Connector for Structured Streaming

by qubole Scala Version: Current License: Apache-2.0

X-Ray Key Features Code Snippets Community Discussions(2)Vulnerabilities Install Support

kandi X-RAY | kinesis-sql Summary

kinesis-sql is a Scala library typically used in Big Data, Spark applications. kinesis-sql has no vulnerabilities, it has a Permissive License and it has low support. However kinesis-sql has 2 bugs. You can download it from GitHub.

Kinesis Connector for Structured Streaming

Support

Quality

Security

License

Reuse

Support

kinesis-sql has a low active ecosystem.

It has 125 star(s) with 66 fork(s). There are 16 watchers for this library.

It had no major release in the last 6 months.

There are 18 open issues and 53 have been closed. On average issues are closed in 126 days. There are 2 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of kinesis-sql is current.

Quality

kinesis-sql has 2 bugs (0 blocker, 0 critical, 2 major, 0 minor) and 23 code smells.

Security

kinesis-sql has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

kinesis-sql code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

kinesis-sql is licensed under the Apache-2.0 License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

kinesis-sql releases are not available. You will need to build from source code and install.

Installation instructions, examples and code snippets are available.

It has 3874 lines of code, 154 functions and 28 files.

It has medium code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of kinesis-sql

Get all kandi verified functions for this library.

kinesis-sql Key Features

No Key Features are available at this moment for kinesis-sql.

kinesis-sql Examples and Code Snippets

No Code Snippets are available at this moment for kinesis-sql.

Community Discussions

Trending Discussions on kinesis-sql

How to write a Spark dataframe into Kinesis Stream?

Spark Streaming write to Kafka with delay - after x minutes

QUESTION

How to write a Spark dataframe into Kinesis Stream?

Asked 2019-Jul-11 at 16:54

I am creating a Dataframe from a kafka topic using spark streaming. I want to write the Dataframe into a Kinesis Producer. I understand that there is no official API for this as of now. But there are multiple APIs available over the internet , but sadly, none of them worked for me. Spark version : 2.2 Scala : 2.11

I tried using https://github.com/awslabs/kinesis-kafka-connector and build the jar. But getting errors due to conflicting package names between this jar and spark API. Please help.

########### Here is the code for others: ...

ANSWER

Answered 2019-Jul-09 at 16:05

Kafka Connect is a service to which you can POST your connector specifications (kinesis in this case), which then takes care of running the connector. It supports quite a few transformations as well while processing the records. Kafka Connect plugins are not intended to be used with Spark applications.

If your use case requires you to do some business logic while processing the records, then you could go with either Spark Streaming or Structured Streaming approach.

If you want to take Spark based approach, below are the 2 options I can think of.

Use Structured Streaming. You could use a Strucuted streaming connector for Kinesis. You can find one here. There may be others too. This is the only stable and open source connector I am aware of. You can find an example for using Kinesis as a sink here.
Use Kinesis Producer Library or aws-java-sdk-kinesis library to publish records from your Spark Streaming application. Using KPL is a preferred approach here. You could do mapPartitions and create a Kinesis client per partition and publish the records using these libraries. There are plenty of examples in AWS docs for these 2 libraries.

Source https://stackoverflow.com/questions/56942802

QUESTION

Spark Streaming write to Kafka with delay - after x minutes

Asked 2019-Jun-06 at 08:28

We have a spark Streaming application. Architecture is as follows

Kinesis to Spark to Kafka.

The Spark application is using qubole/kinesis-sql for structured streaming from Kinesis. The data is then aggregated and then pushed to Kafka.

Our use case demands a delay of 4 minutes before pushing to Kafka.

The windowing is done with 2 minutes and watermark of 4 minutes

...

ANSWER

Answered 2019-Jun-06 at 08:28

Change your output mode from update to append (the default option). The output mode will write all updated rows to the sink, hence, if you use a watermark or not will not matter.

However, with the append mode any writes will need wait until the watermark is crossed - which is exactly what you want:

Append mode uses watermark to drop old aggregation state. But the output of a windowed aggregation is delayed the late threshold specified in withWatermark() as by the modes semantics, rows can be added to the Result Table only once after they are finalized (i.e. after watermark is crossed).

Source https://stackoverflow.com/questions/56472313

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install kinesis-sql

Checkout kinesis-sql branch depending upon your Spark version. Use Master branch for the latest Spark version. This will create target/spark-sql-kinesis_2.12-*.jar file which contains the connector code and its dependency jars.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: