flink-statefun | Apache Flink Stateful Functions | Serverless library

by apache Java Version: Current License: Apache-2.0

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | flink-statefun Summary

flink-statefun is a Java library typically used in Serverless applications. flink-statefun has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has high support. You can download it from GitHub, Maven.

Stateful Functions is an API that simplifies the building of distributed stateful applications with a runtime built for serverless architectures. It brings together the benefits of stateful stream processing - the processing of large datasets with low latency and bounded resource constraints - along with a runtime for modeling stateful entities that supports location transparency, concurrency, scaling, and resiliency. It is designed to work with modern architectures, like cloud-native deployments and popular event-driven FaaS platforms like AWS Lambda and KNative, and to provide out-of-the-box consistent state and messaging while preserving the serverless experience and elasticity of these platforms. Stateful Functions is developed under the umbrella of Apache Flink. This README is meant as a brief walkthrough on the core concepts and how to set things up to get yourself started with Stateful Functions. For a fully detailed documentation, please visit the official docs.

Support

Quality

Security

License

Reuse

Support

flink-statefun has a highly active ecosystem.

It has 332 star(s) with 138 fork(s). There are 42 watchers for this library.

It had no major release in the last 6 months.

flink-statefun has no issues reported. There are no pull requests.

It has a positive sentiment in the developer community.

The latest version of flink-statefun is current.

Quality

flink-statefun has 0 bugs and 0 code smells.

Security

flink-statefun has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

flink-statefun code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

flink-statefun is licensed under the Apache-2.0 License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

flink-statefun releases are not available. You will need to build from source code and install.

Deployable package is available in Maven.

Build file is available. You can build the component from source.

Installation instructions are not available. Examples and code snippets are available.

flink-statefun saves you 16106 person hours of effort in developing the same functionality from scratch.

It has 39795 lines of code, 3839 functions and 647 files.

It has medium code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi has reviewed flink-statefun and discovered the below as its top functions. This is intended to give you an instant insight into flink-statefun implemented functionality, and help decide if they suit your requirements.

Override this method to send incoming request
Serialize protobuf message to byte buffers
Zeroizes a message to a buffer
Get the headers
Deserialize a bootstrap bootstrap data
Copy the contents of the data input to the output view
Deserialize TaggedBootstrapData
Creates a copy of the given bootstrap data
Initialize the buffer
Create the Reductions
Starts the job
Deserialize an AWS region
Creates a stateful cluster instance
Returns the properties at a given node as a map
Entry point
Drains all completed futures on the operator thread
Copy object to targetClassLoader
Gets the long properties at a given position
Opens the Operator
Deserialize the credentials
Returns a hash code for this map
Loads the services from the classpath
Initialize state
Called when a channel is created
Returns the command line options
Returns a string representation of the data type information

Get all kandi verified functions for this library.

flink-statefun Key Features

No Key Features are available at this moment for flink-statefun.

flink-statefun Examples and Code Snippets

No Code Snippets are available at this moment for flink-statefun.

Community Discussions

Trending Discussions on flink-statefun

Time characteristic in Stateful functions

Need advice on migrating from Flink DataStream Job to Flink Stateful Functions 3.1

Flink StateFun high availability exception: "java.lang.IllegalStateException: There is no operator for the state ....."

Custom metrics in Stateful functions

What templating parameters does Flink Stateful Functions URL Path Template support?

How to make an automatic savepoint in Flink Stateful Functions application?

Flink Statefun Bootstrap and State expiration

Flink statefun and confluent schema registry compatibility

Flink statefun co-located functions communication

Flink Stateful Functions 2.0 Multiple Calls During Asynchronous Wait

QUESTION

Time characteristic in Stateful functions

Asked 2022-Feb-03 at 09:06

I understand in general that event time uses Watermarks to make progress in time. In the case of Flink Statefun which is more based on iteration it may be a problem. So my question is if I use the delayed message (https://nightlies.apache.org/flink/flink-statefun-docs-stable/docs/sdk/java/#sending-delayed-messages), then does it mean we can use only processing time notion in Stateful functions ?

I would like to change to Event time processing model but not sure how it will work with Stateful functions.

...

ANSWER

Answered 2022-Feb-03 at 09:06

Stateful Functions (statefun) doesn't support watermarks or event-time processing. But you could implement your own triggering logic based on the timestamps in arriving events.

Source https://stackoverflow.com/questions/70947087

QUESTION

Need advice on migrating from Flink DataStream Job to Flink Stateful Functions 3.1

Asked 2022-Jan-10 at 19:11

I have a working Flink job built on Flink Data Stream. I want to REWRITE the entire job based on the Flink stateful functions 3.1.

The functions of my current Flink Job are:

Read message from Kafka
Each message is in format a slice of data packets, e.g.(s for slice):
- s-0, s-1 are for packet 0
- s-4, s-5, s-6 are for packet 1
The job merges slices into several data packets and then sink packets to HBase
Window functions are applied to deal with disorder of slice arrival

My Objectives

Currently I already have Flink Stateful Functions demo running on my k8s. I want to do rewrite my entire job upon on stateful functions.
Save data into MinIO instead of HBase

My current plan

I have read the doc and got some ideas. My plans are:

There's no need to deal with Kafka anymore, Kafka Ingress(https://nightlies.apache.org/flink/flink-statefun-docs-release-3.0/docs/io-module/apache-kafka/) handles it
Rewrite my job based on java SDK. Merging are straightforward. But How about window functions?
Maybe I should use persistent state with TTL to mimic window function behaviors
Egress for MinIO is not in the list of default Flink I/O Connectors, therefore I need to write my custom Flink I/O Connector for MinIO myself, according to https://nightlies.apache.org/flink/flink-statefun-docs-release-3.0/docs/io-module/flink-connectors/
I want to avoid Embedded module because it prevents scaling. Auto scaling is the key reason why I want to migrate to Flink stateful functions

My Questions

I don't feel confident with my plan. Is there anything wrong with my understandings/plan?

Are there any best practice I should refer to?

Update: windows were used to assemble results

get a slice, inspect its metadata and know it is the last one of the packet
also knows the packet should contains 10 slices
if there are already 10 slices, merge them
if there are not enough slices yet, wait for sometime (e.g. 10 minutes) and then either merge or record packet errors.

I want to get rid of windows during the rewrite, but I don't know how

...

ANSWER

Answered 2022-Jan-10 at 19:11

Background: Use KeyedProcessFunctions Rather than Windows to Assemble Related Events

With the DataStream API, windows are not a good building block for assembling together related events. The problem is that windows begin and end at times that are aligned to the clock, rather than being aligned to the events. So even if two related events are only a few milliseconds apart they might be assigned to different windows.

In general, it's more straightforward to implement this sort of use case with keyed process functions, and use timers as needed to deal with missing or late events.

Doing this with the Statefun API

You can use the same pattern mentioned above. The function id will play the same role as the key, and you can use a delayed message instead of a timer:

as each slice arrives, add it to the packet that's being assembled
if it is the first slice, send a delayed message that will act as a timeout
when all the slices have arrived, merge them and send the packet
if the delayed message arrives before the packet is complete, do whatever is appropriate (e.g., go ahead and send the partial packet)

Source https://stackoverflow.com/questions/70636370

QUESTION

Flink StateFun high availability exception: "java.lang.IllegalStateException: There is no operator for the state ....."

Asked 2021-Dec-15 at 16:51

I have 2 questions related to high availability of a StateFun application running on Kubernetes

Here are details about my setup:

Using StateFun v3.1.0
Checkpoints are stored on HDFS (state.checkpoint-storage: filesystem)
Checkpointing mode is EXACTLY_ONCE
State backend is rocksdb and incremental checkpointing is enabled

1- I tried both Zookeeper and Kubernetes HA settings, result is the same (log below is from a Zookeeper HA env). When I kill the jobmanager pod, minikube starts another pod and this new pod fails when it tries to load last checkpoint:

...

ANSWER

Answered 2021-Dec-15 at 16:51

In statefun <= 3.2 routers do not have manually specified UIDs. While Flinks internal UID generation is deterministic, the way statefun generates the underlying stream graph may not be in some cases. This is a bug. I've opened a PR to fix this in a backwards compatible way[1].

[1] https://github.com/apache/flink-statefun/pull/279

Source https://stackoverflow.com/questions/70316498

QUESTION

Custom metrics in Stateful functions

Asked 2021-Nov-06 at 07:56

Good day to all, Started work recently with Apache Flink Stateful functions. We are using Flink reporter to put metrics to InfluxDB https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/metric_reporters/ Stateful functions provides "function" scope with several metrics out of the box https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2/deployment-and-operations/metrics.html but it's not enough and there is a need to add custom metrics and measurements. All source code seems to be closed to extention and I'm not able to find the proper way how to do this. Please share your experience if someone managed to complete this task.

...

ANSWER

Answered 2021-Nov-06 at 07:56

The ability to add user defined metrics was added to the main branch recently for the embedded-functions SDK. See JIRA issue.

With that change, you can do something like this:

Source https://stackoverflow.com/questions/69845057

QUESTION

What templating parameters does Flink Stateful Functions URL Path Template support?

Asked 2021-Nov-03 at 23:06

When deploying Flink Stateful Functions, one needs to specify what the endpoints for the functions are, i.e. what URL does Flink need to hit in order to trigger the execution of a remote function.

The docs state:

The URL template name may contain template parameters that are filled in based on the function’s specific type. For example, a message sent to message type com.example/greeter will be sent to http://bar.foo.com/greeter.

...

ANSWER

Answered 2021-Nov-03 at 23:06

The only template value supported at the moment is the function name. i.e. the last value after the last forward slash /. You can place it wherever you would like in the template as long as it would resolve to a legal url at the end.

For example, this is also a valid template:

http://{function.name}.prod.svc.example.com

Then, a message address to com.example/greeter (in your example, with my new template) would resolve to:

http://greeter.prod.svc.example.com

If you are missing any other template parameters, feel free to connect with the Flink community over the user mailing list/JIRA. I'm sure they would be happy to learn about new uses cases ;-)

Source https://stackoverflow.com/questions/69620151

QUESTION

How to make an automatic savepoint in Flink Stateful Functions application?

Asked 2020-Sep-21 at 07:03

I am trying to dive into the new Stateful Functions approach and I already tried to create a savepoint manually (https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.1/deployment-and-operations/state-bootstrap.html#creating-a-savepoint).

It works like a charm but I can't find a way how to do it automatically. For example, I have a couple millions of keys and I need to write them all to savepoint.

...

ANSWER

Answered 2020-Jul-26 at 11:14

Is your question about how to replace the env.fromElements in the example with something that reads from a file, or other data source? Flink's DataSet API, which is what's used here, can read from any HadoopInputFormat. See DataSet Connectors for details.

There are easy-to-use shortcuts for common cases. If you just want to read data from a file using a TextInputFormat, that would look like this:

Source https://stackoverflow.com/questions/63095295

QUESTION

Flink Statefun Bootstrap and State expiration

Asked 2020-Jul-24 at 20:54

According to this page we have the ability to set TTL for state when using Flink Statefun v2.1.0.

We also have the ability to bootstrap state, according to this page.

First question is, bootstrap documentation does not mention state expiration at all. What is the correct way to do bootstrapping on states that have TTL? Can someone point me to an example?

The second question is, what happens if I set some state as expire after writing in 1 day and then bootstrap that state using 6 months worth data?

Is the whole bootstrapped state going to expire after literally 1 day?

If so, what can I do to have it expire 1 day worth of data after 1 day passes?

...

ANSWER

Answered 2020-Jul-24 at 20:54

Yes, if that data hasn't been modified since it was loaded, it will all be deleted after one day.

To expire one day's worth of data every day: After bootstrapping the state, you could send yourself a delayed message, set to be delivered one day later. When it arrives, delete the oldest data and send another delayed message.

Source https://stackoverflow.com/questions/63078427

QUESTION

Flink statefun and confluent schema registry compatibility

Asked 2020-Jun-27 at 10:50

I'm trying to egress to confluent kafka from flink statefun. In confluent git repo in order to schema check and put data to kafka topic all we need to do is use kafka client ProducerRecord object with avro object.

But in statefun we need to override "ProducerRecord serialize" method for kafka egress. This causes the following error.

...

ANSWER

Answered 2020-Jun-24 at 15:40

Schema registry is not directly supported at this version of stateful functions, but few workarounds are possible:

Connect to the schema registry by your self from the KafkaEgressSerializer class. In your linked example that would need to be happening here.
Provide your own instance of a FlinkKafkaProducer that is based on (see AvroDeserializationSchema)
Mange the schemas outside of stateful functions, but serialize your Avro record to bytes. Make sure to remove the schema registry from the properties that being passed to the KafkaProducer

Source https://stackoverflow.com/questions/62514013

QUESTION

Flink statefun co-located functions communication

Asked 2020-May-04 at 18:12

I have a properly working embedded job and I want to deploy additional co-located jobs. These newly added jobs will receive messages from the old job and send it to kafka topic.

code as below

...

ANSWER

Answered 2020-May-04 at 18:12

Responses inline, and FYI nothing you are asking is co-located specific. These properties hold for remote modules and jobs that contain mixed workloads of co-located and remote.

Do I have to define ingress for every co-located job? If not how can I make this work?

Yes, every job (remote or colocated) requires at least one ingress. An ingress is a channel that consumes messages from the outside world into a statefun application. Think Kafka or Kinesis. Without an ingress, the job would never do anything because there would be no initial messages to begin the processing.

To each ingress, you will bind 1 or more routers, which take each message from the ingress and forward them to 0 or more functions based on their function types[1].

How can I get co-located jobs to communicate? Is it enough to use the same FunctionType?

Yes, functions simply message each other using their function types.

Are co-located functions communicating over ingress/egress?

No, messages are passed between functions using the Apache Flink runtime which contains a highly optimized network stack. Once a message is pulled from an ingress, it never interacts with that ingress again. If interested, you can read about how Flink's network stack works in some blog posts that the community wrote, but this is not necessary to successfully use statefun in production[2].

[1] https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/io-module/index.html#router

[2]https://flink.apache.org/2019/06/05/flink-network-stack.html

Source https://stackoverflow.com/questions/61578082

QUESTION

Flink Stateful Functions 2.0 Multiple Calls During Asynchronous Wait

Asked 2020-Apr-30 at 13:13

Flink Stateful Functions 2.0 has the ability to make asychronous calls, for example to an external API: [https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/sdk/java.html#completing-async-requests][1].

Function execution is then paused until the call completed with Success, Failure, or Unknown. Unknown is:

The stateful function was restarted, possibly on a different machine, before the CompletableFuture was completed, therefore it is unknown what is the status of the asynchronous operation.

What happens when there is a second call with the same ID to the paused/waiting function?

Does the callee then wait on the called function's processing of its async result so that this second call executes with a clean, non-shared post-async state?
Or does the second call execute on a normal schedule, and thus on top of the state that was current as of the async call, and then when the async call completes it continues processing using the state that was updated while the async call was pending?
Or maybe the call counts as a "restart" of the called function - in which case what is the order of execution: the "restart" runs and then the async returns with "restart" to execute from the now updated state, or this order is reversed?
Or something else?

...

ANSWER

Answered 2020-Apr-19 at 21:20

Function execution does not pause while an async request is completing. The instance for that id will continue to process messages until the request completes. This means the state can change while the future is running.

Think of your future as an ad-hoc function that you message and that then messages you back when it has a result. Functions can spawn multiple asynchronous requests without issue. Whichever future completes first will be processed first by the function instance, not necessarily the order in which they were spawned.

Source https://stackoverflow.com/questions/61311053

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install flink-statefun

You can download it from GitHub, Maven.
You can use flink-statefun like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the flink-statefun component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: