flink-statefun | Apache Flink Stateful Functions | Serverless library

 by   apache Java Version: Current License: Apache-2.0

kandi X-RAY | flink-statefun Summary

kandi X-RAY | flink-statefun Summary

flink-statefun is a Java library typically used in Serverless applications. flink-statefun has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has high support. You can download it from GitHub, Maven.

Stateful Functions is an API that simplifies the building of distributed stateful applications with a runtime built for serverless architectures. It brings together the benefits of stateful stream processing - the processing of large datasets with low latency and bounded resource constraints - along with a runtime for modeling stateful entities that supports location transparency, concurrency, scaling, and resiliency. It is designed to work with modern architectures, like cloud-native deployments and popular event-driven FaaS platforms like AWS Lambda and KNative, and to provide out-of-the-box consistent state and messaging while preserving the serverless experience and elasticity of these platforms. Stateful Functions is developed under the umbrella of Apache Flink. This README is meant as a brief walkthrough on the core concepts and how to set things up to get yourself started with Stateful Functions. For a fully detailed documentation, please visit the official docs.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              flink-statefun has a highly active ecosystem.
              It has 332 star(s) with 138 fork(s). There are 42 watchers for this library.
              OutlinedDot
              It had no major release in the last 6 months.
              flink-statefun has no issues reported. There are no pull requests.
              It has a positive sentiment in the developer community.
              The latest version of flink-statefun is current.

            kandi-Quality Quality

              flink-statefun has 0 bugs and 0 code smells.

            kandi-Security Security

              flink-statefun has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              flink-statefun code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              flink-statefun is licensed under the Apache-2.0 License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              flink-statefun releases are not available. You will need to build from source code and install.
              Deployable package is available in Maven.
              Build file is available. You can build the component from source.
              Installation instructions are not available. Examples and code snippets are available.
              flink-statefun saves you 16106 person hours of effort in developing the same functionality from scratch.
              It has 39795 lines of code, 3839 functions and 647 files.
              It has medium code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed flink-statefun and discovered the below as its top functions. This is intended to give you an instant insight into flink-statefun implemented functionality, and help decide if they suit your requirements.
            • Override this method to send incoming request
            • Serialize protobuf message to byte buffers
            • Zeroizes a message to a buffer
            • Get the headers
            • Deserialize a bootstrap bootstrap data
            • Copy the contents of the data input to the output view
            • Deserialize TaggedBootstrapData
            • Creates a copy of the given bootstrap data
            • Initialize the buffer
            • Create the Reductions
            • Starts the job
            • Deserialize an AWS region
            • Creates a stateful cluster instance
            • Returns the properties at a given node as a map
            • Entry point
            • Drains all completed futures on the operator thread
            • Copy object to targetClassLoader
            • Gets the long properties at a given position
            • Opens the Operator
            • Deserialize the credentials
            • Returns a hash code for this map
            • Loads the services from the classpath
            • Initialize state
            • Called when a channel is created
            • Returns the command line options
            • Returns a string representation of the data type information
            Get all kandi verified functions for this library.

            flink-statefun Key Features

            No Key Features are available at this moment for flink-statefun.

            flink-statefun Examples and Code Snippets

            No Code Snippets are available at this moment for flink-statefun.

            Community Discussions

            QUESTION

            Time characteristic in Stateful functions
            Asked 2022-Feb-03 at 09:06

            I understand in general that event time uses Watermarks to make progress in time. In the case of Flink Statefun which is more based on iteration it may be a problem. So my question is if I use the delayed message (https://nightlies.apache.org/flink/flink-statefun-docs-stable/docs/sdk/java/#sending-delayed-messages), then does it mean we can use only processing time notion in Stateful functions ?

            I would like to change to Event time processing model but not sure how it will work with Stateful functions.

            ...

            ANSWER

            Answered 2022-Feb-03 at 09:06

            Stateful Functions (statefun) doesn't support watermarks or event-time processing. But you could implement your own triggering logic based on the timestamps in arriving events.

            Source https://stackoverflow.com/questions/70947087

            QUESTION

            Need advice on migrating from Flink DataStream Job to Flink Stateful Functions 3.1
            Asked 2022-Jan-10 at 19:11

            I have a working Flink job built on Flink Data Stream. I want to REWRITE the entire job based on the Flink stateful functions 3.1.

            The functions of my current Flink Job are:
            1. Read message from Kafka
            2. Each message is in format a slice of data packets, e.g.(s for slice):
              • s-0, s-1 are for packet 0
              • s-4, s-5, s-6 are for packet 1
            3. The job merges slices into several data packets and then sink packets to HBase
            4. Window functions are applied to deal with disorder of slice arrival
            My Objectives
            • Currently I already have Flink Stateful Functions demo running on my k8s. I want to do rewrite my entire job upon on stateful functions.
            • Save data into MinIO instead of HBase
            My current plan

            I have read the doc and got some ideas. My plans are:

            My Questions

            I don't feel confident with my plan. Is there anything wrong with my understandings/plan?

            Are there any best practice I should refer to?

            Update: windows were used to assemble results
            1. get a slice, inspect its metadata and know it is the last one of the packet
            2. also knows the packet should contains 10 slices
            3. if there are already 10 slices, merge them
            4. if there are not enough slices yet, wait for sometime (e.g. 10 minutes) and then either merge or record packet errors.

            I want to get rid of windows during the rewrite, but I don't know how

            ...

            ANSWER

            Answered 2022-Jan-10 at 19:11

            Background: Use KeyedProcessFunctions Rather than Windows to Assemble Related Events

            With the DataStream API, windows are not a good building block for assembling together related events. The problem is that windows begin and end at times that are aligned to the clock, rather than being aligned to the events. So even if two related events are only a few milliseconds apart they might be assigned to different windows.

            In general, it's more straightforward to implement this sort of use case with keyed process functions, and use timers as needed to deal with missing or late events.

            Doing this with the Statefun API

            You can use the same pattern mentioned above. The function id will play the same role as the key, and you can use a delayed message instead of a timer:

            • as each slice arrives, add it to the packet that's being assembled
            • if it is the first slice, send a delayed message that will act as a timeout
            • when all the slices have arrived, merge them and send the packet
            • if the delayed message arrives before the packet is complete, do whatever is appropriate (e.g., go ahead and send the partial packet)

            Source https://stackoverflow.com/questions/70636370

            QUESTION

            Flink StateFun high availability exception: "java.lang.IllegalStateException: There is no operator for the state ....."
            Asked 2021-Dec-15 at 16:51

            I have 2 questions related to high availability of a StateFun application running on Kubernetes

            Here are details about my setup:

            • Using StateFun v3.1.0
            • Checkpoints are stored on HDFS (state.checkpoint-storage: filesystem)
            • Checkpointing mode is EXACTLY_ONCE
            • State backend is rocksdb and incremental checkpointing is enabled

            1- I tried both Zookeeper and Kubernetes HA settings, result is the same (log below is from a Zookeeper HA env). When I kill the jobmanager pod, minikube starts another pod and this new pod fails when it tries to load last checkpoint:

            ...

            ANSWER

            Answered 2021-Dec-15 at 16:51

            In statefun <= 3.2 routers do not have manually specified UIDs. While Flinks internal UID generation is deterministic, the way statefun generates the underlying stream graph may not be in some cases. This is a bug. I've opened a PR to fix this in a backwards compatible way[1].

            [1] https://github.com/apache/flink-statefun/pull/279

            Source https://stackoverflow.com/questions/70316498

            QUESTION

            Custom metrics in Stateful functions
            Asked 2021-Nov-06 at 07:56

            Good day to all, Started work recently with Apache Flink Stateful functions. We are using Flink reporter to put metrics to InfluxDB https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/metric_reporters/ Stateful functions provides "function" scope with several metrics out of the box https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2/deployment-and-operations/metrics.html but it's not enough and there is a need to add custom metrics and measurements. All source code seems to be closed to extention and I'm not able to find the proper way how to do this. Please share your experience if someone managed to complete this task.

            ...

            ANSWER

            Answered 2021-Nov-06 at 07:56

            The ability to add user defined metrics was added to the main branch recently for the embedded-functions SDK. See JIRA issue.

            With that change, you can do something like this:

            Source https://stackoverflow.com/questions/69845057

            QUESTION

            What templating parameters does Flink Stateful Functions URL Path Template support?
            Asked 2021-Nov-03 at 23:06

            When deploying Flink Stateful Functions, one needs to specify what the endpoints for the functions are, i.e. what URL does Flink need to hit in order to trigger the execution of a remote function.

            The docs state:

            The URL template name may contain template parameters that are filled in based on the function’s specific type. For example, a message sent to message type com.example/greeter will be sent to http://bar.foo.com/greeter.

            ...

            ANSWER

            Answered 2021-Nov-03 at 23:06

            The only template value supported at the moment is the function name. i.e. the last value after the last forward slash /. You can place it wherever you would like in the template as long as it would resolve to a legal url at the end.

            For example, this is also a valid template:

            http://{function.name}.prod.svc.example.com

            Then, a message address to com.example/greeter (in your example, with my new template) would resolve to:

            http://greeter.prod.svc.example.com

            If you are missing any other template parameters, feel free to connect with the Flink community over the user mailing list/JIRA. I'm sure they would be happy to learn about new uses cases ;-)

            Source https://stackoverflow.com/questions/69620151

            QUESTION

            How to make an automatic savepoint in Flink Stateful Functions application?
            Asked 2020-Sep-21 at 07:03

            I am trying to dive into the new Stateful Functions approach and I already tried to create a savepoint manually (https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.1/deployment-and-operations/state-bootstrap.html#creating-a-savepoint).

            It works like a charm but I can't find a way how to do it automatically. For example, I have a couple millions of keys and I need to write them all to savepoint.

            ...

            ANSWER

            Answered 2020-Jul-26 at 11:14

            Is your question about how to replace the env.fromElements in the example with something that reads from a file, or other data source? Flink's DataSet API, which is what's used here, can read from any HadoopInputFormat. See DataSet Connectors for details.

            There are easy-to-use shortcuts for common cases. If you just want to read data from a file using a TextInputFormat, that would look like this:

            Source https://stackoverflow.com/questions/63095295

            QUESTION

            Flink Statefun Bootstrap and State expiration
            Asked 2020-Jul-24 at 20:54

            According to this page we have the ability to set TTL for state when using Flink Statefun v2.1.0.

            We also have the ability to bootstrap state, according to this page.

            First question is, bootstrap documentation does not mention state expiration at all. What is the correct way to do bootstrapping on states that have TTL? Can someone point me to an example?

            The second question is, what happens if I set some state as expire after writing in 1 day and then bootstrap that state using 6 months worth data?

            Is the whole bootstrapped state going to expire after literally 1 day?

            If so, what can I do to have it expire 1 day worth of data after 1 day passes?

            ...

            ANSWER

            Answered 2020-Jul-24 at 20:54

            Yes, if that data hasn't been modified since it was loaded, it will all be deleted after one day.

            To expire one day's worth of data every day: After bootstrapping the state, you could send yourself a delayed message, set to be delivered one day later. When it arrives, delete the oldest data and send another delayed message.

            Source https://stackoverflow.com/questions/63078427

            QUESTION

            Flink statefun and confluent schema registry compatibility
            Asked 2020-Jun-27 at 10:50

            I'm trying to egress to confluent kafka from flink statefun. In confluent git repo in order to schema check and put data to kafka topic all we need to do is use kafka client ProducerRecord object with avro object.

            But in statefun we need to override "ProducerRecord serialize" method for kafka egress. This causes the following error.

            ...

            ANSWER

            Answered 2020-Jun-24 at 15:40

            Schema registry is not directly supported at this version of stateful functions, but few workarounds are possible:

            1. Connect to the schema registry by your self from the KafkaEgressSerializer class. In your linked example that would need to be happening here.
            2. Provide your own instance of a FlinkKafkaProducer that is based on (see AvroDeserializationSchema)
            3. Mange the schemas outside of stateful functions, but serialize your Avro record to bytes. Make sure to remove the schema registry from the properties that being passed to the KafkaProducer

            Source https://stackoverflow.com/questions/62514013

            QUESTION

            Flink statefun co-located functions communication
            Asked 2020-May-04 at 18:12

            I have a properly working embedded job and I want to deploy additional co-located jobs. These newly added jobs will receive messages from the old job and send it to kafka topic.

            code as below

            ...

            ANSWER

            Answered 2020-May-04 at 18:12

            Responses inline, and FYI nothing you are asking is co-located specific. These properties hold for remote modules and jobs that contain mixed workloads of co-located and remote.

            Do I have to define ingress for every co-located job? If not how can I make this work?

            Yes, every job (remote or colocated) requires at least one ingress. An ingress is a channel that consumes messages from the outside world into a statefun application. Think Kafka or Kinesis. Without an ingress, the job would never do anything because there would be no initial messages to begin the processing.

            To each ingress, you will bind 1 or more routers, which take each message from the ingress and forward them to 0 or more functions based on their function types[1].

            How can I get co-located jobs to communicate? Is it enough to use the same FunctionType?

            Yes, functions simply message each other using their function types.

            Are co-located functions communicating over ingress/egress?

            No, messages are passed between functions using the Apache Flink runtime which contains a highly optimized network stack. Once a message is pulled from an ingress, it never interacts with that ingress again. If interested, you can read about how Flink's network stack works in some blog posts that the community wrote, but this is not necessary to successfully use statefun in production[2].

            [1] https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/io-module/index.html#router

            [2]https://flink.apache.org/2019/06/05/flink-network-stack.html

            Source https://stackoverflow.com/questions/61578082

            QUESTION

            Flink Stateful Functions 2.0 Multiple Calls During Asynchronous Wait
            Asked 2020-Apr-30 at 13:13

            Flink Stateful Functions 2.0 has the ability to make asychronous calls, for example to an external API: [https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/sdk/java.html#completing-async-requests][1].

            Function execution is then paused until the call completed with Success, Failure, or Unknown. Unknown is:

            The stateful function was restarted, possibly on a different machine, before the CompletableFuture was completed, therefore it is unknown what is the status of the asynchronous operation.

            What happens when there is a second call with the same ID to the paused/waiting function?

            1. Does the callee then wait on the called function's processing of its async result so that this second call executes with a clean, non-shared post-async state?
            2. Or does the second call execute on a normal schedule, and thus on top of the state that was current as of the async call, and then when the async call completes it continues processing using the state that was updated while the async call was pending?
            3. Or maybe the call counts as a "restart" of the called function - in which case what is the order of execution: the "restart" runs and then the async returns with "restart" to execute from the now updated state, or this order is reversed?
            4. Or something else?
            ...

            ANSWER

            Answered 2020-Apr-19 at 21:20

            Function execution does not pause while an async request is completing. The instance for that id will continue to process messages until the request completes. This means the state can change while the future is running.

            Think of your future as an ad-hoc function that you message and that then messages you back when it has a result. Functions can spawn multiple asynchronous requests without issue. Whichever future completes first will be processed first by the function instance, not necessarily the order in which they were spawned.

            Source https://stackoverflow.com/questions/61311053

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install flink-statefun

            You can download it from GitHub, Maven.
            You can use flink-statefun like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the flink-statefun component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/apache/flink-statefun.git

          • CLI

            gh repo clone apache/flink-statefun

          • sshUrl

            git@github.com:apache/flink-statefun.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Explore Related Topics

            Consider Popular Serverless Libraries

            Try Top Libraries by apache

            echarts

            by apacheTypeScript

            superset

            by apacheTypeScript

            dubbo

            by apacheJava

            spark

            by apacheScala

            incubator-superset

            by apachePython