spring-cloud-dataflow | based Streaming and Batch data processing | Stream Processing library

 by   spring-cloud Java Version: 2.11.0 License: Apache-2.0

kandi X-RAY | spring-cloud-dataflow Summary

kandi X-RAY | spring-cloud-dataflow Summary

spring-cloud-dataflow is a Java library typically used in Data Processing, Stream Processing, Kafka applications. spring-cloud-dataflow has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has high support. You can download it from GitHub, Maven.

Spring Cloud Data Flow is a microservices-based toolkit for building streaming and batch data processing pipelines in Cloud Foundry and Kubernetes. Data processing pipelines consist of Spring Boot apps, built using the Spring Cloud Stream or Spring Cloud Task microservice frameworks. This makes Spring Cloud Data Flow ideal for a range of data processing use cases, from import/export to event streaming and predictive analytics.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              spring-cloud-dataflow has a highly active ecosystem.
              It has 992 star(s) with 554 fork(s). There are 101 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              There are 207 open issues and 3030 have been closed. On average issues are closed in 43 days. There are 4 open pull requests and 0 closed requests.
              It has a positive sentiment in the developer community.
              The latest version of spring-cloud-dataflow is 2.11.0

            kandi-Quality Quality

              spring-cloud-dataflow has 0 bugs and 0 code smells.

            kandi-Security Security

              spring-cloud-dataflow has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              spring-cloud-dataflow code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              spring-cloud-dataflow is licensed under the Apache-2.0 License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              spring-cloud-dataflow releases are available to install and integrate.
              Deployable package is available in Maven.
              Build file is available. You can build the component from source.
              Installation instructions are not available. Examples and code snippets are available.

            Top functions reviewed by kandi - BETA

            kandi has reviewed spring-cloud-dataflow and discovered the below as its top functions. This is intended to give you an instant insight into spring-cloud-dataflow implemented functionality, and help decide if they suit your requirements.
            • Launch a task .
            • Deletes child task executions .
            • Inline the given sequence into the main sequence .
            • Builds the StreamAppDefinition .
            • Creates the appDeployment requests for a stream deployment .
            • Returns a page of audit records .
            • Retrieves the TaskExecution information for a given task .
            • Process a set of links .
            • This method executes a task .
            • Create container configuration map .
            Get all kandi verified functions for this library.

            spring-cloud-dataflow Key Features

            No Key Features are available at this moment for spring-cloud-dataflow.

            spring-cloud-dataflow Examples and Code Snippets

            No Code Snippets are available at this moment for spring-cloud-dataflow.

            Community Discussions

            QUESTION

            Add SCDF (Spring Cloud Data Flow) Application to Bitnami chart generated cluster?
            Asked 2021-Sep-01 at 17:51

            I've used the Bitnami Helm chart to install SCDF into a k8s cluster generated by kOps in AWS.

            I'm trying to add my development SCDF stream apps into the installation using a file URI and cannot figure-out where or how the shared Skipper & Server mount point is. exec'ing into either instance there is no /home/cnb and I'm not seeing anything common via mount. The best I can tell the Bitnami installation is using the MariaDB instance for shared "storage".

            Is there a recommended way of installing local/dev Stream apps into the cluster?

            ...

            ANSWER

            Answered 2021-Aug-23 at 09:03

            There are a couple of parameters under the deployer section that allows you to mount volumes (link):

            Source https://stackoverflow.com/questions/68863139

            QUESTION

            Spring Cloud Data Flow Passing Parameters to Data Flow Server
            Asked 2021-Aug-06 at 16:56

            What is the format for passing in additional arguments or environment variables to the Data Flow Server in SCDF running on Kubernetes? When running locally in Docker Compose, I can do something like below, but not sure what the equivalent is when deploying to Kubernetes using the helm chart.

            ...

            ANSWER

            Answered 2021-Aug-06 at 16:56

            The properties you are looking for might be here under Kafka Chart Parameters -> externalKafka.brokers

            So in your case I would try

            Source https://stackoverflow.com/questions/68683332

            QUESTION

            Spring Cloud Dataflow - Parallel Tasks
            Asked 2021-Aug-04 at 20:54

            I have about 16 tasks configured in parallel like .

            My intention is to only have 3 tasks running at one time. I don't mind which tasks run first as long as the order of the sequential tasks are maintained (BBB is always run after AAA, DDD after CCC etc.)

            As per the documentation here - https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#_configuration_options, I tried setting --split-thread-core-pool-size=3, but it gave me this error -

            Split thread core pool size 3 should be equal or greater than the depth of split flows 17. Try setting the composed task property splitThreadCorePoolSize

            What do I do here ?

            ...

            ANSWER

            Answered 2021-Aug-04 at 20:54

            Spring Cloud Dataflow's Composed Task Runner uses Spring Batch under the covers. And the way Spring Batch deals with nested splits in flows is not quite optimal:

            That's why nested splits should be avoided if tight control on concurrency limits is required.

            In your case that should be possible: With

            Source https://stackoverflow.com/questions/68640451

            QUESTION

            Spring Cloud Data Flow: Error org.springframework.dao.InvalidDataAccessResourceUsageException
            Asked 2021-Jul-08 at 17:30

            I am trying to run/configure a Spring Data Cloud Data Flow (SCDF) to schedule a task for a Spring Batch Job.

            I am running in a minikube that connects to a local postgresql(localhost:5432). The minikube runs in a virtualbox where I assigned a vnet thru the --cidr, so minikube can connect to the local postgres.

            Here is the postgresql service yaml: https://github.com/msuzuki23/SpringCloudDataFlowDemo/blob/main/postgres-service.yaml

            Here is the SCDF config yaml: https://github.com/msuzuki23/SpringCloudDataFlowDemo/blob/main/server-config.yaml

            Here is the SCDF deployment yaml: https://github.com/msuzuki23/SpringCloudDataFlowDemo/blob/main/server-deployment.yaml

            Here is the SCDF server-svc.yaml: https://github.com/msuzuki23/SpringCloudDataFlowDemo/blob/main/server-svc.yaml

            To launch the SCDF server in minikube I do the following kubectl commands:

            ...

            ANSWER

            Answered 2021-Jul-08 at 16:31

            I did a search on

            Caused by: org.postgresql.util.PSQLException: ERROR: relation "hibernate_sequence" does not exist Position: 17

            And found this stackoverflow anwer:

            Postgres error in batch insert : relation "hibernate_sequence" does not exist position 17

            Went to the postgres and created the hibernate_sequence:

            Source https://stackoverflow.com/questions/68304032

            QUESTION

            Spring Cloud Data Flow : Unable to launch multiple instances of the same Task
            Asked 2021-May-13 at 09:21

            TL;DR

            Spring Cloud Data Flow does not allow multiple executions of the same Task even though the documentation says that this is the default behavior. How can we allow SCDF to run multiple instances of the same task at the same time using the Java DSL to launch tasks? To make things more interesting, launching of the same task multiple times works fine when directly hitting the rest enpoints using curl for example.

            Background :

            I have a Spring Cloud Data Flow Task that I have pre-registered in the Spring Cloud Data Flow UI Dashboard

            ...

            ANSWER

            Answered 2021-May-12 at 16:57

            In this case it looks like you are trying to recreate the task definition. You should only need to create the task definition once. From this definition you can launch multiple times. For example:

            Source https://stackoverflow.com/questions/67506703

            QUESTION

            Spring Cloud Data Flow : Programmatical Orchestration of Tasks
            Asked 2021-May-10 at 18:15

            Background

            I have Spring Cloud Data Flow Server running in Kubernetes as a Pod. I am able to launch tasks from the SCDF server UI dashboard. I am looking to develop a more complicated, real-world task- pipeline use-case.

            Instead of using the SCDF UI dashboard, I want to launch a sequential list of tasks from a standard Java application. Consider the following task pipeline :

            Task 1 : Reads data from the database for the unique id received as task argument input and performs enrichments. The enriched record is written back to the database. Execution of one task instance is responsible for processing one unique id.

            Task 2 : Reads the enriched data written by step 1 for the unique id received as task argument input and generates reports. Execution of one task instance is responsible for generating reports for one unique id.

            It should be clear from the above explanation that Task 1 and Task 2 are sequential steps. Assume that the input database contains 50k unique ids. I want to develop an orchestrator Java program that would launch task 1 with a limit of 40. (i.e only 40 pods can be running at any given time for task 1. Any requests to launch more pods for task 1 should be put on wait). Once all 50k unique ids have been processed through Task 1 instances, only then can Task 2 pods should be launched.

            What I found so far

            Going through the documentation, I found something known as the CompositeTaskRunner. However, the examples show commands triggered on a shell/cmd window. I want to do something similar but instead of opening up a data-flow shell program, I want to pass arguments to a Java program that can internally launch tasks. This allows me to easily integrate my application with legacy code that knows how to integrate with Java code (Either by launching a Java program on-demand that should launch a set of tasks and wait for them to complete or by calling a Rest API).

            Question

            1. How to programmatically launch tasks on-demand with Spring Cloud Data Flow using Java instead of a data-flow shell? (Is there a Rest-API to do this or a simple Java program that will be run on a stand alone server should be fine too)
            2. How to programmatically build a sequential pipeline with an upper limit on the number of pods that can be launched per task and with dependencies such that a task can only start once the previous task completed processing all the inputs.
            ...

            ANSWER

            Answered 2021-May-10 at 18:15

            Please review the Java DSL support for Tasks.

            You'd be able to compose the choreography of the tasks with sequential/parallel execution with this fluent-style API. [example: .definition("a: timestamp && b:timestamp")]

            With this defined as Java code, you'd be able to build, launch or schedule the launching of these directed graphs. We see many customers following this approach for E2E acceptance testing and deployment automation.

            [ADD]

            Furthermore, you can extend the programmatic task definitions for continuous deployments, as well.

            Source https://stackoverflow.com/questions/67467973

            QUESTION

            Spring Data Flow Helm chart: Is there a way to declare creation of applications and tasks within the helm charts?
            Asked 2021-Mar-10 at 07:31

            The only way I am running into is using curl command as per the docs: https://docs.spring.io/spring-cloud-dataflow/docs/2.7.1/reference/htmlsingle/#resources-app-registry-post

            This uses a curl command to hit the api. Which I can develop a script for, but I would like to set this up within the helm charts so that these tasks and applications are created when the helm chart is deployed. Any ideas?

            ...

            ANSWER

            Answered 2021-Mar-10 at 07:31

            Please check Spring Cloud Data Flow, Helm Installation, Register prebuilt applications, it says:

            Applications can be registered individually using the app register functionality or as a group using the app import functionality.

            So, I guess you always need to start the app using Helm Chart and only later register the applications using app or REST Endpoint.

            Source https://stackoverflow.com/questions/66556895

            QUESTION

            Spring Cloud Data Flow Local Server Jar - java.lang.NoClassDefFoundError
            Asked 2020-Dec-22 at 16:50

            Recently I updated my JDK from 8 (1.8_275) to 11 (openjdk version "11.0.9.1" 2020-11-04)

            While I am trying to launch SCDF local server using

            ...

            ANSWER

            Answered 2020-Dec-22 at 16:34

            You're using an ancient and deprecated version of SCDF. The 1.x version of SCDF has reached EOL/EOGS, as well. In particular, the version you're using is >2yrs old.

            Please upgrade to the 2.x version. The latest GA is at 2.7.0.

            Check out the getting-started guide and the release blog for more details.

            Source https://stackoverflow.com/questions/65384949

            QUESTION

            Keycloak Ouath2 integration is not working in spring cloud dataflow 2.3.0
            Asked 2020-Nov-01 at 14:26

            I am currently trying to integrate keycloak with spring cloud dataflow 2.3.0 but the configurations are showing in the documentation is not working for this version. I tried the same with version spring cloud dataflow 2.2.2 and the integrations worked okay. This the config I am added in application.yaml for both the versions,

            ...

            ANSWER

            Answered 2020-Nov-01 at 14:26

            The configurations are changed from the version 2.3.0 which is not documented in the dataflow documentations. I have added only the keycloak related configuration in github https://github.com/ChimbuChinnadurai/spring-cloud-dataflow-keycloak-integration

            Source https://stackoverflow.com/questions/64413965

            QUESTION

            Spring Cloud Data Flow security configuracion and integration with RedHat SSO
            Asked 2020-Aug-31 at 20:43

            We are trying to turn on the security for Spring Cloud Data Flow following the documentation (https://docs.spring.io/spring-cloud-dataflow/docs/current-SNAPSHOT/reference/htmlsingle/#configuration-security) but we have some knowledge gaps that we are not capable to fill.

            According to the point 9.2, it is possible to configure the authentication with OAuth 2.0 and integrate it with SSO. We use RedHat SSO, so we are trying to integrate both of them, but we are not capable to make it works, is it possible or there is a limitation about the SSO used?

            Following the documentation, we have set these properties:

            • spring.security.oauth2.client.registration.uaa.client-id=xxxxxxx
            • spring.security.oauth2.client.registration.uaa.client-secret=xxxxxx
            • spring.security.oauth2.client.registration.uaa.redirect-uri='{baseUrl}/login/oauth2/code/{registrationId}'
            • spring.security.oauth2.client.registration.uaa.authorization-grant-type=authorization_code
            • spring.security.oauth2.client.registration.uaa.scope[0]=openid
            • spring.security.oauth2.client.provider.uaa.jwk-set-uri=../openid-connect/certs
            • spring.security.oauth2.client.provider.uaa.token-uri=../openid-connect/token
            • spring.security.oauth2.client.provider.uaa.user-info-uri=../openid-connect/userinfo
            • spring.security.oauth2.client.provider.uaa.user-name-attribute=user_name
            • spring.security.oauth2.client.provider.uaa.authorization-uri=../openid-connect/auth
            • spring.security.oauth2.resourceserver.opaquetoken.introspection-uri=../openid-connect/token/introspect
            • spring.security.oauth2.resourceserver.opaquetoken.client-id=xxxxxxx
            • spring.security.oauth2.resourceserver.opaquetoken.client-secret=xxxxxxx

            So we have some considerations:

            • The properties resourceserver.opaquetoken are needed for the introspection of the token, so we are pretty sure that they are necessary for when we receive a REST request and it must have the Authorization header
            • If we are not using UAA, should the properties be named uaa?
            • When we try to access to de UI, it redirects to the authorization-uri because the authorization-grant-type=authorization_code, so it will login in the SSO, is that right?
            • If we use the grant-type Password it would request directly a username/password for login, where does it is validated?
            • The user-info URI is mandatory but, is it really used?
            • What are the other URIs (jwk and token) used for?
            • Why the redirect URI has that format? where does that variables point to?

            Finally, we have test the configuration in a SCDF running in a Docker container, but it does "nothing":

            ...

            ANSWER

            Answered 2020-Jul-21 at 08:01

            These are all plain Spring Security OAuth settings and concepts are better documented there. We're in a process to add better docks for keycloak but in a meanwhile my old test dataflow-keycloak might get you started.

            In a recent versions we added a better way to use plain jwt keys and we documented it for Azure/AD. Plan is to add similar section for keycloak.

            I believe just by using issuer-uri and jwk-set-uri should give you a working setup(you still need to figure out scope to roles mappings) as Spring Security is using those to autoconfigure oauth settings. All the other settings are kinda legacy dating back times when we weren't fully on Spring Security 5.3 line.

            For RH SSO I'm not sure if you're talking about some global shared instance or your private setup.

            Source https://stackoverflow.com/questions/62999316

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install spring-cloud-dataflow

            You can download it from GitHub, Maven.
            You can use spring-cloud-dataflow like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the spring-cloud-dataflow component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .

            Support

            We welcome contributions! Follow this link for more information on how to contribute.
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            Install
            Maven
            Gradle
            CLONE
          • HTTPS

            https://github.com/spring-cloud/spring-cloud-dataflow.git

          • CLI

            gh repo clone spring-cloud/spring-cloud-dataflow

          • sshUrl

            git@github.com:spring-cloud/spring-cloud-dataflow.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Reuse Pre-built Kits with spring-cloud-dataflow

            Consider Popular Stream Processing Libraries

            gulp

            by gulpjs

            webtorrent

            by webtorrent

            aria2

            by aria2

            ZeroNet

            by HelloZeroNet

            qBittorrent

            by qbittorrent

            Try Top Libraries by spring-cloud

            spring-cloud-netflix

            by spring-cloudJava

            spring-cloud-gateway

            by spring-cloudJava

            spring-cloud-kubernetes

            by spring-cloudJava

            spring-cloud-config

            by spring-cloudJava

            spring-cloud-sleuth

            by spring-cloudJava