kandi background
Explore Kits

kettle-storm | Kettle Storm is an experimental execution environment | Pub Sub library

 by   pentaho Java Version: Current License: No License

 by   pentaho Java Version: Current License: No License

Download this library from

kandi X-RAY | kettle-storm Summary

kettle-storm is a Java library typically used in Messaging, Pub Sub, Kafka applications. kettle-storm has no bugs, it has no vulnerabilities and it has low support. However kettle-storm build file is not available. You can download it from GitHub.
kettle-storm
Support
Support
Quality
Quality
Security
Security
License
License
Reuse
Reuse

kandi-support Support

  • kettle-storm has a low active ecosystem.
  • It has 22 star(s) with 20 fork(s). There are 23 watchers for this library.
  • It had no major release in the last 12 months.
  • There are 1 open issues and 0 have been closed. On average issues are closed in 2505 days. There are 1 open pull requests and 0 closed requests.
  • It has a neutral sentiment in the developer community.
  • The latest version of kettle-storm is current.
kettle-storm Support
Best in #Pub Sub
Average in #Pub Sub
kettle-storm Support
Best in #Pub Sub
Average in #Pub Sub

quality kandi Quality

  • kettle-storm has 0 bugs and 0 code smells.
kettle-storm Quality
Best in #Pub Sub
Average in #Pub Sub
kettle-storm Quality
Best in #Pub Sub
Average in #Pub Sub

securitySecurity

  • kettle-storm has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
  • kettle-storm code analysis shows 0 unresolved vulnerabilities.
  • There are 0 security hotspots that need review.
kettle-storm Security
Best in #Pub Sub
Average in #Pub Sub
kettle-storm Security
Best in #Pub Sub
Average in #Pub Sub

license License

  • kettle-storm does not have a standard license declared.
  • Check the repository for any license declaration and review the terms closely.
  • Without a license, all rights are reserved, and you cannot use the library in your applications.
kettle-storm License
Best in #Pub Sub
Average in #Pub Sub
kettle-storm License
Best in #Pub Sub
Average in #Pub Sub

buildReuse

  • kettle-storm releases are not available. You will need to build from source code and install.
  • kettle-storm has no build file. You will be need to create the build yourself to build the component from source.
  • Installation instructions are not available. Examples and code snippets are available.
  • kettle-storm saves you 625 person hours of effort in developing the same functionality from scratch.
  • It has 1454 lines of code, 105 functions and 24 files.
  • It has medium code complexity. Code complexity directly impacts maintainability of the code.
kettle-storm Reuse
Best in #Pub Sub
Average in #Pub Sub
kettle-storm Reuse
Best in #Pub Sub
Average in #Pub Sub
Top functions reviewed by kandi - BETA

kandi has reviewed kettle-storm and discovered the below as its top functions. This is intended to give you an instant insight into kettle-storm implemented functionality, and help decide if they suit your requirements.

  • Creates a Storm topology
  • Starts Storm execution engine .
  • Process the next row .
  • Submit a topology
  • Send a signal to the system
  • Handles a signal .
  • Emits a tuple asynchronously .
  • Create a new SignalConnection
  • Return a string representation of this signal .
  • Loads the topology jar from configuration .

kettle-storm Key Features

Steps that do not emit at least one message for every input. Because Kettle does not have a message id to correlate Storm messages with we cannot guarantee a message has been completely processed until we see a record emited from a given step. Because of this, we also cannot determine which messages are produced for a given input if they are not immediately emitted as part of the same processRow() call. As such, we can only guarantee message processing when one input message produces at least once output message. These classification of input steps will not work until that is fixed:

Sampling

Aggregation

Sorting

Filtering

First-class Spoon support

Repository-based transformations

Error handling

Conditional hops

Sub-transformations

Metrics: Kettle timing, throughput, logging

default

copy iconCopydownload iconDownload
The following commands will execute a transformation using a local in-memory test cluster.

### From a checkout
A Kettle transformation can be submitted as a topology using the included KettleStorm command-line application. To invoke it from Maven simply use the maven exec target with the Kettle transformation you wish to execute:
```
mvn package
mvn exec:java -Dexec.args=src/main/resources/test.ktr -Dkettle-storm-local-mode=true
```

### From a release
Extract the release and run:
```
java -Dkettle-storm-local-mode=true -jar kettle-engine-storm-${version}-assembly.jar path/to/my.ktr
```

Executing on a Storm cluster
---------------------------
The following instructions are meant to be executed using the artifacts packaged in a release.

To execute a transformation on a Storm cluster running on the same host simply run:
```
java -jar kettle-engine-storm-${version}-assembly.jar path/to/my.ktr
```

To execute the transformation to a Nimbus host running remotely include the host and port via the ```storm.options``` System property:
```
java -Dstorm.options=nimbus.host=my-nimbus,nimbus.thrift.port=6628 -jar kettle-engine-storm-${version}-assembly.jar path/to/my.ktr
```

### Configuration via System Properties

If additional options are required they can be provided as System Properties vai the command line in the format: `-Dargument=value`.

They are all optional and will be translated into ```StormExecutionEnvironmentConfig``` properties:

* ```kettle-storm-local-mode```: Flag indicating if you wish to execute the transformation as a Storm topology on an in-memory "local cluster" or remotely to an external Storm cluster. Defaults to ```false```.
* ```kettle-storm-debug```: Flag indicating you wish to enable debug messaging from Storm for the submitted topology. Defaults to ```false```.
* ```kettle-storm-topology-jar```: The path to the jar file to submit with the Storm topology. This is only required if you have created a custom jar with additional classes you wish to make available to the Kettle transformation without having to manually install plugins or configure the environment of each Storm host.

#### Storm Configuration

By default, Kettle Storm will submit topologies to a nimbus host running on localhost with the default connection settings included with Storm. If you'd like to use a specific storm.yaml file declare a System property on the command line:
```
mvn exec:java -Dstorm.conf.file=/path/to/storm.yaml -Dexec.args=src/main/resources/test.ktr
```

Storm configuration properties can be overriden by specifying them on the command line in the format:
```
-Dstorm.options=nimbus.host=my-nimbus,nimbus.thrift.port=6628
```

Embedding
---------
The Kettle execution engine that can submit topologies can be embedded in a Java application using ```StormExecutionEngine``` and ```StormExecutionEngineConfig```.

```StormExecutionEngine``` provides convenience methods for integrating within multithreaded environments:

- ```StormExecutionEngine.isComplete```: Blocks for the provided duration and returns ```true``` if the topology has completed successfully.
- ```StormExecutionEngine.stop```: Kills the topology running the transformation if it's still execution.

### Example Code

```
StormExecutionEngineConfig config = new StormExecutionEngineConfig();
config.setTransformationFile("/path/to/my.ktr");
StormExecutionEngine engine = new StormExecutionEngine(config);
engine.init();
engine.execute();
engine.isComplete(10, TimeUnit.MINUTE); // Block for up to 10 minutes while the topology executes.
```

Building a release archive
--------------------------
Execute ```mvn clean package``` to produce the release artifacts. The jars will be stored in ```target/```.

Multiple artifacts are produced via the ```mvn package``` target:

```
kettle-engine-storm-0.0.1-SNAPSHOT-assembly.jar
kettle-engine-storm-0.0.1-SNAPSHOT-for-remote-topology.jar
kettle-engine-storm-0.0.1-SNAPSHOT.jar
```

The ```-assembly.jar``` is used to schedule execution of a transformation and contains all dependencies. The ```-for-remote-topology.jar``` contains code to be submitted to the cluster with the topology and all dependencies. The plain jar is this project's compilation without additional dependencies.

External References
===================
Kettle: http://kettle.pentaho.com
Storm: http://storm-project.net

Community Discussions

Trending Discussions on Pub Sub
  • Build JSON content in R according Google Cloud Pub Sub message format
  • BigQuery Table a Pub Sub Topic not working in Apache Beam Python SDK? Static source to Streaming Sink
  • Pub Sub Lite topics with Peak Capacity Throughput option
  • How do I add permissions to a NATS User to allow the User to query & create Jestream keyvalue stores?
  • MSK vs SQS + SNS
  • Dataflow resource usage
  • Run code on Python Flask AppEngine startup in GCP
  • Is there a way to listen for updates on multiple Google Classroom Courses using Pub Sub?
  • Flow.take(ITEM_COUNT) returning all the elements rather then specified amount of elements
  • Wrapping Pub-Sub Java API in Akka Streams Custom Graph Stage
Trending Discussions on Pub Sub

QUESTION

Build JSON content in R according Google Cloud Pub Sub message format

Asked 2022-Apr-16 at 09:59

In R, I want to build json content according this Google Cloud Pub Sub message format: https://cloud.google.com/pubsub/docs/reference/rest/v1/PubsubMessage

It have to respect :

{
  "data": string,
  "attributes": {
    string: string,
    ...
  },
  "messageId": string,
  "publishTime": string,
  "orderingKey": string
}

The message built will be readed from this Python code:

def pubsub_read(data, context):
    '''This function is executed from a Cloud Pub/Sub'''
    message = base64.b64decode(data['data']).decode('utf-8')
    file_name = data['attributes']['file_name']

This following R code builds a R dataframe and converts it to json content:

library(jsonlite)
data="Hello World!"
df <- data.frame(data)
attributes <- data.frame(file_name=c('gfs_data_temp_FULL.csv'))
df$attributes <- attributes

msg <- df %>%
    toJSON(auto_unbox = TRUE, dataframe = 'columns', pretty = T) %>%
    # Pub/Sub expects a base64 encoded string
    googlePubsubR::msg_encode() %>%
    googlePubsubR::PubsubMessage()

It seems good but when I visualise it with a json editor :

enter image description here

indexes are added.

Additionally there is the message content: enter image description here

I dont'sure it respects Google Cloud Pub Sub message format...

ANSWER

Answered 2022-Apr-16 at 09:59

Not sure why, but replacing the dataframe by a list seems to work:

library(jsonlite)

df = list(data = "Hello World")
attributes <- list(file_name=c('toto.csv'))
df$attributes <- attributes

df %>%
  toJSON(auto_unbox = TRUE, simplifyVector=TRUE, dataframe = 'columns', pretty = T)

Output:

{
  "data": "Hello World",
  "attributes": {
    "file_name": "toto.csv"
  }
} 

Source https://stackoverflow.com/questions/71892778

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install kettle-storm

You can download it from GitHub.
You can use kettle-storm like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the kettle-storm component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

DOWNLOAD this Library from

Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from
over 430 million Knowledge Items
Find more libraries
Reuse Solution Kits and Libraries Curated by Popular Use Cases

Save this library and start creating your kit

Explore Related Topics

Share this Page

share link
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from
over 430 million Knowledge Items
Find more libraries
Reuse Solution Kits and Libraries Curated by Popular Use Cases

Save this library and start creating your kit

  • © 2022 Open Weaver Inc.