mailing-list | Mailing List service powered by AWS Lambda | Function As A Service library
kandi X-RAY | mailing-list Summary
kandi X-RAY | mailing-list Summary
Mailing List service powered by AWS Lambda
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of mailing-list
mailing-list Key Features
mailing-list Examples and Code Snippets
Community Discussions
Trending Discussions on mailing-list
QUESTION
I have a Flink job that runs well locally but fails when I try to flink run
the job on cluster. It basically reads from Kafka, do some transformation, and writes to a sink. The error happens when trying to load data from Kafka via 'connector' = 'kafka'
.
Here is my pom.xml, note flink-connector-kafka
is included.
ANSWER
Answered 2021-Mar-12 at 04:09It turns out my pom.xml is configured incorrectly.
QUESTION
We have 4 collection in Solr Cloud (6.6.6). All of these have same schema and hence same type of content. When query, we have to specify each collection name one by one in the query parameters. Is it possible to create a aliases (or something like this) that can route traffic to all collections at bottom level. Due to some technical reasons, we cannot use multiple shards.
It is more like mailing list where a mail is sent to many (mailing-list) users.
...ANSWER
Answered 2021-Jan-28 at 07:03You can create a alias for multiple collection.
Below is the way you create an alias for multiple collection.
QUESTION
ANSWER
Answered 2020-Sep-14 at 07:47With Table API you are just constructing an expression that will be executed later. So all expressions just return another expression and thus construct a tree of expressions. Use org.apache.flink.table.api.Expressions#ifThenElse
or CASE WHEN END
in SQL.
QUESTION
I have the following task:
- Create a job with SQL request to Hive table;
- Run this job on remote Flink cluster;
- Collect the result of this job in file (HDFS is preferable).
Note
Because it is necessary to run this job on remote Flink cluster i can not use TableEnvironment in a simple way. This problem is mentioned in this ticket: https://issues.apache.org/jira/browse/FLINK-18095. For current solution I use adivce from http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Table-Environment-for-Remote-Execution-td35691.html.
Code
...ANSWER
Answered 2020-Sep-08 at 02:11Table.execute().collect()
returns the result of the view to your client side for interactive purpose. In your case, you can use the filesystem connector and use INSERT INTO
for writing the view to the file. For example:
QUESTION
ANSWER
Answered 2020-Aug-08 at 02:21To summarize the series of updates I did to fix my problem:
- Change the connection port within createRemoteEnvironment() to 8081
- Check the Scala version on the Apache Flink server, then downgrade your Scala version in the IDE to match
QUESTION
I would want to use pureconfig with apache Flink.
How can I pass additional java properties when starting the job?
I try to pass it via: -yD env.java.opts="-Dconfig.file='config/jobs/twitter-analysis.conf'"
argument, but it is not accepted:
https://github.com/geoHeil/streaming-reference/blob/5-basic-flink-setup/Makefile#L21
...
ANSWER
Answered 2020-Jun-25 at 19:29Copying the answer from the mailing list.
If you reuse the cluster for several jobs, they need to share the JVM_ARGS
since it's the same process. [1] On Spark, new processes are spawned for each stage afaik.
However, the current recommendation is to use only one ad-hoc cluster per job/application (which is closer to how Spark works). So if you use YARN, every job/application spawns a new cluster that just has the right size for it. Then you can supply new parameters for new YARN submission with
QUESTION
I want to run a flink job on kubernetes, using a (persistent) state backend it seems like crashing taskmanagers are no issue as they can ask the jobmanager which checkpoint they need to recover from, if I understand correctly.
A crashing jobmanager seems to be a bit more difficult. On this flip-6 page I read zookeeper is needed to be able to know what checkpoint the jobmanager needs to use to recover and for leader election.
Seeing as kubernetes will restart the jobmanager whenever it crashes is there a way for the new jobmanager to resume the job without having to setup a zookeeper cluster?
The current solution we are looking at is: when kubernetes wants to kill the jobmanager (because it want to move it to another vm for example) and then create a savepoint, but this would only work for graceful shutdowns.
Edit: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-HA-with-Kubernetes-without-Zookeeper-td15033.html seems to be interesting but has no follow-up
...ANSWER
Answered 2018-Aug-31 at 09:35Out of the box, Flink requires a ZooKeeper cluster to recover from JobManager crashes. However, I think you can have a lightweight implementation of the HighAvailabilityServices
, CompletedCheckpointStore
, CheckpointIDCounter
and SubmittedJobGraphStore
which can bring you quite far.
Given that you have only one JobManager running at all times (not entirely sure whether K8s can guarantee this) and that you have a persistent storage location, you could implement a CompletedCheckpointStore
which retrieves the completed checkpoints from the persistent storage system (e.g. reading all stored checkpoint files). Additionally, you would have a file which contains the current checkpoint id counter for CheckpointIDCounter
and all the submitted job graphs for the SubmittedJobGraphStore
. So the basic idea is to store everything on a persistent volume which is accessible by the single JobManager.
QUESTION
Computer configuration: my computer is maosx system, in the virtual machine installed ubuntu, the original system install ladder (VPN), Linux did not install a successful ladder (VPN).
Environment configuration: install petsc in the virtual machine, download and unpack petsc software package, install GCC, gfortran, and download MPI and BLAS/LAPACK separately before, but there was an error when installing MPI. .
...ANSWER
Answered 2020-Apr-15 at 18:33As Satish said in email, your machine likely ran out of memory. gfortran can take more the 2GB of RAM when compiling. An alternative, if you do not need Fortran, is to configure using --with-fc=0.
QUESTION
I'm researching docker/k8s deployment possibilities for Flink 1.9.1.
I'm after reading/watching [1][2][3][4].
Currently we do think that we will try go with Job Cluster approach although we would like to know what is the community trend with this? We would rather not deploy more than one job per Flink cluster.
Anyways, I was wondering about few things:
How can I change the number of task slots per task manager for Job and Session Cluster? In my case I'm running docker on VirtualBox where I have 4 CPUs assigned to this machine. However each task manager is spawned with only one task slot for Job Cluster. With Session Cluster however, on the same machine, each task manager is spawned with 4 task slots.
In both cases Flink's UI shows that each Task manager has 4 CPUs.
How can I resubmit job if I'm using a Job Cluster. I'm referring this use case [5]. You may say that I have to start the job again but with different arguments. What is the procedure for this? I'm using checkpoints btw.
Should I kill all task manager containers and rerun them with different parameters?
How I can resubmit job using Session Cluster?
How I can provide log config for Job/Session cluster? I have a case, where I changed log level and log format in log4j.properties and this is working fine on local (IDE) environment. However when I build the fat jar, and ran a Job Cluster based on this jar it seams that my log4j properties are not passed to the cluster. I see the original format and original (INFO) level.
Thanks,
[1] https://youtu.be/w721NI-mtAA
[2] https://youtu.be/WeHuTRwicSw
[3] https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/docker.html
[4] https://github.com/apache/flink/blob/release-1.9/flink-container/docker/README.md
...ANSWER
Answered 2020-Jan-15 at 15:48Currently we do think that we will try go with Job Cluster approach although we would like to know what is the community trend with this? We would rather not deploy more than one job per Flink cluster.
This question is probably better suited on the user mailing list.
How can I change the number of task slots per task manager for Job and Session Cluster?
You can control this via the config option taskmanager.numberOfTaskSlots
How I can resubmit job using Session Cluster?
This is described here. The bottom line is that you create a savepoint and resume your job from it. It is also possible to resume a job from retained checkpoints.
How can I resubmit job if I'm using a Job Cluster.
Conceptually, this is no different from resuming a job from a savepoint in a session cluster. You can specify the path to the savepoint as a command line argument to the cluster entrypoint. The details are described here.
How I can provide log config for Job/Session cluster?
If you are using the scripts in the bin/
directory of the Flink binary distribution to start your cluster (e.g., bin/start-cluster.sh
, bin/jobmanager.sh
, bin/taskmanager.sh
, etc.), you can change the log4j configuration by adapting conf/log4j.properties
. The logging configuration is passed to the JobManager and TaskManager JVMs as a system variable (see bin/flink-daemon.sh
). See also the Chapter "How to use logging" in the Flink documentation.
QUESTION
I am simply typing:
...ANSWER
Answered 2019-Jul-19 at 19:13I was able to get past the above error by using a fresh download of Zeppelin. I did run into another issue however when trying to use the python interpreter: Apache Zeppelin 8.1 Win10 Python Interpreter Error
The stack trace in the cmd window used to launch Zeppelin revealed the follow error: Apache Zeppelin 8.1 Win10 Python Interpreter Error Stack Trace
Basically by referencing the code I was able to see the python interpreter is attempting to create a temporary file and put into a folder named tmp: https://github.com/apache/zeppelin/blob/branch-0.8/python/src/main/java/org/apache/zeppelin/python/PythonInterpreter.java, line 102
From this point I wasn't sure where it was trying use this folder named tmp. On my PC the location ended up being in root of C, so C:\tmp
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install mailing-list
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page