ThreadUtils | A collection of classes that help with multithreading | Reflection library
kandi X-RAY | ThreadUtils Summary
kandi X-RAY | ThreadUtils Summary
A collection of classes that help with multithreading.
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of ThreadUtils
ThreadUtils Key Features
ThreadUtils Examples and Code Snippets
Community Discussions
Trending Discussions on ThreadUtils
I'm using Spark 3.1 in Databricks (Databricks Runtime 8) with a very large cluster (25 workers with 112 Gb of memory and 16 cores each) to replicate several SAP tables in an Azure Data Lake Storage (ADLS gen2). For doing this, a tool is writting the deltas of all these tables into an intermediate system (SQL Server) and then, if I have new data for a certain table, I execute a Databricks job to merge the new data with the existing data available in ADLS.
This process is working fine for most of the tables, but some of them (the biggest ones) take a lot of time to be merged (I merge the data using the PK of each table) and the biggest one has started failing since a week ago (When a big delta of the table was generated). Trace of the error that I can see in the job:
Py4JJavaError: An error occurred while calling o233.sql. : org.apache.spark.SparkException: Job aborted. at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:234) at com.databricks.sql.transaction.tahoe.files.TransactionalWriteEdge.$anonfun$writeFiles$5(TransactionalWriteEdge.scala:246) ... .. ............................................................................................................................................................................................................................................................................................................................................................................ Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:428) at com.databricks.sql.transaction.tahoe.perf.DeltaOptimizedWriterExec.awaitShuffleMapStage$1(DeltaOptimizedWriterExec.scala:153) at com.databricks.sql.transaction.tahoe.perf.DeltaOptimizedWriterExec.getShuffleStats(DeltaOptimizedWriterExec.scala:158) at com.databricks.sql.transaction.tahoe.perf.DeltaOptimizedWriterExec.computeBins(DeltaOptimizedWriterExec.scala:106) at com.databricks.sql.transaction.tahoe.perf.DeltaOptimizedWriterExec.doExecute(DeltaOptimizedWriterExec.scala:174) at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:196) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:240) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:236) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:192) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:180) ... 141 more Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: ShuffleMapStage 68 (execute at DeltaOptimizedWriterExec.scala:97) has failed the maximum allowable number of times: 4. Most recent failure reason: org.apache.spark.shuffle.FetchFailedException: Connection from /XXX.XX.XX.XX:4048 closed at at at at .................................................................................................................................................................................................................................................................................................................................... ... Caused by: Connection from /XXX.XX.XX.XX:4048 closed at at at at at at at io.netty.handler.timeout.IdleStateHandler.channelInactive( at at at at at at at at at$HeadContext.channelInactive( at at at at$AbstractUnsafe$ at io.netty.util.concurrent.AbstractEventExecutor.safeExecute( at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks( at at io.netty.util.concurrent.SingleThreadEventExecutor$ at io.netty.util.internal.ThreadExecutorMap$ at ... 1 more
As the error is non descriptive, I have taken a look to each executor log and I have seen following message:
21/04/07 09:11:24 ERROR OneForOneBlockFetcher: Failed while starting block fetches Connection from /XXX.XX.XX.XX:4048 closed
And in the executor that seems to be unable to connect, I see the following error message:
21/04/06 09:30:46 ERROR SparkThreadLocalCapturingRunnable: Exception in thread Task reaper-7 org.apache.spark.SparkException: Killing executor JVM because killed task 5912 could not be stopped within 60000 ms. at org.apache.spark.executor.Executor$ at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.$anonfun$run$1(SparkThreadLocalForwardingThreadPoolExecutor.scala:104) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$ at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:68) at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured$(SparkThreadLocalForwardingThreadPoolExecutor.scala:54) at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:101) at at java.util.concurrent.ThreadPoolExecutor.runWorker( at java.util.concurrent.ThreadPoolExecutor$ at
I have tried increasing the default shuffle parallelism (From 200 to 1200 as It's suggested here Spark application kills executor) and it seems that the job is more time in execution, but it fails again.
I have tried to monitor the SparkUI meanwhile the job is in execution:
But as you can see, the problem is the same: Some stages are failing because an executor its unreachable because a task has failed more than X times.
The big delta that I mentioned above has more or less 4-5 billion rows and the big dump that I want to merge has, more or less, 100 million rows. The table is not partitioned (yet) so the process is very work-intensive. What is failing is the merge part, not the process to copy the data from SQL Server to ADLS, so the merge is being done once the data to be merge is already in Parquet format.
Any idea of what is happening or what can I do in order to finish this merge?
Thanks in advance.
Answered 2021-Apr-12 at 07:56Finally, I reviewed the cluster and I changed the spark.sql.shuffle.partitions property to 1600 in the code of the job that I wanted to execute with this configuration (Instead than changing this directly on the cluster). In my cluster I have 400 cores so I chose a multiple (1600) of that number.
After that, the execution finished in two hours. I came to this conclusion because, in my logs and Spark UI I observed a lot of disk spilling so I thought that the partitions wasn't fitting in the worker nodes.
I am trying to run a simple spark job on a kubernetes cluster. I deployed a pod that starts a pyspark shell and in that shell I am changing the spark configuration as specified below:
Answered 2021-Feb-01 at 09:42I don't have much experience with PySpark but I once setup Java Spark to run on a Kubernetes cluster in client mode, like you are trying now.. and I believe the configuration should mostly be the same.
First of all, you should check if the headless service is working as expected or not. First with a:
I am trying to run spark shell in psuedodistributed mode on my windows 10 pc having 8 Gigs of ram. I am able to submit and run a mapreduce wordcount on yarn ,but when i try to initialize a spark shell or spark submit any program with master as yarn it fails with failed to send RPC error. The error is given below.
Below is my yarn-site.xml config
Answered 2020-Dec-26 at 20:08Caused by: java.lang.NoSuchMethodError:
... 21 more
I am learning pyspark but came across this error. I have been stuck at this for a few hours now. I have seen many questions on StackOverflow but most of them either increase the driver memory or executor memory. I also tried this but doesn't seem to get it working. Anybody here if has experienced such an error, your help is much appreciated.
The same code is working if I have a smaller dataset, but when I use a large dataset this error comes up again.
My laptop configurations:
Answered 2020-Oct-13 at 14:04This is your culprit: sorted_df.coalesce(1).write.json(output, mode='overwrite')
. You have to understand repercussions that arise when repartitioning or coalescing to a single partition. Now, all your data will have to be transferred to a single worker in order to write it to a single file. You could try it with repartition(1)
instead of coalesce(1)
. Plus, not sure why you need print(
I have a join between cleanDF
and sentiment_df
using array_contains
that works fine (from solution 61687997). And I need include in the Result
df a new column ('Year') from cleanDF
This is the join:
Answered 2020-May-26 at 01:04Year column might be having null values & because of that it is failing with Caused by: java.lang.NullPointerException
exception. Filter all null values from Year
I am currently experimenting with Apache Spark. Everything seems to be working fine in that all the various components are up and running (i.e. HDFS, Spark, Yarn, etc). There do not appear to be any errors during the startup of any of these. I am running this in a Vagrant VM and Spark/HDFS/Yarn are dockerized.
tl;dr: Submitting an job via Yarn results in There are 1 datanode(s) running and 1 node(s) are excluded in this operation
Submitting my application with: $ spark-submit --master yarn --class org.apache.spark.examples.SparkPi --driver-memory 512m --executor-memory 512m --executor-cores 1 /Users/foobar/Downloads/spark-3.0.0-preview2-bin-hadoop3.2/examples/jars/spark-examples_2.12-3.0.0-preview2.jar 10
Which results in the following:
Answered 2020-May-05 at 05:14Turns out it was a networking issue. If you look closely at what was originally posted in the question you will see the following error in the log, one that I originally missed:
I am quite new to Spark world. In our application we have an in-built Spark standalone cluster(Version 2.4.3) which takes in submitted jobs by our main data engine loader application via spark submit master URL.
We have 3 worker slave nodes on different VMs. Interestingly because of some IOException which I am posting in a very limited and cryptic format to limit system internals. The Master assumes it needs to Re-Submit the same job/application to the same worker over and over again(10s of thousands of time)
Worker App/Job Logs which is the same for every Job RE-Submission
Answered 2020-Apr-28 at 16:55Try setting below config
I got the error after adding the Firebase Firestore dependency to my app.
implementation ''
Compiling after adding this dependency gives the following error:
Answered 2019-Oct-31 at 14:51Turns out the problem wasn't caused by the Firebase Firestore dependency. It worked after I increased the maximum heap size for the daemon in the global
file. I wasn't aware this had precedence over the project-level properties.
I am getting the following error only when running all the test cases using mvn test
. This doesn't happen if I run each of the test classes independently from IDE. I am using
Answered 2020-Apr-20 at 15:07I checked your git repo
after entering in to this test case
I am having troubles starting spark shell against my local running spark standalone cluster. Any ideas? I'm running this on spark 3.1.0-SNAPSHOT.
Starting the shell or regular app works fine in local mode, but both fail with below command.
Answered 2020-Apr-06 at 05:02The problem was that the incorrect port was being used.
This line appeared in the standalone master log:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
No vulnerabilities reported
Install ThreadUtils
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page