job-server

by zanphp PHP Version: Current License: No License

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | job-server Summary

job-server is a PHP library. job-server has no bugs, it has no vulnerabilities and it has low support. You can download it from GitHub.

job-server

Support

Quality

Security

License

Reuse

Support

job-server has a low active ecosystem.

It has 4 star(s) with 1 fork(s). There are 7 watchers for this library.

It had no major release in the last 6 months.

job-server has no issues reported. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of job-server is current.

Quality

job-server has no bugs reported.

Security

job-server has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

job-server does not have a standard license declared.

Check the repository for any license declaration and review the terms closely.

Without a license, all rights are reserved, and you cannot use the library in your applications.

Reuse

job-server releases are not available. You will need to build from source code and install.

Installation instructions are not available. Examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi has reviewed job-server and discovered the below as its top functions. This is intended to give you an instant insight into job-server implemented functionality, and help decide if they suit your requirements.

Parse field from string
Initialize the worker
Parse command line arguments
Decode a job
Load job data .
Register the middleware .
Parse header array .
Get random port
Yield a job
Called when a job has finished

Get all kandi verified functions for this library.

job-server Key Features

No Key Features are available at this moment for job-server.

job-server Examples and Code Snippets

No Code Snippets are available at this moment for job-server.

Community Discussions

Trending Discussions on job-server

Running a Golang apache Beam pipeline on Spark

It's possible to configure the Beam portable runner with the spark configurations?

java.lang.UnsatisfiedLinkError: no mesos in java.library.path

KafkaRecord cannot be cast to [B

Impala vs Spark performance for ad hoc queries

Scala Reflection exception during creation of DataSet in Spark

How to import spark.jobserver.SparkSessionJob

php gearman worker function no synchronized with doctrine

What is the SBT := operator in build.sbt?

Querying data produced by SPARK job via RESTful API

QUESTION

Running a Golang apache Beam pipeline on Spark

Asked 2021-Mar-10 at 02:46

I created a simple golang Apache Beam pipeline and it is working well with DirectRunner. I tried to deploy it on a Spark cluster using the following command : ./bin/spark-submit --master=spark://vm:7077 main.go --runner=SparkRunner --job_endpoint=localhost:8099 --artifact_endpoint=localhost:8098 --environment_type=LOOPBACK --output=/tmp/output

Before submiting the application, i runned the job_endpoint using the following command :

./gradlew :runners:spark:job-server:runShadow -PsparkMasterUrl=spark://localhost:7077

The job fails on Spark with this error : WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Exception in thread "main" org.apache.spark.SparkException: Failed to get main class in JAR with error 'null'. Please specify one with --class.

It seems that i need to specify the class argument but I do not understand what the error mean? can I get help ?

...

ANSWER

Answered 2021-Mar-10 at 02:46

spark-submit is a Spark utility that accepts either a Java JAR or a Python script. It doesn't know how to run Go programs.

I updated the Beam Go quickstart guide with instructions for the Spark runner. Let me know if that works for you.

Source https://stackoverflow.com/questions/66497579

QUESTION

It's possible to configure the Beam portable runner with the spark configurations?

Asked 2021-Mar-04 at 19:36

TLDR;

It's possible to configure the Beam portable runner with the spark configurations? More precisely, it's possible to configure the spark.driver.host in the Portable Runner?

Motivation

Currently, we have airflow implemented in a Kubernetes cluster, and aiming to use TensorFlow Extended we need to use Apache beam. For our use case Spark would be the appropriate runner to be used, and as airflow and TensorFlow are coded in python we would need to use the Apache Beam's Portable Runner (https://beam.apache.org/documentation/runners/spark/#portability).

The problem

The portable runner creates the spark context inside its container and does not leave space for the driver DNS configuration making the executors inside the worker pods non-communicable to the driver (the job server).

Setup

Following the beam documentation, the job serer was implemented in the same pod as the airflow to use the local network between these two containers. Job server config:

...

ANSWER

Answered 2021-Feb-23 at 22:28

I have three solutions to choose from depending on your deployment requirements. In order of difficulty:

Use the Spark "uber jar" job server. This starts an embedded job server inside the Spark master, instead of using a standalone job server in a container. This would simplify your deployment a lot, since you would not need to start the beam_spark_job_server container at all.

Source https://stackoverflow.com/questions/66320831

QUESTION

java.lang.UnsatisfiedLinkError: no mesos in java.library.path

Asked 2020-May-21 at 20:38

I am trying to configure spark job-sever for Mesos cluster deployment mode. I have set spark.master = "mesos://mesos-master:5050" in jobserver config.

When I am trying to create a context on job-server, it is failing with the following exception:

...

ANSWER

Answered 2017-May-29 at 08:58

I was setting MESOS_NATIVE_JAVA_LIBRARY env variable for user and I was running job-server with Sudo privileges.

Source https://stackoverflow.com/questions/43492687

QUESTION

KafkaRecord cannot be cast to [B

Asked 2019-Dec-28 at 07:51

I'm trying to process the data streaming from Apache Kafka using the Python SDK for Apache Beam with the Flink runner. After running Kafka 2.4.0 and Flink 1.8.3, I follow these steps:

1) Compile and run Beam 2.16 with Flink 1.8 runner.

...

ANSWER

Answered 2019-Dec-28 at 07:51

Disclaimer: this is my first encounter with Apache Beam project.

It seems that Kafka consumer support is quite fresh thing in Beam (at least in Python interface) according to this JIRA. Apparently, it seems that there is still problem with FlinkRunner combined with this new API. Even though your code is technically correct it will not run correctly on Flink. There is a patch available which seems more like a quickfix than final solution to me. It requires recompilation and thus is not something I would propose using on production. If you are just getting started with technology and don't want to be blocked then feel free to try it out.

Source https://stackoverflow.com/questions/59501461

QUESTION

Impala vs Spark performance for ad hoc queries

Asked 2019-Oct-31 at 12:08

I'm interested only in query performance reasons and architectural differences behind them. All answers I've seen before were outdated or hadn't provide me with enough context of WHY Impala is better for ad hoc queries.

From 3 considerations below only the 2nd point explain why Impala is faster on bigger datasets. Could you please contribute to the following statements?

Impala doesn't miss time for query pre-initialization, means impalad daemons are always running & ready. In other hand, Spark Job Server provide persistent context for the same purposes.
Impala is in-memory and can spill data on disk, with performance penalty, when data doesn't have enough RAM. The same is true for Spark. The main difference is that Spark is written on Scala and have JVM limitations, so workers bigger than 32 GB aren't recommended (because of GC). In turn, [wrong, see UPD] Impala is implemented on C++, and has high hardware requirements: 128-256+ GBs of RAM recommended. This is very significant, but should benefit Impala only on datasets that requires 32-64+ GBs of RAM.
Impala is integrated with Hadoop infrastructure. AFAIK the main reason to use Impala over another in-memory DWHs is the ability to run over Hadoop data formats without exporting data from Hadoop. Means Impala usually use the same storage/data/partitioning/bucketing as Spark can use, and do not achieve any extra benefit from data structure comparing to Spark. Am I right?

P.S. Is Impala faster than Spark in 2019? Have you seen any performance benchmarks?

UPD:

Questions update:

I. Why Impala recommends 128+ GBs RAM? What is an implementation language of each Impala's component? Docs say that "Impala daemons run on every node in the cluster, and each daemon is capable of acting as the query planner, the query coordinator, and a query execution engine.". If impalad is Java, than what parts are written on C++? Is there smth between impalad & columnar data? Are 256 GBs RAM required for impalad or some other component?

II. Impala loose all in-memory performance benefits when it comes to cluster shuffles (JOINs), right? Does Impala have any mechanics to boost JOIN performance compared to Spark?

III. Impala use Multi-Level Service Tree (smth like Dremel Engine see "Execution model" here) vs Spark's Directed Acyclic Graph. What does actually MLST vs DAG mean in terms of ad hoc query performance? Or it's a better fit for multi-user environment?

...

ANSWER

Answered 2019-Oct-31 at 12:08

First off, I don't think comparison of a general purpose distributed computing framework and distributed DBMS (SQL engine) has much meaning. But if we would still like to compare a single query execution in single-user mode (?!), then the biggest difference IMO would be what you've already mentioned -- Impala query coordinators have everything (table metadata from Hive MetaStore + block locations from NameNode) cached in memory, while Spark will need time to extract this data in order to perform query planning.

Second biggie would probably be shuffle implementation, with Spark writing temp files to disk at stage boundaries against Impala trying to keep everything in-memory. Leading to a radical difference in resilience - while Spark can recover from losing an executor and move on by recomputing missing blocks, Impala will fail the entire query after a single impalad daemon crash.

Less significant performance-wise (since it typically takes much less time compared to everything else) but architecturally important is work distribution mechanism -- compiled whole stage codegens sent to the workers in Spark vs. declarative query fragments communicated to daemons in Impala.

As far as specific query optimization techniques (query vectorization, dynamic partition pruning, cost-based optimization) -- they could be on par today or will be in the near future.

Source https://stackoverflow.com/questions/58598727

QUESTION

Scala Reflection exception during creation of DataSet in Spark

Asked 2019-Sep-16 at 15:20

I want to run Spark Job on Spark Jobserver. During execution, I got an exception:

stack:

java.lang.RuntimeException: scala.ScalaReflectionException: class com.some.example.instrument.data.SQLMapping in JavaMirror with org.apache.spark.util.MutableURLClassLoader@55b699ef of type class org.apache.spark.util.MutableURLClassLoader with classpath [file:/app/spark-job-server.jar] and parent being sun.misc.Launcher$AppClassLoader@2e817b38 of type class sun.misc.Launcher$AppClassLoader with classpath [.../classpath jars/] not found.

at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:123) at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:22) at com.some.example.instrument.DataRetriever$$anonfun$combineMappings$1$$typecreator15$1.apply(DataRetriever.scala:136) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:232) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:232) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:49) at org.apache.spark.sql.Encoders$.product(Encoders.scala:275) at org.apache.spark.sql.LowPrioritySQLImplicits$class.newProductEncoder(SQLImplicits.scala:233) at org.apache.spark.sql.SQLImplicits.newProductEncoder(SQLImplicits.scala:33) at com.some.example.instrument.DataRetriever$$anonfun$combineMappings$1.apply(DataRetriever.scala:136) at com.some.example.instrument.DataRetriever$$anonfun$combineMappings$1.apply(DataRetriever.scala:135) at scala.util.Success$$anonfun$map$1.apply(Try.scala:237) at scala.util.Try$.apply(Try.scala:192) at scala.util.Success.map(Try.scala:237) at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:237) at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:237) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

In DataRetriever I convert simple case class to DataSet.

case class definition:

...

ANSWER

Answered 2018-Mar-28 at 10:34

Calling toDS() inside future causing ScalaReflectionException.

I decided to construct DataSet outside future.map.

You can verify that Dataset can't be constructed in future.map with this example job.

Source https://stackoverflow.com/questions/49509784

QUESTION

How to import spark.jobserver.SparkSessionJob

Asked 2019-Jun-18 at 16:14

I have added job-server 0.9.0 dependencies in build.sbt by add

...

ANSWER

Answered 2019-Jun-18 at 16:14

I figured out, the sparkSessionJob located in job-server-extras, so I just have to add

Source https://stackoverflow.com/questions/56652844

QUESTION

php gearman worker function no synchronized with doctrine

Asked 2018-May-21 at 11:10

I observed a strange behavior of my doctrine object. In my symfony project I'm using ORM with doctrine to save my data in a mysql database. This is working normal in the most situations. I'm also using gearman in my project, this is a framework that allows applications to complete tasks in parallel. I have a gearman job-server running on the same machine where also my apache is running and I have registered a gearman worker on the same machine in a seperate 'screen' session using the screen window manager. By this method, I have always access to the standard console out of the function registered for the gearman-worker.

In the gearman-worker function I'm invoking, I have access to the doctrine object by $doctrine = $this->getContainer()->get('doctrine') and it works almost normal. But when I have changed some data in my database, doctrine is using still the old data, which were stored before in the database. I'm totally confused, because I expected that by callling:

$repo = $doctrine->getRepository("PackageManagerBundle:myRepo"); $dbElement = $repo->findOneById($Id);

I'm always getting the current data entrys from my database. This is looking like a strange caching behavior, but I have no clue what I've made wrong.

I can solve this problem, by registering the gearman worker and function new:

$worker = new \GearmanWorker(); $worker->addServer(); $worker->addFunction

After that I've back the current state of my database, until I've changing something else. I'm oberserving this behavior only in my gearman worker function. In the rest of the application everthing is synchronized with my database and normal.

...

ANSWER

Answered 2018-May-21 at 11:10

This is what I think may be happening. Could be wrong though.

A gearman worker is going to be a long-running process that picks up jobs to do. The first job it gets will then cause doctrine to load the entity into its object map from the database. But, for the second job the worker receives, doctrine will not perform a database lookup, it will instead check it's identity map and find it already has the object loaded and so will simply return the one from memory. If something else, external to the worker process, has altered the database record then you'll end up with an object that is out of date.

You can tell doctrine to drop objects from its identity map, then it will perform a database lookup. To enforce loading objects from the database again instead of serving them from the identity map, you should use EntityManager#clear().

More info here: https://www.doctrine-project.org/projects/doctrine-orm/en/2.6/reference/working-with-objects.html#entities-and-the-identity-map

Source https://stackoverflow.com/questions/50446961

QUESTION

What is the SBT := operator in build.sbt?

Asked 2018-Feb-08 at 11:46

I'm new to Scala and SBT. I noticed an unfamiliar operator in the build.sbt of an open source project:

:=

Here are a couple examples of how it's used:

...

ANSWER

Answered 2018-Feb-08 at 11:46

The := has essentially nothing to do with the ordinary assignment operator =. It's not a built-in scala operator, but rather a family of methods/macros called :=. These methods (or macros) are members of classes such as SettingKey[T] (similarly for TaskKey[T] and InputKey[T]). They consume the right hand side of the key := value expression, and return instances of type Def.Setting[T] (or similarly, Tasks), where T is the type of the value represented by the key. They are usually written in infix notation. Without syntactic sugar, the invocations of these methods/macros would look as follows:

Source https://stackoverflow.com/questions/48670674

QUESTION

Querying data produced by SPARK job via RESTful API

Asked 2017-Oct-11 at 18:46

I am pretty new to spark. I have produced a file having around 420 mb of data with SPARK job. I have a Java application which only needs to query data concurrently from that file based on certain conditions and return data in json format. So far I have found two RESTful APIs for SPARK but they are only for submitting SPARK jobs remotely and managing SPARK contexts,

...

ANSWER

Answered 2017-Oct-11 at 18:46

You can actually use Livy to get results back as friendly JSON in a RESTful way!

Source https://stackoverflow.com/questions/46688578

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install job-server

You can download it from GitHub.
PHP requires the Visual C runtime (CRT). The Microsoft Visual C++ Redistributable for Visual Studio 2019 is suitable for all these PHP versions, see visualstudio.microsoft.com. You MUST download the x86 CRT for PHP x86 builds and the x64 CRT for PHP x64 builds. The CRT installer supports the /quiet and /norestart command-line switches, so you can also script it.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: