spark-cassandra-connector | DataStax Spark Cassandra Connector
kandi X-RAY | spark-cassandra-connector Summary
kandi X-RAY | spark-cassandra-connector Summary
DataStax Spark Cassandra Connector
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of spark-cassandra-connector
spark-cassandra-connector Key Features
spark-cassandra-connector Examples and Code Snippets
Community Discussions
Trending Discussions on spark-cassandra-connector
QUESTION
I'm trying to use directJoin with the partition keys. But when I run the engine, it doesn't use directJoin. I would like to understand if I am doing something wrong. Here is the code I used:
Configuring the settings:
...ANSWER
Answered 2022-Mar-31 at 14:35I've seen this behavior in some versions of Spark - unfortunately, the changes in the internals of Spark often break this functionality because it relies on the internal details. So please provide more information on what version of Spark & Spark connector is used.
Regarding the second error, I suspect that direct join may not use Spark SQL properties, can you try to use spark.cassandra.connection.host
, spark.cassandra.auth.password
, and other configuration parameters?
P.S. I have a long blog post on using DirectJoin, but it was tested on Spark 2.4.x (and maybe on 3.0, don't remember
QUESTION
Is Datastax Cassandra community edition integration with Spark community edition using spark-cassandra-connector community edition node aware or is this feature reserved for Enterprise editions only?
By node awareness I mean if Spark will send job execution to the nodes owning the data
...ANSWER
Answered 2022-Feb-09 at 16:51Yes, the Spark connector is node-aware and will function in that manner with both DSE and (open source) Apache Cassandra.
In fact on a SELECT
it knows how to hash the partition keys to a token, and send queries on specific token ranges only to the nodes responsible for that data. It can do this, because (like the Cassandra Java driver) it has a window into node-to-node gossip and can see things like node status (up/down) and token range assignment.
QUESTION
I am very new to apache spark and I just have to fetch a table from cassandra database, Below I have appended the data to debug the situation, Please help and thanks in advance. Cassandra Node:192.168.56.10 Spark Node: 192.168.56.10
Cassandra Table to be fetched: dev.device {keyspace.table_name}
Access pyspark with connection to cassandra:
...ANSWER
Answered 2021-Dec-27 at 11:08You can't use connector compiled for Scala 2.11 with Spark 3.2.0 that is compiled with Scala 2.12. You need to use appropriate version - right now it's 3.1.0 with coordinates com.datastax.spark:spark-cassandra-connector_2.12:3.1.0
P.S. Please note that although basic functionality will work, more advanced functionality won't work until the SPARKC-670 is fixed (see this PR)
QUESTION
I am a complete beginner to all this stuff in general so pardon if I'm missing some totally obvious step. I installed spark 3.1.2 and cassandra 3.11.11 and I'm trying to connect both of them through this guide I found where I made a fat jar for execution. In the link I posted when they execute the spark-shell command with the jar file, there's a line which occurs at the start.
INFO SparkContext: Added JAR file:/home/chbatey/dev/tmp/spark-cassandra-connector/spark-cassandra-connector-java/target/scala-2.10/spark-cassandra-connector-java-assembly-1.2.0-SNAPSHOT.jar at http://192.168.0.34:51235/jars/spark-15/01/26 16:16:10 INFO SparkILoop: Created spark context..
I followed all of the steps properly but it doesn't show any line like that in my shell. To confirm that it hasn't been added I try the sample program on that website and it throws an error
java.lang.NoClassDefFoundError: com/datastax/spark/connector/util/Logging
What should I do? I'm using spark-cassandra-connector-3.1.0
...ANSWER
Answered 2021-Nov-30 at 07:28You don't need to compile it yourself, just follow official documentation - use --packages
to automatically download all dependencies:
QUESTION
Although the latest spark-cassandra-connector from DataStax states it supports reading/writing TTL and WRITETIME I am still receiving a SQL undefined function error.
Using Databricks with library com.datastax.spark:spark-cassandra-connector-assembly_2.12:3.1.0 and a Spark Config for CassandraSparkExtensions on a 9.1 LTS ML (includes Apache Spark 3.1.2, Scala 2.12) Cluster. CQL version 3.4.5.
...ANSWER
Answered 2021-Nov-13 at 13:22If you just added Spark Cassandra Connector via Clusters UI, then it will not work - the reason for that is libraries are installed into cluster after Spark already started, so class specified in spark.sql.extensions
isn't found.
To fix this you need to put Jar file to cluster nodes before Spark starts - you can do it using the cluster init script that will either download jar directly with something like this (but it will download multiple copies - for each node):
QUESTION
Small question regarding IntelliJ and Maven pom.xml please.
In several online tutorials, I saw when the user is using IntelliJ, Maven, and has a dependency which version is outdated, got a nice warning.
For instance, in this piece of pom:
...ANSWER
Answered 2021-Nov-02 at 00:53As y.bedrov mentioned, all credits to him, it is under View -> Tool Windows -> Dependencies
QUESTION
I am trying to get data using dataframes from cassandra by using spark-cassandra-connector
but getting below exception.
Note: Connection is successful to cassandra.
Spark version: 2.4.1
spark-cassandra-connector version: 2.5.1
ANSWER
Answered 2021-Oct-01 at 09:03The error you posted indicates that the embedded Java driver is not able to generate a query plan -- list of Cassandra nodes to connect to as coordinators. There is possibly an issue with how you've defined the contact points.
You normally need to specify a contact point with the cassandra.connection.host
parameter. Here's an example of how you would start a Spark shell using the connector:
QUESTION
I'm new to Cassandra and Pyspark, initially I installed cassandra version 3.11.1, openjdk 1.8, pyspark 3.x and scala 1.12. I was getting a lot of errors as shown below after running my python server.
...ANSWER
Answered 2021-Sep-21 at 19:20You are using wrong version of Cassandra connector - if you are using pyspark 3.x, the you need to get corresponding version - 3.0 or 3.1. Your version is for old versions of Spark:
QUESTION
Create table -
...ANSWER
Answered 2021-Aug-11 at 10:16it's interesting change in 2.5.x, that I wasn't aware of - you now need to have a correct row size even if keyColumns
is specified, it worked without it before - looks like a bug for me.
You need to leave only the primary key when deleting the whole row - change delete to the:
QUESTION
I created Docker containers in which I installed Apache Spark 3.1.2 (Hadoop 3.2) that host a ThriftServer which is configured to access Cassandra via the spark-cassandra-connector(3.1.0). Each of these services is running in it's own container. So I got 5 containers up (1x spark master, 2x spark worker, 1x spark thriftserver, 1x cassandra) which are configured to live in the same network via docker-compose.
I use the beeline client from Apache Hive(1.2.1) to query the database. Everything works fine, except for querying a field in Cassandra with the type timestamp
.
ANSWER
Answered 2021-Aug-02 at 14:53I found this JIRA which mentions that there was a bug converting times which is not fixed in 3.1.2 (3.1.3 is not released yet), but in 3.0.3. I downgraded Apache Spark(3.0.3) and spark-cassandra-connector(3.0.1) which seems to solve the problem for now.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install spark-cassandra-connector
The default Scala version for Spark 3.0+ is 2.12 please choose the appropriate build. See the FAQ for more information.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page