kdb-tree | in-memory kdb tree -
kandi X-RAY | kdb-tree Summary
kandi X-RAY | kdb-tree Summary
in-memory kdb tree
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of kdb-tree
kdb-tree Key Features
kdb-tree Examples and Code Snippets
Community Discussions
Trending Discussions on kdb-tree
QUESTION
I am new to HDFS and Spark. I have input data for some simulations that is specific to regions (might be a country or part of country) and a function of time.
Lets assume I have following tables:
...ANSWER
Answered 2019-Jun-27 at 14:31sqoop (is not Spark) is more so for tables. It can use views but it was stated that for complex views the results may even be unreliable. So, that avenue is closed.
You will need to use a spark.read JDBC connection with a view in mySQL that uses region_id as key for distribution - for your parallelism - using the numPartitions approach defined on a "driving" table. The join with the other tables needs to rely on the mySQL engine.
I am not privy to your processing, but it seems hard to enforce a 1 to 1 region_id to partition approach. Moreover, more than 1 partition may exist on the same node - but independently.
You could get all tables independently and then JOIN, but there would be shuffling as there is no way to guarantee all individual READ's results would end up on same Node.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install kdb-tree
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page