MMEd | Micro Machines v3 level editor | Game Engine library
kandi X-RAY | MMEd Summary
kandi X-RAY | MMEd Summary
Micro Machines v3 level editor
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of MMEd
MMEd Key Features
MMEd Examples and Code Snippets
Community Discussions
Trending Discussions on MMEd
QUESTION
I am facing an issue when sorting a huge dataset (1.2 T) based on 4 columns. I also need right after the sort, to partition this dataset when writing the final dataset in HDFS ,based on one of the columns used in the sort function.
Here is a stackoverflow post I posted a few days ago describing an other issue I had with the same code but with regards to joining two datasets :
I used the answer of this post to improve my code. Now the join works fine.
I tested the code without the sort and it works fine. In order to perform the sort, I thought about partitioning the data based on the four columns.
The size of one partition is 500MB. I have then 2600=1.2T/500MB
partitions.
When executing the spark job, I get an shuffle.RetryingBlockFetcher
error (see the error logs below).
My questions are :
- What is the best way to sort data in spark to avoid shuffles ? Or reducing it ?
- Could I correct/add improvements to my code in order to perform the sort ?
- Do I really have to sort this way ? Can't I use other techniques like a Group By ?
ANSWER
Answered 2019-May-13 at 08:40Here are some suggestions for your case:
change 1: repartition based on the larger generated dataset 1.2TB. Also I removed the
repartition(col("NO_NUM"), col("UHDIN"), col("HOURMV"))
at this point since it will be overwritten from the next repartition("NO_NUM") and hence it is redundant.change 2: use persist to save the data that we just partitioned in order to avoid repartitioning over and over again for the same dataframe (please check the links from the previous post on how this works)
change 3: removed
uh_flag_comment.repartition(1300,col("NO_NUM"))
since it seems redundant to me. Although that would be useful only ifTransactionType().transform(uh)
is causing reshuffling, for instance is internally doing a join or groupBy! Such operation would modify the partition key we set on the previous step withrepartition(2600, col("NO_NUM")
.change 4: repartition with
col("NO_NUM"), col("UHDIN"), col("HOURMV")
since this will be the partition key that will be used by the orderBy therefore these two should be identicalchange 5: orderBy with
col("NO_NUM"), col("UHDIN"), col("HOURMV")
change 6: increase the executors num to 40
QUESTION
I am currently facing issues when trying to join (inner) a huge dataset (654 GB) with a smaller one (535 MB) using Spark DataFrame API.
I am broadcasting the smaller dataset to the worker nodes using the broadcast() function.
I am unable to do the join between those two datasets. Here is a sample of the errors I got :
...ANSWER
Answered 2019-May-02 at 14:29Here are some improvements regarding your code:
- Add
repartition
based on the KEY column that you join withuh
, the number of partitions should approximately be650GB / 500MB ~ 1300
. - Apply filtering on your datasets before joining them, in your case just execute the where clauses before the join statement.
- Optionally
cache
the small dataset - Make sure that the small dataset will be broadcasted i.e you can try to save and check its size. Then adjust the value of
spark.broadcast.blockSize
accordingly, probably by increasing it.
Here is how your code should look like with the changes:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install MMEd
Check out the source code
Get a copy of the "PSX" emulator from [http://www.emulator-zone.com/doc.php/psx/psx_em.html](http://www.emulator-zone.com/doc.php/psx/psx_em.html)
Get a copy of the files from the CD onto your PC
Run MMEd (needs .NET 2.0); optionally you can do this from within Visual Studio 2005 - this is a bit slower but helps if it crashes!
Click File > New (CTRL+N). Browse for your copy of the CD image, and select "BREAKY1 - CHEESEY JUMPS" from the course dropdown.
Expand "MMv3 Level" and then the "SHET" chunk in the tree view, and select "201 bkftable" (which is the "Flat" with the ommer on it).
Select the "Flat" viewer pane. It will show you details of the selected chunk, including a list of weapons.
Change "Ommer" to "Mines" in the dropdown in the middle of the page.
Click "Commit" to save your changes to the file (in memory).
Click File > Save (CTRL+S). Save your file somewhere - make sure you save as a MMEd Save File (.mmv file) for best results.
Observe that a new entry appears in the tree "Version <Current Date/Time>" - click here to see a summary of what you changed, and you can revert back to this version in future if necessary.
Click File > Publish (CTRL+P). Browse for a suitable location to save your level in binary format. Check the "Update CD Image" box and browse for your copy of the CD image, make sure the correct course is selected in the dropdown, and rename the course if you wish. Click Publish.
Run the PS emulator using the modified CD image (select the BIN, not the CUE)
Navigate to Cheesy Jumps in multiplayer mode - just before clicking Ok on the course
Click Quick Save > Quick Save 1 (F6).
Play.
Notice that the weapon by the first corner which was an ommer is now mines.
…
Profit!
Although the MMEd Save File format is great because it gives you a version history, it’s always possible something will go horribly wrong. For this reason, make sure you keep a few backups of the raw MMs level binary files - these are the ones produced by the Publish option (and conveniently a few automatic backups are kept in a "Backup" folder, as per the option on the Publish screen).
Changes to course names don’t show up if you just use Quick Load - you need to reboot your virtual PS1 to get them to show up.
Use the "XML" viewer pane to do your editing by hand
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page