eden | dotty version of paradise
kandi X-RAY | eden Summary
kandi X-RAY | eden Summary
The dotty version of paradise to interface with scala.meta.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of eden
eden Key Features
eden Examples and Code Snippets
Community Discussions
Trending Discussions on eden
QUESTION
I'm trying to make sure gcc vectorizes my loops. It turns out, that by using -march=znver1
(or -march=native
) gcc skips some loops even though they can be vectorized. Why does this happen?
In this code, the second loop, which multiplies each element by a scalar is not vectorised:
...ANSWER
Answered 2022-Apr-10 at 02:47The default -mtune=generic
has -mprefer-vector-width=256
, and -mavx2
doesn't change that.
znver1 implies -mprefer-vector-width=128
, because that's all the native width of the HW. An instruction using 32-byte YMM vectors decodes to at least 2 uops, more if it's a lane-crossing shuffle. For simple vertical SIMD like this, 32-byte vectors would be ok; the pipeline handles 2-uop instructions efficiently. (And I think is 6 uops wide but only 5 instructions wide, so max front-end throughput isn't available using only 1-uop instructions). But when vectorization would require shuffling, e.g. with arrays of different element widths, GCC code-gen can get messier with 256-bit or wider.
And vmovdqa ymm0, ymm1
mov-elimination only works on the low 128-bit half on Zen1. Also, normally using 256-bit vectors would imply one should use vzeroupper
afterwards, to avoid performance problems on other CPUs (but not Zen1).
I don't know how Zen1 handles misaligned 32-byte loads/stores where each 16-byte half is aligned but in separate cache lines. If that performs well, GCC might want to consider increasing the znver1 -mprefer-vector-width
to 256. But wider vectors means more cleanup code if the size isn't known to be a multiple of the vector width.
Ideally GCC would be able to detect easy cases like this and use 256-bit vectors there. (Pure vertical, no mixing of element widths, constant size that's am multiple of 32 bytes.) At least on CPUs where that's fine: znver1, but not bdver2 for example where 256-bit stores are always slow due to a CPU design bug.
You can see the result of this choice in the way it vectorizes your first loop, the memset-like loop, with a vmovdqu [rdx], xmm0
. https://godbolt.org/z/E5Tq7Gfzc
So given that GCC has decided to only use 128-bit vectors, which can only hold two uint64_t
elements, it (rightly or wrongly) decides it wouldn't be worth using vpsllq
/ vpaddd
to implement qword *5
as (v<<2) + v
, vs. doing it with integer in one LEA instruction.
Almost certainly wrongly in this case, since it still requires a separate load and store for every element or pair of elements. (And loop overhead since GCC's default is not to unroll except with PGO, -fprofile-use
. SIMD is like loop unrolling, especially on a CPU that handles 256-bit vectors as 2 separate uops.)
I'm not sure exactly what GCC means by "not vectorized: unsupported data-type". x86 doesn't have a SIMD uint64_t
multiply instruction until AVX-512, so perhaps GCC assigns it a cost based on the general case of having to emulate it with multiple 32x32 => 64-bit pmuludq
instructions and a bunch of shuffles. And it's only after it gets over that hump that it realizes that it's actually quite cheap for a constant like 5
with only 2 set bits?
That would explain GCC's decision-making process here, but I'm not sure it's exactly the right explanation. Still, these kinds of factors are what happen in a complex piece of machinery like a compiler. A skilled human can easily make smarter choices, but compilers just do sequences of optimization passes that don't always consider the big picture and all the details at the same time.
-mprefer-vector-width=256
doesn't help:
Not vectorizing uint64_t *= 5
seems to be a GCC9 regression
(The benchmarks in the question confirm that an actual Zen1 CPU gets a nearly 2x speedup, as expected from doing 2x uint64 in 6 uops vs. 1x in 5 uops with scalar. Or 4x uint64_t in 10 uops with 256-bit vectors, including two 128-bit stores which will be the throughput bottleneck along with the front-end.)
Even with -march=znver1 -O3 -mprefer-vector-width=256
, we don't get the *= 5
loop vectorized with GCC9, 10, or 11, or current trunk. As you say, we do with -march=znver2
. https://godbolt.org/z/dMTh7Wxcq
We do get vectorization with those options for uint32_t
(even leaving the vector width at 128-bit). Scalar would cost 4 operations per vector uop (not instruction), regardless of 128 or 256-bit vectorization on Zen1, so this doesn't tell us whether *=
is what makes the cost-model decide not to vectorize, or just the 2 vs. 4 elements per 128-bit internal uop.
With uint64_t
, changing to arr[i] += arr[i]<<2;
still doesn't vectorize, but arr[i] <<= 1;
does. (https://godbolt.org/z/6PMn93Y5G). Even arr[i] <<= 2;
and arr[i] += 123
in the same loop vectorize, to the same instructions that GCC thinks aren't worth it for vectorizing *= 5
, just different operands, constant instead of the original vector again. (Scalar could still use one LEA). So clearly the cost-model isn't looking as far as final x86 asm machine instructions, but I don't know why arr[i] += arr[i]
would be considered more expensive than arr[i] <<= 1;
which is exactly the same thing.
GCC8 does vectorize your loop, even with 128-bit vector width: https://godbolt.org/z/5o6qjc7f6
QUESTION
from bs4 import BeautifulSoup
import requests
cont = requests.get("https://ichi.pro/tr/veri-biliminde-uzaklik-olculeri-159983401462266").content
soup = BeautifulSoup(cont,"html.parser")
metin = soup.text
import re
sonuç1 = re.search("1. Öklid Mesafesi",metin)
sonuç2 = re.search('Okuduğunuz için teşekkürler!',metin)
metin = metin[sonuç1.start():sonuç2.start()].split("\n")
from gensim.models import Word2Vec
model = Word2Vec(metin,size=200,window=15,min_count=5,sg=5)
model.wv["Jaccard mesafesi"]
...ANSWER
Answered 2022-Mar-21 at 15:27There are multiple problems:
If you want a Turkish model, you can try to find a pretrained Word2Vec model for Turkish (e.g. check out this repository) or train a model for Turkish yourself. The way you use it now you seem to train a model but only from a single website, which will barely do anything because the model needs a large amount of sentences to learn anything (like at least 10.000, better much more). Also you set
min_count=5
anyway, so any word appearing less than 5 times is ignored generally. Try something like training it on the Turkish Wikipedia, see the linked repository.Word2Vec by default is a unigram model, so the input is a single word. If you hand it a bigram consisting of two words like
"Jaccard mesafesi"
it will not find anything. Also you should catch the case that the word is not in vocabulary, otherwise each unknown word will cause an error and your program to cancel. Search for the unigram representation of each token, then combine the two, e.g. by using the statistical mean of the vectors:
QUESTION
I'm monitoring a WildFly 10.1.0.Final Java process via SNMP port (configuring the com.sun.management.snmp
properties) in a Linux box.
The problem is the reported max values for Eden and Survivor Spaces are zero.
...ANSWER
Answered 2022-Mar-11 at 04:46The problem here is that jvmMemPoolMaxSize
(OID .1.3.6.1.4.1.42.2.145.3.163.1.1.2.110.1.13
) is defined in JVM-MANAGEMENT-MIB as type JvmUnsigned64TC
, which seems kind of strange because Java specifically forbids the use of unsigned integer types.
The description for jvmMemPoolMaxSize
implies that it's meant to represent the value returned by java.lang.management.MemoryPoolMXBean.getUsage().getMax()
. The documentation for that method says "This method returns -1 if the maximum memory size is undefined."
The description of jvmMgmMIB
addresses the issue with this explanation:
Where the Java programming language API uses long, or int, the MIB often uses the corresponding unsigned quantity - which is closer to the object semantics.
In those cases, it often happens that the -1 value that might be used by the API to indicate an unknown/unimplemented value cannot be used. Instead the MIB uses the value 0, which stricly speaking cannot be distinguished from a valid value. In many cases however, a running system will have non-zero values, so using 0 instead of -1 to indicate an unknown quantity does not lose any functionality.
I think it's safe to say this is one of the cases where a zero is not valid, so you should take it to mean that there is no defined maximum.
QUESTION
I am trying to write a script to pull NFTs by candy machine id, but it is either failing or returning an empty array each time.
I am using the genesysgo mainnet rpc.
Here is the relevant code.
...ANSWER
Answered 2022-Mar-07 at 08:30You are probably using the wrong CMID. Find one of your NFTs on solscan and use the first verified creator ID in the metadata as your CMID. (as seen below).
This would explain why the ID also returns an empty array on the magic eden and pentacles sites.
This address is not the same as the CMID in your .cache file.
QUESTION
I get Nfts on magic eden with this third party api. http://api-mainnet.magiceden.io/rpc/getGlobalActivitiesByQuery?q=%7B%22%24match%22%3A%7B%22txType%22%3A%22initializeEscrow%22%2C%22blockTime%22%3A%7B%22%24gt%22%3A1643468983%7D%7D%2C%22%24sort%22%3A%7B%22blockTime%22%3A-1%7D%7D
It responses with results on postman and browser but causes 403 error with axios in node.js.
How can I get data in node.js?
...ANSWER
Answered 2022-Feb-14 at 07:02It can be because of CORS error. You can use Cors proxy to fix it. Try it,please.
QUESTION
I'm trying to scrappe Magic Eden, specifically the page Collections where I want to get all the collections that exist in the page. I think I'm halfway but I can't figure out the two questions below.
...ANSWER
Answered 2022-Jan-24 at 00:07If the page has an ending and loads all at once instead of using a while loop and scrolling down the page I would just use a
QUESTION
I change GC for application server. Now I use G1 GC. I have 30 GB RAM. For initial testing I set only Xms and Xmx values to be the same 23040 mb.
Settings I use:
...ANSWER
Answered 2022-Jan-20 at 19:38The allocated size listed in that table includes Metaspace. Metaspace is memory pool separate from the java object heap. Therefore the sum of heap and metaspace can exceed the maximum heap size.
QUESTION
How can I get the value of the allocated and free memory of the following YoungGen (Eden, Survivor0,Survivor1), OldGen areas from Java?
I see that the tomcat page displays this information, how can I get it from the java code?
About maxMemory()
, totalMemory()
, freeMemory()
, I know, But it is not entirely clear how to get the value of exactly memory areas, as Tomcat does, for example.
ANSWER
Answered 2022-Jan-17 at 16:37You can use the Management API, most notably the MemoryMXBean
.
For example
QUESTION
As part of the question in Java 11 GC logging I am struggling to understand what the numbers actually mean.
For example:
...ANSWER
Answered 2022-Jan-15 at 13:06what is the unit here
A region. Region size varies based on heap size or an explicit setting.
it also looks like this happens almost entirely within the Eden regions - so the objects already went out of scope before even moving to the survivor space?
Most of them, a small amount might still trickle into later generations but on the other hand those regions may also contain now-dead objects that can be collected so it's mostly in equilibrium with only a very small flow towards the old generation. This kind of behavior is what makes generational collectors so efficient.
QUESTION
I'm running my java application on apline linux system in docker container, and I wanna find out the value of MaxHeapSize, so I use several command : java -XX:+PrintFlagsFinal, jinfo -flag MaxHeapSize , jmap -heap, but the output made me feel confused. The output of jinfo -flag MaxHeapSize , jmap -heap are consistent. However, The output of java -XX:+PrintFlagsFinal is different.so why did this happen?
The default container memory Limit setting is 4096MiB.
The output of java commonds is shown below.(I marked some important parts in the picture)
...ANSWER
Answered 2022-Jan-14 at 05:10These are not comparing the same thing.
When running jmap
or jstack
, these attach to the existing process with PID 9, as listed in the first jps
command.
When running java -XX:+PrintFlagsFinal -version
, this creates a new JVM process, and prints the information for that new process. Note that the original PID 9 process has a number of additional flags that can affect the calculated heap size.
For a more accurate comparison, you could add the -XX:+PrintFlagsFinal
flags to the main command run when the container starts. I would expect this to match the values returned by jinfo
and jmap
.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install eden
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page