Epyc | A versatile software for modeling epidemics on networks
kandi X-RAY | Epyc Summary
kandi X-RAY | Epyc Summary
This project aims to simulate various kinds of compartmental epidemic models, consisting of but not limited to SIR, SIR-SIR, SIS-SIR and the newly introduced SIS-SID, on various types of static and dynamic networks. The main simulating engine is written in C++, and the visualizing scripts are written in Python.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Calculate binning .
- show an image
- Remove leading zeros from a string
- Estimates the Burst rate .
Epyc Key Features
Epyc Examples and Code Snippets
Community Discussions
Trending Discussions on Epyc
QUESTION
I have a dataset of transactions (40k rows, 45 cols, 11 MB RAM) with ID tracking the whole transaction process (sending, withdrawal, refund) and I need to restructurize Dataframe based on values in column 'Reference number', so that all data tied to this transaction is in one row.
What I do: load source Dataframe, sort it, iterate row by row to filter occurances of one ID at a time, iterate over final Dataframe columns and assign corresponding value to a local Dictionary. At the end, concat Dicts and create a new Dataframe (approx. 1/2 * 40k rows, 69 cols).
Based on my testing, 85 % of required time is consumed by the for loop creating Dict keys (for x, y in dictOfCols.items()) Is there a better way of doing this? Currently this script takes more than 30 minutes on a single core AMD Epyc (server) with 8 GB RAM.
dictOfCols contains pairs of keys (final DF column name) and values (list [source DF columns names]) for 3 possible transaction types (sending, withdrawal, refund). Shortened example:
...ANSWER
Answered 2021-May-07 at 16:35Turns out that multiprocessing sped up the job very much. I just had to split dataframe into 8 chunks on a 8 core CPU.
Also I found out that the script takes 30 minutes only in debug mode, not in actual run. Then it takes only about 5 minutes.
QUESTION
I am writing you after many attempts I have done on a CPU cluster so structured:
144 standard compute nodes 2× AMD EPYC 7742, 2× 64 cores, 2.25 GHz 256 (16× 16) GB DDR4, 3200 MHz InfiniBand HDR100 (Connect-X6) local disk for operating system (1× 240 GB SSD) 1 TB NVMe
Now, since my core-h are here limited, I want to maximize performance as much as I can. I am doing some benchmarking with the following submission script:
...ANSWER
Answered 2021-Apr-30 at 14:53In your case, the 256 tasks have no constraints to run in the same rack, location or not. Slurm have no clues to schedule correctly the job on your cluster. It could be schedule 1 task on 256 different nodes, and that is not efficient at all.
To be sure that all is schedule correctly, perhaps you should force to locate the tasks on the node.
QUESTION
I have a Microsoft Azure VM Instance Running on which I have minecraft paper server installed. Today I'm not able to start the server due to some java error caused while running the server command via ssh-putty.
Server OS: Ubuntu 18.04.5 LTS
Minecraft Server Run Commands:
...ANSWER
Answered 2021-Jan-22 at 10:05Native memory allocation (mmap) failed to map 31138512896 bytes
- which is about 31,14G, well above your limit of 29G.
QUESTION
For memory-bound programs it is not always faster to use many threads, say the same number as the cores, since threads may compete for memory channels. Usually on a two-socket machine, less threads are better but we need to set affinity policy that distributes the threads across sockets to maximize the memory bandwidth.
Intel OpenMP claims that KMP_AFFINITY=scatter is to achieve this purpose, the opposite value "compact" is to place threads as close as possible. I have used ICC to build the Stream program for benchmarking and this claim is easily validated on Intel machines. And if OMP_PROC_BIND is set, the native OpenMP env vars like OMP_PLACES and OMP_PROC_BIND are ignored. You will get such a warning:
...ANSWER
Answered 2020-Oct-20 at 03:49TL;DR: Do not use KMP_AFFINITY
. It is not portable. Prefer OMP_PROC_BIND
(it cannot be used with KMP_AFFINITY
at the same time). You can mix it with OMP_PLACES
to bind threads to cores manually. Moreover, numactl
should be used to control the memory channel binding or more generally NUMA effects.
Long answer:
Thread binding: OMP_PLACES
can be used to bound each thread to a specific core (reducing context switches and NUMA issues). OMP_PROC_BIND
and KMP_AFFINITY
should theoretically do that correctly, but in practice, they fail to do so on some systems. Note that OMP_PROC_BIND
and KMP_AFFINITY
are exclusive option: they should not be used together (OMP_PROC_BIND
is a new portable replacement of the older KMP_AFFINITY
environment variable). As the topology of the core change from one machine to another, you can use the hwloc
tool to get the list of the PU ids required by OMP_PLACES
. More especially hwloc-calc
to get the list and hwloc-ls
to check the CPU topology. All threads should be bound separately so that no move is possible. You can check the binding of the threads with hwloc-ps
.
NUMA effects: AMD processors are built by assembling multiple CCX connected together with a high-bandwidth connection (AMD Infinity Fabric). Because of that, AMD processors are NUMA systems. If not taken into account, NUMA effects can result in a significant drop in performance. The numactl
tool is designed to control/mitigate NUMA effects: processes can be bound to memory channels using the --membind
option and the memory allocation policy can be set to --interleave
(or --localalloc
if the process is NUMA-aware). Ideally, processes/threads should only work on data allocated and first-touched on they local memory channels. If you want to test a configuration on a given CCX you can play with --physcpubind
and --cpunodebind
.
My guess is that the Intel/Clang runtime does not perform a good thread binding when KMP_AFFINITY=scatter
is set because of a bad PU mapping (which could come from a OS bug, a runtime bug or bad user/admin settings). Probably due to the CCX (since mainstream processors containing multiple NUMA nodes were quite rare).
On AMD processors, threads accessing memory of another CCX usually pay an additional significant cost due to data moving through the (quite-slow) Infinity Fabric interconnect and possibly due to its saturation as well as the one of memory channels. I advise you to not trust OpenMP runtime's automatic thread binding (use OMP_PROC_BIND=TRUE
), to rather perform the thread/memory bindings manually and then to report bugs if needed.
Here is an example of a resulting command line so as to run your application:
numactl --localalloc OMP_PROC_BIND=TRUE OMP_PLACES="{0},{1},{2},{3},{4},{5},{6},{7}" ./app
PS: be careful about PU/core IDs and logical/physical IDs.
QUESTION
A little bit of a background about my status:
I'm currently using OVH Advance 5 Server (AMD Epyc 7451 - 24 c / 48 t - 128 GB Memory - 450x2 GB SSD) and I was wondering the specifications I should be using for Postgresql.
I'll be using multiprocess running 24 Python scripts with 24 different pools (using asynpg to connect), and I usually use up about 40 GB of RAM or so - that means I have around 88 GB to work with.
Before I've never really touched any of the settings for Postgres; what kind of values should I be using for:
Shared Memory / Max Connections / Random Page Cost?
Reading up on it, it says it's recommended that Shared Memory should generally take up about 25% of the free RAM - but other sources say 2 - 4 GB is generally the sweet spot point, so any insight would be greatly appreciated.
...ANSWER
Answered 2020-Aug-17 at 06:41shared_buffers
: start with 25% of the available RAM or 8GB, whatever is lower.You can run performance tests to see if other settings work better in your case.
max_connections
: leave the default 100. If you think that you need more than 50 connections, use the connection pooler pgBouncer.random_page_cost
: if your storage is as fast with random I/O as with sequential I/O, use a setting of 1.1. Otherwise, stick with the default 4.
QUESTION
there is a weird problem as title when using dpdk,
When I use rte_pktmbuf_alloc(struct rte_mempool *) and already verify the return value of rte_pktmbuf_pool_create() is not NULL, the process receive segmentation fault.
Follow
...ANSWER
Answered 2020-Jun-15 at 06:24I am able to run this without any error by modifying your code for
- include headers
- removed unused variables
- add check if the returned value is NULL or not for
alloc
Test on:
QUESTION
Does Epyc 2 (Rome) (specifically a 7402P in my case) have anything equivalent to Intel's DDIO, where PCIe devices doing DMA can transfer directly to last-level cache? Or do DMA accesses all go directly to DRAM? What happens if it's a DMA write and the data is already present in some caches in the system?
...ANSWER
Answered 2020-May-21 at 17:34AMD lacks support for DDIO.
For this reason AMD is not vulnerable to NetCAT: https://www.techpowerup.com/259096/new-netcat-vulnerability-exploits-ddio-on-intel-xeon-processors-to-steal-data
QUESTION
I'm a learner of OS, and trying to write a kernel.
I googled for "AMD x2APIC" and found out some of the information about EPYC 7002 series seems like to support it.
But I cannot find the relative documentation.
So I would like to ask if the recent AMD processors support it, and if yes, where I can find documentation.
...ANSWER
Answered 2020-Apr-05 at 16:11Yes, the same as in Intel's CPUs.
You use cpuid
(CPUID_Fn00000001_ECX
) to check for it
and the Core::X86::Msr::APIC_BAR
MSR to enable it:
Just like you would with Intel's CPUs.
The x2APIC specification is here.
QUESTION
What is an equivalent way to write this sed program in Python using the re library? This sed pattern completes the search in one pass and its efficient. I am trying to extract the model number of the cpu. Please see my Python code attempt at the bottom.
Sample input:
...ANSWER
Answered 2020-Mar-04 at 23:08Having fully featured language like Python and so well structured data I wouldn't try to parse everything using regular expressions. Instead I'd just wrote a code doing the job, using regex only at the very end. This way instead of enormous regex, I have short and easy to read code with very simple regular expressions.
QUESTION
On Intel processors, x87 trigonometric instructions such as FSIN have limited accuracy due to the use of a 66-bit approximation of pi even though the computation itself is otherwise accurate to the full 64-bit mantissa of an 80-bit extended-precision floating-point value. (Full accuracy for all valid inputs requires a 128-bit approximation of pi.) The omission in Intel's documentation was corrected after the issue was brought to their attention.
However, I cannot find similarly detailed information about the accuracy of AMD's implementation of x87 trigonometric instructions beyond this mention in the AMD64 Architecture Programmer's Manual, Volume 1:
6.4.5.1 Accuracy of Transcendental Resultsx87 computations are carried out in double-extended-precision format, so that the transcendental functions provide results accurate to within one unit in the last place (ulp) for each of the floating-point data types.
Is AMD's implementation of x87 trigonometric instructions actually fully accurate to within one ULP in extended-precision format for all valid inputs, including a 128-bit or better approximation of pi? An answer that pertains to the Zen and Zen 2 architectures (Ryzen and EPYC) would be ideal.
...ANSWER
Answered 2020-Feb-17 at 20:14I found a program located at http://notabs.org/fpuaccuracy/ (direct download link; GPLv3) designed to test the accuracy of x87 trigonometric instructions. The reference output for fpuaccuracy examples
supplied with the program, generated using an Intel Core i7-2600 (Sandy Bridge), is as follows:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install Epyc
You can use Epyc like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page