lscpu | - lscpu for BSDs | Monitoring library
kandi X-RAY | lscpu Summary
kandi X-RAY | lscpu Summary
lscpu for BSDs. The main usage of this program should be for x86 architecture since it leverages CPUID instructions. For other architectures, it just shows very limited facts.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of lscpu
lscpu Key Features
lscpu Examples and Code Snippets
$ ./lscpu
Architecture: i386
Byte Order: Little Endian
Active CPU(s): 2
Total CPU(s): 2
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 1
Vendor: GenuineInte
Community Discussions
Trending Discussions on lscpu
QUESTION
BACKGROUND:
I am trying to write a DPDK app which is supposed to handle packets coming from inside a Virtual Machine Monitor.
Basically the VMM is getting the packets from its guest, then is going to send those packets to DPDK. Then Dpdk sends them out on the NIC.
Virtual Machine -> Virtual Machine Manager -> DPDK -> NIC
This architecture from above is supposed to replace and outperform the original architecture. In the original architecture, the VMM is putting the packets on a TAP interface.
Original:
Virtual Machine -> Virtual Machine Manager -> TAP interface -> NIC
Problem:
I have written the new architecture and the throughput is way worse than when using the TAP interface. (TAP 300 MB/s any direction, Dpdk: 50MB/s VM sender, 5MB/s VM receiver)
I am suspecting that I am not configuring my DPDK Application properly. Could you give an opinion on my configurations?
Environment:
I have done all the testing inside a Qemu virtual machine, so the architectures described above were both ran inside this Virtual Machine:
3 logical CPUs (out of 8 on host)
4096 MB memory
OS: Ubuntu 20.4
2 NICs, one for SSH and one for DPDK
What I did so far:
2GB Hugepages
Isolated the cpu which DPDK is using.
Here is the code: https://github.com/mihaidogaru2537/DpdkPlayground/blob/Strategy_1/primary_dpdk_firecracker/server.c
All functional logic is in "lcore_main", everything else is just configuration.
All the advice I could find about increasing performance would involve hardware stuff and not configuration parameters, I don't know if the values I am using for things such as:
...ANSWER
Answered 2021-May-29 at 09:39[Answer is based on the live debug and configuration settings done to improve performance]
Factors that were affecting performance for Kernel and DPDK interfaces were
- Host system was not using CPU which were isolated for VM
- KVM-QEMU CPU threads were not pinned
- QEMU was not using huge page backed memory
- emulator and io threads of QEMU were not pinned.
- Inside VM the kernel boot parameter was set to 1GB, which was causing TLB to miss on the host.
Corrected configuration:
- setup host with 4 * 1GB huge page on host
- edit qemu XML to reflect VCPU, iothread, emulator threads on desired host CPU
- edit qemu to use host 4 * 1GB page
- edit VM grub to have isolate CPU and use 2MB pages.
- run DPDK application using isolated core on VM
- taskset firecracker thread on ubuntu
We were able to achieve around 3.5X to 4X performance on the current DPDK code.
note: there is a lot of room to improve code performance for DPDK primary and secondary application also.
QUESTION
I have followed below steps to install and run pktgen-dpdk. But I am getting "Illegal instruction" error and application stops.
System Information (Centos 8)
...ANSWER
Answered 2021-May-21 at 12:25Intel Xeon E5-2620
is Sandy Bridge CPU which officially supports AVX and not AVX2.
DPDK 20.11 meson build, ninja -C build
will generate code with AVX
instructions and not AVX2
. But (Based on the live debug) PKTGEN forces the compiler to add AVX2 to be inserted, thus causing illegal instruction.
Solution: edit meson.build
in line 22
from
QUESTION
I want to use | pipe symbol in command but it is throwing error whenever i use this or any other special symbol like >, >> in commands.
...ANSWER
Answered 2021-May-21 at 14:15The pipe feature comes from the shell you are using. I thought it would work with runInShell
but could not get it to work. Instead, you can do this which will explicit run your commands using /bin/sh
:
QUESTION
I have a somewhat large code that uses the libraries numpy, scipy, sklearn, matplotlib
. I need to limit the CPU usage to stop it from consuming all the available processing power in my computational cluster. Following this answer I implemented the following block of code that is executed as soon as the script is run:
ANSWER
Answered 2021-May-14 at 14:00(This might be better as a comment, feel free to remove this if a better answer comes up, as it's based on my experience using the libraries.)
I had a similar issue when multiprocessing parts of my code. The numpy
/scipy
libraries appear to spin up extra threads when you do vectorised operations if you compiled the libraries with BLAS or MKL (or if the conda repo you pulled them from also included a BLAS/MKL library), to accelerate certain calculations.
This is fine when running your script in a single process, since it will spawn threads up to the number specified by OPENBLAS_NUM_THREADS
or MKL_NUM_THREADS
(depending on if you have a BLAS library or MKL library - you can identify which by using numpy.__config__.show()
), but if you are explicitly using a multiprocesing.Pool
, then you likely want to control the number of processes in multiprocessing
- in this case, it makes sense to set n=1
(before importing numpy & scipy), or some small number to make sure you are not oversubscribing:
QUESTION
I'm trying to build this project: https://github.com/utelle/SQLite3MultipleCiphers
Specifically the amalgamation files found at: https://github.com/utelle/SQLite3MultipleCiphers/releases/tag/v1.2.5
I'm getting this error from the _mm_aesimc_si128
function:
ANSWER
Answered 2021-Apr-20 at 19:20Since the amalgamation file is a C file it should be:
QUESTION
When I let my application output the available memory and number of cores on a Google Cloud Run instance using linux commands like "free -h", "lscpu" and "top" I always get the information that there are 2 GB of available memory and 2 cores, although I specified other capacities in my deployment. No matter, I set 1 GB, 2 GB and 4 GB of memory and 1, 2 or 4 CPUs the mentioned linux tools always show the same capacity.
Am I misunderstanding these tools or the Google Cloud Run concept, or is there something not working like it should?
...ANSWER
Answered 2021-Apr-19 at 19:49The Cloud Run services run container on a non standard runtime environmen (named BORG internally at Google Cloud). It's possible that the low level info values are not relevants.
In addition, Cloud Run services run in a sandbox (gVisor) and system calls can be also filtered like that.
What did you look at with these test?
I performed tests to validate the multi-cpus capacity of Cloud Run and wrote an article about that. The multi cpu capacity is real!! Have a look on it.
QUESTION
I'm writing a C++ program that shall solve PDEs and algebraic equations on networks. The Eigen library shoulders the biggest part of the work by solving many sparse linear systems with LU decomposition.
As performance is always nice I played around with options for that. I'm using
...ANSWER
Answered 2021-Mar-04 at 21:58There are plenty of reasons why a code can be slower with -march=native
, although this is quite exceptional.
That being said, in your specific context, one possible scenario is the use of slower SIMD instructions, or more precisely different SIMD instructions finally making the program slower. Indeed, GCC vectorize loops with -O3
using the SSE instruction set on x86 processors such as yours (for backward compatibility). With -march=native
, GCC will likely vectorize loops using the more advanced and more recent AVX instruction set (supported by your processor but not on many old x86 processors). While the use of AVX instruction should speed your program up, it is not always the case in few pathological cases (less efficient code generated by compiler heuristics, loops are too small to leverage AVX instructions, missing/slower AVX instructions available in SSE, alignment, transition penality, energy/frequency impact, etc.).
My guess is your program is memory bound and thus AVX instructions do not make your program faster.
You can confirm this hypothesis by enabling AVX manually using -mavx -mavx2
rather than -march=native
and look if your performance issue is still there. I advise you to carefully benchmark your application using a tool like perf
.
QUESTION
After many attempts, and trying many solutions that I could find on stackoverflow or elsewhere on the internet, I was still not able to run the emulator on my computer. This is happening with this computer on both Windows and Linux boots. I am able to start the emulator but then it remains with a full black screen. Here are some information regarding the software: Linux Ubuntu 20.04LTS and Android-studio version I am working with: 4.1.2. About my hardware:
...ANSWER
Answered 2021-Feb-23 at 14:18I finally found the solution. Because I am using an old AMV processor, I needed to use an ARM based image for the emulator and not a x86 image. I was then able to run it. Unfortunately my processor was not powerful enough and was lagging as hell when launching the android emulator. Had to buy a new computer.
QUESTION
I am new to multithreading ,I read the code like below:
...ANSWER
Answered 2021-Feb-22 at 13:50How many threads can I call in one program?
You didn't specify any programming language or operating system but in most cases, there's no hard limit. The most important limit is how much memory is available for the thread stacks. It's harder to quantify how many threads it takes before they all spend more time competing with each other for resources than they spend doing any actual work.
the command
lscpu
returns: Thread(s) per core: 2, Core(s) per socket: 6, Socket(s): 1
That information tells how many threads the system can simultaneously execute. If your program has more than twelve threads that are ready to run then, at most twelve of them will actually be running at any given point in time, while the rest of them await their turns.
But note: In order for twelve threads to be "ready to run," they have to all be doing tasks that do not interfere with each other. That mostly means, doing computations on values that are already in memory. In your example program, all of the threads want to write to the same output stream. Assuming that the output stream is "thread safe," then that will be something that only one thread can do at a time. It doesn't matter how many cores your computer has.
how many thread can I call to make sure other programs run well?
That's hard to answer unless you know what all of the other programs need to do. And what does "run well" mean, anyway? If you want to be able to use office tools or programming tools to get work done while a large, compute-intensive program runs "in the background," then you'll pretty much have to figure out for yourself just how much work the background program can do while still allowing the "foreground" tools to respond to your typing.
QUESTION
I'd like to run a command similar to:
...ANSWER
Answered 2021-Feb-11 at 01:55What about this —
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install lscpu
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page