Vftrace | visual ftrace ) is a performance profiling library | Performance Testing library

by SX-Aurora C Version: v2.0.0 License: GPL-3.0

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | Vftrace Summary

Vftrace is a C library typically used in Testing, Performance Testing applications. Vftrace has no bugs, it has no vulnerabilities, it has a Strong Copyleft License and it has low support. You can download it from GitHub.

Vftrace (visual ftrace) is a performance profiling library with a focus on applications in high-performance computing (HPC). It is compatible with C and Fortran. It supports NEC's SX Aurora vector architecture and x86 architectures. On top of that, Vftrace can be linked against the PAPI library to measure hardware counters on X86 processors. Hardware counter access for SX-Aurora architectures is already build into Vftrace and does not require addtional libraries. These hardware counters can be transformed into user-defined performance observables, such as GFLOPS or vector length, using json configuration files. Vftrace produces an overview of the function calls appearing during an application's runtime and registers the call number and the time spent in the code parts. If hardware observables are defined, their value is also reported. The generated profile can be visualized and analyzed with the Vfview tool.

Support

Quality

Security

License

Reuse

Support

Vftrace has a low active ecosystem.

It has 6 star(s) with 2 fork(s). There are 1 watchers for this library.

It had no major release in the last 12 months.

There are 2 open issues and 41 have been closed. On average issues are closed in 131 days. There are 1 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of Vftrace is v2.0.0

Quality

Vftrace has no bugs reported.

Security

Vftrace has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

Vftrace is licensed under the GPL-3.0 License. This license is Strong Copyleft.

Strong Copyleft licenses enforce sharing, and you can use them when creating open source projects.

Reuse

Vftrace releases are available to install and integrate.

Installation instructions, examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of Vftrace

Get all kandi verified functions for this library.

Vftrace Key Features

No Key Features are available at this moment for Vftrace.

Vftrace Examples and Code Snippets

No Code Snippets are available at this moment for Vftrace.

Community Discussions

Trending Discussions on Performance Testing

Karate-Gatling: Not able to use object fields inside Karate features

Faulty benchmark, puzzling assembly

What difference does it make if I add think time to my virtual users as opposed to letting them execute requests in a loop as fast as they can?

Jmeter - bzm Streaming Sampler Content Protection

How to wait first post issue and use while loop in k6 load test scripts?

Measuring OpenMP Fork/Join latency

Unable to capture Client transaction ID in Jmeter

Difference between stress test and breakpoint test

MySQL queries performance

k6 how to restart testing service between scenarios

QUESTION

Karate-Gatling: Not able to use object fields inside Karate features

Asked 2022-Apr-11 at 17:08

For the following Gatling simulation

...

ANSWER

Answered 2022-Apr-11 at 17:08

Right now we've tested only with primitive values passed into the Gatling session. It may work if you convert the data into a java.util.Map. So maybe your best bet is to write some toMap() function on your data-object. Or if you manage to emit a JSON string, there is a karate.fromString() helper that can be useful.

So please read the docs here and figure out what works: https://github.com/karatelabs/karate/tree/master/karate-gatling#gatling-session

You are most welcome to contribute code to improve the state of things.

Source https://stackoverflow.com/questions/71830035

QUESTION

Faulty benchmark, puzzling assembly

Asked 2022-Mar-28 at 07:40

Assembly novice here. I've written a benchmark to measure the floating-point performance of a machine in computing a transposed matrix-tensor product.

Given my machine with 32GiB RAM (bandwidth ~37GiB/s) and Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz (Turbo 4.0GHz) processor, I estimate the maximum performance (with pipelining and data in registers) to be 6 cores x 4.0GHz = 24GFLOP/s. However, when I run my benchmark, I am measuring 127GFLOP/s, which is obviously a wrong measurement.

Note: in order to measure the FP performance, I am measuring the op-count: n*n*n*n*6 (n^3 for matrix-matrix multiplication, performed on n slices of complex data-points i.e. assuming 6 FLOPs for 1 complex-complex multiplication) and dividing it by the average time taken for each run.

Code snippet in main function:

...

ANSWER

Answered 2022-Mar-25 at 19:33

1 FP operation per core clock cycle would be pathetic for a modern superscalar CPU. Your Skylake-derived CPU can actually do 2x 4-wide SIMD double-precision FMA operations per core per clock, and each FMA counts as two FLOPs, so theoretical max = 16 double-precision FLOPs per core clock, so 24 * 16 = 384 GFLOP/S. (Using vectors of 4 doubles, i.e. 256-bit wide AVX). See FLOPS per cycle for sandy-bridge and haswell SSE2/AVX/AVX2

There is a a function call inside the timed region, callq 403c0b <_Z12do_timed_runRKmRd+0x1eb> (as well as the __kmpc_end_serialized_parallel stuff).

There's no symbol associated with that call target, so I guess you didn't compile with debug info enabled. (That's separate from optimization level, e.g. gcc -g -O3 -march=native -fopenmp should run the same asm, just have more debug metadata.) Even a function invented by OpenMP should have a symbol name associated at some point.

As far as benchmark validity, a good litmus test is whether it scales reasonably with problem size. Unless you exceed L3 cache size or not with a smaller or larger problem, the time should change in some reasonable way. If not, then you'd worry about it optimizing away, or clock speed warm-up effects (Idiomatic way of performance evaluation? for that and more, like page-faults.)

Why are there non-conditional jumps in code (at 403ad3, 403b53, 403d78 and 403d8f)?

Once you're already in an if block, you unconditionally know the else block should not run, so you jmp over it instead of jcc (even if FLAGS were still set so you didn't have to test the condition again). Or you put one or the other block out-of-line (like at the end of the function, or before the entry point) and jcc to it, then it jmps back to after the other side. That allows the fast path to be contiguous with no taken branches.

Why are there 3 retq instances in the same function with only one return path (at 403c0a, 403ca4 and 403d26)?

Duplicate ret comes from "tail duplication" optimization, where multiple paths of execution that all return can just get their own ret instead of jumping to a ret. (And copies of any cleanup necessary, like restoring regs and stack pointer.)

Source https://stackoverflow.com/questions/71618068

QUESTION

What difference does it make if I add think time to my virtual users as opposed to letting them execute requests in a loop as fast as they can?

Asked 2022-Mar-16 at 20:38

I have a requirement to test that a Public Website can serve a defined peak number of 400 page loads per second.

From what I read online, when testing web pages performance, virtual users (threads) should be configured to pause and "think" on each page they visit, in order to simulate the behavior of a real live user before sending a new page load request.

I must use some remote load generator machines to generate this necessary load, and I have a limit on how many virtual users I can use per each load generator. This means that if I make each virtual user pause and "think" for x seconds on each page, that user will not generate a lot of load compared to how much it would if it was executing as fast as it could with no configured think time - and this would cause me to need more users and implicitly need more load generator machines to achieve my desired "page loads per second" and this would be more costly in the end.

If my only request is to prove that a server can serve 400 page loads per second, I would like to know what difference does it really make if I add think times (and therefore use more virtual users) or not.

Why is generally "think time" considered as something which should be added when testing web pages performance ?

...

ANSWER

Answered 2022-Mar-16 at 20:38

Virtual user which is "idle" (doing nothing) has minimal resources footprint (mainly thread stack size) so I don't think you will need to have more machines
Well-behaved load test must represent real life usage of the application with 100% accuracy, if you're testing a website each JMeter thread (virtual user) must mimic a real user using a real browser with all related features like
- handling embedded resources (image, scripts, styles, fonts, sounds, etc.)
- using caching properly
- getting and sending back cookies
- sending appropriate headers
- processing AJAX requests like browser does
the most straightforward example of the difference between 400 users without think times and 4000 users with think times will be that 4000 users will open 4000 connections and keep them open and 400 users will open only 400 connections.

Source https://stackoverflow.com/questions/71502603

QUESTION

Jmeter - bzm Streaming Sampler Content Protection

Asked 2022-Mar-14 at 22:21

We use Jmeter with the BZM - Streaming Sampler to load test a streaming service. With this we are requesting a dash main.mpd file. That url would look like: https://url.com/5bf9c52c17e072d89e6527d45587d03826512bfa3b53a30bb90ecd7ed1bb7a77/dash/Main.mpd

Within the schema we have defined ContentProtection with value="cenc" as such:

...

ANSWER

Answered 2022-Mar-14 at 18:51

It is possible to:

Download the playlist using HTTP Request sampler and Save Responses to a file listener so it would be saved to your local drive. See Performance Testing: Upload and Download Scenarios with Apache JMeter article for more comprehensive instructions if needed
Amend the playlist as needed using JSR223 Sampler or OS Process Sampler
In the bzm - Streaming Sampler use local URL via file URI scheme i.e.

Source https://stackoverflow.com/questions/71472249

QUESTION

How to wait first post issue and use while loop in k6 load test scripts?

Asked 2022-Feb-19 at 11:38

I have two post request. This post request should run until the response is "createdIsCompleted" == false .I m taking createdIsCompleted response from second post isssue. So how can I run two requests in while loop. By the way, I have to wait first post issue before the second post issue should be run...I know there is no await operator in k6. But I want to learn alternative ways. This while loop not working as I want. The response still returns "createdIsCompleted" == true

...

ANSWER

Answered 2022-Feb-19 at 11:38

By the way, I have to wait first post issue before the second post issue should be run...I know there is no await operator in k6

K6 currently has only blocking calls so each post will finish fully before the next one starts.

On the loop question you have two(three) problems:

createdISCompleted is unitialized, so the while loop will never be run as it's not false.
you have big S in the declaration but then you have small s in the while loop.
you have break at the end of the loop which means it will always exit after the first iteration.

Source https://stackoverflow.com/questions/71183857

QUESTION

Measuring OpenMP Fork/Join latency

Asked 2022-Feb-14 at 14:47

Since MPI-3 comes with functionality for shared memory parallelism, and it seems to be perfectly matched for my application, I'm critically considering rewriting my hybrid OpemMP-MPI code into a pure MPI implementation.

In order to drive the last nail into the coffin, I decided to run a small program to test the latency of the OpenMP fork/join mechanism. Here's the code (written for Intel compiler):

...

ANSWER

Answered 2022-Feb-14 at 14:47

Here is my attempt at measuring fork-join overhead:

Source https://stackoverflow.com/questions/71077917

QUESTION

Unable to capture Client transaction ID in Jmeter

Asked 2022-Jan-30 at 13:23

I am currently working in a insurance creation application. I have been facing a challenge in Capturing the Transaction ID. Below is a recording for example, Sample Start:2022-01-05 19:42:39 IST {"clientTransactionId":"2022010519423991400003554512008008822698"} Sample Start:2022-01-05 19:37:10 IST {"applicationTransactionId":"220105193709901533"}

The above recording shows the clientTransactionId and applicationTransactionId having the first 14 digits as timestamp and the rest as random numbers. I am looking for a function to capture these transaction IDs as I have never faced such challenge before (Combination of Timestamp and Random numbers). Please help.

...

ANSWER

Answered 2022-Jan-30 at 13:23

Just add JSON JMESPath Extractor as a child of the request which returns the above response and configure it like:

Names of created variables: anything meaningful, i.e. clientTransactionId
JMESPath Expressions: clientTransactionId
Match No: 1

Once done you will be able to refer extracted value as ${clientTransactionId} JMeter Variable where required

applicationTransactionId can be handled in exactly the same manner

More information:

Source https://stackoverflow.com/questions/70914010

QUESTION

Difference between stress test and breakpoint test

Asked 2022-Jan-13 at 05:05

I was looking for the verbal explanations of different performance testing types and saw a new one called "breakpoint test". Its explanation seemed very similar to stress testing for me. So what is the difference, or is there any difference?

Stress Test: A verification on the system performance during extremely high load which is way above the peak load

Breakpoint Test: This test determines the point of system failure by gradually increasing the number of simulated concurrent users.

As far as I know, we increase the load gradually while performing stress test too. So what is the difference between this two type?

...

ANSWER

Answered 2021-Oct-26 at 12:12

From the workload point of view the approach is exactly the same, my understanding is:

Stress test is about finding the first bottleneck, it's normally applied before deployment or even at early stages of development (see shift-left concept)
Breakpoint (sometimes also called Capacity) test is about checking how much load the overall integrated environment can handle without issues and what is the slowest component which is a subject for scaling up/optimization.

More information:

Source https://stackoverflow.com/questions/69722534

QUESTION

MySQL queries performance

Asked 2022-Jan-09 at 20:02

I have database catalogs with 14000 records, 100 columns and just 2 columns with type longtext. This query was really slow - more than 40 seconds

...

ANSWER

Answered 2022-Jan-09 at 20:02

LONGTEXT columns are stored separately from the rest of the columns. Extra disk fetches are used to load the value.

When you separated the LONGTEXT columns out, did you then fetch the value? And that was slow, anyway?

Do you have INDEX(shop_id)?

Did Laravel do something dumb like preload the entire table?

What will you do with the PDF? If you will only be writing them to a web page, it would be more efficient in multiple ways to store it as a file, then have HTML reference it. This would probably be done via .

Source https://stackoverflow.com/questions/70641751

QUESTION

k6 how to restart testing service between scenarios

Asked 2021-Dec-21 at 19:09

I am running a load test with k6, which tests my service with 6 scenarios. I am running my service with docker-compose and I want to restart my service between each scenario. I couldn't find a built-in method for this so I added a function to restart the service and added some code to call that function at the start of each scenario ( I declared a counter for each scenario with initial value 0 and call the restart function only when the counter is 1). but the function is getting called per VU, not as I expected. Is there any solution for this?

Thanks in advance

...

ANSWER

Answered 2021-Dec-21 at 19:09

It sounds like you are not executing the scenarios in parallel (as I would expect from k6 scenarios), but rather in sequence.

There isn't anything builtin in k6, but why not have a simple shell script which performs the following steps in order:

Source https://stackoverflow.com/questions/70430947

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install Vftrace

You can clone the current version of the vftrace from github. The third party tools are included in the git repository as submodules, for your convinience. If you already cloned the repository without the --recursive flag you can get the submodules with.
Vftrace is written in C. For the Fortran interface, there is also some Fortran code. Vftrace is built using the standard autotools toolchain. For more information, consult the INSTALL file included in this repository.
GNU
Intel
NEC

Support

You are free to clone or fork this repository as you like. If you wish to make a contribution, please open up a pull request. Consult the CODEOWNERS file for more information about contact persons for specific parts of the code. If you find a bug in Vftrace or have an idea for an improvement, please submit an issue on github.

Find more information at: