valgrind | SVN repository at svn : | Version Control System library
kandi X-RAY | valgrind Summary
kandi X-RAY | valgrind Summary
Release notes for Valgrind.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of valgrind
valgrind Key Features
valgrind Examples and Code Snippets
Community Discussions
Trending Discussions on valgrind
QUESTION
I have implemented a Convolutional Neural Network in C and have been studying what parts of it have the longest latency.
Based on my research, the massive amounts of matricial multiplication required by CNNs makes running them on CPUs and even GPUs very inefficient. However, when I actually profiled my code (on an unoptimized build) I found out that something other than the multiplication itself was the bottleneck of the implementation.
After turning on optimization (-O3 -march=native -ffast-math
, gcc cross compiler), the Gprof result was the following:
Clearly, the convolution2D
function takes the largest amount of time to run, followed by the batch normalization and depthwise convolution functions.
The convolution function in question looks like this:
...ANSWER
Answered 2022-Mar-10 at 13:57Looking at the result of Cachegrind, it doesn't look like the memory is your bottleneck. The NN has to be stored in memory anyway, but if it's too large that your program's having a lot of L1 cache misses, then it's worth thinking to try to minimize L1 misses, but 1.7% of L1 (data) miss rate is not a problem.
So you're trying to make this run fast anyway. Looking at your code, what's happening at the most inner loop is very simple (load-> multiply -> add -> store), and it doesn't have any side effect other than the final store. This kind of code is easily parallelizable, for example, by multithreading or vectorizing. I think you'll know how to make this run in multiple threads seeing that you can write code with some complexity, and you asked in comments how to manually vectorize the code.
I will explain that part, but one thing to bear in mind is that once you choose to manually vectorize the code, it will often be tied to certain CPU architectures. Let's not consider non-AMD64 compatible CPUs like ARM. Still, you have the option of MMX, SSE, AVX, and AVX512 to choose as an extension for vectorized computation, and each extension has multiple versions. If you want maximum portability, SSE2 is a reasonable choice. SSE2 appeared with Pentium 4, and it supports 128-bit vectors. For this post I'll use AVX2, which supports 128-bit and 256-bit vectors. It runs fine on your CPU, and has reasonable portability these days, supported from Haswell (2013) and Excavator (2015).
The pattern you're using in the inner loop is called FMA (fused multiply and add). AVX2 has an instruction for this. Have a look at this function and the compiled output.
QUESTION
I'm new to C and still don't really know how to work with valgrind. I'm doing a project where i need to create a function that returns a line of text from a file descriptor each time it's called using just one static variable.
Repeated calls (e.g., using a loop) to your get_next_line() function should let you read the text file pointed to by the file descriptor, one line at a time.
I have come up with this but I can't find where the memory leak is:
...ANSWER
Answered 2022-Feb-24 at 02:07Well I ran your code like this
QUESTION
This code is missing a constructor initializer list:
...ANSWER
Answered 2022-Feb-10 at 13:48You could add -Weffc++
to catch it (inspired by Scott Meyers book "Effective C++"). Strangely enough it does not refer to any other -W
option (and neither does clang++
).
The option is however considered, by some, a bit outdated by now, but in this case, it's finding a real problem.
QUESTION
I'm trying to fix a problem in my code since several days but I'm still stuck on it. I want to insert a value in a tab through a realloc but I have a memory leak (or something else) and I don't know why.
Here's my code:
...ANSWER
Answered 2022-Feb-05 at 09:26Instead of
QUESTION
The following code compiles in both GNU gfortran and Intel ifort. But only the gfortran compiled version will run successfully.
...ANSWER
Answered 2022-Jan-31 at 17:50The error is issued, because the compiler claims that the pointer that is being allocated was not allocated by an allocate
statement.
The rules are (F2018):
9.7.3.3 Deallocation of pointer targets
1 If a pointer appears in a DEALLOCATE statement, its association status shall be defined. Deallocating a pointer that is disassociated or whose target was not created by an ALLOCATE statement causes an error condition in the DEALLOCATE statement. If a pointer is associated with an allocatable entity, the pointer shall not be deallocated. A pointer shall not be deallocated if its target or any subobject thereof is argument associated with a dummy argument or construct associated with an associate name.
Your pointer b
was associated using the c_f_pointer
subroutine. The error condition mentioned is the
QUESTION
My first question is that the object A(v) that added the the map, it should be deleted automatically when exit the scope?
My second question is what would happen to the object added into the map, when program exits? I believe when I do a_[name] = A(v);, a copy is stored to the map. Also, do I need to provide a copy constructor?
...ANSWER
Answered 2022-Jan-08 at 22:54"still reachable" does not strictly mean it is memory leak. I belive it is because you called exit(0)
instead of just returning 0. The stack didn't get cleaned up because the program got terminated with signal.
QUESTION
Many topics told us that use small object like lambda expression could avoid heap allocation when using std::function
. But my study shows not that way.
This is my experiment code, very simple
...ANSWER
Answered 2022-Jan-05 at 09:28Older versions of libstdc++, like the one shipped by gcc 4.8.5, seem to only optimise function pointers to not allocate (as seen here).
Since the std::function
implementation does not have the small object optimisation that you want, you will have to use an alternative implementation. Either upgrade your compiler or use boost::function
, which is essentially the same as std::function
.
QUESTION
Edited to include MWE (removing example-lite) and added details about compilation and Valgrind output.
I am using the mutable keyword to achieve the result of lazy evaluation and caching a result. This works fine for a single object, but doesn't seem to work as expected for a collection.
My case is more complex, but let's say I have a triangle class that can calculate the area of a triangle and cache the result. I use pointers in my case because the thing being lazily evaluated is a more complex class (it is actually another instance of the same class, but I'm trying to simplify this example).
I have another class that is essentially a collection of triangles. It has a way to calculate the total area of all the contained triangles.
Logically, tri::Area() is const -- and mesh::Area() is const. When implemented as above, Valgrind shows a memory leak (m_Area).
I believe since I am using a const_iterator, the call to tri::Area() is acting on a copy of the triangle. Area() is called on that copy, which does the new, calculates the area, and returns the result. At that point, the copy is lost and the memory is leaked.
In addition, I believe this means the area is not actually cached. The next time I call Area(), it leaks more memory and does the calculation again. Obviously, this is non-ideal.
One solution would be to make mesh::Area() non-const. This isn't great because it needs to be called from other const methods.
I think this might work (mark m_Triangles as mutable and use a regular iterator):
However, I don't love marking m_Triangles as mutable -- I'd prefer to keep the compiler's ability to protect the constiness of m_Triangles in other un-related methods. So, I'm tempted to use const_cast to localize the ugly to just the method that needs it. Something like this (mistakes likely):
Not sure how to implement with const_cast -- should I be casting m_Triangles or this? If I cast this, is m_Triangles visible (since it is private)?
Is there some other way that I'm missing?
The effect I want is to keep mesh::Area() marked const, but have calling it cause all the tris calculate and cache their m_Area. While we're at it -- no memory leaks and Valgrind is happy.
I've found plenty of examples of using mutable in an object -- but nothing about using that object in a collection from another object. Links to a blog post or tutorial article on this would be great.
Thanks for any help.
UpdateFrom this MWE, it looks like I was wrong about the point of the leak.
The code below is Valgrind-clean if the call to SplitIndx()
is removed.
In addition, I added a simple test to confirm that the cached value is getting stored and updated in the container-stored objects.
It now appears that the call m_Triangles[indx] = t1;
is where the leak occurs. How should I plug this leak?
ANSWER
Answered 2021-Dec-24 at 01:18One way to avoid making it mutable
is to make it always point at the the data cache, which could be a std::optional
.
You'd then create and store a std::unique_ptr>
that you keep for the tri
object's lifetime.
Example:
QUESTION
I'm learn about std::allocator
. I try to allocate but use deallocate incorrectly I saw that it didn't use size argument, I confuse about the method could you please explain for me ? Thanks.
- testcase1 "test" : I didn't deallocate, valgrind detected (correct)
- testcase2 "test_deallocate" : I deallocate with size(0) less than actual size (400),valgrind or
-fsanitize=address
can't detect leak - testcase3 "test_deallocate2": I deallocate with size(10000) greater than actual size (400) compiler didn't warning , g++ with
-fsanitize=address
also can't detect this.
ANSWER
Answered 2021-Dec-28 at 10:42
- testcase2 "test_deallocate" : I deallocate with size(0) less than actual size (400),valgrind or -fsanitize=address can't detect leak
When you dellocate with wrong size, then the behaviour of the program is undefined. When the behaviour is undefined, there is no guarantee that memory would be leaked.
- testcase3 "test_deallocate2": I deallocate with size(10000) greater than actual size (400) compiler didn't warning , g++ with -fsanitize=address also can't detect this.
Ditto.
QUESTION
I have two structs, Holder
and Held
. Holder
holds a reference to Held
. Held
holds an i32
:
ANSWER
Answered 2021-Dec-21 at 03:58Copying from my reply above:
The basic thing here is that once you re-change heldvals
then holders
is completely invalidated. So if you populate heldvals
completely, and then iterate through it to populate holders
, then you're OK. But once you change heldvals
again, holders
is invalidated
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install valgrind
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page