Performance Testing

Explore all libraries in Performance Testing

Explore all Performance Testing open source software, libraries, packages, source code, cloud functions and APIs.

Popular New Releases in Performance Testing

lighthouse

locust

2.8.5

vegeta

v12.8.4

fasthttp

v1.35.0

hyperfine

v1.13.0

Popular Libraries in Performance Testing

lighthouse

by GoogleChrome javascript

24437 Apache-2.0

Automated auditing, performance metrics, and best practices for the web.

locust

by locustio python

18579 MIT

Scalable user load testing tool written in Python

vegeta

by tsenart go

18195 MIT

HTTP load testing tool and library. It's over 9000!

fasthttp

by valyala go

17467 MIT

Fast HTTP package for Go. Tuned for high performance. Zero memory allocations in hot paths. Up to 10x faster than net/http

hyperfine

by sharkdp rust

10759 NOASSERTION

A command-line benchmarking tool

quicklink

by GoogleChromeLabs html

8923 Apache-2.0

⚡️Faster subsequent page-loads by prefetching in-viewport links during idle time

ants

by panjf2000 go

7676 MIT

🐜🐜🐜 ants is a high-performance and low-cost goroutine pool in Go, inspired by fasthttp./ ants 是一个高性能且低损耗的 goroutine 池。

puma

by puma ruby

7029 BSD-3-Clause

A Ruby/Rack web server built for parallelism

benchmark

by google c++

6338 Apache-2.0

A microbenchmark support library

Explore all libraries in Performance Testing

Trending New libraries in Performance Testing

tracy

by wolfpld c++

3691 NOASSERTION

C++ frame profiler

high-performance-go

by geektutu go

2203 MIT

high performance coding with golang（Go 语言高性能编程，Go 语言陷阱，Gotchas，Traps）

column

by kelindar go

855 MIT

High-performance, columnar, in-memory store with bitmap indexing in Go

fuzzbench

by google python

778 Apache-2.0

FuzzBench - Fuzzer benchmarking as a service.

PhotorealismEnhancement

by intel-isl html

739

Code & Data for Enhancing Photorealism Enhancement

swift-benchmark

by google swift

705 Apache-2.0

A swift library to benchmark code snippets.

pond

by alitto go

486 MIT

🔘 Minimalistic and High-performance goroutine worker pool written in Go

goose

by tag1consulting rust

332 Apache-2.0

Load testing framework, inspired by Locust

chrome-extensions

by metersphere javascript

320 GPL-3.0

MeterSphere 录制浏览器请求的插件，记录浏览器中的网络请求并导出为 Jmeter 或 Json 格式的文件

Top Authors in Performance Testing

elastic

9 Libraries

2080

QAInsights

7 Libraries

google

6 Libraries

8353

LLNL

6 Libraries

143

microsoft

6 Libraries

4475

intel

6 Libraries

432

centminmod

4 Libraries

lemire

4 Libraries

k6io

4 Libraries

266

shaovoon

4 Libraries

elastic

9 Libraries

2080

QAInsights

7 Libraries

google

6 Libraries

8353

LLNL

6 Libraries

143

microsoft

6 Libraries

4475

intel

6 Libraries

432

centminmod

4 Libraries

lemire

4 Libraries

k6io

4 Libraries

266

shaovoon

4 Libraries

Trending Kits in Performance Testing

5 best C# Performance Testing libraries

Performance testing, or load testing as it is sometimes called, evaluates a system's behavior under high loads. Load testing is the process of determining how an application behaves when subjected to different loads ranging from low to high-level stress tests. The main goal of performance testing is to find and fix issues before they happen in production, preventing performance issues in production and possible downtime. For C# applications, there are various performance testing libraries that can ensure the application’s quality under heavy load, validating its overall stability. This list gives best out of them xunit-performance - Provides extensions over xUnit to author performance tests; WebApiBenchmark - Web api management and performance testing tools; SqlProxyAndReplay - performance testing services that depend upon a SQL database.

7 best PHP Performance Testing libraries

PHP is a widely-used general-purpose scripting language that is especially suited for Web development and can be embedded into HTML. Performance testing is a very important and integral part of any software development project. It’s a process in which the application is tested under real-life conditions that imitate the production environment to determine how it behaves when subjected to different loads. The main goal of this type of testing is to find and fix issues before they happen in production, preventing performance issues in production and possible downtime. This PHP Performance Testing Open Source libraries then this list will show you how to get started with some of the most popular ones Phoronix - Test Suite opensource, crossplatform automated testing; doctrine-test-bundle - Symfony bundle to isolate your app's doctrine database tests; daytona - applicationagnostic framework for automated performance testing.

6 best C++ Performance Testing libraries

Testing applications for performance issues, known as performance testing, is essential to project success. A performance test replicates real-life conditions by simulating production loads on the application to see how the application behaves and responds. Performance testing reveals potential issues that could cause performance problems or downtime in production environments, so these issues can be addressed before they impact end users. There are many tools you can use for benchmarking in C++. Some are open-source and available for free, while others are proprietary tools that must be purchased to access their full functionality. Here is a list of the six best C++ performance testing libraries in 2022 sltbench - C benchmark tool; StronglyUniversalStringHashing - Benchmark showing the we can randomly hash strings.

9 best Python Performance Testing libraries

Performance testing, which is a type of software testing, is essential to project success. A performance test replicates real-life conditions by simulating production loads on the application to see how the application behaves and responds. Performance testing reveals potential issues that could cause performance problems or downtime in production environments, so these issues can be addressed before they impact end-users. Python Performance Testing Open Source libraries are a big part of the Python ecosystem. The following are some of the best open-source Python performance testing libraries including multi-mechanize - Performance Test Framework in Python; pyperform - An easy and convienent way to performance test python code; mobileperf - Android performance test.

9 best Ruby Performance Testing libraries

Benchmarking a piece of code is no easy task, you need to make sure that your code is running efficiently and there are no bottlenecks in the process. Performance testing simulates the expected performance of applications under real-life production conditions to identify and fix issues before they affect users. It consists of running software applications under a variety of conditions to make sure they perform correctly. This is the process of replicating production usage in a controlled environment and monitoring the application’s behavior and performance under real-life conditions. In the following kit we cover 8 best Ruby performance testing libraries for the year 2022 like em-proxy - EventMachine Proxy DSL for writing highperformance transparent; rspec-benchmark - Performance testing matchers for RSpec; logstash-performance-testing - Logstash performance testing tool.

11 best Go Performance Testing libraries

Performance testing is a critical component of any software development process. It is used to evaluate the performance of an application under real-world conditions, mimicking how it will operate when deployed in production. Its primary goal is to identify problems that may occur in live applications, thereby preventing issues and downtime. It can be difficult to accurately test performance across multiple devices and users. You can use these best Go Performance Testing Open Source libraries to help with this process, ddosify - High-performance load testing tool, written in Golang; goose - Goose database migration tool; Fortio - load testing library, command line tool.

8 best JavaScript Performance Testing libraries

Performance testing is a key part of any software development project. Under real-life conditions, performance testing checks how the application performs under heavy loads, preventing performance issues in production and possible downtime. It is an in-depth analysis that emulates the production environment to determine how it behaves under different loads. The goal of performance testing is to identify potential issues before they occur in production, preventing downtime and performance issues during normal operations. Check the best performing libraries for performance testing in JavaScript in this kit, boomerang - End user oriented web performance testing and beaconing; grunt-ava - Run AVA tests; karma-ava - Karma plugin for AVA.

10 best Java Performance Testing libraries

Performance testing is a vital part of any software development project. It's a process in which the application is tested under real-life conditions that imitate the production environment, to see how it behaves when subjected to different loads. The main goal of this type of testing is to find and fix issues before they happen in production, providing smooth operation in production and preventing possible downtime. You can use the following 10 best Java Performance Testing Open Source libraries to help with this process, MyPerf4J - High performance Java APM; java-json-benchmark - Performance testing of serialization and deserialization of Java JSON libraries; quickperf - testing library for Java to quickly evaluate.

More kits in Performance Testing

Trending Discussions on Performance Testing

What's the fastest way to select all rows in one pandas dataframe that do not exist in another?

Crashing when calling QTcpSocket::setSocketDescriptor()

How to do performance testing with key-cloak authentication in Jmeter

How to "PERFORM" a CTE query returning multiple rows/columns?

Difference between stress test and breakpoint test

Performance testing for Svelte applications

Adding keyStore and trustStore to Gatling requests

Buildspec for Spring boot application

How to find max thread active for scatter-gather in mule

Automation vs performance testing

What's the fastest way to select all rows in one pandas dataframe that do not exist in another?

Crashing when calling QTcpSocket::setSocketDescriptor()

How to do performance testing with key-cloak authentication in Jmeter

How to "PERFORM" a CTE query returning multiple rows/columns?

Difference between stress test and breakpoint test

Performance testing for Svelte applications

Adding keyStore and trustStore to Gatling requests

Buildspec for Spring boot application

How to find max thread active for scatter-gather in mule

Automation vs performance testing

QUESTION

What's the fastest way to select all rows in one pandas dataframe that do not exist in another?

Asked 2022-Mar-21 at 23:14

Beginning with two pandas dataframes of different shapes, what is the fastest way to select all rows in one dataframe that do not exist in the other (or drop all rows in one dataframe that already exist in the other)? And are the fastest methods different for string-valued columns vs. numeric columns? Operation should be roughly equivalent to the code below

1import pandas as pd
2
3string_df1 = pd.DataFrame({'latin':['a', 'b', 'c'],
4                           'greek':['alpha', 'beta', 'gamma']})
5string_df2 = pd.DataFrame({'latin':['z', 'c'],
6                           'greek':['omega', 'gamma']})
7
8numeric_df1 = pd.DataFrame({'A':[1, 2, 3],
9                            'B':[1.01, 2.02, 3.03]})
10numeric_df2 = pd.DataFrame({'A':[3, 9],
11                            'B':[3.03, 9.09]})
12
13def index_matching_rows(df1, df2, cols_to_match=None):
14    '''    
15    return index of subset of rows of df1 that are equal to at least one row in df2
16    '''
17    if cols_to_match is None:
18        cols_to_match = df1.columns
19    
20    df1 = df1.reset_index()
21    m = df1.merge(df2, on=cols_to_match[0], suffixes=('1','2'))
22    
23    query = '&amp;'.join(['{0}1 == {0}2'.format(str(c)) for c in cols_to_match[1:]])
24
25    m = m.query(query)
26    
27    return m['index']
28
29print(string_df2.drop(index_matching_rows(string_df2, string_df1)))
30print(numeric_df2.drop(index_matching_rows(numeric_df2, numeric_df1)))
31

output

1import pandas as pd
2
3string_df1 = pd.DataFrame({'latin':['a', 'b', 'c'],
4                           'greek':['alpha', 'beta', 'gamma']})
5string_df2 = pd.DataFrame({'latin':['z', 'c'],
6                           'greek':['omega', 'gamma']})
7
8numeric_df1 = pd.DataFrame({'A':[1, 2, 3],
9                            'B':[1.01, 2.02, 3.03]})
10numeric_df2 = pd.DataFrame({'A':[3, 9],
11                            'B':[3.03, 9.09]})
12
13def index_matching_rows(df1, df2, cols_to_match=None):
14    '''    
15    return index of subset of rows of df1 that are equal to at least one row in df2
16    '''
17    if cols_to_match is None:
18        cols_to_match = df1.columns
19    
20    df1 = df1.reset_index()
21    m = df1.merge(df2, on=cols_to_match[0], suffixes=('1','2'))
22    
23    query = '&amp;'.join(['{0}1 == {0}2'.format(str(c)) for c in cols_to_match[1:]])
24
25    m = m.query(query)
26    
27    return m['index']
28
29print(string_df2.drop(index_matching_rows(string_df2, string_df1)))
30print(numeric_df2.drop(index_matching_rows(numeric_df2, numeric_df1)))
31  latin  greek
320     z  omega
33
34   A     B
351  9  9.09
36

some naive performance testing

1import pandas as pd
2
3string_df1 = pd.DataFrame({'latin':['a', 'b', 'c'],
4                           'greek':['alpha', 'beta', 'gamma']})
5string_df2 = pd.DataFrame({'latin':['z', 'c'],
6                           'greek':['omega', 'gamma']})
7
8numeric_df1 = pd.DataFrame({'A':[1, 2, 3],
9                            'B':[1.01, 2.02, 3.03]})
10numeric_df2 = pd.DataFrame({'A':[3, 9],
11                            'B':[3.03, 9.09]})
12
13def index_matching_rows(df1, df2, cols_to_match=None):
14    '''    
15    return index of subset of rows of df1 that are equal to at least one row in df2
16    '''
17    if cols_to_match is None:
18        cols_to_match = df1.columns
19    
20    df1 = df1.reset_index()
21    m = df1.merge(df2, on=cols_to_match[0], suffixes=('1','2'))
22    
23    query = '&amp;'.join(['{0}1 == {0}2'.format(str(c)) for c in cols_to_match[1:]])
24
25    m = m.query(query)
26    
27    return m['index']
28
29print(string_df2.drop(index_matching_rows(string_df2, string_df1)))
30print(numeric_df2.drop(index_matching_rows(numeric_df2, numeric_df1)))
31  latin  greek
320     z  omega
33
34   A     B
351  9  9.09
36copies = 10
37big_sdf1 = pd.concat([string_df1, string_df1]*copies)
38big_sdf2 = pd.concat([string_df2, string_df2]*copies)
39big_ndf1 = pd.concat([numeric_df1, numeric_df1]*copies)
40big_ndf2 = pd.concat([numeric_df2, numeric_df2]*copies)
41

1import pandas as pd
2
3string_df1 = pd.DataFrame({'latin':['a', 'b', 'c'],
4                           'greek':['alpha', 'beta', 'gamma']})
5string_df2 = pd.DataFrame({'latin':['z', 'c'],
6                           'greek':['omega', 'gamma']})
7
8numeric_df1 = pd.DataFrame({'A':[1, 2, 3],
9                            'B':[1.01, 2.02, 3.03]})
10numeric_df2 = pd.DataFrame({'A':[3, 9],
11                            'B':[3.03, 9.09]})
12
13def index_matching_rows(df1, df2, cols_to_match=None):
14    '''    
15    return index of subset of rows of df1 that are equal to at least one row in df2
16    '''
17    if cols_to_match is None:
18        cols_to_match = df1.columns
19    
20    df1 = df1.reset_index()
21    m = df1.merge(df2, on=cols_to_match[0], suffixes=('1','2'))
22    
23    query = '&amp;'.join(['{0}1 == {0}2'.format(str(c)) for c in cols_to_match[1:]])
24
25    m = m.query(query)
26    
27    return m['index']
28
29print(string_df2.drop(index_matching_rows(string_df2, string_df1)))
30print(numeric_df2.drop(index_matching_rows(numeric_df2, numeric_df1)))
31  latin  greek
320     z  omega
33
34   A     B
351  9  9.09
36copies = 10
37big_sdf1 = pd.concat([string_df1, string_df1]*copies)
38big_sdf2 = pd.concat([string_df2, string_df2]*copies)
39big_ndf1 = pd.concat([numeric_df1, numeric_df1]*copies)
40big_ndf2 = pd.concat([numeric_df2, numeric_df2]*copies)
41%%timeit
42big_sdf2.drop(index_matching_rows(big_sdf2, big_sdf1))
43# copies =  10:      2.61 ms   ±  27.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
44# copies =  20:      4.44 ms   ±  43.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
45# copies =  30:     18.4  ms   ± 132   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
46# copies =  40:     74.6  ms   ± 453   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
47# copies = 100: 19.2       s   ± 112   ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
48

1import pandas as pd
2
3string_df1 = pd.DataFrame({'latin':['a', 'b', 'c'],
4                           'greek':['alpha', 'beta', 'gamma']})
5string_df2 = pd.DataFrame({'latin':['z', 'c'],
6                           'greek':['omega', 'gamma']})
7
8numeric_df1 = pd.DataFrame({'A':[1, 2, 3],
9                            'B':[1.01, 2.02, 3.03]})
10numeric_df2 = pd.DataFrame({'A':[3, 9],
11                            'B':[3.03, 9.09]})
12
13def index_matching_rows(df1, df2, cols_to_match=None):
14    '''    
15    return index of subset of rows of df1 that are equal to at least one row in df2
16    '''
17    if cols_to_match is None:
18        cols_to_match = df1.columns
19    
20    df1 = df1.reset_index()
21    m = df1.merge(df2, on=cols_to_match[0], suffixes=('1','2'))
22    
23    query = '&amp;'.join(['{0}1 == {0}2'.format(str(c)) for c in cols_to_match[1:]])
24
25    m = m.query(query)
26    
27    return m['index']
28
29print(string_df2.drop(index_matching_rows(string_df2, string_df1)))
30print(numeric_df2.drop(index_matching_rows(numeric_df2, numeric_df1)))
31  latin  greek
320     z  omega
33
34   A     B
351  9  9.09
36copies = 10
37big_sdf1 = pd.concat([string_df1, string_df1]*copies)
38big_sdf2 = pd.concat([string_df2, string_df2]*copies)
39big_ndf1 = pd.concat([numeric_df1, numeric_df1]*copies)
40big_ndf2 = pd.concat([numeric_df2, numeric_df2]*copies)
41%%timeit
42big_sdf2.drop(index_matching_rows(big_sdf2, big_sdf1))
43# copies =  10:      2.61 ms   ±  27.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
44# copies =  20:      4.44 ms   ±  43.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
45# copies =  30:     18.4  ms   ± 132   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
46# copies =  40:     74.6  ms   ± 453   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
47# copies = 100: 19.2       s   ± 112   ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
48%%timeit
49big_ndf2.drop(index_matching_rows(big_ndf2, big_ndf1))
50# copies = 10:  2.56 ms ±    29.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
51# copies = 20:  4.38 ms ±    75.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
52# copies = 30: 18.3  ms ±   194   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
53# copies = 40: 76.5  ms ± 1.76    ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
54

This code runs about as quickly for strings as for numeric data, and I think it's exponential in the length of the dataframe (the curve above is 1.6*exp(0.094x), fit to the string data). I'm working with dataframes that are on the order of 1e5 rows, so this is not a solution for me.

Here's the same performance check for Raymond Kwok's (accepted) answer below in case anyone can beat it later. It's O(n).

1import pandas as pd
2
3string_df1 = pd.DataFrame({'latin':['a', 'b', 'c'],
4                           'greek':['alpha', 'beta', 'gamma']})
5string_df2 = pd.DataFrame({'latin':['z', 'c'],
6                           'greek':['omega', 'gamma']})
7
8numeric_df1 = pd.DataFrame({'A':[1, 2, 3],
9                            'B':[1.01, 2.02, 3.03]})
10numeric_df2 = pd.DataFrame({'A':[3, 9],
11                            'B':[3.03, 9.09]})
12
13def index_matching_rows(df1, df2, cols_to_match=None):
14    '''    
15    return index of subset of rows of df1 that are equal to at least one row in df2
16    '''
17    if cols_to_match is None:
18        cols_to_match = df1.columns
19    
20    df1 = df1.reset_index()
21    m = df1.merge(df2, on=cols_to_match[0], suffixes=('1','2'))
22    
23    query = '&amp;'.join(['{0}1 == {0}2'.format(str(c)) for c in cols_to_match[1:]])
24
25    m = m.query(query)
26    
27    return m['index']
28
29print(string_df2.drop(index_matching_rows(string_df2, string_df1)))
30print(numeric_df2.drop(index_matching_rows(numeric_df2, numeric_df1)))
31  latin  greek
320     z  omega
33
34   A     B
351  9  9.09
36copies = 10
37big_sdf1 = pd.concat([string_df1, string_df1]*copies)
38big_sdf2 = pd.concat([string_df2, string_df2]*copies)
39big_ndf1 = pd.concat([numeric_df1, numeric_df1]*copies)
40big_ndf2 = pd.concat([numeric_df2, numeric_df2]*copies)
41%%timeit
42big_sdf2.drop(index_matching_rows(big_sdf2, big_sdf1))
43# copies =  10:      2.61 ms   ±  27.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
44# copies =  20:      4.44 ms   ±  43.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
45# copies =  30:     18.4  ms   ± 132   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
46# copies =  40:     74.6  ms   ± 453   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
47# copies = 100: 19.2       s   ± 112   ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
48%%timeit
49big_ndf2.drop(index_matching_rows(big_ndf2, big_ndf1))
50# copies = 10:  2.56 ms ±    29.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
51# copies = 20:  4.38 ms ±    75.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
52# copies = 30: 18.3  ms ±   194   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
53# copies = 40: 76.5  ms ± 1.76    ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
54%%timeit
55big_sdf1_tuples = big_sdf1.apply(tuple, axis=1)
56big_sdf2_tuples = big_sdf2.apply(tuple, axis=1)
57big_sdf2_tuples.isin(big_sdf1_tuples)
58# copies =  100:     4.82  ms ±     22   µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
59# copies = 1000:    44.6   ms ±    386   µs per loop (mean ± std. dev. of 7 runs,  10 loops each)
60# copies =  1e4:   450     ms ±  9.44    ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
61# copies =  1e5: 4.42       s ± 27.6     ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
62

1import pandas as pd
2
3string_df1 = pd.DataFrame({'latin':['a', 'b', 'c'],
4                           'greek':['alpha', 'beta', 'gamma']})
5string_df2 = pd.DataFrame({'latin':['z', 'c'],
6                           'greek':['omega', 'gamma']})
7
8numeric_df1 = pd.DataFrame({'A':[1, 2, 3],
9                            'B':[1.01, 2.02, 3.03]})
10numeric_df2 = pd.DataFrame({'A':[3, 9],
11                            'B':[3.03, 9.09]})
12
13def index_matching_rows(df1, df2, cols_to_match=None):
14    '''    
15    return index of subset of rows of df1 that are equal to at least one row in df2
16    '''
17    if cols_to_match is None:
18        cols_to_match = df1.columns
19    
20    df1 = df1.reset_index()
21    m = df1.merge(df2, on=cols_to_match[0], suffixes=('1','2'))
22    
23    query = '&amp;'.join(['{0}1 == {0}2'.format(str(c)) for c in cols_to_match[1:]])
24
25    m = m.query(query)
26    
27    return m['index']
28
29print(string_df2.drop(index_matching_rows(string_df2, string_df1)))
30print(numeric_df2.drop(index_matching_rows(numeric_df2, numeric_df1)))
31  latin  greek
320     z  omega
33
34   A     B
351  9  9.09
36copies = 10
37big_sdf1 = pd.concat([string_df1, string_df1]*copies)
38big_sdf2 = pd.concat([string_df2, string_df2]*copies)
39big_ndf1 = pd.concat([numeric_df1, numeric_df1]*copies)
40big_ndf2 = pd.concat([numeric_df2, numeric_df2]*copies)
41%%timeit
42big_sdf2.drop(index_matching_rows(big_sdf2, big_sdf1))
43# copies =  10:      2.61 ms   ±  27.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
44# copies =  20:      4.44 ms   ±  43.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
45# copies =  30:     18.4  ms   ± 132   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
46# copies =  40:     74.6  ms   ± 453   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
47# copies = 100: 19.2       s   ± 112   ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
48%%timeit
49big_ndf2.drop(index_matching_rows(big_ndf2, big_ndf1))
50# copies = 10:  2.56 ms ±    29.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
51# copies = 20:  4.38 ms ±    75.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
52# copies = 30: 18.3  ms ±   194   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
53# copies = 40: 76.5  ms ± 1.76    ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
54%%timeit
55big_sdf1_tuples = big_sdf1.apply(tuple, axis=1)
56big_sdf2_tuples = big_sdf2.apply(tuple, axis=1)
57big_sdf2_tuples.isin(big_sdf1_tuples)
58# copies =  100:     4.82  ms ±     22   µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
59# copies = 1000:    44.6   ms ±    386   µs per loop (mean ± std. dev. of 7 runs,  10 loops each)
60# copies =  1e4:   450     ms ±  9.44    ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
61# copies =  1e5: 4.42       s ± 27.6     ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
62%%timeit
63big_ndf1_tuples = big_ndf1.apply(tuple, axis=1)
64big_ndf2_tuples = big_ndf2.apply(tuple, axis=1)
65big_ndf2_tuples.isin(big_ndf1_tuples)
66# copies =  100:     4.98 ms ±     28.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
67# copies = 1000:    47    ms ±    288   µs per loop (mean ± std. dev. of 7 runs,  10 loops each)
68# copies =  1e4:   461    ms ±  4.41    ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
69# copies =  1e5: 4.58      s ± 30.1     ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
70

Indexing into the longest dataframe with

1import pandas as pd
2
3string_df1 = pd.DataFrame({'latin':['a', 'b', 'c'],
4                           'greek':['alpha', 'beta', 'gamma']})
5string_df2 = pd.DataFrame({'latin':['z', 'c'],
6                           'greek':['omega', 'gamma']})
7
8numeric_df1 = pd.DataFrame({'A':[1, 2, 3],
9                            'B':[1.01, 2.02, 3.03]})
10numeric_df2 = pd.DataFrame({'A':[3, 9],
11                            'B':[3.03, 9.09]})
12
13def index_matching_rows(df1, df2, cols_to_match=None):
14    '''    
15    return index of subset of rows of df1 that are equal to at least one row in df2
16    '''
17    if cols_to_match is None:
18        cols_to_match = df1.columns
19    
20    df1 = df1.reset_index()
21    m = df1.merge(df2, on=cols_to_match[0], suffixes=('1','2'))
22    
23    query = '&amp;'.join(['{0}1 == {0}2'.format(str(c)) for c in cols_to_match[1:]])
24
25    m = m.query(query)
26    
27    return m['index']
28
29print(string_df2.drop(index_matching_rows(string_df2, string_df1)))
30print(numeric_df2.drop(index_matching_rows(numeric_df2, numeric_df1)))
31  latin  greek
320     z  omega
33
34   A     B
351  9  9.09
36copies = 10
37big_sdf1 = pd.concat([string_df1, string_df1]*copies)
38big_sdf2 = pd.concat([string_df2, string_df2]*copies)
39big_ndf1 = pd.concat([numeric_df1, numeric_df1]*copies)
40big_ndf2 = pd.concat([numeric_df2, numeric_df2]*copies)
41%%timeit
42big_sdf2.drop(index_matching_rows(big_sdf2, big_sdf1))
43# copies =  10:      2.61 ms   ±  27.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
44# copies =  20:      4.44 ms   ±  43.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
45# copies =  30:     18.4  ms   ± 132   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
46# copies =  40:     74.6  ms   ± 453   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
47# copies = 100: 19.2       s   ± 112   ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
48%%timeit
49big_ndf2.drop(index_matching_rows(big_ndf2, big_ndf1))
50# copies = 10:  2.56 ms ±    29.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
51# copies = 20:  4.38 ms ±    75.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
52# copies = 30: 18.3  ms ±   194   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
53# copies = 40: 76.5  ms ± 1.76    ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
54%%timeit
55big_sdf1_tuples = big_sdf1.apply(tuple, axis=1)
56big_sdf2_tuples = big_sdf2.apply(tuple, axis=1)
57big_sdf2_tuples.isin(big_sdf1_tuples)
58# copies =  100:     4.82  ms ±     22   µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
59# copies = 1000:    44.6   ms ±    386   µs per loop (mean ± std. dev. of 7 runs,  10 loops each)
60# copies =  1e4:   450     ms ±  9.44    ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
61# copies =  1e5: 4.42       s ± 27.6     ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
62%%timeit
63big_ndf1_tuples = big_ndf1.apply(tuple, axis=1)
64big_ndf2_tuples = big_ndf2.apply(tuple, axis=1)
65big_ndf2_tuples.isin(big_ndf1_tuples)
66# copies =  100:     4.98 ms ±     28.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
67# copies = 1000:    47    ms ±    288   µs per loop (mean ± std. dev. of 7 runs,  10 loops each)
68# copies =  1e4:   461    ms ±  4.41    ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
69# copies =  1e5: 4.58      s ± 30.1     ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
70big_sdf2_tuples.loc[~big_sdf2_tuples.isin(big_sdf1_tuples)]
71

to recover the equivalent of the output in my code above adds about 10 ms.

ANSWER

Answered 2022-Feb-26 at 13:15

Beginning with 2 dataframes:

1import pandas as pd
2
3string_df1 = pd.DataFrame({'latin':['a', 'b', 'c'],
4                           'greek':['alpha', 'beta', 'gamma']})
5string_df2 = pd.DataFrame({'latin':['z', 'c'],
6                           'greek':['omega', 'gamma']})
7
8numeric_df1 = pd.DataFrame({'A':[1, 2, 3],
9                            'B':[1.01, 2.02, 3.03]})
10numeric_df2 = pd.DataFrame({'A':[3, 9],
11                            'B':[3.03, 9.09]})
12
13def index_matching_rows(df1, df2, cols_to_match=None):
14    '''    
15    return index of subset of rows of df1 that are equal to at least one row in df2
16    '''
17    if cols_to_match is None:
18        cols_to_match = df1.columns
19    
20    df1 = df1.reset_index()
21    m = df1.merge(df2, on=cols_to_match[0], suffixes=('1','2'))
22    
23    query = '&amp;'.join(['{0}1 == {0}2'.format(str(c)) for c in cols_to_match[1:]])
24
25    m = m.query(query)
26    
27    return m['index']
28
29print(string_df2.drop(index_matching_rows(string_df2, string_df1)))
30print(numeric_df2.drop(index_matching_rows(numeric_df2, numeric_df1)))
31  latin  greek
320     z  omega
33
34   A     B
351  9  9.09
36copies = 10
37big_sdf1 = pd.concat([string_df1, string_df1]*copies)
38big_sdf2 = pd.concat([string_df2, string_df2]*copies)
39big_ndf1 = pd.concat([numeric_df1, numeric_df1]*copies)
40big_ndf2 = pd.concat([numeric_df2, numeric_df2]*copies)
41%%timeit
42big_sdf2.drop(index_matching_rows(big_sdf2, big_sdf1))
43# copies =  10:      2.61 ms   ±  27.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
44# copies =  20:      4.44 ms   ±  43.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
45# copies =  30:     18.4  ms   ± 132   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
46# copies =  40:     74.6  ms   ± 453   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
47# copies = 100: 19.2       s   ± 112   ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
48%%timeit
49big_ndf2.drop(index_matching_rows(big_ndf2, big_ndf1))
50# copies = 10:  2.56 ms ±    29.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
51# copies = 20:  4.38 ms ±    75.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
52# copies = 30: 18.3  ms ±   194   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
53# copies = 40: 76.5  ms ± 1.76    ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
54%%timeit
55big_sdf1_tuples = big_sdf1.apply(tuple, axis=1)
56big_sdf2_tuples = big_sdf2.apply(tuple, axis=1)
57big_sdf2_tuples.isin(big_sdf1_tuples)
58# copies =  100:     4.82  ms ±     22   µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
59# copies = 1000:    44.6   ms ±    386   µs per loop (mean ± std. dev. of 7 runs,  10 loops each)
60# copies =  1e4:   450     ms ±  9.44    ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
61# copies =  1e5: 4.42       s ± 27.6     ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
62%%timeit
63big_ndf1_tuples = big_ndf1.apply(tuple, axis=1)
64big_ndf2_tuples = big_ndf2.apply(tuple, axis=1)
65big_ndf2_tuples.isin(big_ndf1_tuples)
66# copies =  100:     4.98 ms ±     28.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
67# copies = 1000:    47    ms ±    288   µs per loop (mean ± std. dev. of 7 runs,  10 loops each)
68# copies =  1e4:   461    ms ±  4.41    ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
69# copies =  1e5: 4.58      s ± 30.1     ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
70big_sdf2_tuples.loc[~big_sdf2_tuples.isin(big_sdf1_tuples)]
71df1 = {'Runner': ['A', 'A', 'A', 'A'],
72 'Day': ['1', '3', '8', '9'],
73 'Miles': ['3', '4', '4', '2']}
74
75df2 = df.copy().drop([1,3])
76

where the 2nd has two rows less.

We can hash the rows:

1import pandas as pd
2
3string_df1 = pd.DataFrame({'latin':['a', 'b', 'c'],
4                           'greek':['alpha', 'beta', 'gamma']})
5string_df2 = pd.DataFrame({'latin':['z', 'c'],
6                           'greek':['omega', 'gamma']})
7
8numeric_df1 = pd.DataFrame({'A':[1, 2, 3],
9                            'B':[1.01, 2.02, 3.03]})
10numeric_df2 = pd.DataFrame({'A':[3, 9],
11                            'B':[3.03, 9.09]})
12
13def index_matching_rows(df1, df2, cols_to_match=None):
14    '''    
15    return index of subset of rows of df1 that are equal to at least one row in df2
16    '''
17    if cols_to_match is None:
18        cols_to_match = df1.columns
19    
20    df1 = df1.reset_index()
21    m = df1.merge(df2, on=cols_to_match[0], suffixes=('1','2'))
22    
23    query = '&amp;'.join(['{0}1 == {0}2'.format(str(c)) for c in cols_to_match[1:]])
24
25    m = m.query(query)
26    
27    return m['index']
28
29print(string_df2.drop(index_matching_rows(string_df2, string_df1)))
30print(numeric_df2.drop(index_matching_rows(numeric_df2, numeric_df1)))
31  latin  greek
320     z  omega
33
34   A     B
351  9  9.09
36copies = 10
37big_sdf1 = pd.concat([string_df1, string_df1]*copies)
38big_sdf2 = pd.concat([string_df2, string_df2]*copies)
39big_ndf1 = pd.concat([numeric_df1, numeric_df1]*copies)
40big_ndf2 = pd.concat([numeric_df2, numeric_df2]*copies)
41%%timeit
42big_sdf2.drop(index_matching_rows(big_sdf2, big_sdf1))
43# copies =  10:      2.61 ms   ±  27.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
44# copies =  20:      4.44 ms   ±  43.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
45# copies =  30:     18.4  ms   ± 132   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
46# copies =  40:     74.6  ms   ± 453   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
47# copies = 100: 19.2       s   ± 112   ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
48%%timeit
49big_ndf2.drop(index_matching_rows(big_ndf2, big_ndf1))
50# copies = 10:  2.56 ms ±    29.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
51# copies = 20:  4.38 ms ±    75.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
52# copies = 30: 18.3  ms ±   194   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
53# copies = 40: 76.5  ms ± 1.76    ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
54%%timeit
55big_sdf1_tuples = big_sdf1.apply(tuple, axis=1)
56big_sdf2_tuples = big_sdf2.apply(tuple, axis=1)
57big_sdf2_tuples.isin(big_sdf1_tuples)
58# copies =  100:     4.82  ms ±     22   µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
59# copies = 1000:    44.6   ms ±    386   µs per loop (mean ± std. dev. of 7 runs,  10 loops each)
60# copies =  1e4:   450     ms ±  9.44    ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
61# copies =  1e5: 4.42       s ± 27.6     ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
62%%timeit
63big_ndf1_tuples = big_ndf1.apply(tuple, axis=1)
64big_ndf2_tuples = big_ndf2.apply(tuple, axis=1)
65big_ndf2_tuples.isin(big_ndf1_tuples)
66# copies =  100:     4.98 ms ±     28.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
67# copies = 1000:    47    ms ±    288   µs per loop (mean ± std. dev. of 7 runs,  10 loops each)
68# copies =  1e4:   461    ms ±  4.41    ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
69# copies =  1e5: 4.58      s ± 30.1     ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
70big_sdf2_tuples.loc[~big_sdf2_tuples.isin(big_sdf1_tuples)]
71df1 = {'Runner': ['A', 'A', 'A', 'A'],
72 'Day': ['1', '3', '8', '9'],
73 'Miles': ['3', '4', '4', '2']}
74
75df2 = df.copy().drop([1,3])
76df1_hashed = df1.apply(tuple, axis=1).apply(hash)
77df2_hashed = df2.apply(tuple, axis=1).apply(hash)
78

and believe, like many people will, that 2 different rows are very very very unlikely to get the same hashed value,

and get rows from df1 that do not exist in df2:

1import pandas as pd
2
3string_df1 = pd.DataFrame({'latin':['a', 'b', 'c'],
4                           'greek':['alpha', 'beta', 'gamma']})
5string_df2 = pd.DataFrame({'latin':['z', 'c'],
6                           'greek':['omega', 'gamma']})
7
8numeric_df1 = pd.DataFrame({'A':[1, 2, 3],
9                            'B':[1.01, 2.02, 3.03]})
10numeric_df2 = pd.DataFrame({'A':[3, 9],
11                            'B':[3.03, 9.09]})
12
13def index_matching_rows(df1, df2, cols_to_match=None):
14    '''    
15    return index of subset of rows of df1 that are equal to at least one row in df2
16    '''
17    if cols_to_match is None:
18        cols_to_match = df1.columns
19    
20    df1 = df1.reset_index()
21    m = df1.merge(df2, on=cols_to_match[0], suffixes=('1','2'))
22    
23    query = '&amp;'.join(['{0}1 == {0}2'.format(str(c)) for c in cols_to_match[1:]])
24
25    m = m.query(query)
26    
27    return m['index']
28
29print(string_df2.drop(index_matching_rows(string_df2, string_df1)))
30print(numeric_df2.drop(index_matching_rows(numeric_df2, numeric_df1)))
31  latin  greek
320     z  omega
33
34   A     B
351  9  9.09
36copies = 10
37big_sdf1 = pd.concat([string_df1, string_df1]*copies)
38big_sdf2 = pd.concat([string_df2, string_df2]*copies)
39big_ndf1 = pd.concat([numeric_df1, numeric_df1]*copies)
40big_ndf2 = pd.concat([numeric_df2, numeric_df2]*copies)
41%%timeit
42big_sdf2.drop(index_matching_rows(big_sdf2, big_sdf1))
43# copies =  10:      2.61 ms   ±  27.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
44# copies =  20:      4.44 ms   ±  43.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
45# copies =  30:     18.4  ms   ± 132   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
46# copies =  40:     74.6  ms   ± 453   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
47# copies = 100: 19.2       s   ± 112   ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
48%%timeit
49big_ndf2.drop(index_matching_rows(big_ndf2, big_ndf1))
50# copies = 10:  2.56 ms ±    29.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
51# copies = 20:  4.38 ms ±    75.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
52# copies = 30: 18.3  ms ±   194   µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
53# copies = 40: 76.5  ms ± 1.76    ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
54%%timeit
55big_sdf1_tuples = big_sdf1.apply(tuple, axis=1)
56big_sdf2_tuples = big_sdf2.apply(tuple, axis=1)
57big_sdf2_tuples.isin(big_sdf1_tuples)
58# copies =  100:     4.82  ms ±     22   µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
59# copies = 1000:    44.6   ms ±    386   µs per loop (mean ± std. dev. of 7 runs,  10 loops each)
60# copies =  1e4:   450     ms ±  9.44    ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
61# copies =  1e5: 4.42       s ± 27.6     ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
62%%timeit
63big_ndf1_tuples = big_ndf1.apply(tuple, axis=1)
64big_ndf2_tuples = big_ndf2.apply(tuple, axis=1)
65big_ndf2_tuples.isin(big_ndf1_tuples)
66# copies =  100:     4.98 ms ±     28.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
67# copies = 1000:    47    ms ±    288   µs per loop (mean ± std. dev. of 7 runs,  10 loops each)
68# copies =  1e4:   461    ms ±  4.41    ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
69# copies =  1e5: 4.58      s ± 30.1     ms per loop (mean ± std. dev. of 7 runs,   1 loop each)
70big_sdf2_tuples.loc[~big_sdf2_tuples.isin(big_sdf1_tuples)]
71df1 = {'Runner': ['A', 'A', 'A', 'A'],
72 'Day': ['1', '3', '8', '9'],
73 'Miles': ['3', '4', '4', '2']}
74
75df2 = df.copy().drop([1,3])
76df1_hashed = df1.apply(tuple, axis=1).apply(hash)
77df2_hashed = df2.apply(tuple, axis=1).apply(hash)
78df1[~df1_hashed.isin(df2_hashed)]
79
80  Runner Day Miles
811      A   3     4
823      A   9     2
83

As for the speed difference between string/integers, I am sure you can test it with your real data.

Note 1: you may actually remove .apply(hash) from both lines.

Note 2: check the answer of this question out for more on isin and the use of hash.

Source https://stackoverflow.com/questions/71273291

QUESTION

Crashing when calling QTcpSocket::setSocketDescriptor()

Asked 2022-Mar-09 at 08:58

my project using QTcpSocket and the function setSocketDescriptor(). The code is very normal

1QTcpSocket *socket = new QTcpSocket();
2socket-&gt;setSocketDescriptor(this-&gt;m_socketDescriptor);
3

This coding worked fine most of the time until I ran a performance testing on Windows Server 2016, the crash occurred. I debugging with the crash dump, here is the log

1QTcpSocket *socket = new QTcpSocket();
2socket-&gt;setSocketDescriptor(this-&gt;m_socketDescriptor);
30000004f`ad1ff4e0 : ucrtbase!abort+0x4e
400000000`6ed19790 : Qt5Core!qt_logging_to_console+0x15a
5000001b7`79015508 : Qt5Core!QMessageLogger::fatal+0x6d
60000004f`ad1ff0f0 : Qt5Core!QEventDispatcherWin32::installMessageHook+0xc0
700000000`00000000 : Qt5Core!QEventDispatcherWin32::createInternalHwnd+0xf3
8000001b7`785b0000 : Qt5Core!QEventDispatcherWin32::registerSocketNotifier+0x13e
9000001b7`7ad57580 : Qt5Core!QSocketNotifier::QSocketNotifier+0xf9
1000000000`00000001 : Qt5Network!QLocalSocket::socketDescriptor+0x4cf7
1100000000`00000000 : Qt5Network!QAbstractSocket::setSocketDescriptor+0x256
12

In the stderr log, I see those logs

1QTcpSocket *socket = new QTcpSocket();
2socket-&gt;setSocketDescriptor(this-&gt;m_socketDescriptor);
30000004f`ad1ff4e0 : ucrtbase!abort+0x4e
400000000`6ed19790 : Qt5Core!qt_logging_to_console+0x15a
5000001b7`79015508 : Qt5Core!QMessageLogger::fatal+0x6d
60000004f`ad1ff0f0 : Qt5Core!QEventDispatcherWin32::installMessageHook+0xc0
700000000`00000000 : Qt5Core!QEventDispatcherWin32::createInternalHwnd+0xf3
8000001b7`785b0000 : Qt5Core!QEventDispatcherWin32::registerSocketNotifier+0x13e
9000001b7`7ad57580 : Qt5Core!QSocketNotifier::QSocketNotifier+0xf9
1000000000`00000001 : Qt5Network!QLocalSocket::socketDescriptor+0x4cf7
1100000000`00000000 : Qt5Network!QAbstractSocket::setSocketDescriptor+0x256
12CreateWindow() for QEventDispatcherWin32 internal window failed (Not enough storage is available to process this command.)
13Qt: INTERNAL ERROR: failed to install GetMessage hook: 8, Not enough storage is available to process this command.
14

Here is the function, where the code was stopped on the Qt codebase

1QTcpSocket *socket = new QTcpSocket();
2socket-&gt;setSocketDescriptor(this-&gt;m_socketDescriptor);
30000004f`ad1ff4e0 : ucrtbase!abort+0x4e
400000000`6ed19790 : Qt5Core!qt_logging_to_console+0x15a
5000001b7`79015508 : Qt5Core!QMessageLogger::fatal+0x6d
60000004f`ad1ff0f0 : Qt5Core!QEventDispatcherWin32::installMessageHook+0xc0
700000000`00000000 : Qt5Core!QEventDispatcherWin32::createInternalHwnd+0xf3
8000001b7`785b0000 : Qt5Core!QEventDispatcherWin32::registerSocketNotifier+0x13e
9000001b7`7ad57580 : Qt5Core!QSocketNotifier::QSocketNotifier+0xf9
1000000000`00000001 : Qt5Network!QLocalSocket::socketDescriptor+0x4cf7
1100000000`00000000 : Qt5Network!QAbstractSocket::setSocketDescriptor+0x256
12CreateWindow() for QEventDispatcherWin32 internal window failed (Not enough storage is available to process this command.)
13Qt: INTERNAL ERROR: failed to install GetMessage hook: 8, Not enough storage is available to process this command.
14void QEventDispatcherWin32::installMessageHook()
15{
16    Q_D(QEventDispatcherWin32);
17
18    if (d-&gt;getMessageHook)
19        return;
20
21    // setup GetMessage hook needed to drive our posted events
22    d-&gt;getMessageHook = SetWindowsHookEx(WH_GETMESSAGE, (HOOKPROC) qt_GetMessageHook, NULL, GetCurrentThreadId());
23    if (Q_UNLIKELY(!d-&gt;getMessageHook)) {
24        int errorCode = GetLastError();
25        qFatal(&quot;Qt: INTERNAL ERROR: failed to install GetMessage hook: %d, %s&quot;,
26               errorCode, qPrintable(qt_error_string(errorCode)));
27    }
28}
29

I did research and the error Not enough storage is available to process this command. maybe the OS (Windows) does not have enough resources to process this function (SetWindowsHookEx) and failed to create a hook, and then Qt fire a fatal signal, finally my app is killed. I tested this on Windows Server 2019, the app is working fine, no crashes appear. I just want to know more about the meaning of the error message (stderr) cause I don't really know what is "Not enough storage"? I think it is maybe the limit or bug of the Windows Server 2016? If yes, is there any way to overcome this issue on Windows Server 2016?

ANSWER

Answered 2022-Mar-07 at 09:23

The error ‘Not enough storage is available to process this command’ usually occurs in Windows servers when the registry value is set incorrectly or after a recent reset or reinstallations, the configurations are not set correctly.

Below is verified procedure for this issue:

Click on Start > Run > regedit & press Enter
Find this key name HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\LanmanServer\Parameters
Locate IRPStackSize
If this value does not exist Right Click on Parameters key and Click on New > Dword Value and type in IRPStackSize under the name.
The name of the value must be exactly (combination of uppercase and lowercase letters) the same as what I have above.
Right Click on the IRPStackSize and click on Modify
Select Decimal enter a value higher than 15(Maximum Value is 50 decimal) and Click Ok
You can close the registry editor and restart your computer.

Reference

Source https://stackoverflow.com/questions/71377640

QUESTION

How to do performance testing with key-cloak authentication in Jmeter

Asked 2022-Feb-09 at 10:57

I try to do performance testing but particular site have enable key-cloak authentication.any suggesstions

ANSWER

Answered 2022-Feb-09 at 10:57

It's just a matter of correlation, the exact steps will be different for OpenID Connect and SAML so you need to determine which client protocol is being used in that particular Keycloak instance.

In order to get help with correlating individual parameters you will need to provide the previous response and the format of the next request so we could come up with the relevant instructions for JMeter Post-Processors setup.

Source https://stackoverflow.com/questions/71048126

QUESTION

How to "PERFORM" a CTE query returning multiple rows/columns?

Asked 2022-Feb-08 at 21:52

As a follow-up to this question:

How to "PERFORM" CTE queries in PL/pgSQL?

I try:

1perform (with test_as_cte as(select * from myTable) select * from test_as_cte);
2

But get the following error:

1perform (with test_as_cte as(select * from myTable) select * from test_as_cte);
2SQL Error [42601]: ERROR: subquery must return only one column
3Where: PL/pgSQL function inline_code_block line 9 at PERFORM
4

If I replace * with myCol in the above code there is no error.

However, I need to do realistic performance testing with the CTE and return multiple columns.

ANSWER

Answered 2022-Feb-08 at 21:52

The WITH query enclosed in parentheses is treated like a sub-select. It works fine the way you have it as long as it returns a single value (one column of one row). Else you must treat it as subquery and call it like this (inside a PL/pgSQL code block!):

1perform (with test_as_cte as(select * from myTable) select * from test_as_cte);
2SQL Error [42601]: ERROR: subquery must return only one column
3Where: PL/pgSQL function inline_code_block line 9 at PERFORM
4PERFORM * FROM (with test_as_cte as (select * from b2) select * from test_as_cte t) sub;
5

Or just:

1perform (with test_as_cte as(select * from myTable) select * from test_as_cte);
2SQL Error [42601]: ERROR: subquery must return only one column
3Where: PL/pgSQL function inline_code_block line 9 at PERFORM
4PERFORM * FROM (with test_as_cte as (select * from b2) select * from test_as_cte t) sub;
5PERFORM FROM (&lt;any SELECT query&gt;) sub;
6

The manual:

PERFORM query;

This executes query and discards the result. Write the query the same way you would write an SQL SELECT command, but replace the initial keyword SELECT with PERFORM. For WITH queries, use PERFORM and then place the query in parentheses. (In this case, the query can only return one row.)

I think this could be clearer. I'll suggest a fix for the documentation.

Source https://stackoverflow.com/questions/71040855

QUESTION

Difference between stress test and breakpoint test

Asked 2022-Jan-13 at 05:05

I was looking for the verbal explanations of different performance testing types and saw a new one called "breakpoint test". Its explanation seemed very similar to stress testing for me. So what is the difference, or is there any difference?

Stress Test: A verification on the system performance during extremely high load which is way above the peak load

Breakpoint Test: This test determines the point of system failure by gradually increasing the number of simulated concurrent users.

As far as I know, we increase the load gradually while performing stress test too. So what is the difference between this two type?

ANSWER

Answered 2021-Oct-26 at 12:12

From the workload point of view the approach is exactly the same, my understanding is:

Stress test is about finding the first bottleneck, it's normally applied before deployment or even at early stages of development (see shift-left concept)
Breakpoint (sometimes also called Capacity) test is about checking how much load the overall integrated environment can handle without issues and what is the slowest component which is a subject for scaling up/optimization.

More information:

Source https://stackoverflow.com/questions/69722534

QUESTION

Performance testing for Svelte applications

Asked 2021-Dec-08 at 10:43

How can I do client side performance testing for Svelte applications?

Is there any similar kind in JMeter / WebLoad like true client protocol as in Load runner to test the Svelte applications?

ANSWER

Answered 2021-Dec-08 at 10:43

TruClient protocol is just a real browser (maybe headless) so it is not a "protocol".

In order to test client-side performance you need a real browser -
The most popular browser automation tool is Selenium
JMeter can be integrated with Selenium using WebDriver Sampler
WebDriver Sampler can be installed using JMeter Plugins Manager

Source https://stackoverflow.com/questions/70273326

QUESTION

Adding keyStore and trustStore to Gatling requests

Asked 2021-Dec-08 at 10:07

I've been doing performance testing using Gatling and now need to find a way to add keyStore and trustStore to my requests. Was wondering if that is something I can add in the HttpProtocolBuilder or do I have to take a slightly different approach.

Basically is there a Gatling equivalent to the following RestAssured commands?

1given()
2        .keyStore(&quot;src/test/resources/fooKeyStore.jks&quot;,&quot;fooPassword&quot;)
3        .trustStore(&quot;src/test/resources/fooTrustStore.jks&quot;,&quot;fooPassword&quot;)
4

ANSWER

Answered 2021-Dec-08 at 10:07

Was wondering if that is something I can add in the HttpProtocolBuilder

There, you can only use perUserKeyManagerFactory to create a distinct KeyManager for each virtual user. This should be the way to go only if you want to provide different keystores for each virtual user.

You can define the default keystore in gatling.conf.

I recommend you let the default trusty TrustManager: you want to run a load test, not secure an application.

Source https://stackoverflow.com/questions/70272692

QUESTION

Buildspec for Spring boot application

Asked 2021-Dec-01 at 11:16

So i stuck on one thing I have an application which was written on Spring boot and located on Fargate behind Application Load Balancer. I should make a load performance testing on that app and create buildspec file, i made it but i completly new in building pipelines for that service, so don't know how to write buildspec file. Could someone please help me and give the right direction how to write the buildspec for that service?

ANSWER

Answered 2021-Dec-01 at 11:16

Here an example of buildspec.yml :

Source https://stackoverflow.com/questions/70172498

QUESTION

How to find max thread active for scatter-gather in mule

Asked 2021-Nov-30 at 14:35

I am using mule version 3.9. My application uses HTTP listener component as an message source and uses scatter gather to collect data from database.
During performance testing, "Threadpoolexecutor did not accept within 30000 MILLISECONDS (java.util.concurrent.RejectedExecutionException)" is thrown. I think this is due to non-availability of enough threads for scatter gather component.

As per mule doc, maxThreadsActive for Scatter-Gather is the number of routes in a Scatter-Gather * maxThreadsActive for flow. Now, how to find maxThreadsActive for flow? No processing strategy is defined for flow. pls help.

ANSWER

Answered 2021-Nov-30 at 14:35

Read the default threading profile documentation:

maxThreadsActive The maximum number of threads to use.

1Type: integer
2Required: no
3Default: 16
4

Having said that I recommend that instead of increasing the number of threads it is better to analyze a thread dump to understand what are the existing threads doing when they are all busy. Probably you have some operation that is taking more time than expected (the database query?). Increasing the number of threads should be evaluated with performance test to understand the improvement and cost. Threads are not free, they use memory and CPU.

Source https://stackoverflow.com/questions/70170891

QUESTION

Automation vs performance testing

Asked 2021-Nov-22 at 06:18

I am a beginner in performance testing and I would like to ask, with automation testing is it possible to be transformed into performance testing?

For example, I have the code of an automation of the login scenario for X users, will it be a good practice if I use the statistics of the code run to represent it as a performance diagram?

ANSWER

Answered 2021-Nov-22 at 06:18

Up to certain extent yes, you will get response time and may be some relationship between the number of users and response time, however there are some constraints as well:

Most probably you won't get all the metrics and KPIs you can get with the protocol-level-based-tools
Browsers are very resource intensive, i.e. Firefox 94 system requirements are at least 1 CPU core and 2 GB of RAM per browser instance

So I would rather think not about re-using existing automation tests for checking the performance, but rather converting them into a performance test script to get better results and less resource footprint

Source https://stackoverflow.com/questions/70056334

Community Discussions contain sources that include Stack Exchange Network