benchmarks | Scripts for running various benchmarks on Isambard | DevOps library

 by   UoB-HPC Shell Version: CCPE-CUG-2018 License: No License

kandi X-RAY | benchmarks Summary

kandi X-RAY | benchmarks Summary

benchmarks is a Shell library typically used in Devops applications. benchmarks has no bugs, it has no vulnerabilities and it has low support. You can download it from GitHub.

This repository contains scripts for running various benchmarks in a reproducible manner. This is primarily for benchmarking ThunderX2 in Isambard, and other systems that we typically compare against.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              benchmarks has a low active ecosystem.
              It has 24 star(s) with 4 fork(s). There are 13 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              There are 3 open issues and 0 have been closed. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of benchmarks is CCPE-CUG-2018

            kandi-Quality Quality

              benchmarks has no bugs reported.

            kandi-Security Security

              benchmarks has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

            kandi-License License

              benchmarks does not have a standard license declared.
              Check the repository for any license declaration and review the terms closely.
              OutlinedDot
              Without a license, all rights are reserved, and you cannot use the library in your applications.

            kandi-Reuse Reuse

              benchmarks releases are available to install and integrate.
              Installation instructions are not available. Examples and code snippets are available.

            Top functions reviewed by kandi - BETA

            kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
            Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of benchmarks
            Get all kandi verified functions for this library.

            benchmarks Key Features

            No Key Features are available at this moment for benchmarks.

            benchmarks Examples and Code Snippets

            UoB HPC Benchmarks,Usage
            Shelldot img1Lines of Code : 5dot img1no licencesLicense : No License
            copy iconCopy
            mkdir benchmarks && cd benchmarks
            
            # Example for CloverLeaf on TX2
            $BENCH/cloverleaf/tx2-isambard/benchmark.sh build
            
            # Example for CloverLeaf on TX2, running on 64 nodes (assuming you have previously run 'build')
            $BENCH/cloverleaf/tx2-isamba  
            UoB HPC Benchmarks,Usage,Using custom settings
            Shelldot img2Lines of Code : 4dot img2no licencesLicense : No License
            copy iconCopy
            # Example for GROMACS on TX2
            $BENCH/gromacs/tx2-isambard/benchmark.sh
            
            $BENCH/gromacs/tx2-isambard/benchmark.sh build arm-19.0 armpl-19.0
            $BENCH/gromacs/tx2-isambard/benchmark.sh run scale-64 arm-19.0 armpl-19.0
              

            Community Discussions

            QUESTION

            How to improve divide-and-conquer runtimes?
            Asked 2021-Jun-15 at 17:36

            When a divide-and-conquer recursive function doesn't yield runtimes low enough, which other improvements could be done?

            Let's say, for example, this power function taken from here:

            ...

            ANSWER

            Answered 2021-Jun-15 at 17:36

            The primary optimization you should use here is common subexpression elimination. Consider your first piece of code:

            Source https://stackoverflow.com/questions/67987701

            QUESTION

            Polybase External Tables vs. OPENROWSET serverless sql pool architecture
            Asked 2021-Jun-09 at 09:33

            I am in search of performance benchmarks for querying parquet ADLS files with the standard dedicated sql pool using external tables with polybase vs. serverless sql pool and OPENROWSET views. From my base queries on a 1.5 billion record table, it does appears OPENROWSET in serverless sql pool is around 30% more performant given time for the same query, but what are the architecture that power that? Are there any readily available performance benchmarks?

            ...

            ANSWER

            Answered 2021-Jun-09 at 09:33

            The architecture behind Azure Synapse SQL Serverless Pools and how it achieves such a strong performance is described in this paper, it is called "Polaris".

            http://www.vldb.org/pvldb/vol13/p3204-saborit.pdf

            Performance benchmarks have been published on multiple blogs. Be aware that this can only be a snapshot in time as those features are being improved constantly.

            Source https://stackoverflow.com/questions/67896757

            QUESTION

            BenchmarkDotNet Unable to find Tests when it faces weird solution structure
            Asked 2021-Jun-03 at 00:42

            I have problem with BenchmarkDotNet which I struggle to solve

            Here's my project structure:

            ...

            ANSWER

            Answered 2021-Jun-03 at 00:42

            The short answer is you cannot run benchmark with the structure you created and it is intentional.

            For the BenchmarkDotNet (and it is a generally good practice) it's required for solution to have following structure

            Source https://stackoverflow.com/questions/67766289

            QUESTION

            Why does text appear in browsers but not appear in image viewers?
            Asked 2021-May-28 at 08:39

            I am trying to render a chart but encounter a problem: The elements appear in browsers (Chrome, Firefox) but not in traditional image viewers (Eyes of GNOME, GIMP, Inkscape).

            Code

            At first, I thought that it was because image viewers are incapable of rendering fonts, until I came across an asciinema's thumbnail, which is displayed perfectly by Eyes of GNOME:

            Question: Why does this happen and how to fix this?

            ...

            ANSWER

            Answered 2021-May-28 at 07:32

            The reason is in nested SVGs:

            Source https://stackoverflow.com/questions/67733561

            QUESTION

            PGBouncer IDLE Connections not Closing on Postgres
            Asked 2021-May-27 at 16:31

            We have a setup where we are running 6 PgBouncer processes and our performance benchmarks degrade linearly with time. The longer PgBouncer has been running, the longer the connections to Postgres exist results in slower response times for the benchmark. We have a multi-tenant schema separated database with 2000+ relations. We are configured for Transaction Mode pooling right now. Over time, we see the memory footprint of each Postgres process climb and climb and climb, and again, this results in poorer performance.

            We have tried to be more aggressive in cleaning up idle connections with the following settings:

            ...

            ANSWER

            Answered 2021-May-27 at 16:31

            The issue is resolved.

            The application was extremely chatty and even with server_idle_timeout set as low as 5 seconds, the connections were not getting recycled on the Postgres side.

            The issue we had was that server_lifetime was accidentally commented when we thought it was active and once we changed that, we could clearly see that Postgres connections were getting recycled every 2 minutes (based on our settings).

            The increased memory of each connection over time especially for long-lived connections was only taking into consideration private memory and not shared memory. What we observed was the longer the connection was alive, the more memory it consumed. We tried setting things like DISCARD ALL for reset_query and it had no impact on memory consumption. Based on my research online, we were not the only to ones to face this challenge with pooling connections.

            Thanks for the comments and the help. Our solution in the end was to leverage server_lifetime in pgBouncer to control the number of long-lived connections on Postgres.

            -Mayan

            Source https://stackoverflow.com/questions/67664415

            QUESTION

            Create a new slice given a previous one, without a given value
            Asked 2021-May-23 at 13:50

            I have a slice of strings. What I need to accomplish is to remove one value from the slice, without knowing the index. I thought this would be the easiest way to do it:

            ...

            ANSWER

            Answered 2021-May-23 at 13:49

            Allocate a big return slice in one step (estimated by the input slice), and don't use append() but assign to individual elements:

            Source https://stackoverflow.com/questions/67660392

            QUESTION

            Writing a vector sum function with SIMD (System.Numerics) and making it faster than a for loop
            Asked 2021-May-21 at 18:27

            I wrote a function to add up all the elements of a double[] array using SIMD (System.Numerics.Vector) and the performance is worse than the naïve method.

            On my computer Vector.Count is 4 which means I could create an accumulator of 4 values and run through the array adding up the elements by groups.

            For example a 10 element array, with a 4 element accumulator and 2 remaining elements I would get

            ...

            ANSWER

            Answered 2021-May-19 at 18:28

            I would suggest you take a look at this article exploring SIMD performance in .Net.

            The overall algorithm looks identical for summing using regular vectorization. One difference is that the multiplication can be avoided when slicing the array:

            Source https://stackoverflow.com/questions/67605744

            QUESTION

            Why does JMH report such strange times for a simple Quicksort --obviously disproportionate to N * log(N)?
            Asked 2021-May-19 at 10:44

            Having an intent to study a sort algorithm (of my own), I decided to compare its performance with the classical quicksort and to my great surprise I've discovered that the time taken by my implementation of quicksort is far not proportional to N log(N). I thoroughly tried to find an error in my quicksort but unsuccessfully. It is a simple version of the sort algorithm working with arrays of Integer of different sizes, filled with random numbers, and I have no idea, where the error can sneak in. I have even counted all the comparisons and swaps executed by my code, and their number was rather fairly proportional to N log(N). I am completely confused and can't understand the reality I observe. Here are the benchkmark results for sorting arrays of 1,000, 2,000, 4,000, 8,000 and 16,000 random values (measured with JMH):

            ...

            ANSWER

            Answered 2021-May-18 at 21:03

            Three points work together against your implementation:

            In the very early versions of quicksort, the leftmost element of the partition would often be chosen as the pivot element. Unfortunately, this causes worst-case behavior on already sorted arrays

            • Your algorithm sorts the arrays in place, meaning that after the first pass the "random" array is sorted. (To calculate average times JMH does several passes over the data).

            To fix this, you could change your benchmark methods. For example, you could change sortArray01000() to

            Source https://stackoverflow.com/questions/67571268

            QUESTION

            To use Task.WhenAll, or not to use Task.WhenAll
            Asked 2021-May-14 at 11:58

            I am reviewing some code and trying to come up with a technical reason why you should or should not use Task.WhenAll(Tasks[]) for essentially making Http calls in parallel. The Http calls call a different microservice and I guess one of the calls may or may not take some time to execute... (I guess I am not really interested in that). I'm using BenchmarkDotNet to give me an idea of there is any more memory consumed, or if execution time is wildly different. Here is an over-simplified example of the Benchmarks:

            ...

            ANSWER

            Answered 2021-May-12 at 15:51

            My question really is, is there a technical reason why you should or not use Task.WhenAll()?

            The behavior is just slightly different in the case of exceptions when both calls fail. If they're awaited one at a time, the second failure will never be observed; the exception from the first failure is propagated immediately. If using Task.WhenAll, both failures are observed; one exception is propagated after both tasks fail.

            Is it just a preference?

            It's mostly just preference. I tend to prefer WhenAll because the code is more explicit, but I don't have a problem with awaiting one at a time.

            Source https://stackoverflow.com/questions/67506865

            QUESTION

            Vectorized hashing/ranking of integer combinations of fixed size via operations on 32-bit integers in MATLAB
            Asked 2021-May-14 at 06:09

            I have huge dynamically created tables/matrices in MATLAB of varying first dimension, whose rows represent (sorted) combinations of integers in the range 1-50 of order 6.

            I would like to assign to each combination a unique value (hash, ranking), so that I can check if the same combinations appear in different tables. Different combinations are not allowed to have same value assigned, i.e. no collisions. I have to make a lot of such comparisons between a lot of such tables. So, for performance reasons, I would like to accomplish this by vectorization of uint32 operations to make it suitable for GPU acceleration in MATLAB.

            Things I have thought of so far:

            1. Lexicographic ranking: no idea how to vectorize the standard fast recursive algorithms well, and the only option seems to be to parfor it through the rows, which is slower than other options. IIRC, the direct explicit formula, though vectorizable, requires computation of binomials, which in turn requires log Gamma function in order to avoid huge factorials + double type to avoid collision if I am not mistaken, i.e. is slower because it's 'very numerical'.
            2. Cantor pairing function: one can successively apply Cantor's pairing, which is nice because it's a polynomial expression, but it produces huge numbers well beyond uint32 and is definitely slower than other options.
            3. Base 51 (no pun intended) integers: sends a combination/row vector (x_1,...,x_6) to x_1 + x_2 * 51 + ... + x_6 * 51^5. This is the fastest I currently have. It's easily vectorizable, but unfortunately still requires uint64 or double for rank-6 combinations of 50 elements, which is slower than uint32 or single type operations would be.

            So, I guess, I am looking for a 'clever' injective function on these combinations that computes within the uint32 range and is also well vectorizable (in MATLAB).

            Any help would be much appreciated!

            EDIT: Here is a routine that benchmarks both ranking and searching in uint32, single, and double. I have used MATLAB's gputimeit to produce accurate results.

            ...

            ANSWER

            Answered 2021-May-10 at 12:41

            You've almost got enough bits for your last idea, so you just need to squeeze a few bits out due to the ordering to get it over the bar. Since the whole sequence is sorted, every pair is also ordered. So use a 50-by-50 look-up table to map the sorted (1st,2nd), (3rd,4th), (5th,6th) pairs into numbers from 0-1274.

            Or if you don't want a table, there are fairly simple explicit functions for mapping a pair (i,j) with j>=i to a linear index. Look up upper- or lower-triangular matrix indexing for details on those. (It'll be something along the lines of n*(n+1)/2 - (n-i)*(n-i-1)/2 + j with some +/-1's thrown in depending on base-0 or base-1 indexing, and n=50 in your case, but I'm sure I'll get it wrong writing it off-the-cuff.)

            Anyway, once you've got three numbers 0-1274, the base-1275 idea will fit in uint32.

            Source https://stackoverflow.com/questions/67455774

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install benchmarks

            You can download it from GitHub.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/UoB-HPC/benchmarks.git

          • CLI

            gh repo clone UoB-HPC/benchmarks

          • sshUrl

            git@github.com:UoB-HPC/benchmarks.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Explore Related Topics

            Consider Popular DevOps Libraries

            ansible

            by ansible

            devops-exercises

            by bregman-arie

            core

            by dotnet

            semantic-release

            by semantic-release

            Carthage

            by Carthage

            Try Top Libraries by UoB-HPC

            BabelStream

            by UoB-HPCC++

            openmp-tutorial

            by UoB-HPCC

            SimEng

            by UoB-HPCC++

            TSVC_2

            by UoB-HPCC