thrust | The C++ parallel algorithms library | GPU library

 by   NVIDIA C++ Version: 2.1.0 License: Non-SPDX

kandi X-RAY | thrust Summary

kandi X-RAY | thrust Summary

thrust is a C++ library typically used in Hardware, GPU applications. thrust has no bugs, it has no vulnerabilities and it has medium support. However thrust has a Non-SPDX License. You can download it from GitHub.

Thrust is the C++ parallel algorithms library which inspired the introduction of parallel algorithms to the C++ Standard Library. Thrust's high-level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs. It builds on top of established parallel programming frameworks (such as CUDA, TBB, and OpenMP). It also provides a number of general-purpose facilities similar to those found in the C++ Standard Library. The NVIDIA C++ Standard Library is an open source project; it is available on GitHub and included in the NVIDIA HPC SDK and CUDA Toolkit. If you have one of those SDKs installed, no additional installation or compiler flags are needed to use libcu++.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              thrust has a medium active ecosystem.
              It has 4597 star(s) with 754 fork(s). There are 204 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              There are 168 open issues and 772 have been closed. On average issues are closed in 1243 days. There are 12 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of thrust is 2.1.0

            kandi-Quality Quality

              thrust has 0 bugs and 0 code smells.

            kandi-Security Security

              thrust has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              thrust code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              thrust has a Non-SPDX License.
              Non-SPDX licenses can be open source with a non SPDX compliant license, or non open source licenses, and you need to review them closely before use.

            kandi-Reuse Reuse

              thrust releases are available to install and integrate.
              Installation instructions are not available. Examples and code snippets are available.

            Top functions reviewed by kandi - BETA

            kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
            Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of thrust
            Get all kandi verified functions for this library.

            thrust Key Features

            No Key Features are available at this moment for thrust.

            thrust Examples and Code Snippets

            No Code Snippets are available at this moment for thrust.

            Community Discussions

            QUESTION

            constrain the initial and final values of a GEKKO ```Var``` to a data-based curve
            Asked 2022-Apr-03 at 21:17

            I am trying to solve a low thrust optimal control problem for Earth orbits, i.e. going from one orbit to another. The formulation of the problem includes six states (r_1, r_2, r_3, v_1, v_2, v_3) and 3 controls (u_1, u_2, u_3) with a simplified point model of gravity. When I specify the full initial state and half of the final state, the solver converges and yields a good solution. When I try the full final state, the problem is over constrained.

            My thought on how to remedy this is to allow the trajectory to depart the initial orbit at any point along the orbital curve and join the final orbit an any point along the final orbital curve, giving it more degrees of freedom. Is there a way to constrain the initial and final values of all 6 states to a cspline curve? This is what I have tried so far:

            ...

            ANSWER

            Answered 2022-Apr-03 at 21:17

            It is generally much harder for an optimizer to exactly reach a fixed endpoint, especially when it depends on a complex sequence of moves. This often leads to infeasible solutions. An alternative is to create a soft constraint (objective minimization) to penalize deviations from the final trajectory. Here is an example that is similar:

            Source https://stackoverflow.com/questions/71622083

            QUESTION

            Thrust is very slow for array reduction
            Asked 2022-Mar-19 at 19:23

            I am trying to use thrust to reduce an array of 1M elements to a single value. My code is as follows:

            ...

            ANSWER

            Answered 2022-Mar-19 at 19:23

            Am I doing something wrong here?

            I would not say you are doing anything wrong here. However that might be a matter of opinion. Let's unpack it a bit, using a profiler. I'm not using the exact same setup as you (I'm using a different GPU - Tesla V100, on Linux, CUDA 11.4). In my case the measurement spit out by the code is ~0.5ms, not 2ms.

            • The profiler tells me that the thrust::reduce is accomplished under the hood via a call to cub::DeviceReduceKernel followed by cub::DeviceReduceSingleTileKernel. This two-kernel approach should make sense to you if you have studied Mark Harris' reduction material. The profiler tells me that together, these two calls account for ~40us of the ~500us overall time. This is the time that would be most comparable to the measurement you made of your implementation of Mark Harris' reduction code, assuming you are timing the kernels only. If we multiply by 4 to account for the overall perf ratio, it is pretty close to your 150us measurement of that.
            • The profiler tells me that the big contributors to the ~500us reported time in my case are a call to cudaMalloc (~200us) and a call to cudaFree (~200us). This isn't surprising because if you study the cub::DeviceReduce methodology that is evidently being used by thrust, it requires an initial call to do a temporary allocation. Since thrust provides a self-contained call for thrust::reduce, it has to perform that call, as well as do a cudaMalloc and cudaFree operation for the indicated temporary storage.

            So is there anything that can be done?

            The thrust designers were aware of this situation. To get a (closer to) apples-apples comparison between just measuring the kernel duration(s) of a CUDA C++ implementation, and using thrust to do the same thing, you could use a profiler to compare measurements, or else take control of the temporary allocations yourself.

            One way to do this would be to switch from thrust to cub.

            The thrust way to do it is to use a thrust custom allocator.

            There may be a few other detail differences in methodology that are impacting your measurement. For example, the thrust call intrinsically copies the reduction result back to host memory. You may or may not be timing that step in your other approach which you haven't shown. But according to my profiler measurement, that only accounts for a few microseconds.

            Source https://stackoverflow.com/questions/71523157

            QUESTION

            What is the canonical way to compare memory ranges in the CPU and in the GPU
            Asked 2022-Mar-18 at 19:35

            I have to contiguous ranges (pointer + size), one in the GPU and one in the CPU and I want to compare if they are equal.

            What the canonical way to compare these ranges for equality?

            ...

            ANSWER

            Answered 2022-Mar-17 at 01:32

            You can't do it the way you are imagining in the general case with thrust. Thrust does not execute algorithms in a mixed backend. You must either use the device backend, in which case all data needs to be on the device (or accessible from device code, see below), or else the host backend in which case all data needs to be on the host.

            Therefore you will be forced to copy the data from one side to the other. The cost should be similar (copy host array to device, or device array to host) so we prefer to copy to the device, since the device comparison can be faster.

            If you have the luxury of having the host array be in a pinned buffer, then it will be possible to do something like what you are suggesting.

            For the general case, something like this should work:

            Source https://stackoverflow.com/questions/71505957

            QUESTION

            Which one is faster? raw pointers vs thrust vectors
            Asked 2022-Mar-12 at 11:45

            I am a beginner in Cuda, and I just wanted to ask a simple question that I could not find any clear answer for.

            I know that we can define our array in Device memory using a raw pointer:

            ...

            ANSWER

            Answered 2022-Mar-12 at 11:45

            The data in thrust::device_vector is ordinary global memory, there is no difference in access speed.

            Note however that the two alternatives you present are not equivalent. cudaMalloc returns uninitialized memory. Memory in thrust::device_vector will be initialized. After allocation it launches a kernel for the initialization of its elements, followed by cudaDeviceSynchronize. This could slow down the code. You need to benchmark your code.

            Source https://stackoverflow.com/questions/71449293

            QUESTION

            CUDA: Stack vector in different thread to a 1d vector
            Asked 2022-Mar-07 at 23:33

            I have a thrust vector for each thread in CUDA, and I want to stack vectors by orders (vector in thread 0, vector in thread 1,.... and vector in thread n) to create a 1d vector and send back to CPU. Is there a good way to do this? Any help is appreciated. Thank you.

            ...

            ANSWER

            Answered 2022-Mar-07 at 23:33

            The most performant way to store items from several threads into a single vector will be thread-interleaved. Suppose each of 4 threads (t0-t3) has 4 elements to store (e0-e3). The final storage pattern which will be most efficient will be:

            Source https://stackoverflow.com/questions/71386737

            QUESTION

            How to do a reduction over one dimension of 2D data in Thrust
            Asked 2022-Feb-28 at 23:25

            I'm new to CUDA and the thrust library. I'm learning and trying to implement a function that will have a for loop doing a thrust function. Is there a way to convert this loop into another thrust function? Or should I use a CUDA kernel to achieve this?

            I have come up with code like this

            ...

            ANSWER

            Answered 2022-Feb-28 at 21:01
            Solution using Thrust

            Here is an implementation using thrust::reduce_by_key in conjunction with multiple smart iterators.

            I also took the freedom to sprinkle in some const, auto and lambdas for elegance and readability. Due to the lambdas, you will need to use the -extended-lambda flag for nvcc.

            thrust::distance is the canonical way of subtracting Thrust iterators.

            Source https://stackoverflow.com/questions/71263160

            QUESTION

            Android build failed. showing "Resource compilation failed. Check logs for details."
            Asked 2022-Feb-28 at 05:46
            
                    Baseball is a bat-and-ball game played between two opposing teams, of nine players each, that take turns batting and fielding. The game proceeds when a player on the fielding team, called the pitcher, throws a ball which a player on the batting team tries to hit with a bat. The objective of the offensive team (batting team) is to hit the ball into the field of play, allowing its players to run the bases, having them advance counter-clockwise around four bases to score what are called "runs". The objective of the defensive team (fielding team) is to prevent batters from becoming runners, and to prevent runners' advance around the bases.[2] A run is scored when a runner legally advances around the bases in order and touches home plate (the place where the player started as a batter). The team that scores the most runs by the end of the game is the winner.The first objective of the batting team is to have a player reach first base safely. A player on the batting team who reaches first base without being called "out" can attempt to advance to subsequent bases as a runner, either immediately or during teammates' turns batting. The fielding team tries to prevent runs by getting batters or runners "out", which forces them out of the field of play. Both the pitcher and fielders have methods of getting the batting team's players out. The opposing teams switch back and forth between batting and fielding; the batting team's turn to bat is over once the fielding team records three outs. One turn batting for each team constitutes an inning. A game is usually composed of nine innings, and the team with the greater number of runs at the end of the game wins. If scores are tied at the end of nine innings, extra innings are usually played. Baseball has no game clock, although most games end in the ninth inning.Baseball evolved from older bat-and-ball games already being played in England by the mid-18th century. This game was brought by immigrants to North America, where the modern version developed. By the late 19th century, baseball was widely recognized as the national sport of the United States. Baseball is popular in North America and parts of Central and South America, the Caribbean, and East Asia, particularly in Japan, South Korea, and Taiwan.
                    Badminton is a racquet sport played using racquets to hit a shuttlecock across a net. Although it may be played with larger teams, the most common forms of the game are "singles" (with one player per side) and "doubles" (with two players per side). Badminton is often played as a casual outdoor activity in a yard or on a beach; formal games are played on a rectangular indoor court. Points are scored by striking the shuttlecock with the racquet and landing it within the opposing side's half of the court.Each side may only strike the shuttlecock once before it passes over the net. Play ends once the shuttlecock has struck the floor or if a fault has been called by the umpire, service judge, or (in their absence) the opposing side.[1]The shuttlecock is a feathered or (in informal matches) plastic projectile which flies differently from the balls used in many other sports. In particular, the feathers create much higher drag, causing the shuttlecock to decelerate more rapidly. Shuttlecocks also have a high top speed compared to the balls in other racquet sports. The flight of the shuttlecock gives the sport its distinctive nature.The game developed in British India from the earlier game of battledore and shuttlecock. European play came to be dominated by Denmark but the game has become very popular in Asia, with recent competitions dominated by China. Since 1992, badminton has been a Summer Olympic sport with four events: men's singles, women's singles, men's doubles, and women's doubles,[2] with mixed doubles added four years later. At high levels of play, the sport demands excellent fitness: players require aerobic stamina, agility, strength, speed, and precision. It is also a technical sport, requiring good motor coordination and the development of sophisticated racquet movements.[3
                    Basketball is a team sport in which two teams, most commonly of five players each, opposing one another on a rectangular court, compete with the primary objective of shooting a basketball (approximately 9.4 inches (24 cm) in diameter) through the defender's hoop (a basket 18 inches (46 cm) in diameter mounted 10 feet (3.048 m) high to a backboard at each end of the court), while preventing the opposing team from shooting through their own hoop. A field goal is worth two points, unless made from behind the three-point line, when it is worth three. After a foul, timed play stops and the player fouled or designated to shoot a technical foul is given one, two or three one-point free throws. The team with the most points at the end of the game wins, but if regulation play expires with the score tied, an additional period of play (overtime) is mandated.Players advance the ball by bouncing it while walking or running (dribbling) or by passing it to a teammate, both of which require considerable skill. On offense, players may use a variety of shots – the layup, the jump shot, or a dunk; on defense, they may steal the ball from a dribbler, intercept passes, or block shots; either offense or defense may collect a rebound, that is, a missed shot that bounces from rim or backboard. It is a violation to lift or drag one's pivot foot without dribbling the ball, to carry it, or to hold the ball with both hands then resume dribbling.The five players on each side fall into five playing positions. The tallest player is usually the center, the second-tallest and strongest is the power forward, a slightly shorter but more agile player is the small forward, and the shortest players or the best ball handlers are the shooting guard and the point guard, who implements the coach's game plan by managing the execution of offensive and defensive plays (player positioning). Informally, players may play three-on-three, two-on-two, and one-on-one
                    Bowling is a target sport and recreational activity in which a player rolls a ball toward pins (in pin bowling) or another target (in target bowling). The term bowling usually refers to pin bowling (most commonly ten-pin bowling), though in the United Kingdom and Commonwealth countries, bowling could also refer to target bowling, such as lawn bowls.In pin bowling, the goal is to knock over pins on a long playing surface known as a lane. Lanes have a wood or synthetic surface onto which protective lubricating oil is applied in different specified oil patterns that affect ball motion. A strike is achieved when all the pins are knocked down on the first roll, and a spare is achieved if all the pins are knocked over on a second roll. Common types of pin bowling include ten-pin, candlepin, duckpin, nine-pin, five-pin and kegel. The historical game skittles is the forerunner of modern pin bowling.In target bowling, the aim is usually to get the ball as close to a mark as possible. The surface in target bowling may be grass, gravel, or synthetic.[1] Lawn bowls, bocce, carpet bowls, pétanque, and boules may have both indoor and outdoor varieties. Curling is also related to bowls.Bowling is played by 120 million people in more than 90 countries (including 70 million in the United States alone),[2] and is the subject of video games.
                    Cycling, also called bicycling or biking, is the use of bicycles for transport, recreation, exercise or sport.[1] People engaged in cycling are referred to as "cyclists",[2] "bicyclists",[3] or "bikers".[4] Apart from two-wheeled bicycles, "cycling" also includes the riding of unicycles, tricycles, quadricycles, recumbent and similar human-powered vehicles (HPVs).Bicycles were introduced in the 19th century and now number approximately one billion worldwide.[5] They are the principal means of transportation in many parts of the world, especially in densely populated European cities.[6]Cycling is widely regarded as an effective and efficient mode of transportation[7][8] optimal for short to moderate distances.Bicycles provide numerous possible benefits in comparison with motor vehicles, including the sustained physical exercise involved in cycling, easier parking, increased maneuverability, and access to roads, bike paths and rural trails. Cycling also offers a reduced consumption of fossil fuels, less air or noise pollution, reduced greenhouse gas emissions,[9] and greatly reduced traffic congestion.[10] These have a lower financial cost for users as well as for society at large (negligible damage to roads, less road area required). By fitting bicycle racks on the front of buses, transit agencies can significantly increase the areas they can serve.[11]In addition, cycling provides a variety of health benefits.[12] The World Health Organization (WHO) states that cycling can reduce the risk of cancers, heart disease, and diabetes that are prevalent in sedentary lifestyles.[13][10] Cycling on stationary bikes have also been used as part of rehabilitation for lower limb injuries, particularly after hip surgery.[14] Individuals who cycle regularly have also reported mental health improvements, including less perceived stress and better vitality.[15]
                    Golf is a club-and-ball sport in which players use various clubs to hit balls into a series of holes on a course in as few strokes as possible.Golf, unlike most ball games, cannot and does not utilize a standardized playing area, and coping with the varied terrains encountered on different courses is a key part of the game. The game at the usual level is played on a course with an arranged progression of 18 holes, though recreational courses can be smaller, often having nine holes. Each hole on the course must contain a teeing ground to start from, and a putting green containing the actual hole or cup 4+1⁄4 inches (11 cm) in diameter. There are other standard forms of terrain in between, such as the fairway, rough (long grass), bunkers (or "sand traps"), and various hazards (water, rocks) but each hole on a course is unique in its specific layout and arrangement.Golf is played for the lowest number of strokes by an individual, known as stroke play, or the lowest score on the most individual holes in a complete round by an individual or team, known as match play. Stroke play is the most commonly seen format at all levels, but most especially at the elite level.The modern game of golf originated in 15th century Scotland. The 18-hole round was created at the Old Course at St Andrews in 1764. Golf's first major, and the world's oldest tournament in existence, is The Open Championship, also known as the British Open, which was first played in 1860 at the Prestwick Golf Club in Ayrshire, Scotland. This is one of the four major championships in men's professional golf, the other three being played in the United States: The Masters, the U.S. Open, and the PGA Championship
                    Running is a method of terrestrial locomotion allowing humans and other animals to move rapidly on foot. Running is a type of gait characterized by an aerial phase in which all feet are above the ground (though there are exceptions).[1] This is in contrast to walking, where one foot is always in contact with the ground, the legs are kept mostly straight and the center of gravity vaults over the stance leg or legs in an inverted pendulum fashion.[2] A feature of a running body from the viewpoint of spring-mass mechanics is that changes in kinetic and potential energy within a stride occur simultaneously, with energy storage accomplished by springy tendons and passive muscle elasticity.[3] The term running can refer to any of a variety of speeds ranging from jogging to sprinting.Running in humans is associated with improved health and life expectancy.[4]It is assumed that the ancestors of humankind developed the ability to run for long distances about 2.6 million years ago, probably in order to hunt animals.[5] Competitive running grew out of religious festivals in various areas. Records of competitive racing date back to the Tailteann Games in Ireland between 632 BCE and 1171 BCE,[6][7][8] while the first recorded Olympic Games took place in 776 BCE. Running has been described as the world's most accessible sport.[9]
                    "Soccer team" and "Soccer" redirect here. For the band, see Soccer Team (band). For other uses, see Soccer (disambiguation).This article is about the sport of association football. For other codes of football, see Football.Association football, more commonly known as simply football or soccer,[a] is a team sport played with a spherical ball between two teams of 11 players. It is played by approximately 250 million players in over 200 countries and dependencies, making it the world's most popular sport. The game is played on a rectangular field called a pitch with a goal at each end. The object of the game is to score more goals than the opposition by moving the ball beyond the goal line into the opposing goal, usually within a time frame of 90 or more minutes.Football is played in accordance with a set of rules known as the Laws of the Game. The ball is 68–70 cm (27–28 in) in circumference and known as the football. The two teams compete to get the ball into the other team's goal (between the posts and under the bar), thereby scoring a goal. Players are not allowed to touch the ball with hands or arms while it is in play, except for the goalkeepers within the penalty area. Players may use any other part of their body to strike or pass the ball and mainly use their feet. The team that scores more goals at the end of the game is the winner; if both teams have scored an equal number of goals, either a draw is declared or the game goes into extra time or a penalty shootout, depending on the format of the competition. Each team is led by a captain who has only one official responsibility as mandated by the Laws of the Game: to represent their team in the coin toss before kick-off or penalty kicks.[4]
                    Swimming is the self-propulsion of a person through water, or other liquid, usually for recreation, sport, exercise, or survival. Locomotion is achieved through coordinated movement of the limbs and the body to achieve hydrodynamic thrust which results in directional motion. Humans can hold their breath underwater and undertake rudimentary locomotive swimming within weeks of birth, as a survival response.[1]Swimming is consistently among the top public recreational activities,[2][3][4][5] and in some countries, swimming lessons are a compulsory part of the educational curriculum.[6] As a formalized sport, swimming features in a range of local, national, and international competitions, including every modern Summer Olympics.Swimming relies on the nearly neutral buoyancy of the human body. On average, the body has a relative density of 0.98 compared to water, which causes the body to float. However, buoyancy varies on the basis of body composition, lung inflation, muscle and fat content, centre of gravity and the salinity of the water. Higher levels of body fat and saltier water both lower the relative density of the body and increase its buoyancy. Human males tend to have a lower centre of gravity and higher muscle content, therefore find it more difficult to float or be buoyant. See also: Hydrostatic weighing.Since the human body is less dense than water, water is able to support the weight of the body during swimming. As a result, swimming is “low-impact” compared to land activities such as running. The density and viscosity of water also create resistance for objects moving through the water. Swimming strokes use this resistance to create propulsion, but this same resistance also generates drag on the body.
                    Table tennis, also known as ping-pong and whiff-whaff, is a sport in which two or four players hit a lightweight ball, also known as the ping-pong ball, back and forth across a table using small solid rackets. The game takes place on a hard table divided by a net. Except for the initial serve, the rules are generally as follows: players must allow a ball played toward them to bounce once on their side of the table and must return it so that it bounces on the opposite side at least once. A point is scored when a player fails to return the ball within the rules. Play is fast and demands quick reactions. Spinning the ball alters its trajectory and limits an opponent's options, giving the hitter a great advantage.Table tennis is governed by the worldwide organization International Table Tennis Federation (ITTF), founded in 1926. ITTF currently includes 226 member associations.[3] The table tennis official rules are specified in the ITTF handbook.[4] Table tennis has been an Olympic sport since 1988,[5] with several event categories. From 1988 until 2004, these were men's singles, women's singles, men's doubles and women's doubles. Since 2008, a team event has been played instead of the doubles.The sport originated in Victorian England, where it was played among the upper-class as an after-dinner parlour game.[1][2] It has been suggested that makeshift versions of the game were developed by British military officers in India around the 1860s or 1870s, who brought it back with them.[6] A row of books stood up along the center of the table as a net, two more books served as rackets and were used to continuously hit a golf-ball.[7][8]The name "ping-pong" was in wide use before British manufacturer J and Son Ltd trademarked it in 1901. The name "ping-pong" then came to describe the game played using the rather expensive  equipment, with other manufacturers calling it table tennis. A similar situation arose in the United States, where  sold the rights to the "ping-pong" name to Parker Brothers. Parker Brothers then enforced its trademark for the term in the 1920s, making the various associations change their names to "table tennis" instead of the more common, but trademarked, term.[9]
                    Tennis is a racket sport that can be played individually against a single opponent (singles) or between two teams of two players each (doubles). Each player uses a tennis racket that is strung with cord to strike a hollow rubber ball covered with felt over or around a net and into the opponent's court. The object of the game is to manoeuvre the ball in such a way that the opponent is not able to play a valid return. The player who is unable to return the ball validly will not gain a point, while the opposite player will.[1][2]Tennis is an Olympic sport and is played at all levels of society and at all ages. The sport can be played by anyone who can hold a racket, including wheelchair users. The modern game of tennis originated in Birmingham, England, in the late 19th century as lawn tennis.[3] It had close connections both to various field (lawn) games such as croquet and bowls as well as to the older racket sport today called real tennis.[4]The rules of modern tennis have changed little since the 1890s. Two exceptions are that until 1961 the server had to keep one foot on the ground at all times,[5][6] and the adoption of the tiebreak in the 1970s.[7] A recent addition to professional tennis has been the adoption of electronic review technology coupled with a point-challenge system, which allows a player to contest the line call of a point, a system known as Hawk-Eye.[8][9]Tennis is played by millions of recreational players and is also a popular worldwide spectator sport.[10] The four Grand Slam tournaments (also referred to as the Majors) are especially popular: the Australian Open played on hard courts, the French Open played on red clay courts, Wimbledon played on grass courts, and the US Open also played on hard courts.[11]
                
            
            ...

            ANSWER

            Answered 2022-Feb-28 at 05:46

            Cheers everyone I just found it . The solution is just remove the single quotation mark this one '

            And if you want to use this mark then use like this

            Source https://stackoverflow.com/questions/71290646

            QUESTION

            cub::DeviceRadixSort fails when specifying end bit
            Asked 2022-Feb-27 at 15:17

            I am using the GPU radix sort algorithm of the CUB library to sort N 32-bit unsigned integers whose values all utilize only k of their 32 bits, starting from the least significant bit.

            Thus, I specify the bit subrange [begin_bit, end_bit) when calling cub::DeviceRadixSort::SortKeys in hopes of improving the sorting performance. I am using the latest release of CUB (1.16.0).

            However, SortKeys crashes (not deterministically, but almost always) and reports an illegal memory access error when trying to sort 1 billion keys with certain specified bit ranges of [begin_bit=0, end_bit=k), and k = {20,19,18}, e.g. ./cub_sort_test 1000000000 0 20

            I tested this on a Volta and an Ampere NVIDIA GPU with CUDA versions 11.4 and 11.2 respectively. Has anyone encountered this previously, and/or know a fix? Here is the minimal, reproducable example code:

            ...

            ANSWER

            Answered 2022-Feb-27 at 15:17

            The problem with your code is that you do not use SortKeys correctly. SortKeys does not work in-place. You need to provide a separate output buffer for the sorted data.

            Source https://stackoverflow.com/questions/71285448

            QUESTION

            NVidia thrust arbitrary transform with three-dimensional grid
            Asked 2022-Feb-25 at 08:57

            I want to parallelize the following nested for loop on the GPU using NVidia thrust.

            ...

            ANSWER

            Answered 2022-Feb-25 at 08:57

            You can simply collapse the nested loop into a single loop and use for_each with a counting iterator. In the functor, you need to calculate the three indices from the single loop variable.

            Source https://stackoverflow.com/questions/71262525

            QUESTION

            CMAKE_CXX_SOURCE_FILE_EXTENSIONS not working with thrust/cuda
            Asked 2022-Feb-08 at 22:27

            Thrust allows for one to specify different backends at cmake configure time via the THRUST_DEVICE_SYSTEM flag. My problem is that I have a bunch of .cu files that I want to be compiled as regular c++ files when a user runs cmake with -DTHRUST_DEVICE_SYSTEM=OMP (for example). If I change the extension of the .cu files to .cpp they compile fine (indicating that I just need tell cmake to use the c++ compiler on the .cu files). But if I add .cu to CMAKE_CXX_SOURCE_FILE_EXTENSIONS then I get a CMake Error: Cannot determine link language for target "cuda_kernels". Here's a minimal cmake example:

            ...

            ANSWER

            Answered 2022-Feb-08 at 22:27

            Why is cmake not respecting my CMAKE_CXX_SOURCE_FILE_EXTENSIONS changes?

            The extension-to-language for is set as soon as is enabled by inspecting the value of the CMAKE__SOURCE_FILE_EXTENSIONS variable when the language detection module exits.

            Unfortunately, there is no blessed way to override this list for CXX as it is hard-coded in Modules/CMakeCXXCompiler.cmake.in.

            Perhaps the best way of working around the actual error would be to use the LANGUAGE source file property to tell CMake how to compile the individual CUDA files, like so:

            Source https://stackoverflow.com/questions/71041077

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install thrust

            You can download it from GitHub.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries

            Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link