perf_counter | A dedicated performance counter for Cortex-M systick. It shares the SysTick with users' original Sys | Monitoring library
kandi X-RAY | perf_counter Summary
kandi X-RAY | perf_counter Summary
A dedicated performance counter for Cortex-M systick. It shares the SysTick with users' original SysTick function without interfering with it. This library will bring new functionalities, such as performance counter, delay_us and clock() service defined in time.h.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of perf_counter
perf_counter Key Features
perf_counter Examples and Code Snippets
Community Discussions
Trending Discussions on perf_counter
QUESTION
I have built an A* Pathfinding algorithm that finds the best route from Point A to Point B, there is a timer that starts and ends post execute of the algorithm and the path is draw, this is parsed to a global variable. so it is accessable when i run the alogrithm more than once (to gain an average time).
the global variable gets added to a list, except when i run the algorithm 5 times, only 4 values get added (I can see 5 times being recorded as the algorithm prints the time after completion). when displaying the list it always misses the first time, and only has times 2,3,4,5 if i run the algorithm 5 times. here is main.py
ANSWER
Answered 2022-Apr-08 at 19:15EDIT:
the misterious 5th print is coming from this line, of course
QUESTION
I just get start with asynchronous programming, and I have one questions regarding CPU bound task with multiprocessing. In short, why multiprocessing generated way worse time performance than Synchronous approach? Did I do anything wrong with my code in asynchronous version? Any suggestions are welcome!
1: Task description
I want use one of the Google's Ngram datasets as input, and create a huge dictionary includes each words and corresponding words count.
Each Record in the dataset looks like follow :
"corpus\tyear\tWord_Count\t\Number_of_Book_Corpus_Showup"
Example:
"A'Aang_NOUN\t1879\t45\t5\n"
2: Hardware Information: Intel Core i5-5300U CPU @ 2.30 GHz 8GB RAM
3: Synchronous Version - Time Spent 170.6280147 sec
...ANSWER
Answered 2022-Apr-01 at 00:56There's quite a bit I don't understand in your code. So instead I'll just give you code that works ;-)
I'm baffled by how your code can run at all. A
.gz
file is compressed binary data (gzip compression). You should need to open it with Python'sgzip.open()
. As is, I expect it to die with an encoding exception, as it does when I try it.temp[2]
is not an integer. It's a string. You're not adding integers here, you're catenating strings with+
.int()
needs to be applied first.I don't believe I've ever seen
asyncio
mixed withconcurrent.futures
before. There's no need for it.asyncio
is aimed at fine-grained pseudo-concurrency in a single thread;concurrent.futures
is aimed at coarse-grained genuine concurrency across processes. You want the latter here. The code is easier, simpler, and faster withoutasyncio
.While
concurrent.futures
is fine, I'm old enough that I invested a whole lot into learning the oldermultiprocessing
first, and so I'm using that here.These ngram files are big enough that I'm "chunking" the reads regardless of whether running the serial or parallel version.
collections.Counter
is much better suited to your task than a plain dict.While I'm on a faster machine than you, some of the changes alluded to above have a lot do with my faster times.
I do get a speedup using 3 worker processes, but, really, all 3 were hardly ever being utilized. There's very little computation being done per line of input, and I expect that it's more memory-bound than CPU-bound. All the processes are fighting for cache space too, and cache misses are expensive. An "ideal" candidate for coarse-grained parallelism does a whole lot of computation per byte that needs to be transferred between processes, and not need much inter-process communication at all. Neither are true of this problem.
QUESTION
I'm trying to train a custom COCO-format dataset with Detectron2 on PyTorch. My datasets are json files with the aforementioned COCO-format, with each item in the "annotations" section looking like this:
The code for setting up Detectron2 and registering the training & validation datasets are as follows:
...ANSWER
Answered 2022-Mar-29 at 11:17It's difficult to give a concrete answer without looking at the full annotation file, but a KeyError
exception is raised when trying to access a key that is not in a dictionary. From the error message you've posted, this key seems to be 'segmentation'
.
This is not in your code snippet, but before even getting into network training, have you done any exploration/inspections using the registered datasets? Doing some basic exploration or inspections would expose any problems with your dataset so you can fix them early in your development process (as opposed to letting the trainer catch them, in which case the error messages could get long and confounding).
In any case, for your specific issue, you can take the registered training dataset and check if all annotations have the 'segmentation'
field. A simple code snippet to do this below.
QUESTION
I'm trying to understand how different chunking schemas can speed up or slow down my computation using xarray and dask.
I have read dask and xarray guides but I might have missed something to understand this.
I have 2 storage with the same content but chunked differently.
Both contains a data variable tasmax and the necessary coordinate variables and metadata for it to be opened with xarray.
tasmax shape is
The first storage is a zarr store zarr_init
which I made from netCDF files, 1 file per year, 10 .nc files.
When opening it with xarray I get a chunking schema of chunksize=(366, 256, 512)
, thus 1 year per chunk, same as the initial netCDF storage.
Each chunk is around 191MB.
The second storage, zarr_time_opti
is also a zarr store but, there is no chunking on time dimension.
When I open it with xarray and inspect tasmax
, it's chunking schema is chunksize=(3660, 114, 115)
.
Each chunk is around 191MB as well.
Naively, I would expect spatially independent computations to run much faster and to generate much fewer tasks on zarr_time_opti
than on zarr_init
.
However, I observe the complete opposite:
When computing the same calculus based on groupby("time.month")
, I get 2370 tasks with zarr_time_opti
and only 570 tasks with zarr_init
. As you can see with the MRE below, this has nothing to do with zarr itself as I'm able to reproduce the issue with only xarray and dask.
So my questions are:
- What is the mechanism with xarray or dask which create that many tasks ?
- Then, what would be the strategy to find the best chunking schema ?
ANSWER
Answered 2022-Mar-24 at 12:05What is the mechanism with xarray or dask which create that many tasks ?
In the case of da_optimized
, you seem to be chunking along both lat
and lon
dimensions, and in da_init
, you're chunking along only the time
dimension.
da_optimized
:
da_init
:
When you do a compute, in the beginning, each task will correspond to one chunk.
Sidenotes about your specific example:
da_optimized
starts with 15 chunks andda_init
with 10, this is adding to fewer overall tasks inda_init
. So, to balance them, I've modified it to be:
QUESTION
I am running a program that makes the same API call, for each row in a dataframe. Since it is taking quite a long time I decided to try to learn and implement an async version with asyincio.
I'd split the dataframe in "N" (3 in this case) smaller dataframes, and for each one of those I'd create a coroutine that would be gathered and awaited by the main routine.
Here's what I've tried:
...ANSWER
Answered 2022-Jan-12 at 13:33So, aparently I needed to execute the blocking function async get_next_funding(df) as a SYNC function inside a loop.run_in_executor() since the blocking fcn was not a async type.
Thank @gold_cy for the answer!
There's the modified code:
QUESTION
The Wallis formula is also used to calculate PI. Why are the running times of the two methods so different?
The link to the formula is http://en.wikipedia.org/wiki/Wallis_product
Thank you so much! The following code:
...ANSWER
Answered 2022-Mar-08 at 13:46The second version only uses integers, which are implemented as BigNums, hence much much slower. This is total overkill, unless you hope for huge accuracy (but this is completely hopeless given the very slow convergence of Wallis).
By adding this in the loop
QUESTION
I need to send requests in parallel using asyncio. The requests are sent via the function server_time
from a library that I can't change. It's a function, not a coroutine, so can't await it, which is the challenge here.
I've been trying with the code below but it doesn't work obviously, same reason you can parallelize await asyncio.sleep(1)
but not time.sleep(1)
.
How do I use asyncio to parallelize this? I.e., how can I parallelize something like time.sleep(1)
using asyncio?
ANSWER
Answered 2022-Mar-06 at 16:07You can use run_in_executor
to run the synchronous, I/O-bound session.server_time
function in parallel:
QUESTION
I'm running into a strange problem that I cannot figure out. I recently used the this blog post to learn how to create a logging-decorator that takes logger object as parameter. When testing my decorator, it works fine when testing within ipython environment. However, in my actual project, I saved the decorator in a file decorator.py. When I import the decorator (from decorator import log_func) and apply it to a function within a module (@log_func(some-logger-object)), I get error. I cant figure out why...especially considering that function hasn't actually been called yet. Any of you folks have any ideas what's going on here?
See details below
decorator.py
...ANSWER
Answered 2022-Mar-03 at 14:00The problem is being caused by passing in an object where a function was expected. You're passing in the logger as the function, rather than as the logger. I think you want:
QUESTION
I want to download/scrape 50 million log records from a site. Instead of downloading 50 million in one go, I was trying to download it in parts like 10 million at a time using the following code but it's only handling 20,000 at a time (more than that throws an error) so it becomes time-consuming to download that much data. Currently, it takes 3-4 mins to download 20,000 records with the speed of 100%|██████████| 20000/20000 [03:48<00:00, 87.41it/s]
so how to speed it up?
ANSWER
Answered 2022-Feb-27 at 14:37If it's not the bandwidth that limits you (but I cannot check this), there is a solution less complicated than the celery and rabbitmq but it is not as scalable as the celery and rabbitmq, it will be limited by your number of CPU.
Instead of splitting calls on celery workers, you split them on multiple processes.
I modified the fetch
function like this:
QUESTION
I am trying to get timestamps that are accurate down to the microsecond on Windows OS and macOS in Python 3.10+.
On Windows OS, I have noticed Python's built-in time.time()
(paired with datetime.fromtimestamp()
) and datetime.datetime.now()
seem to have a slower clock. They don't have enough resolution to differentiate microsecond-level events. The good news is time
functions like time.perf_counter()
and time.time_ns()
do seem to use a clock that is fast enough to measure microsecond-level events.
Sadly, I can't figure out how to get them into datetime
objects. How can I get the output of time.perf_counter()
or PEP 564's nanosecond resolution time functions into a datetime
object?
Note: I don't need nanosecond-level stuff, so it's okay to throw away out precision below 1-μs).
Current Solution
This is my current (hacky) solution, which actually works fine, but I am wondering if there's a cleaner way:
...ANSWER
Answered 2022-Feb-26 at 12:56That's almost as good as it gets, since the C module, if available, overrides all classes defined in the pure Python implementation of the datetime
module with the fast C implementation, and there are no hooks.
Reference: python/cpython@cf86e36
Note that:
- There's an intrinsic sub-microsecond error in the accuracy equal to the time it takes between obtaining the system time in
datetime.now()
and obtaining the performance counter time. - There's a sub-microsecond performance cost to add a
datetime
and atimedelta
.
Depending on your specific use case if calling multiple times, that may or may not matter.
A slight improvement would be:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install perf_counter
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page