algorithms | : books : Compilation of all algos & data structs
kandi X-RAY | algorithms Summary
kandi X-RAY | algorithms Summary
An organised compilation of all the algorithms I’ve learnt in different languages.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of algorithms
algorithms Key Features
algorithms Examples and Code Snippets
Community Discussions
Trending Discussions on algorithms
QUESTION
The Question
How do I best execute memory-intensive pipelines in Apache Beam?
Background
I've written a pipeline that takes the Naemura Bird dataset and converts the images and annotations to TF Records with TF Examples of the required format for the TF object detection API.
I tested the pipeline using DirectRunner with a small subset of images (4 or 5) and it worked fine.
The Problem
When running the pipeline with a bigger data set (day 1 of 3, ~21GB) it crashes after a while with a non-descriptive SIGKILL
.
I do see a memory peak before the crash and assume that the process is killed because of a too high memory load.
I ran the pipeline through strace
. These are the last lines in the trace:
ANSWER
Answered 2021-Jun-15 at 13:51Multiple things could cause this behaviour, because the pipeline runs fine with less Data, analysing what has changed could lead us to a resolution.
Option 1 : clean your input dataThe third line of the logs you provide might indicate that you're processing unclean data in your bigger pipeline mmap(NULL,
could mean that | "Get Content" >> beam.Map(lambda x: x.read_utf8())
is trying to read a null value.
Is there an empty file somewhere ? Are your files utf8 encoded ?
Option 2 : use smaller files as inputI'm guessing using the fileio.ReadMatches()
will try to load into memory the whole file, if your file is bigger than your memory, this could lead to errors. Can you split your data into smaller files ?
If files are too big for your current machine with a DirectRunner
you could try to use an on-demand infrastructure using another runner on the Cloud such as DataflowRunner
QUESTION
I have this kind of input as below. It is a list of strings, every odd string is a number starting with MR and every even string is some mixed text. I need to convert this list of strings to a pandas data-frame which strictly has two columns, but because some of the MR numbers are present several times paired with different mixed text counter parts I am getting extra columns everywhere where an MR is repeated, as I am demonstrating below:
...ANSWER
Answered 2021-Jun-15 at 11:48Try this
QUESTION
I'm kinda new at algorithms and I'm afraid that my solutions are not correct, help me fix them.
...ANSWER
Answered 2021-Jun-15 at 07:53- the first one is correct
- so as you said valuation instructions are how many time we are assigning values with the operator then I can see almost 10 assign variables(considering that initializing of i and j as well as i++ (i=i+1) and j++)
- for the third question as you are using an array of size n your space complexity is O(n)
- for the 4th question, as you are using two nestedfor loops and you are iterating like n+(n-1)+(n-2)+... = n*(n+1)/2 = (n^2+n)/2 = O(n^2) (Time complexity)
QUESTION
I have a collection where the documents are uniquely identified by a date, and I want to get the n most recent documents. My first thought was to use the date as a document ID, and then my query would sort by ID in descending order. Something like .orderBy(FieldPath.documentId, descending: true).limit(n)
. This does not work, because it requires an index, which can't be created because __name__ only indexes are not supported
.
My next attempt was to use .limitToLast(n)
with the default sort, which is documented here.
By default, Cloud Firestore retrieves all documents that satisfy the query in ascending order by document ID
According to that snippet from the docs, .limitToLast(n)
should work. However, because I didn't specify a sort, it says I can't limit the results. To fix this, I tried .orderBy(FieldPath.documentId).limitToLast(n)
, which should be equivalent. This, for some reason, gives me an error saying I need an index. I can't create it for the same reason I couldn't create the previous one, but I don't think I should need to because they must already have an index like that in order to implement the default ordering.
Should I just give up and copy the document ID into the document as a field, so I can sort that way? I know it should be easy from an algorithms perspective to do what I'm trying to do, but I haven't been able to figure out how to do it using the API. Am I missing something?
Edit: I didn't realize this was important, but I'm using the flutterfire firestore library.
...ANSWER
Answered 2021-Jun-15 at 00:56From what I've found:
FieldPath.documentId does not match on the documentId, but on the refPath (which it gets automatically if passed a document reference).
As such, since the documents are to be sorted by timestamp, it would be more ideal to create a timestamp fieldvalue
for createdAt
rather than a human-readable string which is prone to string length sorting over the value of the string.
From there, you can simply sort by date and limit to last. You can keep the document ID's as you intend.
QUESTION
I am trying encrypting in JS front end and decrypt in python backend using AES GCM cryptographic algorithm. I am using Web cryptography api for JS front end and python cryptography library for python backend as cryptographic library. I have fixed the IV for now in both side. I have implemented encryption-decryption code in both side, they work on each side. But I think the padding is done differently, can't seem to figure out how the padding is done in web cryptography api. Here is the encryption and decryption for the python backend:
...ANSWER
Answered 2021-Jun-14 at 18:01GCM is a stream cipher mode and therefore does not require padding. During encryption, an authentication tag is implicitly generated, which is used for authentication during decryption. Also, an IV/nonce of 12 bytes is recommended for GCM.
The posted Python code unnecessarily pads and doesn't take the authentication tag into account, unlike the JavaScript code, which may be the main reason for the different ciphertexts. Whether this is the only reason and whether the JavaScript code implements GCM correctly, is difficult to say, since the getMessageEncoding()
method was not posted, so testing this was not possible.
Also, both codes apply a 16 bytes IV/nonce instead of the recommended 12 bytes IV/nonce.
Cryptography offers two possible implementations for GCM. One implementation uses the architecture of the non-authenticating modes like CBC. The posted Python code applies this design, but does not take authentication into account and therefore implements GCM incompletely. A correct example for this design can be found here.
Cryptography generally recommends the other approach for GCM (s. the Danger note), namely the AESGCM
class, which performs implicit authentication so that this cannot be accidentally forgotten or incorrectly implemented.
The following implementation uses the AESGCM
class (and also takes into account the optional additional authenticated data):
QUESTION
We develop an application with VuejS in front and an api Nodejs(Restify) in back. We use a third party for give us authentification (Identity provider with OpenId Connect protocole).
So with VueJs we can authenticate, get an access_token and id_token and we pass it in each nodejs request header with bearer.
Now we need to verify,in back, if this token is valid and if the user can access this routes.
Our Identity provider give us an endpoint (jwks_uri) with a keys like:
...ANSWER
Answered 2021-Jun-04 at 17:54I believe the optimal way for small to medium sized application is just to make jwt verification work as a middleware. Something like:
QUESTION
I'm pretty new to algorithms and am trying to solve a problem that involves generating a list of 5,000 numbers in random order each time it is run. Each number in the list must be unique and be between 1 and 5,000 (inclusive).
...ANSWER
Answered 2021-Jun-13 at 18:23QUESTION
I want implement a elliptic curve diffie hellman using HKDF as key derivation function. I am using a python backend and (vanilla) javascript in frontend. I am using python cryptography library in backend and Web Crypto api in frontend as cryptographic library. I created ECDH key pair in both side and exchanged the pbulic keys. Now I am trying to create the AES shared key with the exchanged public key and private key along with HKDF algorithm. I am able to do it in the python backend (I followed this example for the python code):
...ANSWER
Answered 2021-Jun-13 at 11:02The referenced Python code uses P-384 (aka secp384r1) as elliptic curve. This is compatible with the WebCrypto API, which supports three curves P-256 (aka secp256r1), P-384 and P-521 (aka secp521r1), see EcKeyImportParams
.
The following WebCrypto code generates a shared secret using ECDH and derives an AES key from the shared secret using HKDF. In detail the following happens:
- To allow comparison of the derived key with that of the referenced Python code, predefined EC keys are applied. The private key is imported as PKCS#8, the public key as X.509/SPKI. Note that due to a Firefox bug concerning the import of EC keys, the script below cannot be run in the Firefox browser.
- After the import the shared secret is created with ECDH using
deriveBits()
(and notderiveKey()
). - The shared secret is imported with
importKey()
and then the AES key is derived using HKDF, again withderiveBits()
.
QUESTION
I'm trying to train some ML algorithms on some data that I collected, but I received an error for input variables with inconsistent numbers of samples. I'm not really sure what variables needs to be changed or not. I've posted my code below to give you a better understanding of what I'm trying to accomplish:
...ANSWER
Answered 2021-Jun-12 at 12:14The file has to be opened in binary mode.
open(DATA_FILE, 'rb')
QUESTION
I'm trying to get the Python package OSMnx running on my Windows10 machine. I'm still new to python so struggling with the basics. I've followed the instructions here https://osmnx.readthedocs.io/en/stable/ and have successfully created a new conda environment for it to run in. The installation seems to have gone ok. However, as soon as I try and import it, I get the following error
...ANSWER
Answered 2021-Apr-28 at 10:07The module fractions
is part of the Python standard library. There used to be a function gcd
, which, as the linked documentation says, is:
Deprecated since version 3.5: Use
math.gcd()
instead.
Since the function gcd
was removed from the module fractions
in Python 3.9, it seems that the question uses Python 3.9, not Python 3.7.6 as the question notes, because that Python version still had fractions.gcd
.
The error is raised by networkx
. Upgrading to the latest version of networkx
is expected to avoid this issue:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install algorithms
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page