Popular New Releases in Hashing
xxHash
v0.8.1
hashids
4.1.0
hashids.js
2.2.10
range-v3
Thanks, ISO
jsSHA
Release version 3.2.0
Popular Libraries in Hashing
by Cyan4973 c
5761 NOASSERTION
Extremely fast non-cryptographic hash algorithm
by vinkla php
4508 MIT
A small PHP library to generate YouTube-like ids from numbers. Use it when you don't want to expose your database ids to the user.
by niieani typescript
3695 MIT
A small JavaScript library to generate YouTube-like ids from numbers.
by P-H-C c
3544 NOASSERTION
The password hash Argon2, winner of PHC
by gnemoug python
3225
使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现
by ericniebler c++
3140 NOASSERTION
Range library for C++14/17/20, basis for C++20's std::ranges
by in3rsha ruby
2986 MIT
Animation of the SHA-256 hash function in your terminal.
by troydhanson c
2525 NOASSERTION
C macros for hash tables and more
by ullmark csharp
2072 MIT
A small .NET package to generate YouTube-like hashes from one or many numbers. Use hashids when you do not want to expose your database ids to the user.
Trending New libraries in Hashing
by in3rsha ruby
2986 MIT
Animation of the SHA-256 hash function in your terminal.
by HashPals python
962 GPL-3.0
🔎Searches Hash APIs to crack your hash quickly🔎 If hash is not found, automatically pipes into HashCat⚡
by ankane c
450 NOASSERTION
Open-source vector similarity search for Postgres
by symfony php
259 MIT
The PasswordHasher component provides password hashing utilities.
by Password4j java
185 Apache-2.0
Password4j is a user-friendly cryptographic library that supports Argon2, Bcrypt, Scrypt, PBKDF2 and various cryptographic hash functions.
by tezc c
182 BSD-3-Clause
Common libraries and data structures for C.
by linvon go
150 MIT
Cuckoo Filter go implement, better than Bloom Filter, configurable and space optimized 布谷鸟过滤器的Go实现,优于布隆过滤器,可以定制化过滤器参数,并进行了空间优化
by lukeed javascript
128 MIT
A tiny (190B) and extremely fast utility to generate random IDs of fixed length
by backtrace-labs python
107 MIT
UMASH: a fast enough hash and fingerprint with collision bounds
Top Authors in Hashing
1
12 Libraries
1896
2
9 Libraries
486
3
7 Libraries
1907
4
7 Libraries
606
5
6 Libraries
284
6
6 Libraries
541
7
6 Libraries
129
8
6 Libraries
31
9
5 Libraries
794
10
5 Libraries
113
1
12 Libraries
1896
2
9 Libraries
486
3
7 Libraries
1907
4
7 Libraries
606
5
6 Libraries
284
6
6 Libraries
541
7
6 Libraries
129
8
6 Libraries
31
9
5 Libraries
794
10
5 Libraries
113
Trending Kits in Hashing
A hashing library implements various hashing algorithms. It helps in securing data, verifying integrity, and protecting sensitive information. They are cryptographic functions used to convert data into a fixed-size hash value.
There are different types of hashing algorithms available in hashing libraries. Collision-resistant algorithms like SHA-256 and SHA-3 produce hash values. The values that are unlikely to collide ensure data integrity and security. Collision-free algorithms like MD5 can generate hash values but are less secure.
Hashing libraries has many use cases. We use it for password hashing, converting the original password into a hash value. It will make it easier for attackers to get the original password. Hashing libraries support integrity verification, ensuring we don't tamper with or corrupt the files. We use it for generating digital signatures, verifying authenticity, and implementing secure protocols.
Consider factors like algorithm support, performance, and security when choosing a hashing library. Look for a library that supports many hash algorithms, including collision-resistant options. Performance is crucial, especially for large datasets. It's important to choose an optimized library for speed. Additionally, prioritize the vetted libraries for security. Choose the libraries which have a large user base and community support.
Hashing libraries can be standalone tools. It allows developers to compute hash values for specific data. We can integrate it into security solutions, like user authentication or file verification. Hashing libraries provide APIs and functions. It helps simplify the process of hashing data and handling hash operations.
In summary, hashing libraries are vital to data security and integrity. They offer various hashing algorithms. It supports various use cases like password hashing and file integrity verification. It provides essential tools for securing sensitive information. Choosing the right library involves considering its support, performance, security, and integration capabilities. It will make sure you select the right fit for your needs.
hashids:
- This library helps generate unique and reversible hash values for encoding sensitive information. It helps generate unique values like database IDs or URL parameters.
- It supports custom alphabets, salt values, and minimum hash lengths. It provides flexibility and security.
forge:
- This library offers comprehensive cryptographic functionality. The functionality includes hashing algorithms like MD5, SHA-1, SHA-256, and more.
- It helps secure data, generate digital signatures, and implement cryptographic protocols.
bcrypt.js:
- This library supports the bcrypt hashing algorithm used for password hashing.
- It helps store passwords by applying an expensive hashing function. It makes it tough for attackers to reverse-engineer the original passwords.
js-spark-md5:
- This library focuses on efficient and fast hashing of large datasets, especially for scenarios. The scenarios include client-side file hashing and chunked data hashing.
- It helps generate hash values for file integrity verification, data synchronization, and deduplication.
CryptoJS:
- This library helps implement various hashing algorithms. The algorithmns like MD5, SHA-1, SHA-256, and more.
- It helps secure sensitive data by generating unique hash values. It includes hash values like passwords, digital signatures, and data integrity checks.
js-sha256:
- This library provides a lightweight and efficient implementation of the SHA-256 hashing algorithm.
- It helps generate a cryptographic hash value. It includes values for data integrity checks, blockchain applications, and cryptographic protocols.
blake3:
- This library implements the BLAKE3 hashing algorithm. It is well-known for its performance and security guarantees.
- It helps in computing hash values for data integrity checks and cryptographic applications. It also helps with file deduplication.
hash-anything:
- This library allows developers to hash any JavaScript object. The objects include complex data structures, arrays, and nested objects.
- It helps generate hash values for data comparison, memorization, and efficient caching.
FAQ
1. What is a cryptographic hash, and what does it do?
A cryptographic hash is a mathematical function. It takes an input (data) and produces a fixed-size hash value. It is designed as a one-way function, meaning it is infeasible to reverse the process. It helps get the original input from the hash value. We can use cryptographic hashes for various purposes, like data integrity verification. It also supports password storage, digital signatures, and message authentication.
2. How does one encrypt data using a javascript hashing library?
You must follow best practices to encrypt data using a JavaScript hashing library. First, use a secure and reliable hashing algorithm, such as SHA-256 or bcrypt. Then, apply proper salting and iteration count to the hashing process. Salting involves adding a unique random value to each piece of data before hashing. It adds extra security against precomputed attacks. Iteration count refers to the times we apply the hashing algorithm to the data. It will increase the time required for attackers to perform brute-force attacks.
3. What are the benefits of hashing binary data instead of plain text?
Hashing binary data instead of plain text provides several benefits. First, it allows you to hash any data, including files, images, or binary formats. Hashing binary data ensures the integrity of the entire data file. It will change the document's content, resulting in a different hash value. Hashing binary data is efficient as it operates on binary than textual representation.
4. How is key derivation used in javascript hashing libraries?
Key derivation in JavaScript hashing libraries derives encryption keys from a given key. It involves applying a hashing function many times, usually with extra processing steps. The steps include salting and iteration count. We can design the key derivation functions to make key derivation slow and expensive. It helps prevent brute-force attacks and enhances security.
5. What advantages does SHA-1 have over other algorithms for cryptography?
SHA-1 is an older cryptographic hash algorithm deprecated due to security vulnerabilities. We don't recommend it, as it is susceptible to collision attacks. Other algorithms like SHA-256 and SHA-3 provide stronger security and are more robust. So, SHA-1 has no advantages over other cryptography algorithms, and we should avoid it.
6. Are there any differences between server and client-side hashing libraries?
Server-side and client-side hashing libraries serve the same purpose: to compute hash values. But there might be differences in implementation and available features. We can use server-side hashing libraries in back-end environments. It provides extra functionalities. It supports secure password storage, key derivation, and handling of sensitive data. We use client-side hashing to focus on data integrity and user authentication scenarios.
7. How can I use a javascript hashing library when building web applications?
JavaScript hashing libraries provide APIs and functions. It allows developers to integrate hash calculations into their web applications. When building web applications, JavaScript hashing libraries can enhance security. It supports verifying data integrity during transmission or storage. It also supports generating digital signatures or implementing secure protocols. We can employ these for tasks like password hashing before storing them in a database.
8. What should we include in the developer roadmap for creating secure code with hashes?
The developer roadmap for creating secure code with hashes includes the following:
- understanding different hashing algorithms and their specific use cases,
- implementing proper salting and iteration count for password hashing,
- storing hashed passwords,
- validating and verifying data integrity using hash values,
- protecting sensitive data during transmission by using hash-based message authentication codes (HMAC),
- keeping up with best practices and security updates of hashing and cryptography.
Additionally, developers should be familiar with common attack vectors. They should know about the collision or precomputed attacks and apply appropriate countermeasures. They should do regular code reviews, stay updated, and follow secure coding guidelines. It helps in creating robust and secure applications using hashes.
Trending Discussions on Hashing
Find near duplicate and faked images
Is there a need for transitivity in Python __eq__?
Unhashing a hashed (MD5) email address
Channel hangs, probably not closing at the right place
How can I join two lists in less than O(N*M)?
How reproducible / deterministic is Parquet format?
Angular 12 app still being cached with output-hashing=all
Where to store access token and how to keep track of user (using JWT token in Http only cookie)
Flutter Web Page Routing Issue
Ionic + Fastlane | Android "error: package android.support.v4.content does not exist"
QUESTION
Find near duplicate and faked images
Asked 2022-Mar-24 at 01:32I am using Perceptual hashing technique to find near-duplicate and exact-duplicate images. The code is working perfectly for finding exact-duplicate images. However, finding near-duplicate and slightly modified images seems to be difficult. As the difference score between their hashing is generally similar to the hashing difference of completely different random images.
To tackle this, I tried to reduce the pixelation of the near-duplicate images to 50x50 pixel and make them black/white, but I still don't have what I need (small difference score).
This is a sample of a near duplicate image pair:
Image 1 (a1.jpg):
Image 2 (b1.jpg):
The difference between the hashing score of these images is : 24
When pixeld (50x50 pixels), they look like this:
rs_a1.jpg
rs_b1.jpg
The hashing difference score of the pixeled images is even bigger! : 26
Below two more examples of near duplicate image pairs as requested by @ann zen:
Pair 1
Pair 2
The code I use to reduce the image size is this :
1from PIL import Image
2with Image.open(image_path) as image:
3 reduced_image = image.resize((50, 50)).convert('RGB').convert("1")
4
And the code for comparing two image hashing:
1from PIL import Image
2with Image.open(image_path) as image:
3 reduced_image = image.resize((50, 50)).convert('RGB').convert("1")
4from PIL import Image
5import imagehash
6with Image.open(image1_path) as img1:
7 hashing1 = imagehash.phash(img1)
8with Image.open(image2_path) as img2:
9 hashing2 = imagehash.phash(img2)
10print('difference : ', hashing1-hashing2)
11
ANSWER
Answered 2022-Mar-22 at 12:48Rather than using pixelisation to process the images before finding the difference/similarity between them, simply give them some blur using the cv2.GaussianBlur()
method, and then use the cv2.matchTemplate()
method to find the similarity between them:
1from PIL import Image
2with Image.open(image_path) as image:
3 reduced_image = image.resize((50, 50)).convert('RGB').convert("1")
4from PIL import Image
5import imagehash
6with Image.open(image1_path) as img1:
7 hashing1 = imagehash.phash(img1)
8with Image.open(image2_path) as img2:
9 hashing2 = imagehash.phash(img2)
10print('difference : ', hashing1-hashing2)
11import cv2
12import numpy as np
13
14def process(img):
15 img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
16 return cv2.GaussianBlur(img_gray, (43, 43), 21)
17
18def confidence(img1, img2):
19 res = cv2.matchTemplate(process(img1), process(img2), cv2.TM_CCOEFF_NORMED)
20 return res.max()
21
22img1s = list(map(cv2.imread, ["img1_1.jpg", "img1_2.jpg", "img1_3.jpg"]))
23img2s = list(map(cv2.imread, ["img2_1.jpg", "img2_2.jpg", "img2_3.jpg"]))
24
25for img1, img2 in zip(img1s, img2s):
26 conf = confidence(img1, img2)
27 print(f"Confidence: {round(conf * 100, 2)}%")
28
Output:
1from PIL import Image
2with Image.open(image_path) as image:
3 reduced_image = image.resize((50, 50)).convert('RGB').convert("1")
4from PIL import Image
5import imagehash
6with Image.open(image1_path) as img1:
7 hashing1 = imagehash.phash(img1)
8with Image.open(image2_path) as img2:
9 hashing2 = imagehash.phash(img2)
10print('difference : ', hashing1-hashing2)
11import cv2
12import numpy as np
13
14def process(img):
15 img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
16 return cv2.GaussianBlur(img_gray, (43, 43), 21)
17
18def confidence(img1, img2):
19 res = cv2.matchTemplate(process(img1), process(img2), cv2.TM_CCOEFF_NORMED)
20 return res.max()
21
22img1s = list(map(cv2.imread, ["img1_1.jpg", "img1_2.jpg", "img1_3.jpg"]))
23img2s = list(map(cv2.imread, ["img2_1.jpg", "img2_2.jpg", "img2_3.jpg"]))
24
25for img1, img2 in zip(img1s, img2s):
26 conf = confidence(img1, img2)
27 print(f"Confidence: {round(conf * 100, 2)}%")
28Confidence: 83.6%
29Confidence: 84.62%
30Confidence: 87.24%
31
Here are the images used for the program above:
img1_1.jpg
& img2_1.jpg
:
img1_2.jpg
& img2_2.jpg
:
img1_3.jpg
& img2_3.jpg
:
To prove that the blur doesn't produce really off false-positives, I ran this program:
1from PIL import Image
2with Image.open(image_path) as image:
3 reduced_image = image.resize((50, 50)).convert('RGB').convert("1")
4from PIL import Image
5import imagehash
6with Image.open(image1_path) as img1:
7 hashing1 = imagehash.phash(img1)
8with Image.open(image2_path) as img2:
9 hashing2 = imagehash.phash(img2)
10print('difference : ', hashing1-hashing2)
11import cv2
12import numpy as np
13
14def process(img):
15 img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
16 return cv2.GaussianBlur(img_gray, (43, 43), 21)
17
18def confidence(img1, img2):
19 res = cv2.matchTemplate(process(img1), process(img2), cv2.TM_CCOEFF_NORMED)
20 return res.max()
21
22img1s = list(map(cv2.imread, ["img1_1.jpg", "img1_2.jpg", "img1_3.jpg"]))
23img2s = list(map(cv2.imread, ["img2_1.jpg", "img2_2.jpg", "img2_3.jpg"]))
24
25for img1, img2 in zip(img1s, img2s):
26 conf = confidence(img1, img2)
27 print(f"Confidence: {round(conf * 100, 2)}%")
28Confidence: 83.6%
29Confidence: 84.62%
30Confidence: 87.24%
31import cv2
32import numpy as np
33
34def process(img):
35 h, w, _ = img.shape
36 img = cv2.resize(img, (350, h * w // 350))
37 img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
38 return cv2.GaussianBlur(img_gray, (43, 43), 21)
39
40def confidence(img1, img2):
41 res = cv2.matchTemplate(process(img1), process(img2), cv2.TM_CCOEFF_NORMED)
42 return res.max()
43
44img1s = list(map(cv2.imread, ["img1_1.jpg", "img1_2.jpg", "img1_3.jpg"]))
45img2s = list(map(cv2.imread, ["img2_1.jpg", "img2_2.jpg", "img2_3.jpg"]))
46
47for i, img1 in enumerate(img1s, 1):
48 for j, img2 in enumerate(img2s, 1):
49 conf = confidence(img1, img2)
50 print(f"img1_{i} img2_{j} Confidence: {round(conf * 100, 2)}%")
51
Output:
1from PIL import Image
2with Image.open(image_path) as image:
3 reduced_image = image.resize((50, 50)).convert('RGB').convert("1")
4from PIL import Image
5import imagehash
6with Image.open(image1_path) as img1:
7 hashing1 = imagehash.phash(img1)
8with Image.open(image2_path) as img2:
9 hashing2 = imagehash.phash(img2)
10print('difference : ', hashing1-hashing2)
11import cv2
12import numpy as np
13
14def process(img):
15 img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
16 return cv2.GaussianBlur(img_gray, (43, 43), 21)
17
18def confidence(img1, img2):
19 res = cv2.matchTemplate(process(img1), process(img2), cv2.TM_CCOEFF_NORMED)
20 return res.max()
21
22img1s = list(map(cv2.imread, ["img1_1.jpg", "img1_2.jpg", "img1_3.jpg"]))
23img2s = list(map(cv2.imread, ["img2_1.jpg", "img2_2.jpg", "img2_3.jpg"]))
24
25for img1, img2 in zip(img1s, img2s):
26 conf = confidence(img1, img2)
27 print(f"Confidence: {round(conf * 100, 2)}%")
28Confidence: 83.6%
29Confidence: 84.62%
30Confidence: 87.24%
31import cv2
32import numpy as np
33
34def process(img):
35 h, w, _ = img.shape
36 img = cv2.resize(img, (350, h * w // 350))
37 img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
38 return cv2.GaussianBlur(img_gray, (43, 43), 21)
39
40def confidence(img1, img2):
41 res = cv2.matchTemplate(process(img1), process(img2), cv2.TM_CCOEFF_NORMED)
42 return res.max()
43
44img1s = list(map(cv2.imread, ["img1_1.jpg", "img1_2.jpg", "img1_3.jpg"]))
45img2s = list(map(cv2.imread, ["img2_1.jpg", "img2_2.jpg", "img2_3.jpg"]))
46
47for i, img1 in enumerate(img1s, 1):
48 for j, img2 in enumerate(img2s, 1):
49 conf = confidence(img1, img2)
50 print(f"img1_{i} img2_{j} Confidence: {round(conf * 100, 2)}%")
51img1_1 img2_1 Confidence: 84.2% # Corresponding images
52img1_1 img2_2 Confidence: -10.86%
53img1_1 img2_3 Confidence: 16.11%
54img1_2 img2_1 Confidence: -2.5%
55img1_2 img2_2 Confidence: 84.61% # Corresponding images
56img1_2 img2_3 Confidence: 43.91%
57img1_3 img2_1 Confidence: 14.49%
58img1_3 img2_2 Confidence: 59.15%
59img1_3 img2_3 Confidence: 87.25% # Corresponding images
60
Notice how only when matching the images with their corresponding images does the program output high confidence levels (84+%).
For comparison, here are the results without blurring the images:
1from PIL import Image
2with Image.open(image_path) as image:
3 reduced_image = image.resize((50, 50)).convert('RGB').convert("1")
4from PIL import Image
5import imagehash
6with Image.open(image1_path) as img1:
7 hashing1 = imagehash.phash(img1)
8with Image.open(image2_path) as img2:
9 hashing2 = imagehash.phash(img2)
10print('difference : ', hashing1-hashing2)
11import cv2
12import numpy as np
13
14def process(img):
15 img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
16 return cv2.GaussianBlur(img_gray, (43, 43), 21)
17
18def confidence(img1, img2):
19 res = cv2.matchTemplate(process(img1), process(img2), cv2.TM_CCOEFF_NORMED)
20 return res.max()
21
22img1s = list(map(cv2.imread, ["img1_1.jpg", "img1_2.jpg", "img1_3.jpg"]))
23img2s = list(map(cv2.imread, ["img2_1.jpg", "img2_2.jpg", "img2_3.jpg"]))
24
25for img1, img2 in zip(img1s, img2s):
26 conf = confidence(img1, img2)
27 print(f"Confidence: {round(conf * 100, 2)}%")
28Confidence: 83.6%
29Confidence: 84.62%
30Confidence: 87.24%
31import cv2
32import numpy as np
33
34def process(img):
35 h, w, _ = img.shape
36 img = cv2.resize(img, (350, h * w // 350))
37 img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
38 return cv2.GaussianBlur(img_gray, (43, 43), 21)
39
40def confidence(img1, img2):
41 res = cv2.matchTemplate(process(img1), process(img2), cv2.TM_CCOEFF_NORMED)
42 return res.max()
43
44img1s = list(map(cv2.imread, ["img1_1.jpg", "img1_2.jpg", "img1_3.jpg"]))
45img2s = list(map(cv2.imread, ["img2_1.jpg", "img2_2.jpg", "img2_3.jpg"]))
46
47for i, img1 in enumerate(img1s, 1):
48 for j, img2 in enumerate(img2s, 1):
49 conf = confidence(img1, img2)
50 print(f"img1_{i} img2_{j} Confidence: {round(conf * 100, 2)}%")
51img1_1 img2_1 Confidence: 84.2% # Corresponding images
52img1_1 img2_2 Confidence: -10.86%
53img1_1 img2_3 Confidence: 16.11%
54img1_2 img2_1 Confidence: -2.5%
55img1_2 img2_2 Confidence: 84.61% # Corresponding images
56img1_2 img2_3 Confidence: 43.91%
57img1_3 img2_1 Confidence: 14.49%
58img1_3 img2_2 Confidence: 59.15%
59img1_3 img2_3 Confidence: 87.25% # Corresponding images
60import cv2
61import numpy as np
62
63def process(img):
64 h, w, _ = img.shape
65 img = cv2.resize(img, (350, h * w // 350))
66 return cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
67
68def confidence(img1, img2):
69 res = cv2.matchTemplate(process(img1), process(img2), cv2.TM_CCOEFF_NORMED)
70 return res.max()
71
72img1s = list(map(cv2.imread, ["img1_1.jpg", "img1_2.jpg", "img1_3.jpg"]))
73img2s = list(map(cv2.imread, ["img2_1.jpg", "img2_2.jpg", "img2_3.jpg"]))
74
75for i, img1 in enumerate(img1s, 1):
76 for j, img2 in enumerate(img2s, 1):
77 conf = confidence(img1, img2)
78 print(f"img1_{i} img2_{j} Confidence: {round(conf * 100, 2)}%")
79
Output:
1from PIL import Image
2with Image.open(image_path) as image:
3 reduced_image = image.resize((50, 50)).convert('RGB').convert("1")
4from PIL import Image
5import imagehash
6with Image.open(image1_path) as img1:
7 hashing1 = imagehash.phash(img1)
8with Image.open(image2_path) as img2:
9 hashing2 = imagehash.phash(img2)
10print('difference : ', hashing1-hashing2)
11import cv2
12import numpy as np
13
14def process(img):
15 img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
16 return cv2.GaussianBlur(img_gray, (43, 43), 21)
17
18def confidence(img1, img2):
19 res = cv2.matchTemplate(process(img1), process(img2), cv2.TM_CCOEFF_NORMED)
20 return res.max()
21
22img1s = list(map(cv2.imread, ["img1_1.jpg", "img1_2.jpg", "img1_3.jpg"]))
23img2s = list(map(cv2.imread, ["img2_1.jpg", "img2_2.jpg", "img2_3.jpg"]))
24
25for img1, img2 in zip(img1s, img2s):
26 conf = confidence(img1, img2)
27 print(f"Confidence: {round(conf * 100, 2)}%")
28Confidence: 83.6%
29Confidence: 84.62%
30Confidence: 87.24%
31import cv2
32import numpy as np
33
34def process(img):
35 h, w, _ = img.shape
36 img = cv2.resize(img, (350, h * w // 350))
37 img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
38 return cv2.GaussianBlur(img_gray, (43, 43), 21)
39
40def confidence(img1, img2):
41 res = cv2.matchTemplate(process(img1), process(img2), cv2.TM_CCOEFF_NORMED)
42 return res.max()
43
44img1s = list(map(cv2.imread, ["img1_1.jpg", "img1_2.jpg", "img1_3.jpg"]))
45img2s = list(map(cv2.imread, ["img2_1.jpg", "img2_2.jpg", "img2_3.jpg"]))
46
47for i, img1 in enumerate(img1s, 1):
48 for j, img2 in enumerate(img2s, 1):
49 conf = confidence(img1, img2)
50 print(f"img1_{i} img2_{j} Confidence: {round(conf * 100, 2)}%")
51img1_1 img2_1 Confidence: 84.2% # Corresponding images
52img1_1 img2_2 Confidence: -10.86%
53img1_1 img2_3 Confidence: 16.11%
54img1_2 img2_1 Confidence: -2.5%
55img1_2 img2_2 Confidence: 84.61% # Corresponding images
56img1_2 img2_3 Confidence: 43.91%
57img1_3 img2_1 Confidence: 14.49%
58img1_3 img2_2 Confidence: 59.15%
59img1_3 img2_3 Confidence: 87.25% # Corresponding images
60import cv2
61import numpy as np
62
63def process(img):
64 h, w, _ = img.shape
65 img = cv2.resize(img, (350, h * w // 350))
66 return cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
67
68def confidence(img1, img2):
69 res = cv2.matchTemplate(process(img1), process(img2), cv2.TM_CCOEFF_NORMED)
70 return res.max()
71
72img1s = list(map(cv2.imread, ["img1_1.jpg", "img1_2.jpg", "img1_3.jpg"]))
73img2s = list(map(cv2.imread, ["img2_1.jpg", "img2_2.jpg", "img2_3.jpg"]))
74
75for i, img1 in enumerate(img1s, 1):
76 for j, img2 in enumerate(img2s, 1):
77 conf = confidence(img1, img2)
78 print(f"img1_{i} img2_{j} Confidence: {round(conf * 100, 2)}%")
79img1_1 img2_1 Confidence: 66.73%
80img1_1 img2_2 Confidence: -6.97%
81img1_1 img2_3 Confidence: 11.01%
82img1_2 img2_1 Confidence: 0.31%
83img1_2 img2_2 Confidence: 65.33%
84img1_2 img2_3 Confidence: 31.8%
85img1_3 img2_1 Confidence: 9.57%
86img1_3 img2_2 Confidence: 39.74%
87img1_3 img2_3 Confidence: 61.16%
88
QUESTION
Is there a need for transitivity in Python __eq__?
Asked 2022-Mar-15 at 07:46I'm implementing my own class, with custom __eq__
. And I'd like to return True
for things that are not "equal" in a mathematical sense, but "match" in a fuzzy way.
An issue with this is, however, that this leads to loss of transitivity in a mathematical sense, i.e. a == b && b ==c
, while a
may not be equal to c
.
Question: is Python dependent on __eq__
being transitive? Will what I'm trying to do break things, or is it possible to do this as long as I'm careful myself not to assume transitivity?
I want to match telephone numbers with one another, while those may be either formatted internationally, or just for domestic use (without a country code specified). If there's no country code specified, I'd like a number to be equal to a number with one, but if it is specified, it should only be equal to numbers with the same country-code, or without one.
So:
- Of course,
+31 6 12345678
should equal+31 6 12345678
, and06 12345678
should equal06 12345678
+31 6 12345678
should equal06 12345678
(and v.v.)+49 6 12345678
should equal06 12345678
(and v.v.)- But
+31 6 12345678
should not be equal to+49 6 12345678
Edit: I don't have a need for hashing (and so won't implement it), so that at least makes life easier.
ANSWER
Answered 2022-Mar-14 at 18:06There is no MUST but a SHOULD relation for comparisons being consistent with the commonly understood relations. Python expressively does not enforce this and float
is an inbuilt type with different behaviour due to float("nan")
.
Expressions: Value comparisons[…]
User-defined classes that customize their comparison behavior should follow some consistency rules, if possible:
- […]
- Comparison should be symmetric. In other words, the following expressions should have the same result:
x == y
andy == x
x != y
andy != x
x < y
andy > x
x <= y
andy >= x
- Comparison should be transitive. The following (non-exhaustive) examples illustrate that:
- x > y and y > z implies x > z
- x < y and y <= z implies x < z
Python does not enforce these consistency rules. In fact, the not-a-number values are an example for not following these rules.
Still, keep in mind that exceptions are incredibly rare and subject to being ignored: most people would treat float
as having total order, for example. Using uncommon comparison relations can seriously increase maintenance effort.
Canonical ways to model "fuzzy matching" via operators are as subset, subsequence or containment using unsymmetric operators.
- The
set
andfrozenset
support>
,>=
and so on to indicate that one set encompases all values of another.
1>>> a, b = {1, 5, 6, 8}, {5, 6}
2>>> a >= a, a >= b, b >= a
3(True, True, False)
4
str
and bytes
support in
to indicate that subsequences are covered.
1>>> a, b = {1, 5, 6, 8}, {5, 6}
2>>> a >= a, a >= b, b >= a
3(True, True, False)
4>>> a, b = "+31 6 12345678", "6 12345678"
5>>> a in b, b in a
6(False, True)
7
range
and ipaddress
Networks support in
to indicate that specific items are covered.
1>>> a, b = {1, 5, 6, 8}, {5, 6}
2>>> a >= a, a >= b, b >= a
3(True, True, False)
4>>> a, b = "+31 6 12345678", "6 12345678"
5>>> a in b, b in a
6(False, True)
7>>> IPv4Address('192.0.2.6') in IPv4Network('192.0.2.0/28')
8True
9
Notably, while these operators may be transitive they are not symmetric. For example, a >= b and c >= b
does not imply b >= c
and thus not a >= c
or vice versa.
Practically, one could model "number without country code" as the superset of "number with country code" for the same number. This means that 06 12345678 >= +31 6 12345678
and 06 12345678 >= +49 6 12345678
but not vice versa. In order to do a symmetric comparison, one would use a >= b or b >= a
instead of a == b
.
QUESTION
Unhashing a hashed (MD5) email address
Asked 2022-Feb-15 at 15:55I know that in hashing you, by definition, lose information. However, as email addresses can be restricted - such as with the information available I would know a potential domain of the email, and that it must have an @. Do these constraints change anything about the problem? Or is the best way to simply make a guess and see if the hash is the same? Also MD5 is no longer as secure as it once was.
Thanks
ANSWER
Answered 2022-Feb-15 at 15:55That is the point of Md5 hashing that even a minute change in the string can change the hash completely. So these constraints change nothing about the problem.
However since you said that its an email and that you know about the potential domain then you can try this technique.
- Generate a list of potential emails it will be within 26 letters and lets say of maximum size 10.
Then you can generate an md5 for all of these possibilities and check if it is equal to the one you have.
1import hashlib
2from itertools import combinations
3import time
4
5start=time.time()
6your_md5_hash='your_md5_hash'
7letters='abcdefghijklmnopqrstuvwxyz'
8possible_words=[]
9for r in range(1,10): #change 10 to the maximum size of your email
10 for combo in combinations(list(letters), r=r):
11 res=''.join(combo)
12 possible_words.append(res)
13
14
15possible_words=[''.join(x)+'@domain.com' for x in possible_words]
16print (len(possible_words))
17for x in possible_words:
18 res=hashlib.md5(x.encode())
19 if res==your_md5_hash:
20 print (res)
21 print (x)
22 print ("RESULT_FOUND")
23 exit(0)
24
25print (time.time()-start)
26
27
This is brute force approach and if you know the size of your email then this could work. Secondly please note that if you do not know the size then the size of possibilities will increase exponentially.
For instance the length of combinations as of now is 5658536 and it took my basic laptop 6 seconds to process.
QUESTION
Channel hangs, probably not closing at the right place
Asked 2022-Jan-29 at 19:46I'm trying to learn Go while writing a small program. The program should parse a PATH recursivelys as efficient and fast as possible and output the full filename (with the path included) and the sha256 file hash of the file.
If the file hashing generates fails, I wanna keep the error and add it to the string (at the hash position).
The result should return a string on the console like: fileXYZ||hash
Unfortunately, the programs hangs at some point. I guess some of my channels are not closing properly and waiting indefinitely for input. I've been trying for quite some time to fix the problem, but without success.
Does anyone have an idea why the output hangs? Many many thx in advance, any input/advice for a Go newcomer is welcome too ;-).
(I wrote separate functions as I wanna add additional features after having fixed this issue.)
Thanks a lot! Didier
Here is the code:
1import (
2 "crypto/sha256"
3 "encoding/hex"
4 "flag"
5 "fmt"
6 "io"
7 "log"
8 "os"
9 "path/filepath"
10 "time"
11)
12
13func main() {
14 pathParam := flag.String("path", ".", "Enter Filesystem Path to list folders")
15 flag.Parse()
16 start := time.Now()
17 run(*pathParam)
18 elapsed := time.Since(start)
19 log.Printf("Time elapsed: %v", elapsed)
20}
21
22func run(path string) {
23 chashes := make(chan string, 50)
24 cfiles := make(chan string)
25
26 go func() {
27 readfs(path, cfiles)
28 defer close(cfiles)
29 }()
30 go func() {
31 generateHash(cfiles, chashes)
32 }()
33 defer close(chashes)
34 for hash := range chashes {
35 fmt.Println(hash)
36 }
37}
38
39func readfs(path string, cfiles chan string) {
40 files, err := os.ReadDir(path)
41 if err != nil {
42 log.Fatalln(err)
43 }
44 for _, file := range files {
45 filename := filepath.Join(path, file.Name())
46 if file.IsDir() {
47 readfs(filename, cfiles)
48 continue
49 } else {
50 cfiles <- filename
51 }
52 }
53}
54
55func generateHash(cfiles chan string, chashes chan string) {
56 for filename := range cfiles {
57 go func(filename string) {
58 var checksum string
59 var oError bool = false
60 file, err := os.Open(filename)
61 if err != nil {
62 oError = true
63 errorMsg := "ERROR: " + err.Error()
64 log.Println(errorMsg)
65 checksum = errorMsg
66 }
67 defer file.Close()
68
69 if !oError {
70 hash := sha256.New()
71 if _, err := io.Copy(hash, file); err != nil {
72 errorMsg := "ERROR: " + err.Error()
73 log.Println(errorMsg)
74 checksum = errorMsg
75 }
76 if len(checksum) == 0 {
77 checksum = hex.EncodeToString(hash.Sum(nil))
78 }
79 }
80 chashes <- filename + "||" + checksum
81 }(filename)
82 } //for files
83}
84
ANSWER
Answered 2022-Jan-29 at 19:46The following loop hangs because chashes
is not closed.
1import (
2 "crypto/sha256"
3 "encoding/hex"
4 "flag"
5 "fmt"
6 "io"
7 "log"
8 "os"
9 "path/filepath"
10 "time"
11)
12
13func main() {
14 pathParam := flag.String("path", ".", "Enter Filesystem Path to list folders")
15 flag.Parse()
16 start := time.Now()
17 run(*pathParam)
18 elapsed := time.Since(start)
19 log.Printf("Time elapsed: %v", elapsed)
20}
21
22func run(path string) {
23 chashes := make(chan string, 50)
24 cfiles := make(chan string)
25
26 go func() {
27 readfs(path, cfiles)
28 defer close(cfiles)
29 }()
30 go func() {
31 generateHash(cfiles, chashes)
32 }()
33 defer close(chashes)
34 for hash := range chashes {
35 fmt.Println(hash)
36 }
37}
38
39func readfs(path string, cfiles chan string) {
40 files, err := os.ReadDir(path)
41 if err != nil {
42 log.Fatalln(err)
43 }
44 for _, file := range files {
45 filename := filepath.Join(path, file.Name())
46 if file.IsDir() {
47 readfs(filename, cfiles)
48 continue
49 } else {
50 cfiles <- filename
51 }
52 }
53}
54
55func generateHash(cfiles chan string, chashes chan string) {
56 for filename := range cfiles {
57 go func(filename string) {
58 var checksum string
59 var oError bool = false
60 file, err := os.Open(filename)
61 if err != nil {
62 oError = true
63 errorMsg := "ERROR: " + err.Error()
64 log.Println(errorMsg)
65 checksum = errorMsg
66 }
67 defer file.Close()
68
69 if !oError {
70 hash := sha256.New()
71 if _, err := io.Copy(hash, file); err != nil {
72 errorMsg := "ERROR: " + err.Error()
73 log.Println(errorMsg)
74 checksum = errorMsg
75 }
76 if len(checksum) == 0 {
77 checksum = hex.EncodeToString(hash.Sum(nil))
78 }
79 }
80 chashes <- filename + "||" + checksum
81 }(filename)
82 } //for files
83}
84for hash := range chashes {
85 fmt.Println(hash)
86}
87
Fix by closing chashes
after all the hashers are completed. Use a sync.WaitGroup to wait for the hashers to complete.
1import (
2 "crypto/sha256"
3 "encoding/hex"
4 "flag"
5 "fmt"
6 "io"
7 "log"
8 "os"
9 "path/filepath"
10 "time"
11)
12
13func main() {
14 pathParam := flag.String("path", ".", "Enter Filesystem Path to list folders")
15 flag.Parse()
16 start := time.Now()
17 run(*pathParam)
18 elapsed := time.Since(start)
19 log.Printf("Time elapsed: %v", elapsed)
20}
21
22func run(path string) {
23 chashes := make(chan string, 50)
24 cfiles := make(chan string)
25
26 go func() {
27 readfs(path, cfiles)
28 defer close(cfiles)
29 }()
30 go func() {
31 generateHash(cfiles, chashes)
32 }()
33 defer close(chashes)
34 for hash := range chashes {
35 fmt.Println(hash)
36 }
37}
38
39func readfs(path string, cfiles chan string) {
40 files, err := os.ReadDir(path)
41 if err != nil {
42 log.Fatalln(err)
43 }
44 for _, file := range files {
45 filename := filepath.Join(path, file.Name())
46 if file.IsDir() {
47 readfs(filename, cfiles)
48 continue
49 } else {
50 cfiles <- filename
51 }
52 }
53}
54
55func generateHash(cfiles chan string, chashes chan string) {
56 for filename := range cfiles {
57 go func(filename string) {
58 var checksum string
59 var oError bool = false
60 file, err := os.Open(filename)
61 if err != nil {
62 oError = true
63 errorMsg := "ERROR: " + err.Error()
64 log.Println(errorMsg)
65 checksum = errorMsg
66 }
67 defer file.Close()
68
69 if !oError {
70 hash := sha256.New()
71 if _, err := io.Copy(hash, file); err != nil {
72 errorMsg := "ERROR: " + err.Error()
73 log.Println(errorMsg)
74 checksum = errorMsg
75 }
76 if len(checksum) == 0 {
77 checksum = hex.EncodeToString(hash.Sum(nil))
78 }
79 }
80 chashes <- filename + "||" + checksum
81 }(filename)
82 } //for files
83}
84for hash := range chashes {
85 fmt.Println(hash)
86}
87func generateHash(cfiles chan string, chashes chan string) {
88 var wg sync.WaitGroup
89 for filename := range cfiles {
90 wg.Add(1)
91 go func(filename string) {
92 defer wg.Done()
93 var checksum string
94 var oError bool = false
95 file, err := os.Open(filename)
96 if err != nil {
97 oError = true
98 errorMsg := "ERROR: " + err.Error()
99 log.Println(errorMsg)
100 checksum = errorMsg
101 }
102 defer file.Close()
103
104 if !oError {
105 hash := sha256.New()
106 if _, err := io.Copy(hash, file); err != nil {
107 errorMsg := "ERROR: " + err.Error()
108 log.Println(errorMsg)
109 checksum = errorMsg
110 }
111 if len(checksum) == 0 {
112 checksum = hex.EncodeToString(hash.Sum(nil))
113 }
114 }
115 chashes <- filename + "||" + checksum
116 }(filename)
117 } //for files
118
119 // Wait for the hashers to complete.
120 wg.Wait()
121
122 // Close the channel to cause main() to break
123 // out of for range on chashes.
124 close(chashes)
125}
126
Remove defer close(chashes)
from run()
.
QUESTION
How can I join two lists in less than O(N*M)?
Asked 2021-Dec-25 at 00:43Assume we have two tables (think as in SQL tables), where the primary key in one of them is the foreign key in the other. I'm supposed to write a simple algorithm that would imitate the joining of these two tables. I thought about iterating over each element in the primary key column in the first table, having a second loop where it checks if the foreign key matches, then store it in an external array or list. However, this would take O(N*M) and I need to find something better. There is a hint in the textbook that it involves hashing, however, I'm not sure how hashing could be implemented here or how it would make it better?
Editing to add an example:
1Table Employees
2|ID (Primary Key)| Name | Department|
3|:---------------|:----:|----------:|
4| 1 | Jack | IT |
5| 2 | Amy | Finance |
6
7Table Transactions
8|Sold Product| Sold By(Foreign Key)| Sold To|
9|:-----------|:-------------------:|-------:|
10| TV | 1 | Mary |
11| Radio | 1 | Bob |
12| Mobile | 2 | Lisa |
13
What I want to do is write an algorithm that joins these two tables using hashing, and not anything complex, just a simple idea on how to do that.
ANSWER
Answered 2021-Dec-24 at 22:18Read the child table's primary and foreign keys into a map where the keys are the foreign keys and the values are the primary keys. Keep in mind that one foreign key can map to multiple primary keys if this is a one to many relationship.
Now iterate over the primary keys of the mother table and for each primary key check whether it exists in the map. If so, you add a tuple of the primary keys of the rows that have a relation to the array (or however you want to save it).
The time complexity is O(n + m)
. Iterate over the rows of each table once. Since the lookup in the map is constant, we don't need to add it.
Space complexity is O(m)
where m
is the number of rows in the child table. This is some additional space you use in comparison to the naive solution to improve the time complexity.
QUESTION
How reproducible / deterministic is Parquet format?
Asked 2021-Dec-09 at 03:55I'm seeking advice from people deeply familiar with the binary layout of Apache Parquet:
Having a data transformation F(a) = b
where F
is fully deterministic, and same exact versions of the entire software stack (framework, arrow & parquet libraries) are used - how likely am I to get an identical binary representation of dataframe b
on different hosts every time b
is saved into Parquet?
In other words how reproducible Parquet is on binary level? When data is logically the same what can cause binary differences?
- Can there be some uninit memory in between values due to alignment?
- Assuming all serialization settings (compression, chunking, use of dictionaries etc.) are the same, can result still drift?
I'm working on a system for fully reproducible and deterministic data processing and computing dataset hashes to assert these guarantees.
My key goal has been to ensure that dataset b
contains an idendital set of records as dataset b'
- this is of course very different from hashing a binary representation of Arrow/Parquet. Not wanting to deal with the reproducibility of storage formats I've been computing logical data hashes in memory. This is slow but flexible, e.g. my hash stays the same even if records are re-ordered (which I consider an equivalent dataset).
But when thinking about integrating with IPFS
and other content-addressable storages that rely on hashes of files - it would simplify the design a lot to have just one hash (physical) instead of two (logical + physical), but this means I have to guarantee that Parquet files are reproducible.
Update
I decided to continue using logical hashing for now.
I've created a new Rust crate arrow-digest that implements the stable hashing for Arrow arrays and record batches and tries hard to hide the encoding-related differences. The crate's README describes the hashing algorithm if someone finds it useful and wants to implement it in another language.
I'll continue to expand the set of supported types as I'm integrating it into the decentralized data processing tool I'm working on.
In the long term, I'm not sure logical hashing is the best way forward - a subset of Parquet that makes some efficiency sacrifices just to make file layout deterministic might be a better choice for content-addressability.
ANSWER
Answered 2021-Dec-05 at 04:30At least in arrow's implementation I would expect, but haven't verified the exact same input (including identical metadata) in the same order to yield deterministic outputs (we try not to leave uninitialized values for security reasons) with the same configuration (assuming the compression algorithm chosen also makes the deterministic guarantee). It is possible there is some hash-map iteration for metadata or elsewhere that might also break this assumption.
As @Pace pointed out I would not rely on this and recommend against relying on it). There is nothing in the spec that guarantees this and since the writer version is persisted when writing a file you are guaranteed a breakage if you ever decided to upgrade. Things will also break if additional metadata is added or removed ( I believe in the past there have been some big fixes for round tripping data sets that would have caused non-determinism).
So in summary this might or might not work today but even if it does I would expect this would be very brittle.
QUESTION
Angular 12 app still being cached with output-hashing=all
Asked 2021-Dec-03 at 14:26I have an Angular 12 application that has different build environments (dev/staging/prod) and I have configured these with output hashing on in angular.json
:
1"configurations": {
2 "production": {
3 "fileReplacements": [
4 {
5 "replace": "src/environments/environment.ts",
6 "with": "src/environments/environment.prod.ts"
7 }
8 ],
9 "optimization": true,
10 "outputHashing": "all" <-----
11
The output files do include a hash, but I am still seeing the old version unless I do a hard refresh in the browser.
1"configurations": {
2 "production": {
3 "fileReplacements": [
4 {
5 "replace": "src/environments/environment.ts",
6 "with": "src/environments/environment.prod.ts"
7 }
8 ],
9 "optimization": true,
10 "outputHashing": "all" <-----
11main.07258fce7f84f8b6013e.js
12polyfills.4cc5b416b55531e38cd8.js
13runtime.734c276f75a5d81bc54b.js
14
The landing page is just a login form but the version is shown beneath the form. It shows the old version until the hard-refresh in the browser, at which point it correctly shows the new version.
Is this due to index.html
being cached along with all of it's old JS references?
If so how do I cache bust this?
ANSWER
Answered 2021-Nov-25 at 08:51In case you're using a service worker (eg @angular/pwa
which installs @angular/service-worker
along), you're entire angular app is being cached by the browser. This includes index.html
+ all javascript files + all stylesheets.
To have a new version of your application pushed to your users, you have to do 2 things:
Update your ngsw-config.json
on each new release:
1"configurations": {
2 "production": {
3 "fileReplacements": [
4 {
5 "replace": "src/environments/environment.ts",
6 "with": "src/environments/environment.prod.ts"
7 }
8 ],
9 "optimization": true,
10 "outputHashing": "all" <-----
11main.07258fce7f84f8b6013e.js
12polyfills.4cc5b416b55531e38cd8.js
13runtime.734c276f75a5d81bc54b.js
14{
15 "version": 1, // Or ascending
16 ...
17}
18
Call SwUpdate
:
1"configurations": {
2 "production": {
3 "fileReplacements": [
4 {
5 "replace": "src/environments/environment.ts",
6 "with": "src/environments/environment.prod.ts"
7 }
8 ],
9 "optimization": true,
10 "outputHashing": "all" <-----
11main.07258fce7f84f8b6013e.js
12polyfills.4cc5b416b55531e38cd8.js
13runtime.734c276f75a5d81bc54b.js
14{
15 "version": 1, // Or ascending
16 ...
17}
18constructor(private swUpdate: SwUpdate) {
19 ...
20
21 //#region Check for updates
22 if (this.swUpdate.isEnabled) {
23 this.swUpdate.activated.subscribe((upd) => {
24 window.location.reload();
25 });
26 this.swUpdate.available.subscribe((upd) => {
27 this.swUpdate.activateUpdate();
28 }, (error) => {
29 console.error(error);
30 });
31 this.swUpdate.checkForUpdate().then(() => {
32 }).catch((error) => {
33 console.error('Could not check for app updates', error);
34 });
35 }
36 //#endregion
37}
38
QUESTION
Where to store access token and how to keep track of user (using JWT token in Http only cookie)
Asked 2021-Nov-16 at 08:54Trying to understand how to get and then save user in client (using JWT token in Http only cookie), so that I can do conditional rendering. What I'm having difficulty with is how to continously know if the user is logged in or not, without having to send a request to the server each time the user changes/refresh page. (Note: the problem is not how do I get the token in the Http only cookie, I know that this is done through withCredentials: true
)
So my problem is how do you get/store the access token so that the client will not have to make a request to the server each time the user does something on the website. For example the Navbar should do conditional renderingen depending on if the user is logged in or not, then I don't want to do "ask the server if the user has a access token, then if not check if user has refresh token, then return a new access token if true else redirect to login page" every single time the user switches page.
Client:
UserContext.js
1import { createContext } from "react";
2export const UserContext = createContext(null);
3
App.js
1import { createContext } from "react";
2export const UserContext = createContext(null);
3const App = () => {
4 const [context, setContext] = useState(null);
5
6 return (
7 <div className="App">
8 <BrowserRouter>
9 <UserContext.Provider value={{ context, setContext }}>
10 <Navbar />
11 <Route path="/" exact component={LandingPage} />
12 <Route path="/sign-in" exact component={SignIn} />
13 <Route path="/sign-up" exact component={SignUp} />
14 <Route path="/profile" exact component={Profile} />
15 </UserContext.Provider>
16 </BrowserRouter>
17 </div>
18 );
19};
20
21export default App;
22
Profile.js
1import { createContext } from "react";
2export const UserContext = createContext(null);
3const App = () => {
4 const [context, setContext] = useState(null);
5
6 return (
7 <div className="App">
8 <BrowserRouter>
9 <UserContext.Provider value={{ context, setContext }}>
10 <Navbar />
11 <Route path="/" exact component={LandingPage} />
12 <Route path="/sign-in" exact component={SignIn} />
13 <Route path="/sign-up" exact component={SignUp} />
14 <Route path="/profile" exact component={Profile} />
15 </UserContext.Provider>
16 </BrowserRouter>
17 </div>
18 );
19};
20
21export default App;
22import { GetUser } from "../api/AuthenticateUser";
23
24const Profile = () => {
25 const { context, setContext } = useContext(UserContext);
26
27 return (
28 <div>
29 {context}
30 <button onClick={() => GetUser()}>Change context</button>
31 </div>
32 );
33};
34
35export default Profile;
36
AuthenticateUser.js
1import { createContext } from "react";
2export const UserContext = createContext(null);
3const App = () => {
4 const [context, setContext] = useState(null);
5
6 return (
7 <div className="App">
8 <BrowserRouter>
9 <UserContext.Provider value={{ context, setContext }}>
10 <Navbar />
11 <Route path="/" exact component={LandingPage} />
12 <Route path="/sign-in" exact component={SignIn} />
13 <Route path="/sign-up" exact component={SignUp} />
14 <Route path="/profile" exact component={Profile} />
15 </UserContext.Provider>
16 </BrowserRouter>
17 </div>
18 );
19};
20
21export default App;
22import { GetUser } from "../api/AuthenticateUser";
23
24const Profile = () => {
25 const { context, setContext } = useContext(UserContext);
26
27 return (
28 <div>
29 {context}
30 <button onClick={() => GetUser()}>Change context</button>
31 </div>
32 );
33};
34
35export default Profile;
36import axios from "axios";
37
38export const GetUser = () => {
39 try {
40 axios
41 .get("http://localhost:4000/get-user", {
42 withCredentials: true,
43 })
44 .then((response) => {
45 console.log(response);
46 });
47 } catch (e) {
48 console.log(`Axios request failed: ${e}`);
49 }
50};
51
52
Server:
AuthenticateUser.js
1import { createContext } from "react";
2export const UserContext = createContext(null);
3const App = () => {
4 const [context, setContext] = useState(null);
5
6 return (
7 <div className="App">
8 <BrowserRouter>
9 <UserContext.Provider value={{ context, setContext }}>
10 <Navbar />
11 <Route path="/" exact component={LandingPage} />
12 <Route path="/sign-in" exact component={SignIn} />
13 <Route path="/sign-up" exact component={SignUp} />
14 <Route path="/profile" exact component={Profile} />
15 </UserContext.Provider>
16 </BrowserRouter>
17 </div>
18 );
19};
20
21export default App;
22import { GetUser } from "../api/AuthenticateUser";
23
24const Profile = () => {
25 const { context, setContext } = useContext(UserContext);
26
27 return (
28 <div>
29 {context}
30 <button onClick={() => GetUser()}>Change context</button>
31 </div>
32 );
33};
34
35export default Profile;
36import axios from "axios";
37
38export const GetUser = () => {
39 try {
40 axios
41 .get("http://localhost:4000/get-user", {
42 withCredentials: true,
43 })
44 .then((response) => {
45 console.log(response);
46 });
47 } catch (e) {
48 console.log(`Axios request failed: ${e}`);
49 }
50};
51
52const express = require("express");
53const app = express();
54require("dotenv").config();
55const cors = require("cors");
56const mysql = require("mysql");
57const jwt = require("jsonwebtoken");
58const cookieParser = require("cookie-parser");
59// hashing algorithm
60const bcrypt = require("bcrypt");
61const salt = 10;
62
63// app objects instantiated on creation of the express server
64app.use(
65 cors({
66 origin: ["http://localhost:3000"],
67 methods: ["GET", "POST"],
68 credentials: true,
69 })
70);
71app.use(express.json());
72app.use(express.urlencoded({ extended: true }));
73app.use(cookieParser());
74
75const db = mysql.createPool({
76 host: "localhost",
77 user: "root",
78 password: "password",
79 database: "mysql_db",
80});
81
82//create access token
83const createAccessToken = (user) => {
84 // create new JWT access token
85 const accessToken = jwt.sign(
86 { id: user.id, email: user.email },
87 process.env.ACCESS_TOKEN_SECRET,
88 {
89 expiresIn: "1h",
90 }
91 );
92 return accessToken;
93};
94
95//create refresh token
96const createRefreshToken = (user) => {
97 // create new JWT access token
98 const refreshToken = jwt.sign(
99 { id: user.id, email: user.email },
100 process.env.REFRESH_TOKEN_SECRET,
101 {
102 expiresIn: "1m",
103 }
104 );
105 return refreshToken;
106};
107
108// verify if user has a valid token, when user wants to access resources
109const authenticateAccessToken = (req, res, next) => {
110 //check if user has access token
111 const accessToken = req.cookies["access-token"];
112
113 // if access token does not exist
114 if (!accessToken) {
115 return res.sendStatus(401);
116 }
117
118 // check if access token is valid
119 // use verify function to check if token is valid
120 jwt.verify(accessToken, process.env.ACCESS_TOKEN_SECRET, (err, user) => {
121 if (err) return res.sendStatus(403);
122 req.user = user;
123 return next();
124 });
125};
126
127app.post("/token", (req, res) => {
128 const refreshToken = req.cookies["refresh-token"];
129 // check if refresh token exist
130 if (!refreshToken) return res.sendStatus(401);
131
132 // verify refresh token
133 jwt.verify(refreshToken, process.env.REFRESH_TOKEN_SECRET, (err, user) => {
134 if (err) return res.sendStatus(401);
135
136 // check for refresh token in database and identify potential user
137 sqlFindUser = "SELECT * FROM user_db WHERE refresh_token = ?";
138 db.query(sqlFindUser, [refreshToken], (err, user) => {
139 // if no user found
140 if (user.length === 0) return res.sendStatus(401);
141
142 const accessToken = createAccessToken(user[0]);
143 res.cookie("access-token", accessToken, {
144 maxAge: 10000*60, //1h
145 httpOnly: true,
146 });
147 res.send(user[0]);
148 });
149 });
150});
151
152/**
153 * Log out functionality which deletes all cookies containing tokens and deletes refresh token from database
154 */
155app.delete("/logout", (req, res) => {
156 const refreshToken = req.cookies["refresh-token"];
157 // delete refresh token from database
158 const sqlRemoveRefreshToken =
159 "UPDATE user_db SET refresh_token = NULL WHERE refresh_token = ?";
160 db.query(sqlRemoveRefreshToken, [refreshToken], (err, result) => {
161 if (err) return res.sendStatus(401);
162
163 // delete all cookies
164 res.clearCookie("access-token");
165 res.clearCookie("refresh-token");
166 res.end();
167 });
168});
169
170// handle user sign up
171app.post("/sign-up", (req, res) => {
172 //request information from frontend
173 const { first_name, last_name, email, password } = req.body;
174
175 // hash using bcrypt
176 bcrypt.hash(password, salt, (err, hash) => {
177 if (err) {
178 res.send({ err: err });
179 }
180
181 // insert into backend with hashed password
182 const sqlInsert =
183 "INSERT INTO user_db (first_name, last_name, email, password) VALUES (?,?,?,?)";
184 db.query(sqlInsert, [first_name, last_name, email, hash], (err, result) => {
185 res.send(err);
186 });
187 });
188});
189
190/*
191 * Handel user login
192 */
193app.post("/sign-in", (req, res) => {
194 const { email, password } = req.body;
195
196 sqlSelectAllUsers = "SELECT * FROM user_db WHERE email = ?";
197 db.query(sqlSelectAllUsers, [email], (err, user) => {
198 if (err) {
199 res.send({ err: err });
200 }
201
202 if (user && user.length > 0) {
203 // given the email check if the password is correct
204
205 bcrypt.compare(password, user[0].password, (err, compareUser) => {
206 if (compareUser) {
207 //req.session.email = user;
208 // create access token
209 const accessToken = createAccessToken(user[0]);
210 const refreshToken = createRefreshToken(user[0]);
211 // create cookie and store it in users browser
212 res.cookie("access-token", accessToken, {
213 maxAge: 10000*60, //1h
214 httpOnly: true,
215 });
216 res.cookie("refresh-token", refreshToken, {
217 maxAge: 2.63e9, // approx 1 month
218 httpOnly: true,
219 });
220
221 // update refresh token in database
222 const sqlUpdateToken =
223 "UPDATE user_db SET refresh_token = ? WHERE email = ?";
224 db.query(
225 sqlUpdateToken,
226 [refreshToken, user[0].email],
227 (err, result) => {
228 if (err) {
229 res.send(err);
230 }
231 res.sendStatus(200);
232 }
233 );
234 } else {
235 res.send({ message: "Wrong email or password" });
236 }
237 });
238 } else {
239 res.send({ message: "Wrong email or password" });
240 }
241 });
242});
243
244app.get("/get-user", (req, res) => {
245 const accessToken = req.cookies["acceess-token"];
246 const refreshToken = req.cookies["refresh-token"];
247 //if (!accessToken && !refreshToken) res.sendStatus(401);
248
249 // get user from database using refresh token
250 // check for refresh token in database and identify potential user
251 sqlFindUser = "SELECT * FROM user_db WHERE refresh_token = ?";
252 db.query(sqlFindUser, [refreshToken], (err, user) => {
253 console.log(user);
254 return res.json(user);
255 });
256});
257
258app.listen(4000, () => {
259 console.log("running on port 4000");
260});
261
262
I began experimenting with useContext as you can see in the client code above. My initial idea was to use useEffect in the App component where I make a call to the function GetUser() which makes a request to "/get-user" which will user the refreshToken to find the user (don't know if it is bad practice to use refreshToken to find user in db, maybe I should store access token in db as well and use it to find user in db instead?) and then save things like id, first name, last name and email so that it may be displayed in the navbar or any other component if necessary.
However, I don't know if this is the right thing to do as I have heard a lot about using localStorge, memory or sessionStorage is better for keeping the JWT access token in, while you should keep the refresh token in the server and save it in the mySQL database I have created, only to be used once the user has lost their access token. How should I get access to my access token and how do I keep track of the user logged in? Do I really need to do a request to the server each time the user switches page or refresh page?
Also I have a question about when I should be calling "/token" in the server to create new access tokens. Should I always try to use the access token to do things that require authentication and if it for example returns null
at some point then I make request to "/token" and after that repeat what the user was trying to do?
ANSWER
Answered 2021-Nov-16 at 08:54Do I really need to do a request to the server each time the user switches page or refresh page?
That is the safest way. If you want to keep with the current security best practices for SPAs, then using http-only, secure, same-site cookies is the best option. Refreshes won't happen that often on your page, so it shouldn't be a problem.
My initial idea was to use useEffect in the App component where I make a call to the function GetUser() which makes a request to "/get-user" which will user the refreshToken to find the user
What I would do is to first verify the access token, if it's valid then take the userId out of the access token (if you don't have it there you can easily add it as you're creating the tokens manually) and read the user data from the database. If the access token is invalid then return an error to the website and let the user use the refresh token to get a new access token. So I wouldn't mix responsibilities here - I wouldn't use refresh token to get information about the logged in user.
Also I have a question about when I should be calling "/token" in the server to create new access tokens. Should I always try to use the access token to do things that require authentication and if it for example returns null at some point then I make request to "/token" and after that repeat what the user was trying to do?
Yes, that's how it usually is implemented. You make a call with the access token to a protected endpoint. It would be best if the endpoint returned 401 response if the token is expired or invalid. Then your app knows that it should use the refresh token to get a new access token. Once you have a new access token you try to make the call to the protected endpoint again. If you don't manage to get a new access token (e.g. because the refresh token has expired), then you ask the user to log in again.
QUESTION
Flutter Web Page Routing Issue
Asked 2021-Oct-22 at 07:31I need web app with base url as
1https://example.com/mojjo-test/index.html#/
2
I updated my base url as documented in flutter website: https://flutter.dev/docs/development/ui/navigation/url-strategies
1https://example.com/mojjo-test/index.html#/
2 <base href="/mojjo-test/">
3
in index.html
As per flutter documentation Hashing is already enabled
Observations:
1.Flutter web app loads, shows url while loading,
1https://example.com/mojjo-test/index.html#/
2 <base href="/mojjo-test/">
3 http://example.com/mojjo-test/index.html
4
2.And after some seconds it redirects to
1https://example.com/mojjo-test/index.html#/
2 <base href="/mojjo-test/">
3 http://example.com/mojjo-test/index.html
4 http://example.com/mojjo-test/#/
5
- Now if we do a refresh, browser will try to load
http://example.com/mojjo-test/#/
and shows an access denied
I am assuming that instead of redirecting to http://example.com/mojjo-test/#/
, it should redirect to http://example.com/mojjo-test/index.html#/
Please provide suggestions on how to achieve it. I would like to keep the initial path to be https://example.com/mojjo-test/index.html#/
(with index.html)
ANSWER
Answered 2021-Oct-22 at 07:31I'd advice you commenting out href
in 'web/index.html' (platform project automatically generated when adding Web). That's how I did it:
https://github.com/maxim-saplin/flutter_web_spa_sample/blob/main/web/index.html
And here's the example of this app working under virtual directory: https://maxim-saplin.github.io/flutter_web_spa_sample/html/#/
Flutter Web somehow has these silly issues in scaffolding for the web project (href
in index.html, wrong paths for service worker etc.) - discovered this while playing with GitHub pages.
QUESTION
Ionic + Fastlane | Android "error: package android.support.v4.content does not exist"
Asked 2021-Sep-19 at 15:32I have an Ionic project I'm working with that is having trouble building to Android. I inherited this project, so that's why I'm not 100% familiar with Fastlane and how it's building the java files. Additionally, I'm on WSL2 and using sdkmanager with the following installed packages:
1Installed packages:=====================] 100% Fetch remote repository...
2Path | Version | Description | Location
3------- | ------- | ------- | -------
4build-tools;29.0.2 | 29.0.2 | Android SDK Build-Tools 29.0.2 | build-tools/29.0.2
5emulator | 30.8.4 | Android Emulator | emulator
6patcher;v4 | 1 | SDK Patch Applier v4 | patcher/v4
7platform-tools | 31.0.3 | Android SDK Platform-Tools | platform-tools
8platforms;android-29 | 5 | Android SDK Platform 29 | platforms/android-29
9
When I run bundle exec fastlane android build
it does a whole lot of magic, but in the end, results in the following error:
1Installed packages:=====================] 100% Fetch remote repository...
2Path | Version | Description | Location
3------- | ------- | ------- | -------
4build-tools;29.0.2 | 29.0.2 | Android SDK Build-Tools 29.0.2 | build-tools/29.0.2
5emulator | 30.8.4 | Android Emulator | emulator
6patcher;v4 | 1 | SDK Patch Applier v4 | patcher/v4
7platform-tools | 31.0.3 | Android SDK Platform-Tools | platform-tools
8platforms;android-29 | 5 | Android SDK Platform 29 | platforms/android-29
9> Task :app:compileReleaseJavaWithJavac FAILED
10/home/zonyx/git/gitlab/studio/platforms/android/app/src/main/java/org/apache/cordova/camera/CameraLauncher.java:42: error: package android.support.v4.content does not exist
11import android.support.v4.content.FileProvider;
12 ^
13/home/zonyx/git/gitlab/studio/platforms/android/app/src/main/java/org/apache/cordova/camera/FileProvider.java:21: error: package android.support.v4.content does not exist
14public class FileProvider extends android.support.v4.content.FileProvider {}
15 ^
16/home/zonyx/git/gitlab/studio/platforms/android/app/src/main/java/org/apache/cordova/camera/CameraLauncher.java:297: error: cannot find symbol
17 this.imageUri = FileProvider.getUriForFile(cordova.getActivity(),
18 ^
19symbol: method getUriForFile(Activity,String,File)
20location: class FileProvider
21/home/zonyx/git/gitlab/studio/platforms/android/app/src/main/java/org/apache/cordova/camera/CameraLauncher.java:824: error: cannot find symbol
22 Uri tmpFile = FileProvider.getUriForFile(cordova.getActivity(),
23 ^
24symbol: method getUriForFile(Activity,String,File)
25location: class FileProvider
26Note: Some input files use or override a deprecated API.
27Note: Recompile with -Xlint:deprecation for details.
28Note: Some input files use unchecked or unsafe operations.
29Note: Recompile with -Xlint:unchecked for details.
304 errors
31
I've seen some thoughts about newer SDK versions using androidx.core.content.FileProvider
instead of android.support.v4.content.FileProvider
. Since the entire Android portion is built / generated automatically, I obviously can't change the java file because it will just get overwritten.
Here's a line of the Fastfile that may be helpful:
1Installed packages:=====================] 100% Fetch remote repository...
2Path | Version | Description | Location
3------- | ------- | ------- | -------
4build-tools;29.0.2 | 29.0.2 | Android SDK Build-Tools 29.0.2 | build-tools/29.0.2
5emulator | 30.8.4 | Android Emulator | emulator
6patcher;v4 | 1 | SDK Patch Applier v4 | patcher/v4
7platform-tools | 31.0.3 | Android SDK Platform-Tools | platform-tools
8platforms;android-29 | 5 | Android SDK Platform 29 | platforms/android-29
9> Task :app:compileReleaseJavaWithJavac FAILED
10/home/zonyx/git/gitlab/studio/platforms/android/app/src/main/java/org/apache/cordova/camera/CameraLauncher.java:42: error: package android.support.v4.content does not exist
11import android.support.v4.content.FileProvider;
12 ^
13/home/zonyx/git/gitlab/studio/platforms/android/app/src/main/java/org/apache/cordova/camera/FileProvider.java:21: error: package android.support.v4.content does not exist
14public class FileProvider extends android.support.v4.content.FileProvider {}
15 ^
16/home/zonyx/git/gitlab/studio/platforms/android/app/src/main/java/org/apache/cordova/camera/CameraLauncher.java:297: error: cannot find symbol
17 this.imageUri = FileProvider.getUriForFile(cordova.getActivity(),
18 ^
19symbol: method getUriForFile(Activity,String,File)
20location: class FileProvider
21/home/zonyx/git/gitlab/studio/platforms/android/app/src/main/java/org/apache/cordova/camera/CameraLauncher.java:824: error: cannot find symbol
22 Uri tmpFile = FileProvider.getUriForFile(cordova.getActivity(),
23 ^
24symbol: method getUriForFile(Activity,String,File)
25location: class FileProvider
26Note: Some input files use or override a deprecated API.
27Note: Recompile with -Xlint:deprecation for details.
28Note: Some input files use unchecked or unsafe operations.
29Note: Recompile with -Xlint:unchecked for details.
304 errors
31desc 'Compile a new build for Android'
32lane :build do |options|
33 Dir.chdir('..') do
34 before_build(options)
35 ionic_build
36 sh("ionic cordova build android --device --release --aot false --environment prod --output-hashing all \
37 --sourcemaps false --extract-css true --named-chunks false --build-optimizer true --minifyjs=true \
38 --minifycss=true --optimizejs=true")
39 deeplinks(action: 'uninstall')
40 end
41end
42
ANSWER
Answered 2021-Sep-19 at 15:32cordova-plugin-androidx-adapter will migrate older libraries to use AndroidX Support Libraries automatically. I believe this is needed when you target Android 10 or higher, which is when the switch was made. Once all of your plugins support AndroidX, you can remove the adapter plugin.
Community Discussions contain sources that include Stack Exchange Network
Tutorials and Learning Resources in Hashing
Tutorials and Learning Resources are not available at this moment for Hashing