DHash | Comparing images with dHash algorithm | Computer Vision library
kandi X-RAY | DHash Summary
kandi X-RAY | DHash Summary
Image comparing with dHash algorithm. 使用dHash算法实现图片对比、相似图片查重。.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Compute hamming distance between two images
- Returns the difference between two images
- Computes the hamming distance between two hashes
- Calculate the hash of an image
DHash Key Features
DHash Examples and Code Snippets
Community Discussions
Trending Discussions on DHash
QUESTION
I saw this syntax in the python implementation of bitcoin over here.
https://github.com/samrushing/caesure/blob/master/caesure/bitcoin.py
I have never seen this syntax before, can someone explain it to me or show me somewhere in the documentation where I can understand it?
...ANSWER
Answered 2021-Jun-04 at 08:38In Python you can assign functions to variables.
fout.write
is a function, so in this example, D
is assigned to that function.
D = fout.write
In this line
D ('hash: %s\n' % (hexify (dhash (self.render())),))
, you are calling the function D
, that is, fout.write
. It would be the same as:
fout.write('hash: %s\n' % (hexify (dhash (self.render())),))
QUESTION
I am trying to perform difference hashing with the python ImageHash library and keep getting a numpy error.
The error:
File "/Users/testuser/Desktop/test_folder/test_env/lib/python3.8/site-packages/imagehash.py", line 252, in dhash image = image.convert("L").resize((hash_size + 1, hash_size), Image.ANTIALIAS) AttributeError: 'numpy.ndarray' object has no attribute 'convert'
The code:
...ANSWER
Answered 2021-Apr-27 at 04:42as it is mentioned in imagehash library's document, @image must be a PIL instance.
. so you cant set numpy array as input of the dshash function.if you want do some preprocess with opencv, you should convert it into PIL array before setting it into dhash, like this :
QUESTION
I am new to Python and am writing an application to identify matching images. I am using the dHash algorithm to compare the hashes of images. I have seen in a tutorial the following lines of code:
...ANSWER
Answered 2020-Nov-22 at 18:36To break it down, the first line pixel_difference = image_resized[0:, 1:] > image_resized[0:, :-1]
is basically doing the following:
QUESTION
Each item in my collection has a 64-bit number, which represents dhash of the image. I want to run a query by this field, which will return all items, that have Hamming Distance more or less than some param.
In MySQL I would use BIT_COUNT function. Is there any built-in analog of it in CosmosDB? If no, then how my HAMMING_DISTANCE UDF should look like since JS doesn't support bitwise operations on 64-bit numbers?
...ANSWER
Answered 2020-Oct-26 at 16:55To solve this I took code from long.js and ImageHash for using in CosmosDB UDF. All cudos to their authors.
See gist it here https://gist.github.com/okolobaxa/55cc08a0d67bc60505bfe3ab4f8bc33c
Usage:
QUESTION
I try to create some hashes on a cuda device and printing them on the host. But at the printf on the host i am getting a read error at position 0x000000000100002F
The relevant lines look like this:
...ANSWER
Answered 2020-Oct-24 at 01:22The memory allocation you are using to hold the hashes is incorrect. To have an array of pointers to the memory for each hash, you would require memory allocated both for the array of pointers, and for the hashes themselves, so something like:
QUESTION
I am not sure at all what changed. There were a couple of app upgrades that I ran which might or might not have caused the issue. I believe this may be an issue with the path but I am really not sure. That is the reason I am posting here. Thanks for your help in advance.
This is what I receive when I attempt to run any NPM command:
...ANSWER
Answered 2020-Sep-09 at 17:59I was able to fix this problem First: I uninstalled Node JS from my machine (I am not sure this was needed but I did it)
Second: I copied all of the directories from the (Node JS install)\node_modules\npm\node_modules directory to the c:\Users(user name)\AppData\Roaming\npm\node_modules\npm\node_modules directory
Now it appears that all the npm stuff works. When I run the two commands to get the current versions it returns the correct information.
D:>node -v
v12.18.3
D:>npm -v
6.14.7
I am not sure how things were confused but it appears that at some point in time over the last couple of years that the AppData location had stopped being updated. When I did an update the path was set back to the AppData and that data was very old. By copying over the node_modules for the new install to the AppData location it appears at this point that everything was updated.
I hope this helps someone else in the future.
QUESTION
I have 2 image folder containing 10k and 35k images. Each image is approximately the size of (2k,2k).
I want to remove the images which are exact duplicates.
The variation in different images are just a change in some pixels.
I have tried DHashing, PHashing, AHashing but as they are lossy image hashing technique so they are giving the same hash for non-duplicate images too.
I also tried writing a code in python, which will just subtract images and the combination in which the resultant array is not zero everywhere gives those image pair to be duplicate of each other.
Buth the time for a single combination is 0.29 seconds and for total 350 million combinations is really huge.
Is there a way to do it in a faster way without flagging non-duplicate images also.
I am open to doing it in any language(C,C++), any approach(distributed computing, multithreading) which can solve my problem accurately.
Apologies if I added some of the irrelevant approaches as I am not from computer science background.
Below is the code I used for python approach -
ANSWER
Answered 2020-Jul-31 at 16:45You should find the answer on how to delete duplicate files (not only images). Then you can use, for example, fdupes
or find some alternative SW: https://alternativeto.net/software/fdupes/
QUESTION
I've cobbled together/wrote some code (Thanks stackoverflow users!) that checks for similarities in images using imagehash, but now I am having issues checking thousands of images (roughly 16,000). Is there anything that I could improve with the code (or a different route entirely) that can more accurately find matches and/or decrease time required? Thanks!
I first changed my list that is created to an itertools combination, so it only compares unique combinations of images.
...ANSWER
Answered 2019-Jul-18 at 18:35Try this, instead of hashing each image at comparison (127,992,000 hashes), you hash ahead of time and compare the hashes since those are not going to change (16,000 hashes).
QUESTION
Assume that one picture with different resolutions from one host has more than one copies.
At the metasearcher stage, I want to check if the 2 pictures have the same names, but not trivial names (such as image.jpg, photo.jpg ...). In this case, I want to include only the picture with higher resolution.
Example: search for "city"
https://znews-photo.zadn.vn/w480/Uploaded/lerl/2017_10_07/DJI_005701_zing.jpeg
https://znews-photo.zadn.vn/Uploaded/lerl/2017_10_07/DJI_005701_zing.jpeg
The first one should not be returned.
This is a job assignment from a web search team, therefore I care a lot about performance.
My current approach:
*) To avoid trivial names, iterate through the testing queries for image search, count the number of appearance of each token from different URLs after tokenized by "/", and manually pick the most appeared tokens in the URLs which are similar to "photo", "picture", "background", etc... In the end, I will have a set of trivial names.
*) For pictures with the same name, each picture I get its dHash, its resolution, for every pair of pictures with dHash difference less than a certain thresh hold, I discard the picture with smaller resolution.
Edit: After consulted with my manager, I realized that I misunderstood the requirements. I should only work purely on the URLs without accessing to the actual images (which would be too expensive). With the example above, I should be able to discard the first image based on the URLs difference of the two. Also, as the result, expecting accuracy isn't high, anything > 85% should be decent.
I greatly appreciate any ideas/insights on improving my current approach.
...ANSWER
Answered 2019-May-24 at 06:43You won't be able to implement robust solution to this problem without accessing image contents. However, if you still want to work directly with URLs, here are some observations:
- Original images often contain "orig" or "original" keywords in their urls, while thumbnails contain "thumb" or "thumbnails" keywords
- URLs for thumbnails often contain width and height numbers (e.g. 640, 768, 1024)
- Generally, longer URL (from the same host) means a thumbnail. This is because when thumbnail is generated, width/height numbers usually are appended to its name.
Entirely different approach is to retrieve image byte sizes by issuing HTTP HEAD requests. In 99% cases server will return Content-Length
header. HTTP HEAD doesn't download contents, only the HTTP headers. As such, it is not so expensive as downloading entire images.
QUESTION
I'd like to compare ORB, SIFT, BRISK, AKAZE, etc. to find which works best for my specific image set. I'm interested in the final alignment of images.
Is there a standard way to do it?
I'm considering this solution: take each algorithm, extract the features, compute the homography and transform the image.
Now I need to check which transformed image is closer to the target template.
Maybe I can repeat the process with the target template and the transformed image and look for the homography matrix closest to the identity but I'm not sure how to compute this closeness exactly. And I'm not sure which algorithm should I use for this check, I suppose a fixed one.
Or I could do some pixel level comparison between the images using a perceptual difference hash (dHash). But I suspect the the following hamming distance may not be very good for images that will be nearly identical.
I could blur them and do a simple subtraction but sounds quite weak.
Thanks for any suggestions.
EDIT: I have thousands of images to test. These are real world pictures. Images are of documents of different kinds, some with a lot of graphics, others mostly geometrical. I have about 30 different templates. I suspect different templates works best with different algorithms (I know in advance the template so I could pick the best one).
Right now I use cv2.matchTemplate to find some reference patches in the transformed images and I compare their locations to the reference ones. It works but I'd like to improve over this.
...ANSWER
Answered 2019-Apr-14 at 22:29From your question, it seems like the task is not to compare the feature extractors themselves, but rather to find which type of feature extractor leads to the best alignment.
For this, you need two things:
- a way to perform the alignment using the features from different extractors
- a way to check the accuracy of the alignment
The algorithm you suggested is a good approach for doing the alignment. To check if accuracy, you need to know what is a good alignment.
You may start with an alignment you already know. And the easiest way to know the alignment between two images is if you made the inverse operation yourself. For example, starting with one image, you rotate it some amount, you translate/crop/scale or combine all this operations. Knowing how you obtained the image, you can obtain your ideal alignment (the one that undoes your operations).
Then, having the ideal alignment and the alignment generated by your algorithm, you can use one metric to evaluate its accuracy, depending on your definition of "good alignment".
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install DHash
You can use DHash like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page