DHash | Comparing images with dHash algorithm | Computer Vision library

by hjaurum Python Version: Current License: No License

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | DHash Summary

DHash is a Python library typically used in Artificial Intelligence, Computer Vision, Deep Learning, Example Codes applications. DHash has no bugs, it has no vulnerabilities and it has low support. However DHash build file is not available. You can download it from GitHub.

Image comparing with dHash algorithm. 使用dHash算法实现图片对比、相似图片查重。.

Support

Quality

Security

License

Reuse

Support

DHash has a low active ecosystem.

It has 140 star(s) with 79 fork(s). There are 5 watchers for this library.

It had no major release in the last 6 months.

There are 0 open issues and 1 have been closed. On average issues are closed in 7 days. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of DHash is current.

Quality

DHash has 0 bugs and 0 code smells.

Security

DHash has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

DHash code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

DHash does not have a standard license declared.

Check the repository for any license declaration and review the terms closely.

Without a license, all rights are reserved, and you cannot use the library in your applications.

Reuse

DHash releases are not available. You will need to build from source code and install.

DHash has no build file. You will be need to create the build yourself to build the component from source.

Installation instructions are not available. Examples and code snippets are available.

DHash saves you 15 person hours of effort in developing the same functionality from scratch.

It has 43 lines of code, 4 functions and 1 files.

It has medium code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi has reviewed DHash and discovered the below as its top functions. This is intended to give you an instant insight into DHash implemented functionality, and help decide if they suit your requirements.

Compute hamming distance between two images
Returns the difference between two images
Computes the hamming distance between two hashes
Calculate the hash of an image

Get all kandi verified functions for this library.

DHash Key Features

No Key Features are available at this moment for DHash.

DHash Examples and Code Snippets

No Code Snippets are available at this moment for DHash.

Community Discussions

Trending Discussions on DHash

Understanding python syntax - variable followed by parenthesis

Numpy error trying to use difference hashing with the ImageHash library

Difference hashing and understanding what these lines of code are doing?

How to calculate Hemming Distance in CosmosDB?

printing array from unified memory on cuda device doesn`t work

I have been using Node JS and NPM for several years with no issues. Suddenly I am receiving an error with any NPM command

Find duplicate images in fastest way

Finding image similarities in a folder of thousands

Metasearch - Remove duplicated pictures with different resolution - improving on current approach

Comparing feature extractors (or comparing aligned images)

QUESTION

Understanding python syntax - variable followed by parenthesis

Asked 2021-Jun-04 at 08:45

I saw this syntax in the python implementation of bitcoin over here.

https://github.com/samrushing/caesure/blob/master/caesure/bitcoin.py

I have never seen this syntax before, can someone explain it to me or show me somewhere in the documentation where I can understand it?

...

ANSWER

Answered 2021-Jun-04 at 08:38

In Python you can assign functions to variables.

fout.write is a function, so in this example, D is assigned to that function.

D = fout.write

In this line D ('hash: %s\n' % (hexify (dhash (self.render())),)), you are calling the function D, that is, fout.write. It would be the same as:

fout.write('hash: %s\n' % (hexify (dhash (self.render())),))

Source https://stackoverflow.com/questions/67834061

QUESTION

Numpy error trying to use difference hashing with the ImageHash library

Asked 2021-Apr-27 at 04:42

I am trying to perform difference hashing with the python ImageHash library and keep getting a numpy error.

The error:

File "/Users/testuser/Desktop/test_folder/test_env/lib/python3.8/site-packages/imagehash.py", line 252, in dhash image = image.convert("L").resize((hash_size + 1, hash_size), Image.ANTIALIAS) AttributeError: 'numpy.ndarray' object has no attribute 'convert'

The code:

...

ANSWER

Answered 2021-Apr-27 at 04:42

as it is mentioned in imagehash library's document, @image must be a PIL instance.. so you cant set numpy array as input of the dshash function.if you want do some preprocess with opencv, you should convert it into PIL array before setting it into dhash, like this :

Source https://stackoverflow.com/questions/67275803

QUESTION

Difference hashing and understanding what these lines of code are doing?

Asked 2020-Nov-22 at 18:36

I am new to Python and am writing an application to identify matching images. I am using the dHash algorithm to compare the hashes of images. I have seen in a tutorial the following lines of code:

...

ANSWER

Answered 2020-Nov-22 at 18:36

To break it down, the first line pixel_difference = image_resized[0:, 1:] > image_resized[0:, :-1] is basically doing the following:

Source https://stackoverflow.com/questions/64957283

QUESTION

How to calculate Hemming Distance in CosmosDB?

Asked 2020-Oct-26 at 16:55

Each item in my collection has a 64-bit number, which represents dhash of the image. I want to run a query by this field, which will return all items, that have Hamming Distance more or less than some param.

In MySQL I would use BIT_COUNT function. Is there any built-in analog of it in CosmosDB? If no, then how my HAMMING_DISTANCE UDF should look like since JS doesn't support bitwise operations on 64-bit numbers?

...

ANSWER

Answered 2020-Oct-26 at 16:55

To solve this I took code from long.js and ImageHash for using in CosmosDB UDF. All cudos to their authors.

See gist it here https://gist.github.com/okolobaxa/55cc08a0d67bc60505bfe3ab4f8bc33c

Usage:

Source https://stackoverflow.com/questions/64416618

QUESTION

printing array from unified memory on cuda device doesn`t work

Asked 2020-Oct-24 at 01:22

I try to create some hashes on a cuda device and printing them on the host. But at the printf on the host i am getting a read error at position 0x000000000100002F

The relevant lines look like this:

...

ANSWER

Answered 2020-Oct-24 at 01:22

The memory allocation you are using to hold the hashes is incorrect. To have an array of pointers to the memory for each hash, you would require memory allocated both for the array of pointers, and for the hashes themselves, so something like:

Source https://stackoverflow.com/questions/64508119

QUESTION

I have been using Node JS and NPM for several years with no issues. Suddenly I am receiving an error with any NPM command

Asked 2020-Sep-09 at 17:59

I am not sure at all what changed. There were a couple of app upgrades that I ran which might or might not have caused the issue. I believe this may be an issue with the path but I am really not sure. That is the reason I am posting here. Thanks for your help in advance.

This is what I receive when I attempt to run any NPM command:

...

ANSWER

Answered 2020-Sep-09 at 17:59

I was able to fix this problem First: I uninstalled Node JS from my machine (I am not sure this was needed but I did it)

Second: I copied all of the directories from the (Node JS install)\node_modules\npm\node_modules directory to the c:\Users(user name)\AppData\Roaming\npm\node_modules\npm\node_modules directory

Now it appears that all the npm stuff works. When I run the two commands to get the current versions it returns the correct information.

D:>node -v

v12.18.3

D:>npm -v

6.14.7

I am not sure how things were confused but it appears that at some point in time over the last couple of years that the AppData location had stopped being updated. When I did an update the path was set back to the AppData and that data was very old. By copying over the node_modules for the new install to the AppData location it appears at this point that everything was updated.

I hope this helps someone else in the future.

Source https://stackoverflow.com/questions/63796829

QUESTION

Find duplicate images in fastest way

Asked 2020-Jul-31 at 17:12

I have 2 image folder containing 10k and 35k images. Each image is approximately the size of (2k,2k).
I want to remove the images which are exact duplicates.
The variation in different images are just a change in some pixels.
I have tried DHashing, PHashing, AHashing but as they are lossy image hashing technique so they are giving the same hash for non-duplicate images too.
I also tried writing a code in python, which will just subtract images and the combination in which the resultant array is not zero everywhere gives those image pair to be duplicate of each other. Buth the time for a single combination is 0.29 seconds and for total 350 million combinations is really huge.
Is there a way to do it in a faster way without flagging non-duplicate images also. I am open to doing it in any language(C,C++), any approach(distributed computing, multithreading) which can solve my problem accurately.
Apologies if I added some of the irrelevant approaches as I am not from computer science background.
Below is the code I used for python approach -

...

ANSWER

Answered 2020-Jul-31 at 16:45

You should find the answer on how to delete duplicate files (not only images). Then you can use, for example, fdupes or find some alternative SW: https://alternativeto.net/software/fdupes/

Source https://stackoverflow.com/questions/63195790

QUESTION

Finding image similarities in a folder of thousands

Asked 2019-Jul-18 at 18:35

I've cobbled together/wrote some code (Thanks stackoverflow users!) that checks for similarities in images using imagehash, but now I am having issues checking thousands of images (roughly 16,000). Is there anything that I could improve with the code (or a different route entirely) that can more accurately find matches and/or decrease time required? Thanks!

I first changed my list that is created to an itertools combination, so it only compares unique combinations of images.

...

ANSWER

Answered 2019-Jul-18 at 18:35

Try this, instead of hashing each image at comparison (127,992,000 hashes), you hash ahead of time and compare the hashes since those are not going to change (16,000 hashes).

Source https://stackoverflow.com/questions/57100668

QUESTION

Metasearch - Remove duplicated pictures with different resolution - improving on current approach

Asked 2019-May-24 at 06:43

Assume that one picture with different resolutions from one host has more than one copies.

At the metasearcher stage, I want to check if the 2 pictures have the same names, but not trivial names (such as image.jpg, photo.jpg ...). In this case, I want to include only the picture with higher resolution.

Example: search for "city"

https://znews-photo.zadn.vn/w480/Uploaded/lerl/2017_10_07/DJI_005701_zing.jpeg

https://znews-photo.zadn.vn/Uploaded/lerl/2017_10_07/DJI_005701_zing.jpeg

The first one should not be returned.

This is a job assignment from a web search team, therefore I care a lot about performance.

My current approach:

*) To avoid trivial names, iterate through the testing queries for image search, count the number of appearance of each token from different URLs after tokenized by "/", and manually pick the most appeared tokens in the URLs which are similar to "photo", "picture", "background", etc... In the end, I will have a set of trivial names.

*) For pictures with the same name, each picture I get its dHash, its resolution, for every pair of pictures with dHash difference less than a certain thresh hold, I discard the picture with smaller resolution.

Edit: After consulted with my manager, I realized that I misunderstood the requirements. I should only work purely on the URLs without accessing to the actual images (which would be too expensive). With the example above, I should be able to discard the first image based on the URLs difference of the two. Also, as the result, expecting accuracy isn't high, anything > 85% should be decent.

I greatly appreciate any ideas/insights on improving my current approach.

...

ANSWER

Answered 2019-May-24 at 06:43

You won't be able to implement robust solution to this problem without accessing image contents. However, if you still want to work directly with URLs, here are some observations:

Original images often contain "orig" or "original" keywords in their urls, while thumbnails contain "thumb" or "thumbnails" keywords
URLs for thumbnails often contain width and height numbers (e.g. 640, 768, 1024)
Generally, longer URL (from the same host) means a thumbnail. This is because when thumbnail is generated, width/height numbers usually are appended to its name.

Entirely different approach is to retrieve image byte sizes by issuing HTTP HEAD requests. In 99% cases server will return Content-Length header. HTTP HEAD doesn't download contents, only the HTTP headers. As such, it is not so expensive as downloading entire images.

Source https://stackoverflow.com/questions/56285863

QUESTION

Comparing feature extractors (or comparing aligned images)

Asked 2019-Apr-15 at 18:47

I'd like to compare ORB, SIFT, BRISK, AKAZE, etc. to find which works best for my specific image set. I'm interested in the final alignment of images.

Is there a standard way to do it?

I'm considering this solution: take each algorithm, extract the features, compute the homography and transform the image.

Now I need to check which transformed image is closer to the target template.

Maybe I can repeat the process with the target template and the transformed image and look for the homography matrix closest to the identity but I'm not sure how to compute this closeness exactly. And I'm not sure which algorithm should I use for this check, I suppose a fixed one.

Or I could do some pixel level comparison between the images using a perceptual difference hash (dHash). But I suspect the the following hamming distance may not be very good for images that will be nearly identical.

I could blur them and do a simple subtraction but sounds quite weak.

Thanks for any suggestions.

EDIT: I have thousands of images to test. These are real world pictures. Images are of documents of different kinds, some with a lot of graphics, others mostly geometrical. I have about 30 different templates. I suspect different templates works best with different algorithms (I know in advance the template so I could pick the best one).

Right now I use cv2.matchTemplate to find some reference patches in the transformed images and I compare their locations to the reference ones. It works but I'd like to improve over this.

...

ANSWER

Answered 2019-Apr-14 at 22:29

From your question, it seems like the task is not to compare the feature extractors themselves, but rather to find which type of feature extractor leads to the best alignment.

For this, you need two things:

a way to perform the alignment using the features from different extractors
a way to check the accuracy of the alignment

The algorithm you suggested is a good approach for doing the alignment. To check if accuracy, you need to know what is a good alignment.

You may start with an alignment you already know. And the easiest way to know the alignment between two images is if you made the inverse operation yourself. For example, starting with one image, you rotate it some amount, you translate/crop/scale or combine all this operations. Knowing how you obtained the image, you can obtain your ideal alignment (the one that undoes your operations).

Then, having the ideal alignment and the alignment generated by your algorithm, you can use one metric to evaluate its accuracy, depending on your definition of "good alignment".

Source https://stackoverflow.com/questions/55679644

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install DHash

You can download it from GitHub.
You can use DHash like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: