Category : hash

Jaccard similarity is used to estimate the similarity between two sets. However, if we want to find pairs of most similar documents, it would take us O(n^2). If using minhashing, it can be done a lot faster (http://infolab.stanford.edu/~ullman/mmds/ch3n.pdf, https://www.fatalerrors.org/a/text-similarity-calculation-minhash-and-lsh-algorithm.html). I am wondering how to implement minhashing to estimate the similarity between two sets, say s1={1, ..

Read more

I want to know how to compare two hash values not Hamming distance. Is there a way? The final goal is to determine key of python dictionary that similar images can have in common. for example. import imagehash # img1, img2, img3 are same images img1_hash = imagehash.average_hash(Image.open(‘data/image1.jpg’)) img2_hash = imagehash.average_hash(Image.open(‘data/image2.jpg’)) img3_hash = imagehash.average_hash(Image.open(‘data/image3.jpg’)) img4_hash ..

Read more

I change images to hash values and try to classify images with similar hash values into the same group. so for example. import imagehash # img1, img2, img3 are same images img1_hash = imagehash.average_hash(Image.open(‘data/image1.jpg’)) img2_hash = imagehash.average_hash(Image.open(‘data/image2.jpg’)) img3_hash = imagehash.average_hash(Image.open(‘data/image3.jpg’)) img4_hash = imagehash.average_hash(Image.open(‘data/image4.jpg’)) print(img1_has, img2_hash, img3_hash, img4_hash) >>> 81c38181bf8781ff, 81838181bf8781ff, 81838181bf8781ff, ff0000ff3f00e7ff hash_lst = [[‘img1’, ..

Read more

md5 to receive CTF for classwork import hashlib flag = 0 counter = 0 pass_hash = input("Enter md5 hash: ") wordlist = input("filename: ") try: pass_file = open(wordlist, "r") except: print("No file found") quit() for word in pass_file: enc_wrd = word.encode(‘utf-8’) digest = hashlib.md5(enc_wrd).hexdigest() if digest == pass_hash: print("Password has been found!") print("Password:" + word) ..

Read more

I have an application running in Python 3.9.4 where I store class objects in sets (along with many other kinds of objects). I’m getting non-deterministic behavior even when PYTHONHASHSEED=0 because class objects get non-deterministic hash codes. I assume that’s because class objects’ hash codes come from their addresses in memory. For example, here are two ..

Read more