Category : similarity

In python: I have a dictionary of 1000 products like this: products={p1:"apples",p2:"oranges",…,p1000:"bananas"} I now have 20.000 old shopping orders (dictionary) that look like this: orders={ "order_1":{"p1":100,"p7":30,…,"p560":126}, "order_2":{"p6":1300,"p7":51,…,"p423":3000}, …, "order_20000":{"p1":700,"p4":5,…,"p942":178} } Each order has different number of unique products (100-200 products) For each order I have the time it took to gather all products: time={"order1":15days,"order2":34days",…,"order20000":7days} When ..

Read more

As part of a past interview task, I’m working with a sports streaming dataset that looks like this: pd.DataFrame({‘away_contestant_country’: {0: ‘Japan’, 1: ‘Canada’}, ‘competition_name’: {0: ‘NPB’, 1: ‘FIBA AmeriCup (W)’}, ‘customer_country’: {0: ‘Japan’, 1: ‘Canada’}, ‘device_category’: {0: ‘Web’, 1: ‘Unknown’}, ‘home_contestant_country’: {0: ‘Japan’, 1: ‘Brazil’}, ‘live_or_on_demand’: {0: ‘Live’, 1: ‘Live’}, ‘match_date’: {0: ‘2021-06-11T08:45:00.000Z’, 1: ‘2021-06-13T19:10:00.000Z’}, ..

Read more

I have the following function, dataframe and vector, why I am getting an error? import pandas as pd import numpy as np def vanilla_vec_similarity(x, y): x.drop(‘request_id’, axis=1, inplace=True).values.flatten().tolist() y.drop(‘request_id’, axis=1, inplace=True).values.flatten().tolist() res = (np.array(x) == np.array(y)).astype(int) return res.mean() test_df = pd.DataFrame({‘request_id’: [55, 42, 13], ‘a’: [‘x’,’y’,’z’], ‘b’:[1,2,3], ‘c’: [1.0, -1.8, 19.113]}) test_vec = pd.DataFrame([[123,’x’,1.1, -1.8]], ..

Read more