Category : information-retrieval

I want to rank documents for given queries The ranking have to be implemented in two steps: 1)First pass retrieval: Use Anserini SimpleSearcher to rank all documents for a given query 2)Second pass retrieval: Re-rank top-100 documents from the 1st pass retrieval using a retrieval model from pyserini.search import SimpleSearcher results = [] searcher = ..

Read more

I have been exploring this problem a lot about just using the website url to tag or cluster them as per their business domain. For example: amazon.com => e-commerce bbc.co.uk => news Adidas.com => sports apparel or lets say Amazon.com/xbox => gadget I have read through some research papers which try to cluster using different ..

Read more

I have a dataset off millions of arrays like follows: sentences=[ [ ‘query_foo bar’, ‘split_query_foo’, ‘split_query_bar’, ‘sku_qwre’, ‘brand_A B C’, ‘split_brand_A’, ‘split_brand_B’, ‘split_brand_C’, ‘color_black’, ‘category_C1’, ‘product_group_clothing’, ‘silhouette_t_shirt_top’, ], […] ] where you find a query, a sku that was acquired by the user doing the query and a few attributes of the SKU. My idea ..

Read more