Category : wikipedia

I need to extract companies’ information from wikipedia. I have the name of companies and use the following code to get their information. Def_wiki=[] non_wiki=[] title_wiki=[] for i in company_uni: try: Def_wiki.append(wikipedia.summary(i)) title_wiki.append(i) except: non_wiki.append(i) the problem is there are some false positive. I get some information for a title but it is not for ..

Read more

Decoding normal URL escaped characters is a fairly easy task with python. If you want to decode something like: Wikivoyage:%E5%88%A0%E9%99%A4%E8%A1%A8%E5%86%B3 All you need to use is: import urllib urllib.parse.unquote(‘Wikivoyage:%E5%88%A0%E9%99%A4%E8%A1%A8%E5%86%B3’) And you get: ‘Wikivoyage:删除表决’ However, I have identified some characters which this does not work with, namely 4-digit % decoded strings: For example: %25D8 This apparently ..

Read more