Commit b768fe9411ceed100a37d70ed710138e98bf00ed

Authored by Romain Deveaud
1 parent ca96fb31f8
Exists in master

bugfix in document, more stopwords

Showing 2 changed files with 3 additions and 2 deletions Side-by-side Diff

lib/mirimiri/document.rb
... ... @@ -142,7 +142,7 @@
142 142 end
143 143  
144 144 def self.get_url(name)
145   - raise ArgumentError, "Bad encoding", name unless name.isutf9
  145 + raise ArgumentError, "Bad encoding", name unless name.isutf8
146 146  
147 147 atts = REXML::Document.new(Net::HTTP.get( URI.parse "http://en.wikipedia.org/w/api.php?action=query&titles=#{URI.escape name}&inprop=url&prop=info&format=xml" ).unaccent.toutf8).elements['api/query/pages/page'].attributes
148 148  
lib/mirimiri/string.rb
... ... @@ -66,7 +66,8 @@
66 66 "whew","which","whichever","whichsoever","while","whilst","whither","who","whoa",
67 67 "whoever","whole","whom","whomever","whomsoever","whose","whosoever","why","will",
68 68 "wilt","with","within","without","worse","worst","would","wow","ye","yet","year",
69   -"yippee","you","your","yours","yourself","yourselves"
  69 +"yippee","you","your","yours","yourself","yourselves",
  70 + "edit", "new", "page", "article", "http", "www", "com", "org", "wikipedia", "en"
70 71 ]
71 72  
72 73 Transmap = {