Thành viên:Hide on Rosé/Bộ lọc sai phạm/w/en/12

sửa

Điều kiện Chi tiết
user_editcount < 30 & (
   (
      new_size < 300 &
      old_size > 300
   ) |
   edit_delta < -5000
 ) &  (
    bad_words := "\b(?:(?:P+H+|F)+U+C+K+[\w\d]*?|AN+U+S|CR+A+P|SH+I+T|[KC]+U+N+T|NI+GG+(?:E+R|A)|(?:ASS+|BU+(?:T+|M+))[ \-]*(?:H+O+L+E|S+E+X|R+A+P+E)|(?:CO|KA+W+)[CK]|LOO+[SZ]+E+R|BI+T+C+H|PE+N+I+S|WA+N+K+(?:E+R)?|SP+U+R+T|MA+S+T+[UE]+R+B+A+I?T+(?:I+O+N|E+R?)?|HA+GG+E+R|HERMAPHRO|JESKE ?COURLANO|(?:PH+|F)A+G+O+T|KINGPINIE|ORLY)\b";

    ccnorm(added_lines) rlike bad_words &
    !ccnorm(removed_lines) rlike bad_words &
    !added_lines irlike "^#Redirect\s*\[\[" &
    !contains_any ( page_title, "Sockpuppet investigations", user_name, "Sandbox" )
)
Kích hoạt Có​
Cập nhật 13:10:47
18-02-2024
Ghi công w:en:AF/12
(lịch sử)
Ghi chú Không
Tác vụ
Thắt cổ chai Cảnh báo Chặn tác vụ Thu hồi quyền Cấm người dùng Gắn thẻ
Không Không Thông điệp w:en:AFD Không Không Không​

Ghi chú bộ lọc sửa

  • Just familiarizing myself with the syntax for now; if it's horribly, horribly wrong don't hesitate to nuke it. —Nihiltres
  • "new_wikitext" now replaced with more epic "added_lines" :) —Nihiltres
  • Enabled warning - Tiptoety
  • Optimised -- Andrew
  • AntiAbuseBot hit something with only "nigger" in it earlier, adding that too. - Hersfold
  • Made sure it has to be mass-removal --Andrew
  • I've been adding some more variations —Nihiltres
  • Enable "prevent user from doing the action" This filter is good. -Prodego
  • too many false positives, four of the first fifteen hits were good edits; dropping to just logging for now - east
  • Fixed, disallow has been set, applying to mainspace only (after hitting a sig of a user with poop in their name) -Prodego
  • Adding not-sysop line, hopefully that will help optimize some, this filter is hitting the safeguard level. - hersfold
  • Added one. Does it matter whether you use single/double quotes? - It Is Me Here
  • Removed addition by It Is Me Here: since we're using ccnorm() on added_lines, "SHIT" is already covered by "5H1T", and "SHIT" will never turn up. Filters need to be kept short for performance reasons. As far as I know, single/double quotes don't matter. —Nihiltres
  • Added HAGGER (duh) -- NawlinWiki
  • Changed to use contains_any :) --Andrew
  • Add "Wikipedia is Communism" (boy, there's an oldie) -- NW
  • Added ' WANKER' (note the initial space); if it turns out any false positives, feel free to nuke it. Would it make this filter more efficient to order the most common words earlier in the contains_any()? A study of which words turn up would be interesting. —Nihiltres
    • It would only make it a tiny bit more efficient, assuming that bad words edits are rare. RF.
  • Excluding articles turned into redirects, which causes false positives. --Conti
  • Refined the redirect check; we don't want "I REDIRECT THIS PAGE TO YOUR ANUS" to be an easy workaround for the filter. —Nihiltres
  • Added variant "A55 H01E" of "A55H01E". Remove if problematic. —Nihiltres
  • Obama "Epic fail" vandal. --NW 4/13
  • Added variant "F U C K"; saw it in an article history and confirmed that it was being used to bypass the filter ( http://en.wikipedia.org/wiki/Special:AbuseLog?title=Special%3AAbuseLog&wpSearchUser=125.237.148.153&wpSearchFilter=12 ). I wonder how computationally expensive contains_any() is; it might be useful to use a regex system if it isn't significantly cheaper. If we get too many variants I'd be tempted to change ccnorm to norm, though that's more likely to produce false positives. —Nihiltres
  • Removing restriction of article pages only and adding a phrase for Joker vandalism; this type of vandalism is not appropriate on user talk pages either. Tested at length. --NW 5/20
    • More Joker garbage, tested. --NW 5/20
      • More from tonight, tested. --NW 5/21
        • + 1 more, also tested. --NW 5/21
  • More, tested. --NW 5/22
  • Shouldn't the additions of the last few days be moved to a separate filter? None of them are actual obscenities. --Conti
  • Makes sense, unless two filters eat up more time than one. Or we could just change the name of this one to "replacing a page with vandalism". --NW
    • I think we can live with the 5 additional ms. I'd rather not rename this filter, tho. Adding "☺" to pages is not vandalism, it's the MO of a specific vandal, and therefore should deserve its own filter. --Conti
      • Done -- the non-obscenities are now in filter 13. --NW
  • Fixed the "shit" filter (oops) and added a variant. —Nihiltres
  • Re-add one not covered elsewhere anymore. --NW
    • Modify to deal with 4chan vandalism. --NW 11/24
      • Revert self, that's not gonna work. --NW 11/24

Add "Hermaphrodite" per recent attacks -TS 1/3 allow users to blank their own talkpages -- Soap 1-21 exception for bots due to FP; more intelligent solution would be nice. also added space before crap to let "skyscraper" through (can we not use \b with ccnorm? that would be more ideal) -- Soap 1-23 Simplified. - Ruslik

I made a sudden change, tested on test wiki first, to correct the false positives. This required a complete redesign, but I took many of the ideas from the old filter. This should reduce false positives. Log only currently to see how it does. - Shirik 27 Jan 2010 +1, blame SGF -- Shirik 5 Mar 2010 +2 from ongoing attacks -- Shirik 7 Mar 2010 Added exception for SPI due to recent false positives until I can find a better way of doing it. -- Shirik 29 Mar 2010 Rm namespace for SPI exemption, since article_text does not have namespace in it. -Tim Song 31 Mar 2010 Added replacement with "ORLY" due to an ongoing attack - Shirik 20 Apr 2010 My first edit filter change. Added + signs after each letter, to match thins like "FFFUUUCCCKKK". Also added matches for butt hole and bum hole. Changed to log only per request of shirik. Tim1357 April 26 Turned disallow back on, added LOOSER Tim1357 April 26 Optimize (Move "Sockpupet investigation" exclusion to before the regex.--Tim Rm false positive "cook" (oops) -- Tim 5/5 Add a rule that it is only if the page is reduced in size, even though I know that it was made this way on purpose. See Redrose64's \filter logs -- Soap Add line to exclude users with more than 2,000 edits. -- Tim 4/28 Reorder to optimize and add some. -- Tim 6/26 I'm pretty sure a vandal won't survive more than 100 edits. - KoH Exlude "Sandbox" in page title, there is already a bot to clean that page. -Sole Soul Use irlike and simplify redirect regex. RF 2014-02-17

Clean layout and reduce condition count. -DF Simplify regex. RF 2015-07-03 Enhanced redirect regex, renamed bad_words, reduced condition count. Possible that more optimisation could be done using statistics. (E.g. are SPI edits very much rarer than bad_words edits?) RF 20150806 Updated for both old and new version of ccnorm. RF 20160812

https://phabricator.wikimedia.org/T29987 fully deployed and confirmed working, removing old code ~MA 2016.08.18

A couple regex fixes --Kaldari 2016-08-19 Tweak. RF 20160911

Public per Special:Permalink/784131724#Privacy of general vandalism filters ~MA

Update deprecated, make more readable/optimize. -G 2018.02.21

Optimized regex code. --Oshwah 3/18/2022

Converted OR condition to non-capturing group. --Oshwah 9/29/2022

Reorder to put unlikely conditions last. --Suffusion of Yellow 02:56 23 Oct 2022

Restore disallow; that was meant to be a quick check. Useless as a log-only filter. That said, this isn't all that useful as a disallowing filter, either; nearly everything is stopped by other filters. No objections to disabling completely. --Suffusion of Yellow 23:10 24 Mar 2023

Một số nhật trình bộ lọc đáng chú ý sửa