Support forums : Feature Requests

PERISHABLE PRESS 4G BLACKLIST

Now that you've read stuff first and introduced yourself, the next thing you'll probably want to do is say what you want Quam Plures to have and do. So here ya go: A feature request forum!

Moderator: Dracones

PERISHABLE PRESS 4G BLACKLIST

Postby Kimberly » Tue Oct 11, 2011 4:13 am

This is not really a "How do I" question since I know how to do it. I just did not know where to ask but here.

I am getting some problems with bots crawling around on my server. Suddenly it seems that China's search engines are interesting in indexing my sites as well as France and others, plus spam bots and the like. I think one of the bots is activating javascript as it crawls my sites which is not good. Therefore, I am thinking of installing PERISHABLE PRESS 4G BLACKLIST code to my .htaccess file at the root, or in my case virtual root, of the server. Has anyone used this before and are there problems with QP; are there exceptions/removals to the PERISHABLE PRESS 4G BLACKLIST that need to be made in order to not prevent QP from serving up my site?
Kimberly
Dracone
User avatar
 
Posts: 842
Joined: Mon Jul 19, 2010 4:44 pm

Re: PERISHABLE PRESS 4G BLACKLIST

Postby EdB » Tue Oct 11, 2011 5:15 am

Interesting. Haven't used it, but just the other day I ran across a site that talked about the performance problem(s) associated with a huge .htaccess file so I guess that is something to think about. Seriously: those lists are way too big.

I've been thinking about doing a .htaccess plugin. Kinda scary given how hosts do stuff via .htaccess and they may or may not consider that other entities are also editing that file. Anyway the PP site seems like a damned good resource, but personally I wouldn't be too inclined to upload such a huge list.
EdB
Dracone
User avatar
 
Posts: 2072
Joined: Sun Nov 22, 2009 7:20 am
Location: Maricopa Arizona

Re: PERISHABLE PRESS 4G BLACKLIST

Postby Kimberly » Tue Oct 11, 2011 6:38 am

This new one, version 5 is in beta, does not have a huge list. I will post it here since I am keeping the author's info. It seems to not be affecting the performance of QP at all. I will keep an eye on the stats over time to see if the spammer referrals and bots diminish. Actually, it seemed to me that the site load time decreased.
Code: Select all
### PERISHABLE PRESS 4G BLACKLIST ###

# ESSENTIALS
RewriteEngine on
ServerSignature Off
Options All -Indexes
Options +FollowSymLinks

# FILTER REQUEST METHODS
<IfModule mod_rewrite.c>
RewriteCond %{REQUEST_METHOD} ^(TRACE|DELETE|TRACK) [NC]
RewriteRule ^(.*)$ - [F,L]
</IfModule>

# BLACKLIST CANDIDATES
<Limit GET POST PUT>
Order Allow,Deny
Allow from all
Deny from 75.126.85.215   "# blacklist candidate 2008-01-02 = admin-ajax.php attack "
Deny from 128.111.48.138  "# blacklist candidate 2008-02-10 = cryptic character strings "
Deny from 87.248.163.54   "# blacklist candidate 2008-03-09 = block administrative attacks "
Deny from 84.122.143.99   "# blacklist candidate 2008-04-27 = block clam store loser "
Deny from 210.210.119.145 "# blacklist candidate 2008-05-31 = block _vpi.xml attacks "
Deny from 66.74.199.125   "# blacklist candidate 2008-10-19 = block mindless spider running "
Deny from 203.55.231.100  "# 1048 attacks in 60 minutes"
Deny from 24.19.202.10    "# 1629 attacks in 90 minutes"
</Limit>

# QUERY STRING EXPLOITS
<IfModule mod_rewrite.c>
RewriteCond %{QUERY_STRING} \.\.\/    [NC,OR]
RewriteCond %{QUERY_STRING} boot\.ini [NC,OR]
RewriteCond %{QUERY_STRING} tag\=     [NC,OR]
RewriteCond %{QUERY_STRING} ftp\:     [NC,OR]
RewriteCond %{QUERY_STRING} http\:    [NC,OR]
RewriteCond %{QUERY_STRING} https\:   [NC,OR]
RewriteCond %{QUERY_STRING} mosConfig [NC,OR]
RewriteCond %{QUERY_STRING} ^.*(\[|\]|\(|\)|<|>|'|"|;|\?|\*).* [NC,OR]
RewriteCond %{QUERY_STRING} ^.*(%22|%27|%3C|%3E|%5C|%7B|%7C).* [NC,OR]
RewriteCond %{QUERY_STRING} ^.*(%0|%A|%B|%C|%D|%E|%F|127\.0).* [NC,OR]
RewriteCond %{QUERY_STRING} ^.*(globals|encode|config|localhost|loopback).* [NC,OR]
RewriteCond %{QUERY_STRING} ^.*(request|select|insert|union|declare|drop).* [NC]
RewriteRule ^(.*)$ - [F,L]
</IfModule>

# CHARACTER STRINGS
<IfModule mod_alias.c>
# BASIC CHARACTERS
RedirectMatch 403 \,
RedirectMatch 403 \:
RedirectMatch 403 \;
RedirectMatch 403 \=
RedirectMatch 403 \@
RedirectMatch 403 \[
RedirectMatch 403 \]
RedirectMatch 403 \^
RedirectMatch 403 \`
RedirectMatch 403 \{
RedirectMatch 403 \}
RedirectMatch 403 \~
RedirectMatch 403 \"
RedirectMatch 403 \$
RedirectMatch 403 \<
RedirectMatch 403 \>
RedirectMatch 403 \|
RedirectMatch 403 \.\.
RedirectMatch 403 \/\/
RedirectMatch 403 \%0
RedirectMatch 403 \%A
RedirectMatch 403 \%B
RedirectMatch 403 \%C
RedirectMatch 403 \%D
RedirectMatch 403 \%E
RedirectMatch 403 \%F
RedirectMatch 403 \%22
RedirectMatch 403 \%27
RedirectMatch 403 \%28
RedirectMatch 403 \%29
RedirectMatch 403 \%3C
RedirectMatch 403 \%3E
RedirectMatch 403 \%3F
RedirectMatch 403 \%5B
RedirectMatch 403 \%5C
RedirectMatch 403 \%5D
RedirectMatch 403 \%7B
RedirectMatch 403 \%7C
RedirectMatch 403 \%7D
# COMMON PATTERNS
Redirectmatch 403 \_vpi
RedirectMatch 403 \.inc
Redirectmatch 403 xAou6
Redirectmatch 403 db\_name
Redirectmatch 403 select\(
Redirectmatch 403 convert\(
Redirectmatch 403 \/query\/
RedirectMatch 403 ImpEvData
Redirectmatch 403 \.XMLHTTP
Redirectmatch 403 proxydeny
RedirectMatch 403 function\.
Redirectmatch 403 remoteFile
Redirectmatch 403 servername
Redirectmatch 403 \&rptmode\=
Redirectmatch 403 sys\_cpanel
RedirectMatch 403 db\_connect
RedirectMatch 403 doeditconfig
RedirectMatch 403 check\_proxy
Redirectmatch 403 system\_user
Redirectmatch 403 \/\(null\)\/
Redirectmatch 403 clientrequest
Redirectmatch 403 option\_value
RedirectMatch 403 ref\.outcontrol
# SPECIFIC EXPLOITS
RedirectMatch 403 errors\.
RedirectMatch 403 config\.
RedirectMatch 403 include\.
RedirectMatch 403 display\.
RedirectMatch 403 register\.
Redirectmatch 403 password\.
RedirectMatch 403 maincore\.
RedirectMatch 403 authorize\.
Redirectmatch 403 macromates\.
RedirectMatch 403 head\_auth\.
RedirectMatch 403 submit\_links\.
RedirectMatch 403 change\_action\.
Redirectmatch 403 com\_facileforms\/
RedirectMatch 403 admin\_db\_utilities\.
RedirectMatch 403 admin\.webring\.docs\.
Redirectmatch 403 Table\/Latest\/index\.
</IfModule>
Kimberly
Dracone
User avatar
 
Posts: 842
Joined: Mon Jul 19, 2010 4:44 pm

Re: PERISHABLE PRESS 4G BLACKLIST

Postby Kimberly » Tue Oct 11, 2011 6:48 am

EdB wrote:I've been thinking about doing a .htaccess plugin. Kinda scary given how hosts do stuff via .htaccess and they may or may not consider that other entities are also editing that file. Anyway the PP site seems like a damned good resource, but personally I wouldn't be too inclined to upload such a huge list.


Most host will just include one at the root. You could work on a plugin that would place the .htacess in the QP folder. One thing is that often a problem is the memory level is set too low, that can affect the site's performance. and the other is file size uploads limits.

In my case I wanted one .htacess to cover all sites and domains on the server, so it needed to be at the root.
Kimberly
Dracone
User avatar
 
Posts: 842
Joined: Mon Jul 19, 2010 4:44 pm

Re: PERISHABLE PRESS 4G BLACKLIST

Postby Kimberly » Tue Oct 11, 2011 5:58 pm

The problem with bad bots is that they don't follow the rules. Which is why it is nice to be able to just keep them out in the first place. I have a client that got banned by google adsense (which is really a waste of time; unless you have millions of visitors you don't make any money with them) because of fraudulent clicks. While it was not a problem with QP, users may be interested in how to prevent such things from happening if they are hosting client's blogs. My client is not going to leave (and he is the only paying one I have) but was irritated. Somewhere here on the forums I talked about click bots before. Now what I am seeing may be bots that crawl the site and by accessing the javascript code causes a false click that goes back to google but does not redirect to the advertisers website.

In looking at the Perishable Press 4G Blacklist, I went ahead and added a redirect for User Agents. However, in researching this I came across a statement that we should not create huge blacklists. The idea should be along the lines of blocking websites from leaving cookies. In my browser settings I deny all sites from leaving cookies and have an exceptions list for good sites. This should also be the idea behind blocking bad bots. The problem is that the User Agent string can be changed and that makes the robots.txt or .htaccess file invalid. Therefore, a better approach is to deny all bots and make an exceptions list for the ones you do want to allow.
Kimberly
Dracone
User avatar
 
Posts: 842
Joined: Mon Jul 19, 2010 4:44 pm

Re: PERISHABLE PRESS 4G BLACKLIST

Postby EdB » Tue Oct 11, 2011 6:49 pm

My .htaccess is fairly stable across all installations. I have ~170 lines of User-Agent bad-bots that I got from the web and added to based on personal experience. I then do a very simple Referer block:
Code: Select all
SetEnvIfNoCase Referer ".*(more-poker).*" spam_bot
SetEnvIfNoCase Referer ".*(ooo-casino|hotelgaydays).*" spam_bot
SetEnvIfNoCase Referer ".*(sexdragsandrocknroll).*" spam_bot
SetEnvIfNoCase Referer ".*(ambien|-anal|anal-|betting).*" spam_bot
SetEnvIfNoCase Referer ".*(casino|-cialis|cialis-|credit-).*" spam_bot
SetEnvIfNoCase Referer ".*(-deal|-drugs|deal-|drugs-).*" spam_bot
SetEnvIfNoCase Referer ".*(finance|fioricet|forex).*" spam_bot
SetEnvIfNoCase Referer ".*(gambling).*" spam_bot
SetEnvIfNoCase Referer ".*(hardcore|hold-em|holdem|hoodia|-hotel|hotel-).*" spam_bot
SetEnvIfNoCase Referer ".*(insurance|jintropin|lenarcic).*" spam_bot
SetEnvIfNoCase Referer ".*(levitra|-loan|loan-).*" spam_bot
SetEnvIfNoCase Referer ".*(mature|meridia|mortgage).*" spam_bot
SetEnvIfNoCase Referer ".*(pharmacy|phentermine|pills-|-pills|poker-|-poker).*" spam_bot
SetEnvIfNoCase Referer ".*(shemale|tramadol).*" spam_bot
SetEnvIfNoCase Referer ".*(volny|viagra|xanax).*" spam_bot

<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=spam_bot
</Limit>
That might block some valid referer situations but I don't care. I created that based on my days managing that other app's antispam central system: why bother with 10 or 12 individual domain names when a single keyword will kill them all?

I guess I should look at the PP4G thing and see how it stacks up. Also looking at that other app's current antispam list and trimming it down based on things like not banning sentences (which they do). Unfortunately that puts up a list of ~2700 URLs for the blacklist, so I need to further trim that based on my Referer blockade above.

For QP5 I'm probably going to include checkboxes upon installation (that default to unchecked) for "create robots.txt" and "create .htaccess" that basically uses whatever I've got canned for each specific domain. Only available at the root level though. To me if you install it in a folder then no deal.
EdB
Dracone
User avatar
 
Posts: 2072
Joined: Sun Nov 22, 2009 7:20 am
Location: Maricopa Arizona


Return to Feature Requests

Who is online

Users browsing this forum: No registered users and 1 guest

cron