PDA

View Full Version : IP Tables


glennzo
29th August 2008, 05:19 PM
Has IPTables ever been discussed here :rolleyes: ? I have a few bots checking out my wiki site every day or a few times a day. They are from MSN, Yahoo, etc. Do I need to bother blocking these? If so, can anyone explain how to add rules to iptables based on the ip addresses of these bots? Yes, I read man iptables and searched Fedora Forum prior to posting. :rolleyes: By the way, I had always had the firewall, iptables and ip6 tables turned off. Now the firewall is on as is iptables. ip6tables is still off.

Dies
29th August 2008, 05:43 PM
Are they just ignoring your "robots.txt" ?

glennzo
29th August 2008, 05:50 PM
Ha! What's robots.txt? I'm lucky I got the wiki working :p Guess I should go see what that's all about. Anyhow, here's a sample listing in access.log. 65.55.230.232 - - [29/Aug/2008:11:29:57 - 0400] "Get /robots.txt HTTP/1.1" 404 298 "-" "msnbot-media/1.1 (+httrp://search/msn.com/msnbot.htm)"

Edit: I see, from a simple Google search, that robots.txt can be a very useful file if placed in the proper folder. It's used to tell these bots to go away. Just what I want, or do I? I don't really care if my little spot on the web is indexed. After all, how else would one avail themselves of all the useful info that I have written, and occasionally plagiarized :p

vallimar
29th August 2008, 05:58 PM
Raw and simple iptables I believe would be something like:

iptables -A INPUT -s <ip-to-block> -j DROP

glennzo
29th August 2008, 06:01 PM
Simple is what I like. Now where do I put that little tidbit? Is there an iptables.conf somewhere? I see a likely candidate at /etc/sysconfig/iptables-config. My guess is that this is my victim.

vallimar
29th August 2008, 06:17 PM
Thats the config file for the iptables service.
I'm not sure how your config is setup.. if you use the iptables service with save/restore,
you chould just run it at the commandline and do "service iptables save", if you aren't using
an actual customizable firewall setup, and aren't saving the rules, I would recommend adding
any custom blacklist rules to /etc/rc.d/rc.local to get loaded at the end of the bootup phase.

Dies
29th August 2008, 06:46 PM
Simple is what I like.

Then unless you have some really rude bots just keep IPTables out of this. ;)

http://www.robotstxt.org/faq/prevent.html

glennzo
29th August 2008, 08:26 PM
I like the idea of the blacklist but I like the idea of using robots.txt per Dies better. I copied robots.txt to /var/www/moin and added this
User-agent: *
Disallow: /
to it per Dies' url. Let's see what happens... I also like the idea of not using iptables. At the same time I appreciate the help you both are providing. Thank you.

glennzo
29th August 2008, 10:28 PM
Hmmmm. I moved robots.txt to /var/www/moin since in my infinite wisdom that's where I think it belongs for the moin wiki. No good. Actually the reverse is true. Yahoo! Slurp is grabbing every page on the wiki. Stopping httpd stops Yahoo! Slurp. Restart httpd and Yahoo! Slurp is at it again, indexing every page. Apparently before I started fixing this unbroken issue things were acting as I wanted. Slurp looked, saw that it wasn't welcome and went away. I've also edited the robots.txt file that was already in /usr/share/moin/htdocs but that didn't help. Yes, I restarted httpd. So, I renamed /var/www/moin/robots.txt and it looks like things are back to normal. Moral or the story, if it ain't broke don't fix it.