Website Stats Mystification


(Butternut Squash) #1

I am really curious about a certain phenomenon that I see frequently in my stats. These brief visits all originate in China, they almost always go to a top tier page that doesn’t really exist (e.g. CARDS), and then they are gone in a nanosecond. I am attaching a screen shot taken a few days ago.

Are these trolls? Are they looking for email addresses, or is something even more diabolical happening?

I am wondering if anyone has a clue.

Thanks.


(Michael M.) #2

Probably it was a bot. Can you identify the IP or the hostname?

You should differentiate between good bots and bad bots. You can block bad bots with a rule in the .htaccess


(Butternut Squash) #3

Well, if I was mystified before, I am doubly mystified now.

What’s a good bot and what’s a bad bot? How does one tell the difference?

I cannot identify the IP or hostname, nor do I know what/where the .htaccess is, therefore how to block them.

But I am curious what they are looking for, or trying to do.


(Michael M.) #4

Good bots are the bots (crawler) of the great search engines. They need the information they crawl for evaluating and classifying the content they find on your website. They do their job based on international standards and they normally would never crawl a directory you do not want to be crawled.

Bad bots misuse or ignore your robots instructions, they look for mail addresses, for images, for any files they are not allowed to use. When you look at your tracking data, sometimes you will see that the visitors of you website are increasing enormously at one single day or at one moment, mostly the reason is that a bad bot has crawled your website. If you look via Google you will find a lot of lists with bad bots. I recommend to exclude these bad bots from crawling your website.

Some days ago I wrote a post about blocking bad bots in my blog. You must not understand German, but you can use the code and write the code into a textfile called .htaccess:
Zu viele Seitenbesucher…?


(Butternut Squash) #5

My dearest Mr. Applesauce,

Thank you so much for your response. You are correct, I do not speak German.

So to clarify, I take the entirety of the code highlighted in grey on your website, and I copy it into a text file saved as .htaccess. Is that correct?

And then what I do with it? Where do I put it?

Thanks again.


(Butternut Squash) #6

OK, I’ve got the scoop. Apparently you create the file, and upload it via FTP to the root of your server.

I hope that’s right.


(Michael M.) #7

You create a textfile and save it as "anyname.txt“. You copy the highlighted code into to the document and save. Then you use a FTP-program and export the textfile to the root level of your website. The root level ist the place where you find the index.html of your homepage. After uploading you rename the file from anytime.txt to “.htaccess” (with the dot at the beginning).

Note: This is for Apache servers and not working on windows servers


(Butternut Squash) #8

So I followed the instructions, uploaded the htaccess file, and Eureka…it worked! No more bots. I was ecstatic.

My website is still a work in progress. I uploaded a newer version last night. The server (LittleOak) went berserk, and gave me an internal server error.

After much gnashing of teeth, re-saving the file, re-exporting the file, re-uploading the file, it turns out that the culprit was the htaccess. When that file was deleted from the server, the website was once again accessible.

Have no idea why that would happen.


(Michael M.) #9

Hmmm…

When it comes to an internal server error after creating a rule in the htaccess, there might be an issue with the syntax of a rule. Such a htaccess is very “capricious” - one dot at the wrong place can cause a server error. But it worked before…

I have no idea why reuploading your project caused this issue because normally changing anything at the website documents does not affect the htaccess