Block user access to internals of a site using HTTP_REFERER

Block user access to internals of a site using HTTP_REFERER

"I have control over the HttpServer but not over the ApplicationServer or the Java Applications sitting there but I need to block direct access to certain pages on those applications. Precisely, I don't want users automating access to forms issuing direct GET/POST HTTP requests to the appropriate servlet.

So, I decided to block users based on the value of HTTP_REFERER. After all, if the user is navigating inside the site, it will have an appropriate HTTP_REFERER. Well, that was what I thought.

I implemented a rewrite rule in the .htaccess file that says:

RewriteEngine on

# Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} !^http://mywebaddress(.cl)?/.* [NC]
RewriteRule (servlet1|servlet2)/.+\?.+ - [F]

I expected to forbid access to users that didn't navigate the site but issue direct GET requests to the ""servlet1"" or ""servlet2"" servlets using querystrings. But my expectations ended abruptly because the regular expression (servlet1|servlet2)/.+\?.+ didn't worked at all.

I was really disappointed when I changed that expression to (servlet1|servlet2)/.+ and it worked so well that my users were blocked no matter if they navigated the site or not.

So, my question is: How do I can accomplish this thing of not allowing ""robots"" with direct access to certain pages if I have no access/privileges/time to modify the application?"

Asked by: Guest | Views: 328

Total answers/comments: 4

Comments display order:

Guest [Entry]

"I'm not sure if I can solve this in one go, but we can go back and forth as necessary.

First, I want to repeat what I think you are saying and make sure I'm clear. You want to disallow requests to servlet1 and servlet2 is the request doesn't have the proper referer and it does have a query string? I'm not sure I understand (servlet1|servlet2)/.+\?.+ because it looks like you are requiring a file under servlet1 and 2. I think maybe you are combining PATH_INFO (before the ""?"") with a GET query string (after the ""?""). It appears that the PATH_INFO part will work but the GET query test will not. I made a quick test on my server using script1.cgi and script2.cgi and the following rules worked to accomplish what you are asking for. They are obviously edited a little to match my environment:

RewriteCond %{HTTP_REFERER} !^http://(www.)?example.(com|org) [NC]
RewriteCond %{QUERY_STRING} ^.+$
RewriteRule ^(script1|script2)\.cgi - [F]

The above caught all wrong-referer requests to script1.cgi and script2.cgi that tried to submit data using a query string. However, you can also submit data using a path_info and by posting data. I used this form to protect against any of the three methods being used with incorrect referer:

RewriteCond %{HTTP_REFERER} !^http://(www.)?example.(com|org) [NC]
RewriteCond %{QUERY_STRING} ^.+$ [OR]
RewriteCond %{REQUEST_METHOD} ^POST$ [OR]
RewriteCond %{PATH_INFO} ^.+$
RewriteRule ^(script1|script2)\.cgi - [F]

Based on the example you were trying to get working, I think this is what you want:

RewriteCond %{HTTP_REFERER} !^http://mywebaddress(.cl)?/.* [NC]
RewriteCond %{QUERY_STRING} ^.+$ [OR]
RewriteCond %{REQUEST_METHOD} ^POST$ [OR]
RewriteCond %{PATH_INFO} ^.+$
RewriteRule (servlet1|servlet2)\b - [F]

Hopefully this at least gets you closer to your goal. Please let us know how it works, I'm interested in your problem.

(BTW, I agree that referer blocking is poor security, but I also understand that relaity forces imperfect and partial solutions sometimes, which you seem to already acknowledge.)"

Guest [Entry]

I don't have a solution, but I'm betting that relying on the referrer will never work because user-agents are free to not send it at all or spoof it to something that will let them in.

Guest [Entry]

You can't tell apart users and malicious scripts by their http request. But you can analyze which users are requesting too many pages in too short a time, and block their ip-addresses.

Guest [Entry]

"Using a referrer is very unreliable as a method of verification. As other people have mentioned, it is easily spoofed. Your best solution is to modify the application (if you can)

You could use a CAPTCHA, or set some sort of cookie or session cookie that keeps track of what page the user last visited (a session would be harder to spoof) and keep track of page view history, and only allow users who have browsed the pages required to get to the page you want to block.

This obviously requires you to have access to the application in question, however it is the most foolproof way (not completely, but ""good enough"" in my opinion.)"