LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   Apache trailing slash problem for multiple rewrites (https://www.linuxquestions.org/questions/linux-server-73/apache-trailing-slash-problem-for-multiple-rewrites-735889/)

deesto 06-26-2009 03:19 PM

Apache trailing slash problem for multiple rewrites
 
I'm trying to proxy multiple instances of a web application (Nagios) on multiple remote machines via one Apache proxy machine.

In a virtual host definition, I've created separate Location entries for each instance, since each has a different authentication mechanism. And for each Location, I've created a set of RewriteRule directives, like the following:
Code:

RewriteRule ^/instance1$ /instance1/ [R]
RewriteRule ^/instance1/(.*)$ https://remote.url/and/path$1 [L,P]

The second rule is working, but the first rule (to fix a missing trailing slash) does not, and the user is kicked to a blank page. I've read the mod_rewrite docs, the regular and advanced rewrite guides, and many 'trailing slash' posts here:'UseCanonicalNames' is on, and I've also tried these methods, with the same results:

Code:

#Reverse proxy:
ProxyPass /instance1 https://remote.url/and/path
ProxyPassReverse /instance1 https://remote.url/and/path

#Rewrite within Location directive:
<Location instance1>
...
RewriteBase /
RewriteRule ^$ / [R]
</Location

None of these seemed to address the problem with the missing slash. When I went back to the first set of RewriteRule directives above, the rewrite log showed the engine init for the request, then 'applying pattern' for all the possible definitions (instance1, instance2, etc.), including the one for the slash fix, but only returned a 'rewrite' entry for rewriting the URL without the slash to the remote server. This means that while the remote rewrite was a match, the slash rewrite didn't match and was skipped, but I'm not sure why.

Guttorm 06-29-2009 06:15 PM

Hi

I'm no expert on this, but why not just make 2 rules. One without a slash and one with:
Code:

RewriteRule ^/instance1$ https://remote.url/and/path [L,P]
RewriteRule ^/instance1/(.*)$ https://remote.url/and/path$1 [L,P]


deesto 06-30-2009 09:00 AM

Quote:

Originally Posted by Guttorm (Post 3590722)
I'm no expert on this, but why not just make 2 rules. One without a slash and one with:
Code:

RewriteRule ^/instance1$ https://remote.url/and/path [L,P]
RewriteRule ^/instance1/(.*)$ https://remote.url/and/path$1 [L,P]


Thanks Guttorm. Seems sensible enough. Unfortunately, this ended up with the same problem as my original solution: requests without the trailing slash (/instance ) end up without a trailing slash and at a blank page. From the rewrite log output, it looks like the rule matches, but then the engine continues to check and apply later rules, even though I've specified the 'L' flag (for "last", meaning "don't process any further rules if this matches", or at least that's what I thought it meant).

Maybe someone can make some sense of what is happening; the request here is for the URL '/instance' (no trailing slash):
Code:

(2) init rewrite engine with requested uri /instance
(3) applying pattern '.*' to uri '/instance'
(3) applying pattern '/instance$' to uri '/instance'
(2) rewrite '/instance' -> 'https://remote.url/and/path/'
(2) forcing proxy-throughput with https://remote.url/and/path/
(1) go-ahead with proxy request proxy:https://remote.url/and/path/ [OK]
(2) init rewrite engine with requested uri /favicon.ico
(3) applying pattern '.*' to uri '/favicon.ico'
(3) applying pattern '/instance$' to uri '/favicon.ico'
(3) applying pattern '/instance/(.*)$' to uri '/favicon.ico'
(3) applying pattern '/instance2$' to uri '/favicon.ico'
(3) applying pattern '/instance2/(.*)$' to uri '/favicon.ico'
(3) applying pattern '^/(nagios_config_graph.pl)(.*)$' to uri '/favicon.ico'
(3) applying pattern '^/cgi-bin/nagios_config_graph.pl$' to uri '/favicon.ico'
(3) applying pattern '^/instance3$' to uri '/favicon.ico'
(3) applying pattern '^/instance3/(.*)$' to uri '/favicon.ico'
(3) applying pattern '/' to uri '/favicon.ico'
(2) rewrite '/favicon.ico' -> 'https://remote.url/and/path/'
(2) forcing proxy-throughput with https://remote.url/and/path/
(1) go-ahead with proxy request proxy:https://remote.url/and/path/ [OK]

The first pattern seems to match, but the trailing slash is not being appended.

Viewing the resulting page source confirms that the trailing slash was not added, and that the URL remains unchanged. Also, the resultant page, though blank, is actually a "no frames" page (which includes a hidden comment that says "This page requires a web browser which supports frames"). And the Apache access log shows GETs for /instance instead of /instance/ (and for /favicon.ico too, which should be /instance/favicon.ico).

deesto 08-05-2009 12:52 PM

I could still use some help with this: nothing I can think of trying seems to work.

acksys 01-15-2011 03:54 PM

I had to do the exact same thing. I had some trouble, but got it to work.

I know this is an old thread, but if anyone comes across it while trying to do this and wants me to post my configuration I'd be happy to.

grim76 01-15-2011 08:44 PM

Quote:

Originally Posted by acksys (Post 4225508)
I had to do the exact same thing. I had some trouble, but got it to work.

I know this is an old thread, but if anyone comes across it while trying to do this and wants me to post my configuration I'd be happy to.

Why not post your config rather than make people re-visit a thread to ask for it?

Nominal Animal 01-16-2011 01:01 AM

Quote:

I've created a set of RewriteRule directives, like the following:
Code:

RewriteRule ^/instance1$      /instance1/                  [R]
RewriteRule ^/instance1/(.*)$ https://remote.url/and/path$1 [L,P]


Note, Guttorm's example lacked the trailing slash in the first case, otherwise it would be fine.

Either use duplicate rules like
Code:

RewriteRule ^/instance1$      https://remote.url/and/path/  [L,P]
RewriteRule ^/instance1/(.*)$ https://remote.url/and/path$1 [L,P]

Or, restart the rule matching if a corrective rule matches:
Code:

RewriteRule ^/instance1$      /instance1/                  [N]
RewriteRule ^/instance1/(.*)$ https://remote.url/and/path$1 [L,P]

The [N] means that the entire rule chain is restarted immediately with the result, at the very first RewriteRule. If you create an infinite loop, Apache will get stuck, so I normally recommend the duplicate rules instead.

If the corrective rules are as simple as above, or you are very careful, the restart feature is immensely useful.

Note that corrective rules that restart the chain are best used first, to make it as efficient as possible. If you are careful like me, you can even kill some path trickery at the same time. Consider using these as your very first rules:
Code:

RewriteEngine on
RewriteRule ^([^/].*)$      /$1  [E=redirect:y,N]
RewriteRule ^//+(.*)$      /$1  [E=redirect:y,N]
RewriteRule ^(.*)\.\.+(.*)$ $1.$2 [E=redirect:y,N]
RewriteRule ^(.*)//+(.*)$  $1/$2 [E=redirect:y,N]
RewriteRule ^(.*)\./(.*)$  $1/$2 [E=redirect:y,N]
RewriteRule ^(.*)/\.(.*)$  $1/$2 [E=redirect:y,N]
RewriteRule ^(.*/[^/.]+)$  $1/  [E=redirect:y,N]
RewriteCond %{ENV:redirect} y
RewriteRule ^(.*)$          $1    [R,L]

  1. If the request URL does not start with a slash (/), prepend it.
  2. If the request URL starts with more than one slash (/), keep only the first one.
  3. If there are more than one dot (.) in succession in the URL, replace them with just one.
  4. If there are more than one slash (/) in succession in the URL, replace them with just one.
  5. Replace any ./ in the URL with just /
  6. Replace any /. in the URL with just /
  7. If the last component in the URL is non-empty but does not contain a dot, append a slash /
  8. If any of the above rules were applied, they set the environment variable redirect to y (and restart the entire chain all over again). When they reach this point, all replacements possible have already been made.
    If and only if there were any substitutions made, the environment variable will have a y in it. If so, apply the following RewriteRule.
  9. Use a single visible redirect to redirect the browser to the cleaned up URL right now.
Note that the [E=redirect:y] flag will set the environment variable redirect to value y, and the [N] flag will restart the entire RewriteRule chain immediately from the beginning (using the result, of course) if the rule matches.
Since browsers only allow a small number of redirects before giving up a query, it's best to apply all known fixes first, and then do a single redirect. If these rules are the very first ones, this is extremely efficient -- even a very high load site would not be able to measure the impact (other than the single redirect done).

The end result is simple and safe for your other redirects; you won't need any duplicate rules et cetera. These also increase script security, since scripts never get paths with potentially nasty path walking components (/../ for example).

Hope you find this useful,
Nominal Animal


All times are GMT -5. The time now is 06:30 PM.