LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
LinkBack Search this Thread
Old 11-09-2011, 06:38 AM   #1
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,512
Blog Entries: 27

Rep: Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174
Apache RedirectMatch: prevent insertion of match in URL?


Hello

The following server-level configuration generates http://<server>/omega/cgi-bin/omega?DB=docoll/docoll/FMT=docoll on GETting http://<server>/docoll. Can the red part be suppressed and the & retained?

Code:
ScriptAlias /omega/cgi-bin/ /usr/lib/cgi-bin/omega/

<Directory /var/www/docoll>
    DirectorySlash  On
    Options         All
    Order           Deny,Allow
    Allow           From all
    RewriteRule     ^ - [env=OMEGA_CONFIG_FILE:/etc/opt/docoll/search-0.2/omega.conf]
    RedirectMatch   ^/docoll/$ "/omega/cgi-bin/omega?DB=docoll&FMT=docoll"
</Directory>
Best

Charles
 
Old 11-10-2011, 03:25 PM   #2
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,344

Rep: Reputation: 1945Reputation: 1945Reputation: 1945Reputation: 1945Reputation: 1945Reputation: 1945Reputation: 1945Reputation: 1945Reputation: 1945Reputation: 1945Reputation: 1945
I don't see why you care in this instance. If it's a static url returned, then you don't need to mess with it. match it exactly and return exactly whatever you want. You can trivially do what you're asking for as it's a regex so "redirectmatch ^(.+)/docoll/(.+) $1/$2 " should work, but that's overkill if the url is only one thing anyway, just do a normal redirect instead.
 
Old 11-10-2011, 04:54 PM   #3
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942
Quote:
Originally Posted by catkin View Post
The following server-level configuration generates http://<server>/omega/cgi-bin/omega?DB=docoll/docoll/FMT=docoll on GETting http://<server>/docoll. Can the red part be suppressed and the & retained?
You can use a number of RewriteCond directives to capture the GET parameters, then use the RewriteRule (with R=307 flag) to do what the RedirectMatch does. Because each RewriteCond needs to have the same number of subpatterns with the same logical contents, it gets pretty complicated:
Code:
ScriptAlias /omega/cgi-bin/ /usr/lib/cgi-bin/omega/

<Directory /var/www/docoll>
    DirectorySlash  On
    Options         All
    Order           Deny,Allow
    Allow           From all
    RewriteEngine   On
    RewriteBase     /docoll
    RewriteCond     %{QUERY_STRING}   ^()&*DB=[^&]*(.*)&+FMT=[^&]*(.*)$ [OR]
    RewriteCond     %{QUERY_STRING} ^(.*)&+DB=[^&]*(.*)&+FMT=[^&]*(.*)$ [OR]
    RewriteCond     %{QUERY_STRING}   ^()&*FMT=[^&]*(.*)&+DB=[^&]*(.*)$ [OR]
    RewriteCond     %{QUERY_STRING} ^(.*)&+FMT=[^&]*(.*)&+DB=[^&]*(.*)$ [OR]
    RewriteCond     %{QUERY_STRING}   ^()&*DB=[^&]*(.*)()$  [OR]
    RewriteCond     %{QUERY_STRING} ^(.*)&+DB=[^&]*(.*)()$  [OR]
    RewriteCond     %{QUERY_STRING}   ^()&*FMT=[^&]*(.*)()$ [OR]
    RewriteCond     %{QUERY_STRING} ^(.*)&+FMT=[^&]*(.*)()$ [OR]
    RewriteCond     %{QUERY_STRING} ^(.*)()()$
    RewriteRule     ^$ /omega/cgi-bin/omega?DB=docoll&FMT=docoll&%1%2%3 [R=307,env=OMEGA_CONFIG_FILE:/etc/opt/docoll/search-0.2/omega.conf]
</Directory>
The above is untested, but it should work.

Odd rules (beginning with the first rule) start with an omitted variable, even rules start with some other GET query variable. The first four patterns capture the cases where both parameters have been specified. The last four patterns capture the cases where only one of the omitted variables was specified.

If you omit the R=307 flag, omega will be run exactly the same way, but the users will not see the redirect.

To debug such complex rewrite rules, I recommend using an environment-dumping CGI script. I use this CGI program I wrote, env-dump.c:
Code:
#include <stdio.h>

extern char **environ;

static inline void fputcsafe(const int c, FILE *const out)
{
	if (c == '<')
		fputs("&lt;", out);
	else
	if (c == '>')
		fputs("&gt;", out);
	else
	if (c == '&')
		fputs("&amp;", out);
	else
	if (c == '"')
		fputs(""", out);
	else
		fputc(c, out);
}

void ftablerow(const char *s, FILE *const out)
{
	if (!s || !*s)
		return;

	fputs(  "    <tr>\n"
		"     <td class=\"name\" align=\"right\" valign=\"top\">"
	      , out);

	while (*s && *s != '=')
		fputcsafe(*(s++), out);

	if (*s == '=')
		s++;

	fputs(  "</td>\n"
		"     <td class=\"value\" align=\"left\" valign=\"top\">"
	      , out);

	while (*s)
		fputcsafe(*(s++), out);

	fputs(	"</td>\n"
		"    </tr>\n"
	      , out);
	return;
}


int main(void)
{
	int i;

	fputs(	"Content-Type: text/html; charset=UTF-8\n"
		"\n"
		"<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\""
		" \"http://www.w3.org/TR/html4/loose.dtd\">\n"
		"<html>\n"
		" <head>\n"
		"  <title> CGI Environment Variables </title>\n"
		"  <meta http-equiv=\"content-type\" content=\"text-html; charset=UTF-8\">\n"
		"  <script type=\"text/css\">\n"
		"   html, body {\n"
		"    background: #ffffff;\n"
		"    color: #000000;\n"
		"   }\n"
		"   table {\n"
		"    width: 100%;\n"
		"    border-color: #666666;\n"
		"    border-width: 3px;\n"
		"    border-style: double;\n"
		"   }\n"
	      , stdout);
	fputs(	"   td {\n"
		"    font-weight: normal;\n"
		"    font-style: normal;\n"
		"    border-color: #cccccc;\n"
		"    border-width: 1px;\n"
		"    border-style: solid;\n"
		"    padding: 0.5em 1.0em 0.5em 1.0em;\n"
		"    margin: 0 0 0 0;\n"
		"   }\n"
		"   td.name {\n"
		"    width: 20em;\n"
		"    vertical-align: top;\n"
		"    text-align: right;\n"
		"   }\n"
		"   td.value {\n"
		"    vertical-align: top;\n"
		"    text-align: right;\n"
		"   }\n"
		"  </script>\n"
		" </head>\n"
		" <body>\n"
		"  <table frame=\"box\" rules=\"all\">\n"
		"   <tbody>\n"
	      , stdout);
	for (i = 0; environ[i]; i++)
		ftablerow(environ[i], stdout);
	fputs(	"   </tbody>\n"
		"  </table>\n"
		" </body>\n"
		"</html>\n"
	      , stdout);

	return 0;
}
which you can compile and install at /usr/lib/cgi-bin/omega/env-dump using
Code:
gcc -Wall -O3 -o env-dump env-dump.c
sudo install -m 0755 /usr/lib/cgi-bin/omega/env-dump env-dump
If you then replace the /omega/cgi-bin/omega in the Apache config with /omega/cgi-bin/env-dump you can see all of the environment variables, including OMEGA_CONFIG_FILE and QUERY_STRING.

After debugging, I recommend removing the environment dump CGI program, because it may help possible attackers in tailoring their attack attempts for your system.

One could use a Bash/sed/awk script instead for dumping the environment, maybe something as simple as
Code:
#!/bin/bash
echo "Content-type: text/plain; charset=UTF-8"
echo ""
env
but I prefer the nice table my C program outputs.

Last edited by Nominal Animal; 11-12-2011 at 11:46 AM. Reason: Added the missing 'RewriteEngine On' from the Directory directive.
 
1 members found this post helpful.
Old 11-12-2011, 03:18 AM   #4
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,512
Blog Entries: 27

Original Poster
Rep: Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174
Quote:
Originally Posted by acid_kewpie View Post
I don't see why you care in this instance. If it's a static url returned, then you don't need to mess with it. match it exactly and return exactly whatever you want. You can trivially do what you're asking for as it's a regex so "redirectmatch ^(.+)/docoll/(.+) $1/$2 " should work, but that's overkill if the url is only one thing anyway, just do a normal redirect instead.
Thanks Chris

Yes, a simple Redirect could do it but I actually want to do something more complex which does need RedirectMatch. When that didn't work I simplified it to the RedirectMatch in the OP.

In this simplified RedirectMatch, I don't want any of the matched regex in the URL. That's using the terms in the Apache documentation for RedirectMatch:
Code:
RedirectMatch [status] regex URL
 
Old 11-12-2011, 03:22 AM   #5
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,512
Blog Entries: 27

Original Poster
Rep: Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174
Thanks Nominal Animal

I am not able to get even the most basic mod_rewrite functionality to work, despite re-installing and re-initialising Apache as described in this Webmaster World post.
 
Old 11-12-2011, 11:56 AM   #6
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942
Quote:
Originally Posted by catkin View Post
Thanks Nominal Animal

I am not able to get even the most basic mod_rewrite functionality to work, despite re-installing and re-initialising Apache as described in this Webmaster World post.
Because of the way VirtualHosts are configured for Apache on Debian, RewriteRule only works within a Directory directive or .htaccess file (but only if SymlinksIfOwnerMatch or FollowSymlinks option is enabled for that directory). You also need to remember to enable RewriteEngine there, as well as specify the RewriteBase.


Let me outline a procedure I use of temporarily "baselining" Apache. This is nondestructive, and requires no package reinstalls.

I save the current configuration, and replace it with the original configuration provided by the apache2.2-common package. First, obtain a copy of the desired apache2-common Debian package. Extract the contents into a temporary directory:
Code:
sudo mkdir -m 0755 /tmp/apache2-common
sudo dpkg-deb -x apache2.2-common_2.2.16-6+squeeze4_i386.deb /tmp/apache2-common
Then, save current Apache configuration (to /etc/apache2.old ), and replace the configuration with the original pristine configuration. You'll also need to create an empty httpd.conf, which I think is generated by the post-installation scripts. Note that this will not include any configuration files provided by other packages (like libapache-mod-fastcgi or libapache-mod-fcgid):
Code:
sudo mv /etc/apache2 /etc/apache2.old
sudo mv /tmp/apache2-common/etc/apache2 /etc/apache2
sudo touch /etc/apache2/httpd.conf
sudo chmod 0664 /etc/apache2/httpd.conf
sudo rm -rf /tmp/apache2-common
Other packages you have installed may provide other configuration files. To obtain the list of those, run
Code:
dpkg-query -S /etc/apache2 | sed -e 's|,[\t ]*|\n|g; s|:.*$||g' | grep -ve '^apache2\.2-common$' | while read PKG ; do echo "$PKG:" ; dpkg-query -L "$PKG" | grep -e '^/etc/apache2/' ; echo ; done
While you certainly can extract those files from their original packages, fully reinstall those packages, or copy the configuration files from /etc/apache2.old (if you have not modified those files), you do not need to do any of that for testing. If you intend to use those packages, then obviously you do need their configuration files, too.

After the above, you need to enable whatever modules you use, and the default site, using
Code:
sudo a2enmod alias authz_host cgi dir env rewrite setenvif
sudo a2ensite default
Now you have a pristine, basic, working Apache2 configuration.


Here is a simple test and debugging environment you can use to check that everything, including URL Rewriting, is working correctly after the above. I use something like the following to create and debug the most complex RewriteRule systems.

Create a simple CGI shell script /usr/lib/cgi-bin/env to emit your environment variables:
Code:
cat >/tmp/env <<'END'
#!/bin/sh
echo 'Content-Type: text/plain; charset=UTF-8'
echo ''
env
END
sudo install -m 0755 /tmp/env /usr/lib/cgi-bin/env
rm -f /tmp/env
Actually, I prefer to use the env-dump I listed earlier, but this trivial shell script provides the same information, just in plain text form.

Create a test Apache configuration file, for example /etc/apache2/conf.d/rewrite-test:
Code:
cat >/tmp/rewrite-test <<'END'
<Directory /var/www>
    Options         FollowSymlinks
    RewriteEngine   On
    RewriteBase     /
    RewriteRule     ^(rewrite[/-]?.*)$ /cgi-bin/env/$1 [L,E=EXAMPLE_VAR:/$1]
</Directory>

<Directory /usr/lib/cgi-bin>
    <Files env>
        SetEnvIf REDIRECT_EXAMPLE_VAR ^(.*)$ EXAMPLE_VAR=$1
    </Files>
</Directory>

END
sudo install -m 0644 /tmp/rewrite-test /etc/apache2/conf.d/rewrite-test
rm -f /tmp/rewrite-test
Note that you need both the RewriteEngine to be enabled, and set a RewriteBase, to get consistent working results when using RewriteRule inside a Directory directive. The RewriteRule will also set a custom environment variable, EXAMPLE_VAR, to the original URL. However, since this is a redirection, Apache ends up renaming it to REDIRECT_EXAMPLE_VAR; to get the desired environment variable name you need to use SetEnvIf in the cgi directory, like I did above. The RewriteBase will consume any trailing slashes automatically; with RewriteBase set, RewriteRules never see a leading slash. To get an absolute URL path into EXAMPLE_VAR, one must prepend the RewriteBase string to the captured subpattern; above, this is just /.

Finally, stop Apache if it is running, and start it. Reloading is normally sufficient, but we want to make sure it is started.
Code:
sudo /etc/init.d/apache2 stop
sudo /etc/init.d/apache2 start
The CGI script will now dump the environment variables using URL http://127.0.0.1/cgi-bin/env.

Any URL beginning with rewrite- or rewrite/ will be rewritten (internally redirected, without showing the user the actual redirected URL) to the environment dumper too. For example, try http://127.0.0.1/rewrite-this and http://127.0.0.1/rewrite/something/completely/different.


Could you please verify the above works for you? It should, for everyone using a Debian-based distribution.


After testing, it is important to remove the /usr/lib/cgi-bin/env script, since it includes information that may assist attackers in tailoring an attack to your system:
Code:
sudo rm -f /usr/lib/cgi-bin/env
The key difference between my method and just reinstalling the packages, is that you can still revert back to your earlier configuration, simply by renaming the /etc/apache2 (current "testing" configuration) and /etc/apache2.old (the old configuration, before any of this). Just remember to
Code:
sudo /etc/init.d/apache2 reload
afterwards to make Apache reload its configuration.

Last edited by Nominal Animal; 11-12-2011 at 02:40 PM. Reason: Fixed Symlinks -> SymlinksIfOwnerMatch
 
Old 11-12-2011, 01:05 PM   #7
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,512
Blog Entries: 27

Original Poster
Rep: Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174
Quote:
Originally Posted by Nominal Animal View Post
Because of the way VirtualHosts are configured for Apache on Debian, RewriteRule only works within a Directory directive or .htaccess file (but only if Symlinks or FollowSymlinks option is enabled for that directory).
Great!

And thanks for bearing with me on this saga when I am so independent-minded, not taking advice verbatim and always questioning it.

That got rewriting working. I should probably have realised that from earlier advice you have given.

Experimentation, using a Directory section not a .htaccess file, established that FollowSymlinks (and only FollowSymlinks) is required (I could not find plain Symlinks in the Options documentation -- was it a typo for SymLinksIfOwnerMatch?). Why should Apache designers have required FollowSymlinks? No symlinks are involved!

Last edited by catkin; 11-12-2011 at 01:07 PM.
 
Old 11-12-2011, 04:27 PM   #8
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942
Quote:
Originally Posted by catkin View Post
And thanks for bearing with me on this saga when I am so independent-minded, not taking advice verbatim and always questioning it.
Not at all. In fact, I like that attitude a lot. Critical examination is the only way you can find bugs and weaknesses, and develop something even better.

Quote:
Originally Posted by catkin View Post
Experimentation, using a Directory section not a .htaccess file, established that FollowSymlinks (and only FollowSymlinks) is required (I could not find plain Symlinks in the Options documentation -- was it a typo for SymLinksIfOwnerMatch?).
Yes, exactly: it was a typo. I think I was interrupted when typing that.

Quote:
Originally Posted by catkin View Post
Why should Apache designers have required FollowSymlinks? No symlinks are involved!
It would be quite complicated to extend the core Options directive from within the mod_rewrite module; the core Option code would have to be modified to implement a new option.

On the other hand, it is very simple and easy to just reuse FollowSymlinks/SymlinksIfOwnerMatch for mod_rewrite; something like one additional condition test to certain if clauses within mod_rewrite code.

It turns out that reusing FollowSymlinks for this works for just about every conceivable case. Nobody can think of a real world scenario where FollowSymlinks/SymlinksIfOwnerMatch should be disabled but RewriteEngine enabled, or vice versa. The logical disparity is just not important enough. There is a lot of written documentation and user knowledge "invested" here, and the code would touch very security-sensitive core code. (Options directive is at the heart of Apache security features; if it bugs out, Apache is likely to publish content the administrator did not want it to.) The new code would have to be carefully examined for bugs, documentation updated, and the code maintained for the foreseeable future. Without a real world use case it simply is not worth the change.

If you happen to encounter such a real world scenario, though, the balance may change. The patch itself (to add say ChangeURL option to Options) is easily written, but getting it vetted and accepted upstream, with relevant documentation changes, is quite a bit of work.


I also tested my earlier multi-RewriteCond suggestion, and it seems to work fine for me. It does emit two consecutive ampersands in some cases, but that is perfectly okay (the RFCs specify "at least one" separator between name=value pairs).

That can be avoided by splitting the RewriteRule. Here is the test configuration (that redirects to /cgi-bin/env) that you can try, if you want "perfect" URL's:
Code:
    RewriteEngine   On
    RewriteBase     /docoll

    RewriteCond     %{QUERY_STRING} ^&*(.+)&+DB=[^&]*(.*)&+FMT=[^&]*(.*)$ [OR]
    RewriteCond     %{QUERY_STRING} ^&*(.+)&+FMT=[^&]*(.*)&+DB=[^&]*(.*)$
    RewriteRule     ^$ /cgi-bin/env?DB=docoll&FMT=docoll&%1%2%3 [L,env=OMEGA_CONFIG_FILE:/etc/opt/docoll/search-0.2/omega.conf]

    RewriteCond     %{QUERY_STRING} ^&*DB=[^&]*(.*)&+FMT=[^&]*&*(.*)$ [OR]
    RewriteCond     %{QUERY_STRING} ^&*FMT=[^&]*(.*)&+DB=[^&]*&*(.*)$ [OR]
    RewriteCond     %{QUERY_STRING} ^&*(.+)&+DB=[^&]*(.*)$            [OR]
    RewriteCond     %{QUERY_STRING} ^&*(.+)&+FMT=[^&]*(.*)$
    RewriteRule     ^$ /cgi-bin/env?DB=docoll&FMT=docoll&%1%2 [L,env=OMEGA_CONFIG_FILE:/etc/opt/docoll/search-0.2/omega.conf]

    RewriteCond     %{QUERY_STRING} ^&*DB=[^&]*&*(.*)$  [OR]
    RewriteCond     %{QUERY_STRING} ^&*FMT=[^&]*&*(.*)$ [OR]
    RewriteCond     %{QUERY_STRING} ^(.*)$
    RewriteRule     ^$ /cgi-bin/env?DB=docoll&FMT=docoll&%1 [L,env=OMEGA_CONFIG_FILE:/etc/opt/docoll/search-0.2/omega.conf]
The reason I brought this up is that the xapian-omega GET parsing seems buggy to me. It stops parsing query parameters if it encounters consecutive ampersands -- which it absolutely should not do; it does not follow the RFC's at all!

The code on lines 182..186 in xapian-omega-1.2.7/cgiparam.cc:decode_get() should be
Code:
	    if (ch == '\0' || ch == '&') {
		if (!name.empty())
			add_param(name, val);
		if (ch == '\0')
			return;
		break;
	    }
The function also has a trivial buffer overrun problem (given an URL ending with %, the function will parse some other environment variable as a query parameter). The code on lines 190..197 in xapian-omega-1.2.7(cgiparam.cc:decode_get() should be
Code:
	    else if (ch == '%') {
		int c = 0;
		do {
		    if (q_str[0] >= '0' && q_str[0] <= '9')
			c = q_str[0] - '0';
		    else
		    if (q_str[0] >= 'A' && q_str[0] <= 'F')
			c = q_str[0] - 'A' + 10;
		    else
		    if (q_str[0] >= 'a' && q_str[0] <= 'f')
			c = q_str[0] - 'a' + 10;
		    else
			break;

		    if (q_str[1] >= '0' && q_str[1] <= '9')
			c = 16 * c + q_str[1] - '0';
		    else
		    if (q_str[1] >= 'A' && q_str[1] <= 'F')
			c = 16 * c + q_str[1] - 'A' + 10;
		    else
		    if (q_str[1] >= 'a' && q_str[1] <= 'f')
			c = 16 * c + q_str[1] - 'a' + 10;
		    else
			break;

		    if (c) {
			ch = c;
			q_str += 2;
		    }
		} while (0);
	    }
which will only consume correct, non-NUL percent escape sequences, and will handle things like trailing percent signs correctly.
 
1 members found this post helpful.
Old 11-12-2011, 11:04 PM   #9
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,512
Blog Entries: 27

Original Poster
Rep: Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174
Would you like me to post the fixed code to the mailing list, include it in a bug report or do you want to do it?
 
Old 11-13-2011, 12:22 AM   #10
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942
Feel free to push it upstream. I'd rather not get involved in a discussion whether the problems are big enough to fix or not.

Here are the changes in a suggested patch form; you may have better luck in pushing it upstream in this form.
Code:
This patch fixes omega GET query parsing.

Without this patch:
 - if the QUERY_STRING ends with a '%', the parsing routine
   will overrun the QUERY_STRING (if it is followed in memory
   by only a single NUL byte).
 - omega stops parsing GET query parameters if there are
   two consecutive ampersands in QUERY_STRING.
 - omega tries to parse anything following a percent sign.
   Other implementations do not try to parse malformed escape
   sequences.

diff -Naur xapian-omega-1.2.7-original/cgiparam.cc xapian-omega-1.2.7/cgiparam.cc
--- xapian-omega-1.2.7-original/cgiparam.cc	2011-08-10 09:49:12.000000000 +0300
+++ xapian-omega-1.2.7/cgiparam.cc	2011-11-13 08:10:15.021566998 +0200
@@ -180,20 +180,28 @@
 	while (1) {
 	    ch = *q_str++;
 	    if (ch == '\0' || ch == '&') {
-		if (name.empty()) return; // end on blank line
-		add_param(name, val);
+		if (!name.empty()) add_param(name, val);
+		if (ch == '\0')
+			return;
 		break;
 	    }
 	    char orig_ch = ch;
 	    if (ch == '+')
 		ch = ' ';
-	    else if (ch == '%') {
-		int c = *q_str++;
-		ch = (c & 0xf) + ((c & 64) ? 9 : 0);
-		if (c) c = *q_str++;
-		ch = ch << 4;
-		ch |= (c & 0xf) + ((c & 64) ? 9 : 0);
-		if (!c) return; // unfinished % code
+	    else if (ch == '%' &&
+		     ((q_str[0] >= '0' && q_str[0] <= '9') ||
+		      (q_str[0] >= 'A' && q_str[0] <= 'F') ||
+		      (q_str[0] >= 'a' && q_str[0] <= 'f')) &&
+		     ((q_str[1] >= '0' && q_str[1] <= '9') ||
+		      (q_str[1] >= 'A' && q_str[1] <= 'F') ||
+		      (q_str[1] >= 'a' && q_str[1] <= 'f'))) {
+		const int c1 = q_str[0], c2 = q_str[1];
+		int c;
+		c = ( (c1 & 0xf) + ((c1 & 64) ? 9 : 0) ) << 4
+		  + ( (c2 & 0xf) + ((c2 & 64) ? 9 : 0) );
+		q_str += 2;
+		if (!c)
+		    continue;
 	    }
 	    if (had_equals) {
 		val += char(ch);
 
Old 11-13-2011, 11:46 PM   #11
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,512
Blog Entries: 27

Original Poster
Rep: Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174
Quote:
Originally Posted by Nominal Animal View Post
Feel free to push it upstream.
Done (the patch as mangled in the report looks sane in the notification email)
 
Old 11-14-2011, 04:23 AM   #12
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,512
Blog Entries: 27

Original Poster
Rep: Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174
env-dump.c

Quote:
Originally Posted by Nominal Animal View Post
To debug such complex rewrite rules, I recommend using an environment-dumping CGI script. I use this CGI program I wrote, env-dump.c:
Thanks for env-dump.c but two issues:
  1. After changing """ to '"' on line 17 ...
  2. Code:
    gcc -Wall -O3 -o env-dump env-dump.c
    env-dump.c: In function 'fputcsafe':
    env-dump.c:17: warning: passing argument 1 of 'fputs' makes pointer from integer without a cast
    /usr/include/stdio.h:662: note: expected 'const char * __restrict__' but argument is of type 'int'
 
Old 11-14-2011, 04:53 AM   #13
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,512
Blog Entries: 27

Original Poster
Rep: Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174
Quote:
Originally Posted by Nominal Animal View Post
Could you please verify the above works for you? It should, for everyone using a Debian-based distribution.
Yes thanks, the simple CGI shell script /usr/lib/cgi-bin/env and the example /etc/apache2/conf.d/rewrite-test result in informative pages on browsing http://127.0.0.1/cgi-bin/env, http://127.0.0.1/rewrite-this and http://127.0.0.1/rewrite/something/completely/different except the server is headless so 127.0.0.1 was changed to suit.
 
Old 11-14-2011, 07:35 AM   #14
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,512
Blog Entries: 27

Original Poster
Rep: Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174Reputation: 1174
Thanks to Nominal Animal's "Because of the way VirtualHosts are configured for Apache on Debian, RewriteRule only works within a Directory directive ...", the OP requirement was solved by using a RewriteRule instead of a RedirectMatch using this minimal Directory grouping which also shows the more complex case of redirecting /var/www/docoll/<instance name> to a CGI call with <instance name> substituted into the query string
Code:
<Directory /var/www/docoll>
    Options       FollowSymlinks
    RewriteEngine On
    RewriteRule  ^$ /cgi-bin/env?DB=docoll&FMT=docoll [QSA,L]
    RewriteRule  ^([^/]*)/*$ /cgi-bin/env?DB=$1&FMT=$1 [QSA,L]
</Directory>
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Regular expression to match URL matches fragments - Python madsovenielsen Programming 4 08-29-2010 09:44 AM
Is there any way to prevent sed from updating file modified time when no match? DennyCrane Linux - General 2 05-01-2009 07:28 AM
Apache RedirectMatch Directive sir-lancealot Linux - Server 1 01-24-2008 04:14 AM
Regular expression to match a valid URL string vharishankar Programming 13 07-21-2005 09:17 PM
RedirectMatch & ScriptAlias richwalker Fedora 0 07-10-2005 06:12 PM


All times are GMT -5. The time now is 04:58 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration