Visit Jeremy's Blog.
Go Back > Forums > Linux Forums > Linux - Server
User Name
Linux - Server This forum is for the discussion of Linux Software used in a server related context.


  Search this Thread
Old 04-27-2009, 01:57 AM   #1
Registered: Sep 2004
Posts: 86

Rep: Reputation: 17
http relay - tracking http requests

I need to track call center agents interaction with a website, specifically when the agent enters information into a certain page, I need to write that information to a database.

My solution has been to have the agents connects to an internal website, forward all communication to the external website.

The simplest solution would be to write a custom relay application, that inspects the communication between the agent's browser and the web server. It looks for a post with the information that needs to be saved. However there are two problems:

1. The external web site uses https, so all communication is encrypted. It would be OK for the agent to connect to the relay over http, since it is behind the corporate firewall, and have the relay connect to the external website over https. However, I think that adding http -> https translation to the custom relay application is a non-trivial work item.

2. The pages returned by the server contains absolute urls, and redirections back to that website. Thus, the user may connect to the relay but might be redirected back to the real website, or links in the returned content might point back to the web site, so by the time the user get to the page of interest they are no longer connected with the relay.

To solve #1 I decided to use Apache as a the basis for my relay, and have it deal with the http -> https translation.

Initially I tried to accomplish the relay functionality with apache's proxy and filters modules. You can set Apache as a reverse proxy and have it translate from http to https. And using an output filter you can inspect all returned content and translate urls etc. such that they will point to the relay. However, it turned out that target server redirection escapes the filter and is passed back by the apache proxy to the client.

So next I wrote a cgi that plugs into apache and uses wget to fetch the content. The user accesses pages from the local server, which translates urls in both the outgoing and incoming content. The cgi uses wget with https to access the external website. I also take advantage of wget's ability to cache pages in the local web server.

It works! the only problem is that it is slow! I think that is because wget closes the connection after each request (since it is called per request) and takes a long time to negotiate the ssl (https) connection for each request. In contrast, when the browser connects directly to the external website over https, it keeps the connection open.

So I'm looking for a better solution short of implementing a complete http->https relay.
Old 04-28-2009, 08:52 AM   #2
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
if you have issues about the hostnames that the box contacts, how about modifying your DNS solution to tell your clients that the hostname in the URL is actually your local box? It's not horribly nice in the first instance, but not that uncommon, albeit usually in different architectures. Quite what you do with the request once it hits your box can still be an issue though, but your existing server side ssl on apache may still sort out your issues.

You could do something hopefully simpler without apache too. you could use just forward requests to the remote server via an ssl connection handled by a widget like stunnel. here, you'd only need to get the clients to hit a port on the stunnel box that you are listening on for the plaintext stunnel connection (via dns entries) and then squirt it through. There's no mention of actual http inspection here at all though, as you could just use something like ngrep to recgonise various bits of data straight off of the wire without actually terminating it or messing in any other way.

More formally again, squid really should be able to do server side ssl too, but a brief google hasn't shown anything. It was from there that I drifted towards stunnel...

Last edited by acid_kewpie; 04-28-2009 at 08:58 AM.
Old 04-28-2009, 07:05 PM   #3
Registered: May 2001
Posts: 29,361
Blog Entries: 55

Rep: Reputation: 3547Reputation: 3547Reputation: 3547Reputation: 3547Reputation: 3547Reputation: 3547Reputation: 3547Reputation: 3547Reputation: 3547Reputation: 3547Reputation: 3547
...another to bridge HTTP to HTTPS could be Delegate (multi-purpose application level gateway or proxy server), see example: A universal TLS gateway by DeleGate.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Tracking http requests at the client machines? maradnus Linux - Networking 1 02-05-2009 04:14 AM
How to use tcpdump to be able to see http requests sent to the server? helptonewbie Linux - Networking 4 01-12-2009 10:33 AM
http requests hanging scalforama Linux - Networking 1 01-29-2008 05:01 AM
automating http requests Murdock1979 Programming 5 12-03-2007 10:14 AM
Help! (I'm getting flooded with http requests) rknoesel Mandriva 6 11-14-2004 06:57 PM > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 06:47 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration