LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices

Reply
 
LinkBack Search this Thread
Old 11-16-2008, 06:42 AM   #1
shreks
Member
 
Registered: Aug 2004
Posts: 79

Rep: Reputation: 15
Web Page Access and TCP Session


Hi All,

I have a question regarding how the data of a web page is partitioned into multiple TCP session by HTTP web server? That is if there is any identifications in these TCP session data that can be used by web browsers to reconstruct the original web page. Obviously those TCP session data must be organized in order and no one can be left out for a complete reconstruction. Where should such TCP session identification information be located? in TCP packet header? in TCP session header? in HTTP data? or elsewhere. Thanks a lot for any comments!
 
Old 11-16-2008, 10:02 AM   #2
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 39,835

Rep: Reputation: 1118Reputation: 1118Reputation: 1118Reputation: 1118Reputation: 1118Reputation: 1118Reputation: 1118Reputation: 1118Reputation: 1118
If i understand the question, i think you've misunderstood a few things about what tcp does. tcp has nothing to do with http other than it being a data protocol carried within tcp. a web page will be multiple http requests over multiple tcp connections. often a single connection can be used for multiple http gets but at the same time, multiple connections are used for speed. TCP implictly ensures that each piece of data passed is correct rebuilt as a single entity, but as a web page is usually about 30 different things, that's down to the html engine to reassemble each piece of html, gif, flash into a page irrespective of the network transport.
 
Old 11-16-2008, 07:56 PM   #3
shreks
Member
 
Registered: Aug 2004
Posts: 79

Original Poster
Rep: Reputation: 15
Hi Chris,

Thank you for your valuable inputs!

Yes, you are quite right on the question! That is what I have imagined. Therefore my ultimate goal is to find out the relationships among the received tcp sessions. There must be some sort of identities which can be used by web clients to reassemble these tcp sessions into a complete image of a web page. Maybe I should resort to the design of web server software to find out the TAG that the server software have put into those tcp session data.

Give me some hints if anyone happens to know. Many thanks!
 
Old 11-17-2008, 03:49 AM   #4
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 39,835

Rep: Reputation: 1118Reputation: 1118Reputation: 1118Reputation: 1118Reputation: 1118Reputation: 1118Reputation: 1118Reputation: 1118Reputation: 1118
web clients and tcp sessions never meet up. the tcp/ip stack returns to the app a given number of pieces of http data containing html, png etc.. it has no business caring about tcp itself when it's back in the realms of a browser.

So the browser wants index.html. It requests an HTTP get for index.html from its network stack and that goes away and opens a new tcp socket to the server, requests the data and passes it back to the browser side of the app. the browser then reads the app, and sees an <img /> tag and requests that gif file from the network stack. that will then open another socket, or reuse the existing one to request that, and then passes it back again, and so it goes on. Please note the demarcation of responsibilities of it all, that there are clear lines of responsibility within the app that keep different things very seperate.

Last edited by acid_kewpie; 11-17-2008 at 03:53 AM.
 
Old 11-17-2008, 09:58 AM   #5
shreks
Member
 
Registered: Aug 2004
Posts: 79

Original Poster
Rep: Reputation: 15
Thank you very much, Chris! Very clear and to the point! Let me study the protocols and relevant techniques a little bit more and get better understanding of how data is transferred and organized between web servers and client browsers. All in all, the way that data is transferred back and forth depends on the dynamic communications between them.

BTW. I am wondering if there exist some sort of static HTTP protocol/header parsers that can extract wanted certain type of MIME type data block out of direct dump of HTTP communication data? I mean development tools and libraries. Many thanks again!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
wget and links2 can't access web page. fakie_flip Programming 6 01-11-2008 04:34 PM
Web-page access control on embedded device simon_qwl Programming 2 02-27-2007 11:06 AM
web page database access per page or per session? b0uncer Programming 6 01-13-2007 12:09 PM
Allow this particular web page access on port 443 ONLY lothario Linux - Software 2 01-14-2005 10:14 PM
Web page user access Cristian Negres Linux - Newbie 2 01-05-2002 12:43 PM


All times are GMT -5. The time now is 09:25 PM.

Main Menu
 
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: @linuxquestions
Open Source Consulting | Domain Registration