LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices


Reply
  Search this Thread
Old 08-01-2021, 09:22 AM   #1
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Rep: Reputation: 51
Expand/transform an URL which is inside a credential accessed site


Where i study, there is a site which all students use to do things related to the institution. Within this site, there is also access to a subsystem which runs Moodle, and it is used to have areas specific for each discipline, in each moment.

Inside that Moodle, sometimes we have an URL which is shared which students. But this URL is hidden. Id est, it is an URL within the school domain:

https:// virtual. school. br/2021/mod/url/view. php?id=13531

But if we visit that, we will be redirected to (possibly) any external site. And there is the problem: sometimes, teachers shares bad URLs which i *do not* want to visit with the browser profile i use to access the school and other "serious" things.

Pages to expand short/redirected URLs do not work here because they will always give, as result, the URL

https:// systems. school. br/idp/login. jsp

I searched for a browser (Firefox 52, in this situation) extension which could expand this URL, but using my existing session cookies to get the correct result. I did not find (neither for newer Firefox, or other browsers).

This may also be possible to solve with a silly programming in javascript, or in the browser development things. I even thought about opening this thread in the programming or in the general forums. Well, here is a good bet, i think.

Ideas? Solutions?

Last edited by dedec0; 08-01-2021 at 09:27 AM.
 
Old 08-01-2021, 10:40 AM   #2
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,312
Blog Entries: 3

Rep: Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722
Can you watch the response codes using wget or a home made perl or python script?

Code:
wget -S -O /dev/null https:// virtual. school. br/2021/mod/url/view. php?id=13531
You'll have to add in the authentication options as well, since you are logging in to Moodle.

It should show some response codes in the 300 range. If so, you have something you can work with when you write your own browser plug-in.
 
Old 08-01-2021, 02:26 PM   #3
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
Thumbs up

Quote:
Originally Posted by Turbocapitalist View Post
Can you watch the response codes using wget or a home made perl or python script?
sure. i even have a good setup of python, now. and perl should be already ok, too. i use (at least) one command that is made purely in perl (maybe there are more, i do not examine things all around).

Quote:
Originally Posted by Turbocapitalist View Post
Code:
wget -S -O /dev/null https:// virtual. school. br/2021/mod/url/view. php?id=13531
I ran and formatted (editing all the possibly sensitive data i saw) the above command. The output is below - but i forgot to run it with POSIX locale, so you can understand all messages. I think it is acceptable, in this case, but if you want, ask me to remake it.

Code:
$ wget -S -O /dev/null https:// virtual. school.
br/2021/mod/url/view. php?id=13531

--2021-08-01 15:30:07--  https://virtual. school
.br/2021/mod/url/view. php?id=13531

Resolvendo virtual. school. br (virtual. school. br)...
123.456.789.11

Conectando-se a virtual. school. br (virtual. school.
br)|123.456.789.11|:443... conectado.

A requisição HTTP foi enviada, aguardando resposta... 

  HTTP/1.1 303 See Other

  Server: nginx/1.10.1

  Date: Sun, 01 Aug 2021 18:30:07 GMT

  Content-Type: text/html; charset=utf-8

  Transfer-Encoding: chunked

  Connection: keep-alive

  Set-Cookie: MoodleSession2021=and8ejbnayejinvalidkwqwert;
  path=/2021/; domain=school. br; secure; HttpOnly

  Expires: Thu, 19 Nov 1981 08:76:54 GMT

  Cache-Control: no-store, no-cache, must-revalidate

  Pragma: no-cache

  X-Redirect-By: Moodle

  Location: https://virtual. school. br/2021/login/index. php

  Content-Language: pt-br

  Strict-Transport-Security: max-age=13579000; includeSubDomains;
  preload;

  X-Content-Type-Options: nosniff

  X-XSS-Protection: 1; mode=block

  X-Robots-Tag: none

  X-Download-Options: noopen

  X-Permitted-Cross-Domain-Policies: none

  X-Frame-Options: SAMEORIGIN

  Referrer-Policy: no-referrer

Localização: https://virtual. school. br/2021/login/index. php
[redirecionando]

--2021-08-01 15:30:07--  https://virtual. school.
br/2021/login/index. php

Reaproveitando a conexão existente para virtual. school. br:443.

A requisição HTTP foi enviada, aguardando resposta... 

  HTTP/1.1 303 See Other

  Server: nginx/1.10.1

  Date: Sun, 01 Aug 2021 18:30:07 GMT

  Content-Type: text/html; charset=utf-8

  Transfer-Encoding: chunked

  Connection: keep-alive

  Expires: Thu, 19 Nov 1981 08:52:00 GMT

  Cache-Control: no-store, no-cache, must-revalidate

  Pragma: no-cache

  X-Redirect-By: Moodle

  Location: https://virtual. school. br/2021/auth/shibboleth/

  Content-Language: pt-br

  Strict-Transport-Security: max-age=15768000; includeSubDomains;
  preload;

  X-Content-Type-Options: nosniff

  X-XSS-Protection: 1; mode=block

  X-Robots-Tag: none

  X-Download-Options: noopen

  X-Permitted-Cross-Domain-Policies: none

  X-Frame-Options: SAMEORIGIN

  Referrer-Policy: no-referrer

Localização: https://virtual. school. br/2021/auth/shibboleth/
[redirecionando]

--2021-08-01 15:30:07--  https://virtual. school.
br/2021/auth/shibboleth/

Reaproveitando a conexão existente para virtual. school. br:443.

A requisição HTTP foi enviada, aguardando resposta... 

  HTTP/1.1 302 Moved Temporarily

  Server: nginx/1.10.1

  Date: Sun, 01 Aug 2021 18:30:07 GMT

  Content-Type: text/html

  Transfer-Encoding: chunked

  Connection: keep-alive

  Location: https://systems. school.
  br/idp/profile/SAML2/Redirect/SSO?SAMLRequest={value edited:
  besides a lot of uninteligible parts with [a-zA-Z0-9], it
  contained several ASCII encoded things like %2F %2B %3A %3D } 

  Expires: Wed, 01 Jan 1987 12:34:56 GMT

  Cache-Control: private,no-store,no-cache,max-age=0

  Set-Cookie: _opensaml_req_ss%3A{ [a-zA-Z0-9%]\+ }; path=/;
  domain=.school. br; secure; HttpOnly;; SameSite=None

  Strict-Transport-Security: max-age=12345678; includeSubDomains;
  preload;

  X-Content-Type-Options: nosniff

  X-XSS-Protection: 1; mode=block

  X-Robots-Tag: none

  X-Download-Options: noopen

  X-Permitted-Cross-Domain-Policies: none

  X-Frame-Options: SAMEORIGIN

  Referrer-Policy: no-referrer

Erro de sintaxe em Set-Cookie: _opensaml_req_ss%3A{
    [a-zA-Z0-9%]\+ }; path=/; domain=.school. br; secure;
    HttpOnly;; SameSite=None na posição 167.

Localização: https://systems. school.
br/idp/profile/SAML2/Redirect/SSO?SAMLRequest={ [a-zA-Z0-9%]\+
# with "RelayState" inside } [redirecionando]

--2021-08-01 15:30:07--  https://systems. school.
br/idp/profile/SAML2/Redirect/SSO?SAMLRequest={ [a-zA-Z0-9%]\+
# with "RelayState" inside }

Resolvendo systems. school. br (systems. school. br)...
123.456.789.5

Conectando-se a systems. school. br (systems. school.
br)|123.456.789.5|:443... conectado.

A requisição HTTP foi enviada, aguardando resposta... 

  HTTP/1.1 302 Found

  Date: Sun, 01 Aug 2021 18:30:07 GMT

  Content-Length: 0

  Expires: Thu, 01 Dec 1994 12:34:00 GMT

  Cache-Control: no-cache="set-cookie, set-cookie2"

  Strict-Transport-Security: max-age=15552000

  Location: https://systems. school. br/idp/login.jsp

  Set-Cookie:
  WASReqURL=https:///idp/profile/SAML2/Redirect/SSO?SAMLRequest={
      [a-zA-Z0-9%]\+ # with "RelayState" inside }; path=/;
      secure; HttpOnly

  Keep-Alive: timeout=30, max=250

  Connection: Keep-Alive

  Content-Language: en-US

Localização: https://systems. school. br/idp/login.jsp
[redirecionando]

--2021-08-01 15:30:07--  https://systems. school.
br/idp/login.jsp

Reaproveitando a conexão existente para systems. school. br:443.

A requisição HTTP foi enviada, aguardando resposta... 

  HTTP/1.1 200 OK

  Date: Sun, 01 Aug 2021 18:30:07 GMT

  Expires: 0

  Cache-Control: no-cache, no-store, must-revalidate, max-age=0

  Pragma: no-cache

  Strict-Transport-Security: max-age=15552000

  Set-Cookie: JSESSIONID_CL01={ [a-zA-Z0-9%]\+ }:{ [a-zA-Z0-9%]\+
  }; Path=/; Secure

  Set-Cookie:
  WASReqURL=https:///idp/profile/SAML2/Redirect/SSO?SAMLRequest={
      [a-zA-Z0-9%]+ # with "RelayState" inside }; Path=/

  Set-Cookie:
  WASReqURL=https:///idp/profile/SAML2/Redirect/SSO?SAMLRequest={
      [a-zA-Z0-9%]+ # with "RelayState" inside }; Expires=Thu,
      01-Dec-94 12:34:56 GMT; Path=/; Domain=.school. br

  Keep-Alive: timeout=30, max=249

  Connection: Keep-Alive

  Transfer-Encoding: chunked

  Content-Type: text/html;charset=ISO-8859-1

  Content-Language: en-US

Tamanho: não especificada [text/html]

Salvando em: “/dev/null”


/dev/null               [ <=>                ]   2,87K  --.-KB/s
in 0,03s   


2021-08-01 15:30:07 (85,2 KB/s) - “/dev/null” salvo [2941]
Quote:
Originally Posted by Turbocapitalist View Post

You'll have to add in the authentication options as well, since you are logging in to Moodle.

It should show some response codes in the 300 range. If so, you have something you can work with when you write your own browser plug-in.
Indeed, there are 303 responses. But the problem is the cookies that are exist and valid in the browser, but not in a script. And the way the system works, where moodle is inside a subdomain, but we do not login specifically to it. We login with yet another subdomain (like "my. school. br") and are redirected all around. Is this a reason to conclude it is complicated? Or not necessarily?

To sniff what my browser sends and receives from network, helps anything? Get the cookie names+values+properties from the browser itself is very easy, but...

Last edited by dedec0; 08-01-2021 at 05:55 PM.
 
Old 08-01-2021, 03:40 PM   #4
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
Question

Quote:
Originally Posted by Turbocapitalist View Post
Can you watch the response codes using wget or a home made perl or python script?

Code:
wget -S -O /dev/null https:// virtual. school. br/2021/mod/url/view. php?id=13531
Another thought: i do not want to save the whole response content, like this command does (although it throws it out). It shows the network "path" that was made to get the content, find. But what i really want it to check the URL before each step is taken. There are URLs i do not even will visit before editing them (like those with visit or share IDs, and similar).

For this new thought, i ask: browser addons deal with such basic details of network operations?
 
Old 08-01-2021, 10:31 PM   #5
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,312
Blog Entries: 3

Rep: Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722
You'd have to write your own script in perl or python or similar to make the request to the web server and check what it returns each time and then allow you to choose whether to follow the next stage in the request or not. The above wget is not a solution. It will, however, show you all the stages in the request along with their response codes. That will tell you what kind of possibilities you have for writing your script.
 
Old 08-02-2021, 07:06 AM   #6
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
Question

Quote:
Originally Posted by Turbocapitalist View Post
You'd have to write your own script in perl or python or similar to make the request to the web server and check what it returns each time and then allow you to choose whether to follow the next stage in the request or not. The above wget is not a solution. It will, however, show you all the stages in the request along with their response codes. That will tell you what kind of possibilities you have for writing your script.
So, the core of what i need in an HTTP library, right? For example, in python:

https://thumbs2.imgbox.com/43/be/R3899Bxf_t.png

Or would an http request/response parser be what i need? Look:

https://thumbs2.imgbox.com/34/38/YXtHpiQn_t.png

I am not sure about the difference of these 2 results. What you think?
 
Old 08-02-2021, 07:08 AM   #7
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,312
Blog Entries: 3

Rep: Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722
Can you please post the text here inside [code] [/code] tags?
 
Old 08-02-2021, 07:19 AM   #8
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
Quote:
Originally Posted by Turbocapitalist View Post
Can you please post the text here inside [code] [/code] tags?
I did not write any code, Turbo. The images i sent are just with the description and names of the libraries that i am in doubt to which to choose, which i need. They are screenshots of synaptic window.
 
Old 08-02-2021, 07:24 AM   #9
teckk
LQ Guru
 
Registered: Oct 2004
Distribution: Arch
Posts: 5,138
Blog Entries: 6

Rep: Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827
I'm not sure what you are wanting. Why are there spaces in your urls like that?

Code:
#!/usr/bin/python

from http.client import HTTPSConnection
from time import sleep

#Example list, some good, some bad, on purpose.
u = ('/questions/linux-newbie-8/', 'questions/linux-newbie-8/', 
       '/questions/linux-software-2/', 'questions/linux-software-2/')
       
url = 'linuxquestions.org'

for i in u:
    a = HTTPSConnection(url)
    a.request('GET', i)
    b = a.getresponse()
    print('\n', i)
    print(b.status, b.reason)
    data = b.read().decode('utf-8', errors='ignore')
    a.close()
    sleep(2)
 
1 members found this post helpful.
Old 08-02-2021, 07:43 AM   #10
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
Quote:
Originally Posted by teckk View Post
I'm not sure what you are wanting. Why are there spaces in your urls like that?

Code:
#!/usr/bin/python

from http.client import HTTPSConnection
from time import sleep

#Example list, some good, some bad, on purpose.
u = ('/questions/linux-newbie-8/', 'questions/linux-newbie-8/', 
       '/questions/linux-software-2/', 'questions/linux-software-2/')
       
url = 'linuxquestions.org'

for i in u:
    a = HTTPSConnection(url)
    a.request('GET', i)
    b = a.getresponse()
    print('\n', i)
    print(b.status, b.reason)
    data = b.read().decode('utf-8', errors='ignore')
    a.close()
    sleep(2)
I wrote spaces in the URLs just to avoid LinuxQuestions automatic URL transforming, which destroyed the output in some parts. So, instead of disabling it (impossible now), i added spaces in a way that we still can read things easily. After that, I separated each line with an empty line, and broke the long ones to fit a medium screen width, so anyone reading my whole output can simply roll it down, without losing anything. And the URL domains, subdomains and the other unique details were changed/replaced with something that represent what i see in them, but just in a way that does not reveal their real values, for privacy concerns.

Thank you for the python example, teckk. But i think i need to deal with more complex details of a request. In post #3 (https://www.linuxquestions.org/quest...4/#post6271483), i show the requests and responses that happened in the browser. I do not want to make all of them. But i will need to make the requests pretty similar (i guess). The variable values (cookies) i get copy from the browser, or i will have to send requests since the first login page of my school (and probably do a robot browser job?).
 
Old 08-02-2021, 07:52 AM   #11
teckk
LQ Guru
 
Registered: Oct 2004
Distribution: Arch
Posts: 5,138
Blog Entries: 6

Rep: Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827Reputation: 1827
https://docs.python.org/3/library/http.client.html

Edit:
https://docs.python.org/3/library/urllib.html
https://docs.python-requests.org/en/master/index.html

Last edited by teckk; 08-02-2021 at 07:54 AM.
 
1 members found this post helpful.
Old 08-02-2021, 07:52 AM   #12
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,312
Blog Entries: 3

Rep: Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722
Quote:
Originally Posted by dedec0 View Post
They are screenshots of synaptic window.
Then please embed them here so they can be viewed in the context of your question both now and in the future after imgbox goes to the great bit bucket in the sky. Few will click on links to dodgy sites but embedding them here means that they are vetted to a substantial extent.
 
Old 08-02-2021, 09:02 AM   #13
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
Thumbs down

Quote:
Originally Posted by Turbocapitalist View Post
Then please embed them here so they can be viewed in the context of your question both now and in the future after imgbox goes to the great bit bucket in the sky. Few will click on links to dodgy sites but embedding them here means that they are vetted to a substantial extent.
imgbox is not a dodgy site. It is pretty safe, maintained by ads, but providing a great service for its users, registered (for free) or not registered. It is also very flexible, giving me options to choose how to share each image. In this thread, i used the BB code it gives (yes, LQ is not BB, but since it is equal in a few other tags, why not being in this KEY one?).

embed? How?? I just read the hints of all post editing buttons, and none is about image. And if you are talking about message attachments, we have a limited quota of them in LQ (or this changed, and nobody told me). In imgbox, there is no limit of how many images i can have, or how many galleries i can have (i keep things much organized, there, which is great), and there is no bandwidth limitation (except for abuses, of course).

Last edited by dedec0; 08-02-2021 at 09:03 AM.
 
Old 08-02-2021, 10:09 AM   #14
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,603

Rep: Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546
Quote:
Originally Posted by dedec0 View Post
I wrote spaces in the URLs just to avoid LinuxQuestions automatic URL transforming, which destroyed the output in some parts. So, instead of disabling it (impossible now)
Uh? Untick "Automatically parse links in text" under the message box. This can be done both at post-time and edit-time.


Quote:
if you are talking about message attachments, we have a limited quota of them in LQ
The quota is 35MB. The total size of the two images you linked is ~6KB - you could attach several thousand such images without exceeding the quota.

 
Old 08-02-2021, 10:30 AM   #15
dedec0
Senior Member
 
Registered: May 2007
Posts: 1,372

Original Poster
Rep: Reputation: 51
Quote:
Originally Posted by boughtonp View Post
Uh? Untick "Automatically parse links in text" under the message box. This can be done both at post-time and edit-time.
kkkk.... indeed. I did not see that, although i checked that place in compose page. For a mysterious reason, i did not recognize it. I wanted to have an option to never parse URLs automatically. Have it always off. And to always disable smileys in text - i hate that option. I also do not like the fact that tags in quoted text are not in separate lines, when we quote messages. This makes it harder to quote separate paragraphs using linuxes' selected text copy, which i do a lot. There is also a bad "feature" with code tags that they *always* add an empty line below our code - even if we write a single line of code, and leave both tags in the same line. I have reported (suggested?) this problem, and found out that i was not the only person who noticed it. This was long ago, it never changed, or showed anything.


Quote:
Originally Posted by boughtonp View Post
The quota is 35MB. The total size of the two images you linked is ~6KB - you could attach several thousand such images without exceeding the quota.
These 2 images are small, but they are exception. The quota here will be quickly filled, if i start using it. I prefer not to worry about size, and keep worrying just about showing the right parts.
 
  


Reply

Tags
mod url, moodle



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How do you transform a URL to make it look as a "this" link? stf92 LQ Suggestions & Feedback 5 05-02-2019 06:24 PM
ubuntu, cant expand variable inside sed denywinarto Linux - Software 1 06-23-2015 04:28 PM
Apache site redirects using what rule? foo.site.com -> www.site.com/foo LaughingBoy Linux - Server 2 04-16-2009 09:51 PM
Get IP addresses that accessed my site thisObject Linux - Software 13 11-14-2006 12:08 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Networking

All times are GMT -5. The time now is 03:24 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration