Expand/transform an URL which is inside a credential accessed site
Linux - NetworkingThis forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Expand/transform an URL which is inside a credential accessed site
Where i study, there is a site which all students use to do things related to the institution. Within this site, there is also access to a subsystem which runs Moodle, and it is used to have areas specific for each discipline, in each moment.
Inside that Moodle, sometimes we have an URL which is shared which students. But this URL is hidden. Id est, it is an URL within the school domain:
But if we visit that, we will be redirected to (possibly) any external site. And there is the problem: sometimes, teachers shares bad URLs which i *do not* want to visit with the browser profile i use to access the school and other "serious" things.
Pages to expand short/redirected URLs do not work here because they will always give, as result, the URL
https:// systems. school. br/idp/login. jsp
I searched for a browser (Firefox 52, in this situation) extension which could expand this URL, but using my existing session cookies to get the correct result. I did not find (neither for newer Firefox, or other browsers).
This may also be possible to solve with a silly programming in javascript, or in the browser development things. I even thought about opening this thread in the programming or in the general forums. Well, here is a good bet, i think.
Can you watch the response codes using wget or a home made perl or python script?
sure. i even have a good setup of python, now. and perl should be already ok, too. i use (at least) one command that is made purely in perl (maybe there are more, i do not examine things all around).
I ran and formatted (editing all the possibly sensitive data i saw) the above command. The output is below - but i forgot to run it with POSIX locale, so you can understand all messages. I think it is acceptable, in this case, but if you want, ask me to remake it.
Code:
$ wget -S -O /dev/null https:// virtual. school.
br/2021/mod/url/view. php?id=13531
--2021-08-01 15:30:07-- https://virtual. school
.br/2021/mod/url/view. php?id=13531
Resolvendo virtual. school. br (virtual. school. br)...
123.456.789.11
Conectando-se a virtual. school. br (virtual. school.
br)|123.456.789.11|:443... conectado.
A requisição HTTP foi enviada, aguardando resposta...
HTTP/1.1 303 See Other
Server: nginx/1.10.1
Date: Sun, 01 Aug 2021 18:30:07 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: MoodleSession2021=and8ejbnayejinvalidkwqwert;
path=/2021/; domain=school. br; secure; HttpOnly
Expires: Thu, 19 Nov 1981 08:76:54 GMT
Cache-Control: no-store, no-cache, must-revalidate
Pragma: no-cache
X-Redirect-By: Moodle
Location: https://virtual. school. br/2021/login/index. php
Content-Language: pt-br
Strict-Transport-Security: max-age=13579000; includeSubDomains;
preload;
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
X-Robots-Tag: none
X-Download-Options: noopen
X-Permitted-Cross-Domain-Policies: none
X-Frame-Options: SAMEORIGIN
Referrer-Policy: no-referrer
Localização: https://virtual. school. br/2021/login/index. php
[redirecionando]
--2021-08-01 15:30:07-- https://virtual. school.
br/2021/login/index. php
Reaproveitando a conexão existente para virtual. school. br:443.
A requisição HTTP foi enviada, aguardando resposta...
HTTP/1.1 303 See Other
Server: nginx/1.10.1
Date: Sun, 01 Aug 2021 18:30:07 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate
Pragma: no-cache
X-Redirect-By: Moodle
Location: https://virtual. school. br/2021/auth/shibboleth/
Content-Language: pt-br
Strict-Transport-Security: max-age=15768000; includeSubDomains;
preload;
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
X-Robots-Tag: none
X-Download-Options: noopen
X-Permitted-Cross-Domain-Policies: none
X-Frame-Options: SAMEORIGIN
Referrer-Policy: no-referrer
Localização: https://virtual. school. br/2021/auth/shibboleth/
[redirecionando]
--2021-08-01 15:30:07-- https://virtual. school.
br/2021/auth/shibboleth/
Reaproveitando a conexão existente para virtual. school. br:443.
A requisição HTTP foi enviada, aguardando resposta...
HTTP/1.1 302 Moved Temporarily
Server: nginx/1.10.1
Date: Sun, 01 Aug 2021 18:30:07 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Location: https://systems. school.
br/idp/profile/SAML2/Redirect/SSO?SAMLRequest={value edited:
besides a lot of uninteligible parts with [a-zA-Z0-9], it
contained several ASCII encoded things like %2F %2B %3A %3D }
Expires: Wed, 01 Jan 1987 12:34:56 GMT
Cache-Control: private,no-store,no-cache,max-age=0
Set-Cookie: _opensaml_req_ss%3A{ [a-zA-Z0-9%]\+ }; path=/;
domain=.school. br; secure; HttpOnly;; SameSite=None
Strict-Transport-Security: max-age=12345678; includeSubDomains;
preload;
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
X-Robots-Tag: none
X-Download-Options: noopen
X-Permitted-Cross-Domain-Policies: none
X-Frame-Options: SAMEORIGIN
Referrer-Policy: no-referrer
Erro de sintaxe em Set-Cookie: _opensaml_req_ss%3A{
[a-zA-Z0-9%]\+ }; path=/; domain=.school. br; secure;
HttpOnly;; SameSite=None na posição 167.
Localização: https://systems. school.
br/idp/profile/SAML2/Redirect/SSO?SAMLRequest={ [a-zA-Z0-9%]\+
# with "RelayState" inside } [redirecionando]
--2021-08-01 15:30:07-- https://systems. school.
br/idp/profile/SAML2/Redirect/SSO?SAMLRequest={ [a-zA-Z0-9%]\+
# with "RelayState" inside }
Resolvendo systems. school. br (systems. school. br)...
123.456.789.5
Conectando-se a systems. school. br (systems. school.
br)|123.456.789.5|:443... conectado.
A requisição HTTP foi enviada, aguardando resposta...
HTTP/1.1 302 Found
Date: Sun, 01 Aug 2021 18:30:07 GMT
Content-Length: 0
Expires: Thu, 01 Dec 1994 12:34:00 GMT
Cache-Control: no-cache="set-cookie, set-cookie2"
Strict-Transport-Security: max-age=15552000
Location: https://systems. school. br/idp/login.jsp
Set-Cookie:
WASReqURL=https:///idp/profile/SAML2/Redirect/SSO?SAMLRequest={
[a-zA-Z0-9%]\+ # with "RelayState" inside }; path=/;
secure; HttpOnly
Keep-Alive: timeout=30, max=250
Connection: Keep-Alive
Content-Language: en-US
Localização: https://systems. school. br/idp/login.jsp
[redirecionando]
--2021-08-01 15:30:07-- https://systems. school.
br/idp/login.jsp
Reaproveitando a conexão existente para systems. school. br:443.
A requisição HTTP foi enviada, aguardando resposta...
HTTP/1.1 200 OK
Date: Sun, 01 Aug 2021 18:30:07 GMT
Expires: 0
Cache-Control: no-cache, no-store, must-revalidate, max-age=0
Pragma: no-cache
Strict-Transport-Security: max-age=15552000
Set-Cookie: JSESSIONID_CL01={ [a-zA-Z0-9%]\+ }:{ [a-zA-Z0-9%]\+
}; Path=/; Secure
Set-Cookie:
WASReqURL=https:///idp/profile/SAML2/Redirect/SSO?SAMLRequest={
[a-zA-Z0-9%]+ # with "RelayState" inside }; Path=/
Set-Cookie:
WASReqURL=https:///idp/profile/SAML2/Redirect/SSO?SAMLRequest={
[a-zA-Z0-9%]+ # with "RelayState" inside }; Expires=Thu,
01-Dec-94 12:34:56 GMT; Path=/; Domain=.school. br
Keep-Alive: timeout=30, max=249
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html;charset=ISO-8859-1
Content-Language: en-US
Tamanho: não especificada [text/html]
Salvando em: “/dev/null”
/dev/null [ <=> ] 2,87K --.-KB/s
in 0,03s
2021-08-01 15:30:07 (85,2 KB/s) - “/dev/null” salvo [2941]
Quote:
Originally Posted by Turbocapitalist
You'll have to add in the authentication options as well, since you are logging in to Moodle.
It should show some response codes in the 300 range. If so, you have something you can work with when you write your own browser plug-in.
Indeed, there are 303 responses. But the problem is the cookies that are exist and valid in the browser, but not in a script. And the way the system works, where moodle is inside a subdomain, but we do not login specifically to it. We login with yet another subdomain (like "my. school. br") and are redirected all around. Is this a reason to conclude it is complicated? Or not necessarily?
To sniff what my browser sends and receives from network, helps anything? Get the cookie names+values+properties from the browser itself is very easy, but...
Another thought: i do not want to save the whole response content, like this command does (although it throws it out). It shows the network "path" that was made to get the content, find. But what i really want it to check the URL before each step is taken. There are URLs i do not even will visit before editing them (like those with visit or share IDs, and similar).
For this new thought, i ask: browser addons deal with such basic details of network operations?
You'd have to write your own script in perl or python or similar to make the request to the web server and check what it returns each time and then allow you to choose whether to follow the next stage in the request or not. The above wget is not a solution. It will, however, show you all the stages in the request along with their response codes. That will tell you what kind of possibilities you have for writing your script.
You'd have to write your own script in perl or python or similar to make the request to the web server and check what it returns each time and then allow you to choose whether to follow the next stage in the request or not. The above wget is not a solution. It will, however, show you all the stages in the request along with their response codes. That will tell you what kind of possibilities you have for writing your script.
So, the core of what i need in an HTTP library, right? For example, in python:
Can you please post the text here inside [code] [/code] tags?
I did not write any code, Turbo. The images i sent are just with the description and names of the libraries that i am in doubt to which to choose, which i need. They are screenshots of synaptic window.
I'm not sure what you are wanting. Why are there spaces in your urls like that?
Code:
#!/usr/bin/python
from http.client import HTTPSConnection
from time import sleep
#Example list, some good, some bad, on purpose.
u = ('/questions/linux-newbie-8/', 'questions/linux-newbie-8/',
'/questions/linux-software-2/', 'questions/linux-software-2/')
url = 'linuxquestions.org'
for i in u:
a = HTTPSConnection(url)
a.request('GET', i)
b = a.getresponse()
print('\n', i)
print(b.status, b.reason)
data = b.read().decode('utf-8', errors='ignore')
a.close()
sleep(2)
I'm not sure what you are wanting. Why are there spaces in your urls like that?
Code:
#!/usr/bin/python
from http.client import HTTPSConnection
from time import sleep
#Example list, some good, some bad, on purpose.
u = ('/questions/linux-newbie-8/', 'questions/linux-newbie-8/',
'/questions/linux-software-2/', 'questions/linux-software-2/')
url = 'linuxquestions.org'
for i in u:
a = HTTPSConnection(url)
a.request('GET', i)
b = a.getresponse()
print('\n', i)
print(b.status, b.reason)
data = b.read().decode('utf-8', errors='ignore')
a.close()
sleep(2)
I wrote spaces in the URLs just to avoid LinuxQuestions automatic URL transforming, which destroyed the output in some parts. So, instead of disabling it (impossible now), i added spaces in a way that we still can read things easily. After that, I separated each line with an empty line, and broke the long ones to fit a medium screen width, so anyone reading my whole output can simply roll it down, without losing anything. And the URL domains, subdomains and the other unique details were changed/replaced with something that represent what i see in them, but just in a way that does not reveal their real values, for privacy concerns.
Thank you for the python example, teckk. But i think i need to deal with more complex details of a request. In post #3 (https://www.linuxquestions.org/quest...4/#post6271483), i show the requests and responses that happened in the browser. I do not want to make all of them. But i will need to make the requests pretty similar (i guess). The variable values (cookies) i get copy from the browser, or i will have to send requests since the first login page of my school (and probably do a robot browser job?).
Then please embed them here so they can be viewed in the context of your question both now and in the future after imgbox goes to the great bit bucket in the sky. Few will click on links to dodgy sites but embedding them here means that they are vetted to a substantial extent.
Then please embed them here so they can be viewed in the context of your question both now and in the future after imgbox goes to the great bit bucket in the sky. Few will click on links to dodgy sites but embedding them here means that they are vetted to a substantial extent.
imgbox is not a dodgy site. It is pretty safe, maintained by ads, but providing a great service for its users, registered (for free) or not registered. It is also very flexible, giving me options to choose how to share each image. In this thread, i used the BB code it gives (yes, LQ is not BB, but since it is equal in a few other tags, why not being in this KEY one?).
embed? How?? I just read the hints of all post editing buttons, and none is about image. And if you are talking about message attachments, we have a limited quota of them in LQ (or this changed, and nobody told me). In imgbox, there is no limit of how many images i can have, or how many galleries i can have (i keep things much organized, there, which is great), and there is no bandwidth limitation (except for abuses, of course).
I wrote spaces in the URLs just to avoid LinuxQuestions automatic URL transforming, which destroyed the output in some parts. So, instead of disabling it (impossible now)
Uh? Untick "Automatically parse links in text" under the message box. This can be done both at post-time and edit-time.
Quote:
if you are talking about message attachments, we have a limited quota of them in LQ
The quota is 35MB. The total size of the two images you linked is ~6KB - you could attach several thousand such images without exceeding the quota.
Uh? Untick "Automatically parse links in text" under the message box. This can be done both at post-time and edit-time.
kkkk.... indeed. I did not see that, although i checked that place in compose page. For a mysterious reason, i did not recognize it. I wanted to have an option to never parse URLs automatically. Have it always off. And to always disable smileys in text - i hate that option. I also do not like the fact that tags in quoted text are not in separate lines, when we quote messages. This makes it harder to quote separate paragraphs using linuxes' selected text copy, which i do a lot. There is also a bad "feature" with code tags that they *always* add an empty line below our code - even if we write a single line of code, and leave both tags in the same line. I have reported (suggested?) this problem, and found out that i was not the only person who noticed it. This was long ago, it never changed, or showed anything.
Quote:
Originally Posted by boughtonp
The quota is 35MB. The total size of the two images you linked is ~6KB - you could attach several thousand such images without exceeding the quota.
These 2 images are small, but they are exception. The quota here will be quickly filled, if i start using it. I prefer not to worry about size, and keep worrying just about showing the right parts.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.