LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 01-13-2016, 11:42 AM   #1
robertjinx
Member
 
Registered: Oct 2007
Location: Prague, CZ
Distribution: RedHat / CentOS / Ubuntu / SUSE / Debian
Posts: 749

Rep: Reputation: 73
PHP curl - how to get the content of a page without processing the page


Hello, I'm trying to get a page using curl (url), but that contains some script, which redirects the user or curl in this case to a website. I would like to get the script (as a string) and use 'preg_match' and do something, but when doing this in the browser, I get redirected and I cant processes the variable:

PHP Code:
function get_data($url) {
    
$ch curl_init();
    
$timeout 5;
    
curl_setopt($chCURLOPT_URL$url);
    
curl_setopt($chCURLOPT_RETURNTRANSFER1);
    
curl_setopt($chCURLOPT_CONNECTTIMEOUT$timeout);
    
$data curl_exec($ch);
    
curl_close($ch);
    return 
$data;
}

$url "http://webhost4christ.com/7744a.php?=88"// this is spam, really
$data get_data($url);

echo 
$data // this will redirect me to the final page 
Im building a function which checks any pages for a certain content for spam, the content looks like this:

Quote:
<script type="text/javascript">
var host = 'kncsllue.ru';
var query = window.location.search.substring(1);
var vars = query.split('&');

if(vars.length>0){
var pair = vars[0].split('=');
var way;

if (pair[1]){

way = '/?cid='+pair[1];

if (pair[1]=='Unsubscribe'){
way = '/report/spam/';
}
}
else
{
way = '/'
}
}

if(!way){way = '/'}

way = 'http://' + host + way;
top.window.location.href = way;
</script>
This is a spam page and its a script. Using curl, I wanna get this and preg_match to find for example '/report/spam/' or something similar.

Any help would be welcomed. Thanks
 
Old 01-13-2016, 12:15 PM   #2
thesnow
Member
 
Registered: Nov 2010
Location: Minneapolis, MN
Distribution: Ubuntu, Red Hat, Mint
Posts: 172

Rep: Reputation: 56
Have you tried file_get_contents instead of curl ?
 
Old 01-13-2016, 12:16 PM   #3
robertjinx
Member
 
Registered: Oct 2007
Location: Prague, CZ
Distribution: RedHat / CentOS / Ubuntu / SUSE / Debian
Posts: 749

Original Poster
Rep: Reputation: 73
Yes, but requires url_fopen to be enabled. Find it a bit of a security risk.

Quote:
PHP Warning: file_get_contents(): http:// wrapper is disabled in the server configuration by allow_url_fopen=0
 
Old 01-13-2016, 12:52 PM   #4
thesnow
Member
 
Registered: Nov 2010
Location: Minneapolis, MN
Distribution: Ubuntu, Red Hat, Mint
Posts: 172

Rep: Reputation: 56
If you skip echoing $data you should be able to find the string,

Code:
$results = strpos($data,'/report/spam');
echo $results;

Or use highlight_string($data);, which should return the contents of the script instead of running it.
 
Old 01-13-2016, 01:43 PM   #5
robertjinx
Member
 
Registered: Oct 2007
Location: Prague, CZ
Distribution: RedHat / CentOS / Ubuntu / SUSE / Debian
Posts: 749

Original Poster
Rep: Reputation: 73
Quote:
Originally Posted by thesnow View Post
If you skip echoing $data you should be able to find the string,

Code:
$results = strpos($data,'/report/spam');
echo $results;

Or use highlight_string($data);, which should return the contents of the script instead of running it.
This actually sounds good, Danke man!

Not sure how exact it will be, but keeping my fingers crossed
 
Old 01-13-2016, 02:08 PM   #6
robertjinx
Member
 
Registered: Oct 2007
Location: Prague, CZ
Distribution: RedHat / CentOS / Ubuntu / SUSE / Debian
Posts: 749

Original Poster
Rep: Reputation: 73
One more question, what could I use in PHP to check for multiple 'strings', something like this, simplified?

Quote:
$results = strpos($data, 'var host =');
if ($results) {
$results = strpos($data, '/?cid=');
if ($results) {
$results = strpos($data, '/report/spam/');
if ($results)
echo "Spam!";
}
}
 
Old 01-13-2016, 03:54 PM   #7
thesnow
Member
 
Registered: Nov 2010
Location: Minneapolis, MN
Distribution: Ubuntu, Red Hat, Mint
Posts: 172

Rep: Reputation: 56
Sure, or something like this:

Code:
$s1 = strpos($data,'var host =');
$s2 = strpos($data, '/?cid=');
$s3 = strpos($data, '/report/spam/');

if ( $s1 && $s2 && $s3 ) {
  echo "SPAM!";
}
 
Old 01-14-2016, 01:04 AM   #8
robertjinx
Member
 
Registered: Oct 2007
Location: Prague, CZ
Distribution: RedHat / CentOS / Ubuntu / SUSE / Debian
Posts: 749

Original Poster
Rep: Reputation: 73
Quote:
Originally Posted by thesnow View Post
Sure, or something like this:

Code:
$s1 = strpos($data,'var host =');
$s2 = strpos($data, '/?cid=');
$s3 = strpos($data, '/report/spam/');

if ( $s1 && $s2 && $s3 ) {
  echo "SPAM!";
}
Well, thats kind of the same as me, just from the different angle.
 
Old 01-14-2016, 01:57 AM   #9
robertjinx
Member
 
Registered: Oct 2007
Location: Prague, CZ
Distribution: RedHat / CentOS / Ubuntu / SUSE / Debian
Posts: 749

Original Poster
Rep: Reputation: 73
BTW, since I've implemented that little 'code', I've got 57 (bad/spam urls) baddies, not even Google Safe Browsing works this fast So somewhat it works!
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Ho to get to the second ( etc) search page using curl? rimma Linux - Newbie 1 11-08-2013 01:15 AM
[SOLVED] PHP cURL how to get page image from remote server without errrors devwink Programming 1 04-12-2010 07:53 AM
[SOLVED] Using curl to log in to https page. Chris_no Linux - General 3 09-03-2009 03:50 PM
How do I output information from a PHP page to an HTML page? SentralOrigin Programming 3 01-10-2009 01:54 AM
i cannt use curl get this page,,,why henryluo Linux - Software 1 03-01-2008 08:07 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 04:27 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration