LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 07-06-2020, 11:04 AM   #1
TBotNik
Member
 
Registered: May 2016
Location: Greenville, TX
Distribution: Kubuntu 18.04
Posts: 796

Rep: Reputation: Disabled
Wink PHP Help


All,

OK, I have this PHP cli script to scrub tags from all .html files in a directory. If all the tags being sought were in one line of the file, then I think my current code would work, but usually the code is on different lines, so I cannot assign the array:

Code:
  $array[] = ('num'=>$arraylinenumber,'nam'=>$r_nam,'add'=>$r_add,'cty'=>$r_cty,'stt'=>$r_stt,'zip'=$r_zip,'phn'=$r_phn,'abt'>$r_abt);
until I know all the elements are non-blank and then clear them after the array assignment. So I need to put the array assignment outside the "foreach lines" loop, but having issues about how best to record the non-blank array element to preserve them for the array line write.

I'm still thinking about the best way to do this and here is my current code is:

Code:
	$src_lst		=	scandir ( "$src_dir" );
        foreach ( $src_lst as &$file ) {
		$sps = strpos ( $file, ".html");
		if ( $sps == false ) { continue; }
		$ret_ray[] = get_info ( $file );
	}	// end foreach $src_list
	
	function get_info ( $inf ) {
		$lines 	=	file ( $inf );
		$o_ray	=	array();
		$ray_cnt	=	0;
		foreach ( $lines as $l_num => $line ) {
			$r_nam	=	get_name   ( $line );
			$r_add	=	get_street ( $line );
			$r_cty	=	get_city   ( $line );
			$r_stt	=	get_state  ( $line );
			$r_zip	=	get_zip 	  ( $line );
			$r_phn	=	get_phone  ( $line );
			$r_abt	=	get_about  ( $line );
			$o_ray[]="'nam'=>$r_nam,'add'=>$r_add,'cty'=>$r_cty,
			'stt'=>$r_stt,'zip'=$r_zip,'phn'=$r_phn,
			'abt'=>$r_abt";
		}	// end foreach $lines
	}		// end function get_info
			

	function get_name ( $lin ) {
		if ( strpos ( $line, 'itemprop="url">') == true ) { 
			$nps		=	strpos ( $line, 'itemprop="url">') + 15;
			$nln		=	substr ( $line, $nps );
			$eps		=	strpos ( $nln, '</a>');
			$nam		=	substr ( $nln, 0, $eps );
			return $nam
		}	//	end if strpos
	}		// end function get_info

	function get_street ( $lin ) {
		if ( strpos ( $line, 'streetAddress">') == true ) { 
			$nps		=	strpos ( $line, 'streetAddress">') + 15;
			$nln		=	substr ( $line, $nps );
			$eps		=	strpos ( $nln, '</span>');
			$nam		=	substr ( $nln, 0, $eps );
			return $nam
		}	//	end if strpos
	}		// end function get_info

	function get_city ( $lin ) {
		if ( strpos ( $line, 'addressLocality">') == true ) { 
			$nps		=	strpos ( $line, 'addressLocality">') + 17;
			$nln		=	substr ( $line, $nps );
			$eps		=	strpos ( $nln, '</span>');
			$nam		=	substr ( $nln, 0, $eps );
			return $nam
		}	//	end if strpos
	}		// end function get_info

	function get_state ( $lin ) {
		if ( strpos ( $line, 'addressRegion">') == true ) { 
			$nps		=	strpos ( $line, 'addressRegion">') + 15;
			$nln		=	substr ( $line, $nps );
			$eps		=	strpos ( $nln, '</span>');
			$nam		=	substr ( $nln, 0, $eps );
			return $nam
		}	//	end if strpos
	}		// end function get_info

	function get_zip ( $lin ) {
		if ( strpos ( $line, 'postalCode">') == true ) { 
			$nps		=	strpos ( $line, 'postalCode">') + 12;
			$nln		=	substr ( $line, $nps );
			$eps		=	strpos ( $nln, '</span>');
			$nam		=	substr ( $nln, 0, $eps );
			return $nam
		}	//	end if strpos
	}		// end function get_info

	function get_phone ( $lin ) {
		if ( strpos ( $line, 'telephone">') == true ) { 
			$nps		=	strpos ( $line, 'telephone">') + 11;
			$nln		=	substr ( $line, $nps );
			$eps		=	strpos ( $nln, '</span>');
			$nam		=	substr ( $nln, 0, $eps );
			return $nam
		}	//	end if strpos
	}		// end function get_info

	function get_about ( $lin ) {
		if ( strpos ( $line, 'itemprop="about">') == true ) { 
			$nps		=	strpos ( $line, 'itemprop="about">') + 17;
			$nln		=	substr ( $line, $nps );
			$eps		=	strpos ( $nln, '</p>');
			$nam		=	substr ( $nln, 0, $eps );
			return $nam
		}	//	end if strpos
	}		// end function get_info
Oh. the final goal is to build the "INSERT VALUE" rows into a .sql file for import into MySQL!

Open to your suggestions here!

Cheers!

TBNK

Last edited by TBotNik; 07-06-2020 at 11:16 AM.
 
Old 07-06-2020, 12:41 PM   #2
shruggy
Senior Member
 
Registered: Mar 2020
Posts: 3,670

Rep: Reputation: Disabled
Just so you know: Document Object Model, particularly, DOMDocument and DOMXPath.

If that's not enough, there are also Symfony DomCrawler and PHP Simple HTML DOM Parser.
 
Old 07-06-2020, 12:51 PM   #3
TBotNik
Member
 
Registered: May 2016
Location: Greenville, TX
Distribution: Kubuntu 18.04
Posts: 796

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by shruggy View Post
Just so you know: Document Object Model, particularly, DOMDocument and DOMXPath.

If that's not enough, there are also Symfony DomCrawler and PHP Simple HTML DOM Parser.
shruggy,

Thanks! Looking at those! However to complete my array, since the lines in the .html files do not serially contain the tags, how would suggest I work them until all are filled in before assigning to my output array? Will these DOM apps read the file as a streem instead of line by line?

Cheers!

TBNK

PS
I go to the trouble of manually finding what I want online and then saving the page with info I want into my "/Web_searches" directory, so I can run this process on them. Depending on the online source the "tags" needed for processing vary, so will have to adapt my script for each online source I search!

Last edited by TBotNik; 07-06-2020 at 12:57 PM.
 
Old 07-06-2020, 12:58 PM   #4
shruggy
Senior Member
 
Registered: Mar 2020
Posts: 3,670

Rep: Reputation: Disabled
Quote:
Originally Posted by TBotNik View Post
Will these DOM apps read the file as a stream instead of line by line?
They will see the file as a tree of nodes. That's the point of using them. See this question.

Last edited by shruggy; 07-06-2020 at 04:07 PM.
 
Old 07-15-2020, 11:04 AM   #5
TBotNik
Member
 
Registered: May 2016
Location: Greenville, TX
Distribution: Kubuntu 18.04
Posts: 796

Original Poster
Rep: Reputation: Disabled
All,

I looked at the DOM suggestions and it will take longer to learn them than to fix my array problem.

Thanks, but need an answer on the original issue!

TBNK
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
After upgrading php 5.3.8 to php 5.4.30 php files are not being executed kingkashif Linux - Server 5 06-26-2014 10:15 AM
[SOLVED] Am trying to install packages like net-snmp, php-mysql, php-snmp, php but with errors Maj Linux - Newbie 1 07-26-2013 02:12 PM
PHP - Mail.php and Mail/mime.php issues LVsFINEST Linux - Server 1 02-08-2009 05:44 PM
php apache or php cgi - php learner rblampain Linux - Security 3 12-17-2004 11:10 PM
help getting pptp-php-gtk.php to run as root mrtwice Linux - Software 0 11-21-2003 12:49 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 12:25 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration