LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   sql: put "beginning of line" and "end of line" within a charset... (https://www.linuxquestions.org/questions/programming-9/sql-put-beginning-of-line-and-end-of-line-within-a-charset-4175426542/)

masavini 09-10-2012 10:43 AM

sql: put "beginning of line" and "end of line" within a charset...
 
hi,
i have this code:
Code:

$pattern = '/([A-Z])([0-9])/';
$replacement = '\\1 \\2';
$string = preg_replace($pattern, $replacement, $string);
       
$pattern = '/([0-9])([A-Z])/';
$replacement = '\\1 \\2';
$string = preg_replace($pattern, $replacement, $string);

$splitted = split('[ -. /\_]+',$string);
       
if (preg_match("/^[A-Z]+$/", $splitted[0])) {
        $qstring = $searchfield." REGEXP '[^A-Z]$splitted[0][^A-Z]'";
}
else {
        $qstring = $searchfield." REGEXP '[^0-9]$splitted[0][^0-9]'";
}

$i = 0;
while (++$i < count($splitted)) {
        if (!empty($splitted[$i])) {
                if (preg_match("/^[A-Z]+$/", $splitted[i])) {
                        $qstring .= "AND ".$searchfield." REGEXP '[^A-Z]$splitted[i][^A-Z]'";
                }
                else {
                        $qstring .= "AND ".$searchfield." REGEXP '[^0-9]$splitted[i][^0-9]'";
                }
        }
}

what happens is that if $splitted[i] is found at the beginning or at the end of my record, the query gives no results...

any idea?
i could add a space at the benigging and in the end all of my records... it would work, but it would be quite an horrible solution...

thanks!

Snark1994 09-11-2012 08:09 AM

What're the contents of 'string' and your database records likely to be? Just trying to get my head around what you're trying to do...

Also, surely '$splitted[i]' should be '$splitted[$i]'? (and would be easier to understand if you didn't put it in the SQL string directly) I would also change your while loop construct to a foreach loop, but that's more personal preference - though again it would make it easier to read.

masavini 09-11-2012 09:12 AM

Quote:

Originally Posted by Snark1994 (Post 4777663)
What're the contents of 'string' and your database records likely to be? Just trying to get my head around what you're trying to do...

Code:

$string = "BUBU23GH"
$pattern = '/([A-Z])([0-9])/';
$replacement = '\\1 \\2';
$string = preg_replace($pattern, $replacement, $string);
       
$pattern = '/([0-9])([A-Z])/';
$replacement = '\\1 \\2';
$string = preg_replace($pattern, $replacement, $string);

now $string is "BUBU 23 GH"

Code:

$splitted = split('[ -. /\_]+',$string);
$splitted[0] is "BUBU"
$splitted[1] is "23"
$splitted[2] is "GH"

Code:

if (preg_match("/^[A-Z]+$/", $splitted[0])) {
        $qstring = $searchfield." REGEXP '[^A-Z]$splitted[0][^A-Z]'";
}
else {
        $qstring = $searchfield." REGEXP '[^0-9]$splitted[0][^0-9]'";
}

$qstring is "searchfield REGEXP '[^A-Z]BUBU[^A-Z]'"

Code:

$i = 0;
while (++$i < count($splitted)) {
        if (!empty($splitted[$i])) {
                if (preg_match("/^[A-Z]+$/", $splitted[i])) {
                        $qstring .= "AND ".$searchfield." REGEXP '[^A-Z]$splitted[$i][^A-Z]'";  <- yes, i missed a "$"
                }
                else {
                        $qstring .= "AND ".$searchfield." REGEXP '[^0-9]$splitted[$i][^0-9]'";  <- yes, i missed a "$"
                }
        }
}

now $qstring is "searchfield REGEXP '[^A-Z]BUBU[^A-Z]' AND searchfield REGEXP '[^0-9]23[^0-9]' AND searchfield REGEXP '[^A-Z]GH[^A-Z]'"

let's say i have a table with one field only (searchfield):
Code:

searchfield
BUBU23GH
_BUBU23GH_
_ABUBU23GH_
_BUBU233GH_

if i run the query i only get "_BUBU23GH_", since the record "BUBU23GH" does not match the expressions '[^A-Z]BUBU[^A-Z]' and '[^A-Z]GH[^A-Z]'
how to get both "_BUBU23GH_" and "BUBU23GH" with the results?

i have 2 ways:
1) modify the table records adding a particular char (like _) in the first and last position, then trim them away when getting the results (VERY uncomfortable)
2) modify the regular expressions so that [^A-Z] and [^0-9] include the special chars ^ and $ (begginning and end of line)... is it possible?

i hope it's a bit clearer, now...

danielbmartin 09-11-2012 09:25 AM

Quote:

Originally Posted by masavini (Post 4777711)
i hope it's a bit clearer, now...

Not really.

It would help if you provide a sample input file and the desired output file.

Daniel B. Martin

masavini 09-11-2012 09:59 AM

ok, i try to...

sql to create a test table:
Code:

CREATE TABLE IF NOT EXISTS `test` (
  `searchfield` text
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO `test` (`searchfield`) VALUES
('BUBU23GH'),
('_BUBU23GH_'),
('_ABUBU23GH_'),
('_BUBU233GH_');

php code to test the regular expressions:
Code:

$string = "BUBU23GH";

$pattern = '/([A-Z])([0-9])/';
$replacement = '\\1 \\2';
$string = preg_replace($pattern, $replacement, $string);

$pattern = '/([0-9])([A-Z])/';
$replacement = '\\1 \\2';
$string = preg_replace($pattern, $replacement, $string);

$splitted = split('[ -. /\_]+',$string);

if (preg_match("/^[A-Z]+$/", $splitted[0])) {
        $qstring = "searchfield REGEXP '[^A-Z]$splitted[0][^A-Z]'";
}
else {
        $qstring = "searchfield REGEXP '[^0-9]$splitted[0][^0-9]'";
}

$i = 0;
while (++$i < count($splitted)) {
        if (!empty($splitted[$i])) {
                if (preg_match("/^[A-Z]+$/", $splitted[$i])) {
                        $qstring .= " AND searchfield REGEXP '[^A-Z]$splitted[$i][^A-Z]'";
                }
                else {
                        $qstring .= " AND searchfield REGEXP '[^0-9]$splitted[$i][^0-9]'";
                }
        }
}

$query = "SELECT searchfield FROM test WHERE ".$qstring;
$mysqli = new mysqli($dbip, $user, $pwd, $dbname);
$result = $mysqli->query($query);
$out = $result->fetch_row();
foreach ($out as $k) {
      printf("$k");
}

current output is:
_BUBU23GH_

desired output is:
BUBU23GH
_BUBU23GH_

YankeePride13 09-11-2012 12:51 PM

Here's your query:

SELECT searchfield FROM test WHERE
searchfield REGEXP '[^A-Z]BUBU[^A-Z]'
AND
searchfield REGEXP '[^0-9]23[^0-9]'
AND
searchfield REGEXP '[^A-Z]GH[^A-Z]'

The reason you do not get BUBU23GH in the output is because of the first condition. There, you are saying to select all searchField where the string does not start and does not end with a letter, which obviously BUBU23GH does.

masavini 09-11-2012 03:11 PM

Quote:

Originally Posted by YankeePride13 (Post 4777857)
The reason you do not get BUBU23GH in the output is because of the first condition. There, you are saying to select all searchField where the string does not start and does not end with a letter, which obviously BUBU23GH does.

sure, i know this!
is there a way to fix it?

i can't believe nobody ever had the need to specify a charset with beginning and end of lines... i mean something like [^A-Z$^]...

masavini 09-19-2012 08:03 AM

solved with:
Code:

        $unwanted = "[^0-9]";
        if (preg_match("/^[A-Z]+$/", $splitted[0])) {
                $unwanted = "[^A-Z]";
        }
        $qstring0 = "(".$searchfield1." REGEXP '$unwanted$splitted[0]$unwanted' OR ".$searchfield1." REGEXP '^$splitted[0]$unwanted' OR ".$searchfield1." REGEXP '$unwanted$splitted[0]$')";

        $i = 0;
        while (++$i < count($splitted)) {
                if (!empty($splitted[$i])) {
                        $unwanted = "[^0-9]";
                        if (preg_match("/^[A-Z]+$/", $splitted[$i])) {
                                $unwanted = "[^A-Z]";
                        }
                        $qstring0 .= " AND (".$searchfield1." REGEXP '$unwanted$splitted[$i]$unwanted' OR ".$searchfield1." REGEXP '^$splitted[$i]$unwanted' OR ".$searchfield1." REGEXP '$unwanted$splitted[$i]$')";
                }
        }

thanks for your support!


All times are GMT -5. The time now is 07:29 PM.