LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   Database Query Help (https://www.linuxquestions.org/questions/linux-server-73/database-query-help-4175669617/)

xray55 02-14-2020 03:50 PM

Database Query Help
 
Hi guys,

I will be as brief as possible, I converted a 17 yr old vbulletin forum into Xenforo. The importer worked fine but the redirect script didnt. That means any links in the posts that point to other places on forum are now bad links.

Here is an example of previous vBulletin thread link

Code:

domain.com/forums/showthread.php?69985-daily-news

and it needs to change to look like this

Code:

domain.com/forums/index.php?threads/.69985

I cant touch 69985 because every thread or post will have a different ID which is fine.

So I need to replace showthread.php? with index.php?threads/.



which I can and haven't done so yet

db table: xf_post
db index: message

Code:

UPDATE xf_post SET message = REPLACE(message, 'showthread.php?', 'index.php?threads/.');

If I do that, I end up with

Code:

domain.com/forums/index.php?threads/.69985-daily-news
which is almost correct but I need to remove the label for it to work

-daily-news



So every thread will have a different ID which I cant remove but I need to remove that thread label.

I need a clever way to mass remove all the different labels from different threads.

Thanks in advance for any help

boughtonp 02-14-2020 05:21 PM

You haven't said which database software you're using, so I'm assuming MariaDB.

You can use regex replace to capture the numeric portion, then append it like so:

Code:

REGEX_REPLACE( message , 'showthread\\.php\\?([0-9]+)' , 'index.php?threads/.\\1' )
At present, that would give you the same result as your current replace.

There's two options for handling the label.

One option is to put an ampersand on the end of the replacement:
Code:

'index.php?threads/.\\1&'
That maintains the label information for users, but I don't know if it works with Xenforo (given that it's using "/." instead of "=", it may well claim the entire QS.)

The other option is to change the pattern to include the label:
Code:

'showthread.php\\?([0-9]+)-[\w-]+'
Depending on how the label is generated, it might need other characters added to the character class.


Another option is to not touch the database at all, and solve this using URL rewriting in your web server - you'd use similar regex patterns and replacements (but with only single backslashes instead of the doubles that MariaDB requires).


xray55 02-16-2020 02:38 AM

Hi,

Sorry for the late response. Im using mysql57 under centos 7.

It looks like there is no double slant escapes used

heres one copied from the xenforo forum post table viewed in text editor, this another example post from vbulletin that now sits in xenforo post table and needs converted over, the label does end with a backslash but i think that can be ignored since its trailing.


Code:

[URL=\"https://domain/forums/showthread.php?66750-CURRENT-VIP-PAYMENT-PAGE\"]

I planned on removing the label entirely cause I thought it would be easier to do to look like this

Code:

/forums/index.php?threads/.66750/



but xenforo does allow label as option but it has to be moved before the thread ID number like this and if u dont included it, it redirects to it based on what the current thread title is.

Code:

/forums/index.php?threads/current-vip-payment-page.66750/

and viewing the table you are right, there are a lot of threads in table ending in t=thread#
Code:

forums/showthread.php?t=66750
I think these are the older vb3 links that also need addressed. vb4 doesnt use the = sign

xray55 02-16-2020 03:43 PM

1 Attachment(s)
So I tried this query

Code:

REGEX_REPLACE( message , 'showthread.php\\?([0-9]+)-[\w-]+' , 'index.php?threads/.\\1&' )

regex drops to a tree, what other info am i suppose to enter there

xray55 02-17-2020 02:24 PM

any help with this would very much appreciated.

boughtonp 02-17-2020 04:46 PM

Quote:

Originally Posted by xray55 (Post 6090863)
So I tried this query

Code:

REGEX_REPLACE( message , 'showthread.php\\?([0-9]+)-[\w-]+' , 'index.php?threads/.\\1&' )

regex drops to a tree, what other info am i suppose to enter there

That's not a query, it's a single function call / expression that takes the place of the second part of your set statement in your original query.

i.e:
Code:

UPDATE xf_post SET message = REPLACE(...)
changes to
Code:

UPDATE xf_post SET message = REGEX_REPLACE(...)
But before you do that, it's often a good idea to test these types of things with by previewing the result in a select statement, e.g:
Code:

SELECT message , message2=REGEX_REPLACE(...) FROM xf_post WHERE message LIKE '%showthread.php%' LIMIT 0,10
Then you can check that the replace works as expected.

Another version of that is to duplicate the existing column first, then you can modify the duplicate but still have an easy escape route if things doesn't work.


boughtonp 02-17-2020 04:58 PM

To put the title before the id number, you could use something like:
Code:

REGEX_REPLACE( message , 'showthread\\.php\\?([0-9]+)-([\\w-]+)' , 'index.php?threads/\\2.\\1' )
Although I'd also recommend explicitly checking for the site URL - you don't want to change external links...
Code:

REGEX_REPLACE( message , '(https://domain/forums/)showthread\\.php\\?([0-9]+)-([\\w-]+)' , '\\1index.php?threads/\\3.\\2' )
Though if you've got both absolute and relative links might need something a bit more than that.

xray55 02-17-2020 05:04 PM

but I notice you're just using the field name message while the table name is xf_post?

are u for hire?

boughtonp 02-17-2020 05:26 PM

Quote:

Originally Posted by xray55 (Post 6091206)
but I notice you're just using the field name message while the table name is xf_post?

I'm using the same names as in your first post, but trying to keep the code snippets simple to help with understanding.

Quote:

are u for hire?
Potentially - feel free to send a PM to discuss, (though it might take me a couple of days to respond).


xray55 02-17-2020 05:45 PM

I dont see PM button in your profile, I added as friend to see if it allows after. U try Pming me or u can also contact me by email or skype jaystevens5555

gmail

scasey 02-17-2020 07:29 PM

Quote:

Originally Posted by xray55 (Post 6091223)
I dont see PM button in your profile, I added as friend to see if it allows after. U try Pming me or u can also contact me by email or skype

gmail

I'll leave it to a mod to explain why such things (private solutions to questions) are frowned on at LQ, but I will point out that PM is not available in either direction until one has a certain number of posts (I believe its 100, but you can find that in the FAQs yourself).

It's your data, the documentation for mysql is excellent. You've been given some excellent suggestions.
Make a copy of the data and experiment.
Read up on regular expressions.

You is spelled y-o-u.

xray55 02-17-2020 07:44 PM

Hello Scasey,

Ya understood and long time no see. You saved me previously on another DB query that didnt require regex.

Well I'm doing something wrong cause pasting in the Regex command will drops down to a tree, I guess I'm not understanding.



After searching through the DB, it also appears that I need to be specific with the domain as there's other domain entries not needed for replacement.




I need figure out 3 Regex commands and I was perfectly clear.

This is exactly what I need to do.



Code:

FIND
domain1.com/forums/showthread.php?t=

REPLACE
domain1.com/forums/index.php?threads/.



Code:

FIND
domain1.com/forums/showthread.php?69-email-addy    (leave thread ID# but get rid of the label and these will differ from thread to thread)

REPLACE
domain1.com/forums/index.php?threads/.



Code:

FIND
domain1.com/forums/showthread.php?

REPLACE
domain1.com/forums/index.php?threads/.


scasey 02-17-2020 08:27 PM

The interactive sql command is “dropping to a tree” because it’s waiting for the semicolon to terminate the statement.

xray55 02-17-2020 08:38 PM

Quote:

Originally Posted by scasey (Post 6091261)
The interactive sql command is “dropping to a tree” because it’s waiting for the semicolon to terminate the statement.


Ok, understood but;


Code:

TO FIND
domain1.com/forums/showthread.php?69-email-addy    (leave thread ID# but get rid of the label and these will differ from thread to thread)

FIND REPLACE
domain1.com/forums/index.php?threads/.


Use

Code:

REGEX_REPLACE( message , '(https://domain1.com/forums/)showthread\\.php\\?([0-9]+)-([\\w-]+)' , '\\1index.php?threads/\\3.\\2' )


db table: xf_post
db index: message


Shouldn't the table name be included in the Regex command?

scasey 02-17-2020 08:45 PM

Quote:

Originally Posted by xray55 (Post 6091264)
Shouldn't the table name be included in the Regex command?

Id have to look it up to answer that question, but you can do that, too. Maybe bookmark that page for future reference.
Those documents are better than any answer you’ll get here...but we’re here to help if you don’t understand something you find there.

That said, #6 is a very complete response. Read it again. Carefully.


All times are GMT -5. The time now is 02:28 AM.