LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > General
User Name
Password
General This forum is for non-technical general discussion which can include both Linux and non-Linux topics. Have fun!

Notices


Reply
  Search this Thread
Old 01-12-2018, 06:19 PM   #1
mwx
Member
 
Registered: Jan 2009
Location: Virginia, USA
Distribution: CentOS - Ubuntu
Posts: 61

Rep: Reputation: 0
MYSQL Database learning project. I could use some help.


Ok. So let's forget for a moment that this concerns Wordpress - I don't think it has anything to do with what I have going on.

I built my dad a blog several years ago and after a brief amount of instruction, set him free to upload pictures and write to his hearts content. Several times he sort of crashed the site by trying to upload files with 140+ character file names - multiple dots (.) - slashes - extensions - you name it...

He also uploaded about 2000 pictures and WP being WP - it made 3 additional copies of each photo - so now we have 8000 photos on the server and no real way of telling what's what. What has been used and what hasn't. What is actively embedded in an article and what's not, etc.

There are a few utilities out there which were supposed to help with this exact problem - most of which have been abandoned by their developers. The only one that remotely works - only does about 1/4/ to 1/3 the job you wished it would... So I'm going to try to learn how to come at it from a different angle. Hopefully someone will be able to give me a few ideas and make this a "teach a man to fish" moment.

What I want to do - or at least think I want to do - is be able to query the DB tables of all active pages (roughly 240) and get a return of all the photos listed in the published articles. If I'm really lucky - I'll also get a list of what articles each one is inserted into.

Then I'd like to query the DB and find out which photos it has a record of - that are not in use - as I don't think WP scrubs those records. I think it keeps a record of every photo ever uploaded which still resides on the server and hasn't been deleted through its media manager.

Then - with those two data sets culled - I can begin to formulate a plan of attack as to how I'm going to go about cleaning up the DB and removing the old records - as well as removing the older, no longer in use photos. The site is bulky - and lags more than it should - and I'd like to shrink the size of the DB and make it a bit more portable and streamline it for speed and efficiency...

Here's the thing... I don't know the first thing about how to get started. How to formulate the search strings. How to isolate the data I'm looking for. How to manipulate it once I find it...

It's a bit of a project - so I thought I'd ask for a hand in getting started... Sound too crazy? I just need some logical breadcrumbs that will lead me in the right direction. I've never tried to tackle anything like this. I'm in my first year of Linux Sys Admin classes and thought that this might be a neat project to work through so that I can hopefully learn something and help someone else in the future...
 
Old 01-12-2018, 11:01 PM   #2
Sefyir
Member
 
Registered: Mar 2015
Distribution: Linux Mint
Posts: 578

Rep: Reputation: 266Reputation: 266Reputation: 266
I don't want to discourage learning sql.. but
I think with removing duplicate files there's a easier solution (which I just tested)
As always, backup before doing this and check afterwards.

From here

Quote:
Deleting Unused Media Files in WordPress

Removing unused media files in WordPress is easy. All you have to do is to select the “Unattached” category from the drop-down in the menu bar found in the media library:
unattached media files

It will now display all “unattached” media files; that is, it will display all media files that aren’t being used anywhere on your site. You can delete them all, but it’s recommended that you create a backup of your wp-content/uploads directory first so you can restore them in case something goes wrong.
Edit: If the intention is to learn sql, you can use the above method to perhaps compare the two.
To start with basics though, you can try Sololearns course on SQL
https://www.sololearn.com/Course/SQL/

Last edited by Sefyir; 01-12-2018 at 11:06 PM.
 
Old 01-13-2018, 07:55 AM   #3
mwx
Member
 
Registered: Jan 2009
Location: Virginia, USA
Distribution: CentOS - Ubuntu
Posts: 61

Original Poster
Rep: Reputation: 0
If only it were that easy. Selecting that option returns zero results.

Now I'm starting to think that I'll have to poll the DB and get listings of active or attached photos - then find or write a script which will traverse the directory structure and list the contents of all of the media folders - compare the two results and then manually delete the additional unused photos.
 
Old 01-13-2018, 10:02 AM   #4
Habitual
LQ 5k Club
 
Registered: Jan 2011
Location: Yawnstown, Ohio
Distribution: High Sierra
Posts: 8,958
Blog Entries: 36

Rep: Reputation: Disabled
Quote:
Originally Posted by mwx View Post
If only it were that easy. Selecting that option returns zero results.
they may be buried in the wp_posts table.
I don't always use a CMS,
but when I do, it's Wordpress.

And I never leave home w\out https://codex.wordpress.org
 
Old 01-14-2018, 12:59 PM   #5
mwx
Member
 
Registered: Jan 2009
Location: Virginia, USA
Distribution: CentOS - Ubuntu
Posts: 61

Original Poster
Rep: Reputation: 0
So - using (find) I was able to pipe a list of all the files in the media directories out to a text file... Turns out my problem is considerably worse than I had first suspected. I thought there were 8000 photos? Nope... More like 24000 individual images scattered across multiple directories...

At least I have them all in a file which can be sorted. Next step is to get into the DB and see if I can get a listing of what's in use and what's not..

We shall see. I'll report back what I find... One lesson of the day is that I've been able to use basic Linux thus far to answer most of the questions. Have yet to learn any SQL... That'll be next I guess.
 
Old 01-14-2018, 05:42 PM   #6
Habitual
LQ 5k Club
 
Registered: Jan 2011
Location: Yawnstown, Ohio
Distribution: High Sierra
Posts: 8,958
Blog Entries: 36

Rep: Reputation: Disabled
Section 9.3.1
I'd scan all such unaccounted for content with clamscan because "as it turns out" media attachments are notoriously abused.
Upload one file in an incorrectly perm'd owner:group directory, and you could be in a world of hurt. I hope not ;)

Hardening Wordpress properly adhered to will be a good start on Securing your Wordpress.

Use Images says that some posts have attachments and those may account for the discrepancy in file count, however...
Use the codex. - It **is** the Official Source


Good Luck on your learning objectives. :)

Peace.
 
Old 01-14-2018, 10:16 PM   #7
Sefyir
Member
 
Registered: Mar 2015
Distribution: Linux Mint
Posts: 578

Rep: Reputation: 266Reputation: 266Reputation: 266
Quote:
I thought there were 8000 photos? Nope... More like 24000 individual images scattered across multiple directories...
Wordpress will by default create multiple duplicate images of varying sizes (for convenience?)
It is likely many of those images are duplicates, but with different resolution.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
i have small project of php script and mysql database, want to configure in .deb file bhanu055 Linux - Software 1 07-25-2011 10:25 PM
How can i copt the MYSQL Database to a other server (Database) halvorls Linux - Server 3 07-27-2008 09:19 PM
Linux Distro for Database Learning Andriy Linux - General 7 09-18-2007 10:47 PM
Senior Project - Learning Linux certainly lisa Linux - General 8 08-13-2007 10:38 PM
Learning a Database lambmt Linux - Software 2 03-03-2004 04:11 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > General

All times are GMT -5. The time now is 08:50 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration