SQL - Get the correct data
I admit it; my SQL is way too rusty for these queries...
The situation is simple, and actually something I recently thought up and started to implement. We have a bunch of (apache) proxy servers, and from those I load the logs into my database with a simple perl script that's gonna replace me :-) I took the configuration from the conf.d/*.conf files and extracted the <Location /bla/bla/bla> and stored those in my table 'configured_locations' (configured_locations.location is varchar(255) ) I also have a table logs that's actually a bit more complex in setup it references a bunch of other tables and to make that "human readable" I designed this view: Code:
View "public.showlog" Now I want to query how much each location as defined in the configured_locations.location appears "like" in the logs in a specified period (using showlog.date) I thought this query would tell me that, but the outcome must be wrong as I have about 8M records in my table and the sum of the results was 420M :-) Code:
select configured_locations.location,count(0) Probably not really necessary to tell, but we're talking a PostgreSQL 8.4 database. |
I would put the showlog matching in a sub-select thus:
Quote:
|
Thanks for the quick reply; Had to alter it a bit, as the query wanted a group-by... But this query does not give what I expected; Maybe I ought to clarify a bit better:
Given these configured_locations.location: /site1/service1 /site2/service1 /site3/service2 And these showlog.locations (where envir is not that relevant) /site1/service1?status /site1/service1/page1.html /site2/service1/service /site3/service2/service4 /site3/service2/service /site3/service2?status I'd like to do the query that will give these results: /site1/service1 2 /site2/service1 1 /site3/service2 3 So how often (within a given period) each configured_location "appears" as a substring of the location string in showlog The query you gave does some sort of counting of how often a location appears in configured_locations where they appear in the log; but the count is on the amount of hits in the configured_locations table. |
I think I may have the correct query; not sure if this is "it" though:
Code:
Select configured_locations.location,count(0) |
Got close; have yet to determine a way to fix a few things
The query I came up with is now as follows: Code:
Select configured_locations.location,count(0) /site1/service1 /site1/service12 would both match the like 'site1/service1%', besides: having N proxies in the given environment, multiplies the outcome with factor N; at least: that's what the data looks like right now. Gonna think about how to overcome these things ;-) |
All times are GMT -5. The time now is 01:52 PM. |