Hi all,
I am looking for suggestions on optimizing this SQL statement:
Code:
SELECT ip,sum(packets)
FROM(
(SELECT dst_ip AS ip,sum(dst_packets) AS packets
FROM flows
WHERE interval='2005-02-01 00:00:00'
GROUP BY dst_ip)
UNION ALL
(SELECT src_ip AS ip,sum(src_packets) AS packets
FROM flows
WHERE interval='2005-02-01 00:00:00'
GROUP BY src_ip) )
AS stats
GROUP BY stats.ip
HAVING sum(packets)>0
ORDER BY sum(packets) DESC
There must be some way to combine the two subqueries in to a single query.
Here is the basic gist of what is happening...
I have flow level records, where I am interested in finding the number of packets sent by IP addresses, and the format is:
Code:
src_ip dst_ip src_packets dst_packets
500 600 1 2
700 500 1 2
Such that src_ip sent src_packets and dst_ip sent dst_packets. Therefore, the result of my current query is:
The first subquery adds adds up the packet counts for the destination IPs, and the second subquery adds up the packet counts for the source IPs. Then the top level query sums the union.
Theres got to be a way to optimize this to perform it in one pass through the data, since it is hundreds of thousands of rows it does make a difference.
I'd greatly appreciate any suggestions
Thanks!
George