Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
|
06-29-2005, 08:03 PM
|
#1
|
LQ Newbie
Registered: Dec 2004
Location: Great Falls, Montana
Distribution: Mandrake 10.1, Fedora C3,4. Red Hat 8, LTSP
Posts: 16
Rep:
|
wget a random file on phone server
Howdy all, I've never posted here, but i will admit i use linuxquestions as my bible. I've learned alot and gotten alot of my answers here. So thanks for your help in advance.
Here's my situation:
I have a server filled with a bunch of audio files. I would like to be able to pull several of the files off at one time, but in a random order. The server is a phone server and it records all the phone calls that come into the call center. I have to do call evals on each person once a week, so instead of getting 30 files by hand. I would like to have a script that would wget one audio file by each agent for that week. The files are listed on the server in a very complicated way, here's an example: agent-10-1117354150-4709.gsm. I've tried to randomize that last set of numbers but i havn't been able to get it working yet.
Is there any command that will go to the server and download say agent-10-111735* ??
Any help is greatly appriciated.
Thanks.
|
|
|
06-29-2005, 10:38 PM
|
#2
|
Senior Member
Registered: Dec 2002
Location: Atlantic City, NJ
Distribution: Ubuntu & Arch
Posts: 3,503
Rep:
|
Okay so the first two sets of digits(e.g. 10-1117354150) show which employee the recording belongs to right? You want to get 30 random recordings but only one per employee? Also, this is all going to be done remotely correct? You don't have direct access to the server itself?
I'm just trying to get this straight here. I think what I've deduced is correct but I want to double check.
|
|
|
06-30-2005, 04:22 AM
|
#3
|
LQ Newbie
Registered: Dec 2004
Location: Great Falls, Montana
Distribution: Mandrake 10.1, Fedora C3,4. Red Hat 8, LTSP
Posts: 16
Original Poster
Rep:
|
Correct, the first two digits are the agents ID and the second set is a random number created by the server to store the files. I havn't been able to find a similarity in the way it creates that number, except for the first few digits (1117) so when i wget, somewhere in the command line is going to have to be a wild card (wget agent-10-*). Your also correct, i'm not going to have direct access to the server, although i may be able to gain access if it would help.
|
|
|
06-30-2005, 06:41 AM
|
#4
|
Senior Member
Registered: Dec 2002
Location: Atlantic City, NJ
Distribution: Ubuntu & Arch
Posts: 3,503
Rep:
|
Well, if you were able to get a listing of all the possible files to download you could easily process them the way you want. I'm just not sure how to get a listing of remote files to download.
How do you download the files? FTP or HTTP?
One last thing. Reading your above post I'm wondering why you would wget agent-10-*. Wouldn't that get you all the files having to do with one employee? I thought you didn't want more then 1 file from each employee?
Also, could you possibly post a data sample of the audio files? I'm sure it would be huge but I think it would help to see what your dealing with.
Last edited by Crashed_Again; 06-30-2005 at 07:47 AM.
|
|
|
06-30-2005, 11:23 AM
|
#5
|
LQ Newbie
Registered: Dec 2004
Location: Great Falls, Montana
Distribution: Mandrake 10.1, Fedora C3,4. Red Hat 8, LTSP
Posts: 16
Original Poster
Rep:
|
Yeah I've trying to think of how to pull a list of files that are on it., but i havn't been having any luck. All the files have to be downloaded HTTP.
As far as wget agent-10-* since I'm doing call evals it has to be a random call to be "fair" for everyone, just so none of my supervisors dont think that i'm favoring anyone. I was hoping that i could pull one file off the server from one agent, then stop there, and move on to the next agent.
Also, as far as a data sample, what kind would you like? Really all i can give you is a sample list of the audio files. I'll post that next. If you would like anything else, let me know. It's great to have someone helping me, i've been giving myself a headache over this for around a week now.
|
|
|
06-30-2005, 11:46 AM
|
#6
|
LQ Newbie
Registered: Dec 2004
Location: Great Falls, Montana
Distribution: Mandrake 10.1, Fedora C3,4. Red Hat 8, LTSP
Posts: 16
Original Poster
Rep:
|
Here's a list of the audio files so you see the names of files i've been working with
agent-82-1117350474-4707.gsm
agent-10-1117354150-4709.gsm
agent-10-1117354173-4711.gsm
agent-10-1117357683-4713.gsm
agent-10-1117358274-4715.gsm
agent-82-1117359132-4720.gsm
agent-38-1117370905-4729.gsm
agent-38-1117370987-4734.gsm
agent-38-1117371411-4736.gsm
agent-38-1117371537-4739.gsm
agent-46-1117371692-4744.gsm
agent-38-1117372138-4749.gsm
agent-46-1117372206-4756.gsm
agent-46-1117372802-4764.gsm
agent-46-1117372972-4769.gsm
agent-46-1117373758-4774.gsm
agent-78-1117376160-4780.gsm
agent-78-1117378364-4788.gsm
agent-46-1117378444-4793.gsm
agent-38-1117379386-4795.gsm
agent-46-1117379472-4800.gsm
agent-78-1117379535-4805.gsm
agent-46-1117379770-4815.gsm
agent-46-1117380137-4821.gsm
agent-78-1117380338-4828.gsm
agent-26-1117380475-4831.gsm
agent-26-1117380596-4836.gsm
agent-78-1117380693-4838.gsm
agent-78-1117380836-4840.gsm
agent-78-1117380917-4842.gsm
agent-26-1117380921-4844.gsm
agent-46-1117381024-4846.gsm
agent-78-1117381066-4848.gsm
agent-78-1117381139-4850.gsm
agent-78-1117381237-4855.gsm
agent-46-1117381259-4857.gsm
agent-26-1117381343-4859.gsm
agent-46-1117381390-4861.gsm
agent-46-1117381483-4874.gsm
agent-46-1117381802-4876.gsm
agent-26-1117381888-4881.gsm
Ok, that should be enough copy and pasting. This is the call order yesterday in about a hour period. Now, if i could wget agent-26-11173 * then move on to the next agent, agent-27-11173 * and so on untill I have one call from each agent. I've already tried a randomizing set of numbers but that never actually hit a correct file in the wget because the server doesn't save the files in a paticualar order.
I hope this helps.
|
|
|
06-30-2005, 01:12 PM
|
#7
|
Senior Member
Registered: Dec 2002
Location: Atlantic City, NJ
Distribution: Ubuntu & Arch
Posts: 3,503
Rep:
|
Is this a public server that we could access? I need to see an example of the html page that is generated to download the files. I think you would have to write some code that would parse the html code and take out each file name. I know how to do this but I need some html code to work off of.
Can I look at this page that you are downloading from or can you give me another example?
|
|
|
06-30-2005, 01:48 PM
|
#8
|
LQ Newbie
Registered: Dec 2004
Location: Great Falls, Montana
Distribution: Mandrake 10.1, Fedora C3,4. Red Hat 8, LTSP
Posts: 16
Original Poster
Rep:
|
Sorry, it's not a public server. What are you looking for in the html? I may be able to give you some of that.
|
|
|
06-30-2005, 04:38 PM
|
#9
|
Member
Registered: Feb 2005
Location: Sunnyvale, CA
Distribution: Ubuntu
Posts: 205
Rep:
|
Hi,
If you are familiar with coding web pages in php, why don't you create a simple web page that will algorithmically generate the file names to be downloaded, and then sends the specific files to you one at a time using a browser? I am sure that by using loops, and the glob() and rand() php functions you can generate a set of file names that are random in nature but assure you complete coverage. Once you have a file name, you can use the php "Location" command to actually send the file to your client browser and save the file like a normal web download.
|
|
|
06-30-2005, 05:08 PM
|
#10
|
LQ Newbie
Registered: Dec 2004
Location: Great Falls, Montana
Distribution: Mandrake 10.1, Fedora C3,4. Red Hat 8, LTSP
Posts: 16
Original Poster
Rep:
|
Sadly, I dont know php. I'm trying to get all this done in a shell script. Since all of this is giving me a headache i may just look into php and see what i can get done. Got any examples to get me started?
|
|
|
06-30-2005, 07:40 PM
|
#11
|
Member
Registered: Aug 2004
Location: Todd Mission Texas
Distribution: Linspire
Posts: 215
Rep:
|
It should give you a headache. If it was easy your girlfriend would do it for you.
Looks like you know what you want to happen but not how to get there. I would suggest that now is the time to put the computer aside and get out the old fashion pencil and paper. Something you can scratch through, draw long arrow to move down the page, generally it only needs to make sense to you in your mind, not to anyone else at the present.
What you need is a flow chart. Use you pencil and pad to develop this flow chart.
A flow chart should give you several things. A check to see if your logic is correct. A program flow and the results you expect from each function along with where does the program go if the expected results fails,and by using long function names what each function does.
Hint: When you name a function, start it with a number. example:
3 Extreeeeeeeeeeeeemllllllllllllly long descriptive function name
Later on if you need to loop back to this function it is easier to write 'goto 3' that the long name.
When you have the outline finished post it here. We can continue with the writing of the functions that are giving you trouble.
HTH
Dave
|
|
|
06-30-2005, 08:13 PM
|
#12
|
LQ Newbie
Registered: Dec 2004
Location: Great Falls, Montana
Distribution: Mandrake 10.1, Fedora C3,4. Red Hat 8, LTSP
Posts: 16
Original Poster
Rep:
|
Ok, i know what i want, and i know about 80% on how to get there.
Today I was googling and found a better command to use instead of wget. I can use curl. But as i was playing around with curl, i came across a small problem. It will write an output file even if the file isn't real on the server. Any way to manipulate curl to only write an output file if it finds the file on the server? Say:
if [ -f http://server/calls/agent-81-1777(00000-99999)-(0000-9999).gsm ]
then
curl http://server/calls/agent-81-1777(whatever numbers it found to be valid).gsm -O
If i do curl http://server/calls/agent-81-1777[0-9999] -O it writes a 9999 output files, including the one valid file. So basicly, how would i get it to verify the file is valid before it copys it to my computer. I hope i'm making sence, it's the end of the day and my brain has had it. I'm gonna go home and have a beer. Thanks all again.
|
|
|
06-30-2005, 08:39 PM
|
#13
|
Senior Member
Registered: Dec 2002
Location: Atlantic City, NJ
Distribution: Ubuntu & Arch
Posts: 3,503
Rep:
|
Well here is my idea and the reason I asked for the sample html page.
The first step is to get an array of all the possible files to download. This is why you would need to parse the html code to get the file names. The next step is to randomly select 1 file from every employee. From that data set you could randomly select 30 files and download them one at a time.
I think I could implement this in python but I would need a sample of the html from the page that serves the files.
|
|
|
06-30-2005, 10:42 PM
|
#14
|
Member
Registered: Aug 2004
Location: Todd Mission Texas
Distribution: Linspire
Posts: 215
Rep:
|
Last edited by Dave Kelly; 06-30-2005 at 10:53 PM.
|
|
|
07-01-2005, 11:00 AM
|
#15
|
Senior Member
Registered: Dec 2002
Location: Atlantic City, NJ
Distribution: Ubuntu & Arch
Posts: 3,503
Rep:
|
Okay check this out:
Code:
#!/usr/bin/env python
import os, random
# SET THIS TO THE URL WHERE YOU DOWNLOAD THE FILES FROM.
# DON'T FORGET THE / AT THE END OF THE URL.
url = 'http://www.siteyourdownloadingfrom.com/directory/files/'
files = ['agent-10-1117354150-4709.gsm', 'agent-10-1817354150-4709.gsm', 'agent-10-1114554150-4709.gsm',
'agent-11-4567354150-4709.gsm', 'agent-11-1165454150-4709.gsm', 'agent-11-1458654150-4709.gsm',
'agent-12-1117534550-4709.gsm', 'agent-12-1345674150-4709.gsm', 'agent-12-1116785450-4709.gsm',
'agent-13-1132454350-4709.gsm', 'agent-13-1454324450-4709.gsm', 'agent-13-3454363150-4709.gsm',
'agent-14-1134534150-4709.gsm', 'agent-14-3453354150-4709.gsm', 'agent-14-1113454550-4709.gsm',
'agent-15-1767654150-4709.gsm', 'agent-15-1456456150-4709.gsm', 'agent-15-1456454150-4709.gsm',
'agent-16-4621354150-4709.gsm', 'agent-16-1113453330-4709.gsm', 'agent-16-1114564450-4709.gsm',
'agent-17-1523454150-4709.gsm', 'agent-17-1115454544-4709.gsm', 'agent-17-1117345640-4709.gsm',
'agent-18-1153244440-4709.gsm', 'agent-18-1114545320-4709.gsm', 'agent-18-1135343150-4709.gsm',
'agent-19-6324554335-4709.gsm', 'agent-19-1164545150-4709.gsm', 'agent-19-1163463350-4709.gsm',
'agent-20-1115434555-4709.gsm', 'agent-20-1115443330-4709.gsm', 'agent-20-1766777450-4709.gsm',
'agent-21-1115345454-4709.gsm', 'agent-21-1117354545-4709.gsm', 'agent-21-1114564550-4709.gsm',
'agent-22-4534544150-4709.gsm', 'agent-22-1453354150-4709.gsm', 'agent-22-1117456350-4709.gsm',
'agent-23-1134545150-4709.gsm', 'agent-23-4534454150-4709.gsm', 'agent-23-1134545150-4709.gsm',
'agent-24-1455223330-4709.gsm', 'agent-24-1345454150-4709.gsm', 'agent-24-1116556560-4709.gsm',
'agent-25-5634545440-4709.gsm', 'agent-25-3454224150-4709.gsm', 'agent-25-1117375656-4709.gsm',
'agent-26-1145454634-4709.gsm', 'agent-26-1154545450-4709.gsm', 'agent-26-1254556345-4709.gsm',
'agent-27-1116346666-4709.gsm', 'agent-27-1113456633-4709.gsm', 'agent-27-1113456777-4709.gsm',
'agent-28-5845865485-4709.gsm', 'agent-28-1117345577-4709.gsm', 'agent-28-3434534322-4709.gsm',
'agent-29-1485958757-4709.gsm', 'agent-29-1534518676-4709.gsm', 'agent-29-1463346740-4709.gsm',
'agent-30-1958403048-4709.gsm', 'agent-30-1176574565-4709.gsm', 'agent-30-1444454150-4709.gsm',
'agent-31-1587639490-4709.gsm', 'agent-31-1175745670-4709.gsm', 'agent-31-1116734550-4709.gsm',
'agent-32-9847364857-4709.gsm', 'agent-32-1117674566-4709.gsm', 'agent-32-1543234421-4709.gsm',
'agent-33-9564345857-4709.gsm', 'agent-33-1456456546-4709.gsm', 'agent-33-1654645221-4709.gsm',
'agent-34-9847564565-4709.gsm', 'agent-34-1117776567-4709.gsm', 'agent-34-1144456354-4709.gsm',
'agent-35-9456456456-4709.gsm', 'agent-35-7567765464-4709.gsm', 'agent-35-1656435421-4709.gsm',
'agent-36-4345344857-4709.gsm', 'agent-36-1767347546-4709.gsm', 'agent-36-1654643321-4709.gsm',
'agent-37-9645432237-4709.gsm', 'agent-37-1117856807-4709.gsm', 'agent-37-1145454644-4709.gsm',
'agent-38-9865748940-4709.gsm', 'agent-38-0594736455-4709.gsm', 'agent-38-1564563255-4709.gsm',
'agent-39-9847564645-4709.gsm', 'agent-39-1456456565-4709.gsm', 'agent-39-1165678896-4709.gsm',
'agent-40-9834444857-4709.gsm', 'agent-40-1565345577-4709.gsm', 'agent-40-6563345781-4709.gsm',
'agent-41-2424677347-4709.gsm', 'agent-41-6456734554-4709.gsm', 'agent-41-1145534524-4709.gsm',
'agent-42-9845456367-4709.gsm', 'agent-42-1115434523-4709.gsm', 'agent-42-7674545671-4709.gsm',
'agent-43-9345345345-4709.gsm', 'agent-43-1656564324-4709.gsm', 'agent-43-1146456345-4709.gsm']
dic = {}
for entry in files:
# GET THE FIRST NUMBER AFTER 'agent-'
number = entry.split('-')[1]
# IF THE DICTIONARY ALREADY HAS THE 'number' IN IT
# APPEND THE 'entry' TO THE 'number' KEY IN 'dic'
if dic.has_key(number):
dic[number].append(entry)
# OTHERWISE CREATE THE DICTIONARY KEY
# USING THE 'number' AND ADD THE ENTRY
else:
dic[number] = [entry]
tmp_list = []
# RANDOMLY CHOOSE ONE FILE FROM EACH EMPLOYEE
for vals in dic.values():
tmp_list.append(random.choice(vals))
final_list = []
counter = 1
while counter <= 30:
# RANDOMLY CHOOSE ONE OF THE FILES
fn = random.choice(tmp_list)
# IF THE FILE IS ALREADY IN THE LIST
# START THE LOOP OVER AGAIN
if fn in final_list:
continue
# OTHERWISE DOWNLOAD THE FILE
else:
print 'Downloading %s...' % fn
# DOWNLOAD THE FILE USING WGET
try:
os.system('wget '+ url + fn)
except:
print "Could not download %s. Server down? File doesn't exist?" % fn
print '%s finished downloading.\n' % fn
# INCREASE THE COUNTER
counter += 1
print '\n\nDownloading complete. Have a nice day!'
I tried to comment as much as I could so you can follow whats going on. This should do the trick for you. The only problem is, as you can see, I provided the data set of file names. In order to write something to parse the html code to get the file names available to download I'd need to see the html code of the page that you download from.
Last edited by Crashed_Again; 07-01-2005 at 12:07 PM.
|
|
|
All times are GMT -5. The time now is 01:33 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|