Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Funny, I was a professional sound engineer for 7+ years. It dawns to me what I see as impossible may be not that apparent for people with less experience on this field.
don't get me - us - wrong: wav-to-midi software exists, but it's pretty useless if you have more than one instrument, and even with only one, the results are questionable.
i'm sure there's a difference between the crappy windows freeware i tried and some professional version, and i'm sure the software will get better with time, much better, but to be able to seperate everything, just like "unmixing" a master track, that's stuff for sci-fi novels.
another analogy:
there are programs able to convert JPG images to SVG.
now imagine you have an SVG of a filled circle on a plain background. my guess, the SVG file contains about 10 lines.
convert to JPG - no problem.
convert back to SVG - how much code will the SVG file contain now?
don't get me - us - wrong: wav-to-midi software exists, but it's pretty useless if you have more than one instrument, and even with only one, the results are questionable.
i'm sure there's a difference between the crappy windows freeware i tried and some professional version, and i'm sure the software will get better with time, much better, but to be able to seperate everything, just like "unmixing" a master track, that's stuff for sci-fi novels.
another analogy:
there are programs able to convert JPG images to SVG.
now imagine you have an SVG of a filled circle on a plain background. my guess, the SVG file contains about 10 lines.
convert to JPG - no problem.
convert back to SVG - how much code will the SVG file contain now?
My argument however is that musical instruments, and voices, are such rich sources that it *will* be possible to separate them out.
On the simplest level, take the number 11. You are told that it consists of 2 numbers added together and asked to find those numbers. In the absence of any more data, it's impossible. However in music you have both far more complex sources, generating multiple data, and also the ability to look over the entire timeline of the music in order to determine what frequencies and volumes are being generated by the various instruments over the piece. You will also have templates - the usual frequencies generated by a trumpet, say, with minimum and maximum bounds and probabilities.
If we humans can do it (I have for example a very talented friend who can listen to a piece of music and then transcribe the music for any instrument from it), computers will be able to do it. It may take some time however. :-)
Distribution: Debian Sid AMD64, Raspbian Wheezy, various VMs
Posts: 7,680
Rep:
Quote:
Originally Posted by hydrurga
My argument however is that musical instruments, and voices, are such rich sources that it *will* be possible to separate them out.
On the simplest level, take the number 11. You are told that it consists of 2 numbers added together and asked to find those numbers. In the absence of any more data, it's impossible. However in music you have both far more complex sources, generating multiple data, and also the ability to look over the entire timeline of the music in order to determine what frequencies and volumes are being generated by the various instruments over the piece. You will also have templates - the usual frequencies generated by a trumpet, say, with minimum and maximum bounds and probabilities.
If we humans can do it (I have for example a very talented friend who can listen to a piece of music and then transcribe the music for any instrument from it), computers will be able to do it. It may take some time however. :-)
So, what is the mathematical difference between a Mini Moog and a 2600 playing the same thing?
That it may be possible to create a file containing virtual instruments which sound, to the human ear, identical to the original is not in dispute -- it's called MP3.
Distribution: Debian Sid AMD64, Raspbian Wheezy, various VMs
Posts: 7,680
Rep:
Quote:
Originally Posted by jamison20000e
.ogg
Well, yes, a perhaps better version of the whole "How many perfect synths to make this?".
My point being that MIDI is sample based so is just a list of samples which is never, ever, going to sound like the original and it's going to be almost or actually impossible to tell which "real" instruments were used to produce a given piece of music.
By the way, people cannot tell either, they just know that there's a pool of instruments that could be pulled from and work it out from that.
Last edited by 273; 07-14-2016 at 01:38 PM.
Reason: "from" not "form" and "pool" not "poll"
My argument however is that musical instruments, and voices, are such rich sources that it *will* be possible to separate them out.
On the simplest level, take the number 11. You are told that it consists of 2 numbers added together and asked to find those numbers. In the absence of any more data, it's impossible. However in music you have both far more complex sources, generating multiple data, and also the ability to look over the entire timeline of the music in order to determine what frequencies and volumes are being generated by the various instruments over the piece. You will also have templates - the usual frequencies generated by a trumpet, say, with minimum and maximum bounds and probabilities.
If we humans can do it (I have for example a very talented friend who can listen to a piece of music and then transcribe the music for any instrument from it), computers will be able to do it. It may take some time however. :-)
Yes, we can be sure a computer will someday do it. The fact that a human can do it means that the information is there. It's just a matter of extracting and sorting it out.
I recently stumbled across a book about human hearing and how it works. It's really complicated. I have also been developing some artificial hearing software but still have much to do and much more to learn.
Distribution: Debian Sid AMD64, Raspbian Wheezy, various VMs
Posts: 7,680
Rep:
Quote:
Originally Posted by Beryllos
Yes, we can be sure a computer will someday do it. The fact that a human can do it means that the information is there. It's just a matter of extracting and sorting it out.
I recently stumbled across a book about human hearing and how it works. It's really complicated. I have also been developing some artificial hearing software but still have much to do and much more to learn.
AS I mentioned, a human can't do it -- humans just compare what they hear to a database they've built up over the years as a computer would. I guarantee that there are instruments that 90% of people haven't heard and even the more common ones can be confused with one another. Again, I'll mentioned Tom Morello's guitar and go on to say that there are synth-written guitar parts which can be indistinguishable from "the real thing".
Then there's MIDI being simply a load of samples from a very finite library.
So, I suppose, yes you could have a program which turns any sound into a very poor MIDI rendition using built-in software synthesizers but you would find that the majority don't sound close to the original at all.
AS I mentioned, a human can't do it -- humans just compare what they hear to a database they've built up over the years as a computer would. I guarantee that there are instruments that 90% of people haven't heard and even the more common ones can be confused with one another. Again, I'll mentioned Tom Morello's guitar and go on to say that there are synth-written guitar parts which can be indistinguishable from "the real thing".
Then there's MIDI being simply a load of samples from a very finite library.
So, I suppose, yes you could have a program which turns any sound into a very poor MIDI rendition using built-in software synthesizers but you would find that the majority don't sound close to the original at all.
With all due respect to the OP, which was about MIDI, I wasn't talking about MIDI, and I'm aware of its limitations. Rather, hydrurga and I were speculating about whether a machine can, in principle, perform as well as a human (or better) at analyzing music and describing it in terms that would allow instruments to be separated, accurately reproduced, edited, retuned, remixed, transcribed, or whatever the user requires. At the moment, it's science fiction... kind of like cell phones used to be...
BIG FAT WARNING: THIS IS OT; I'M BECOMING PHILOSOPHICAL!
Quote:
Originally Posted by hydrurga
My argument however is that musical instruments, and voices, are such rich sources that it *will* be possible to separate them out.
On the simplest level, take the number 11. You are told that it consists of 2 numbers added together and asked to find those numbers. In the absence of any more data, it's impossible. However in music you have both far more complex sources, generating multiple data, and also the ability to look over the entire timeline of the music in order to determine what frequencies and volumes are being generated by the various instruments over the piece. You will also have templates - the usual frequencies generated by a trumpet, say, with minimum and maximum bounds and probabilities.
If we humans can do it (I have for example a very talented friend who can listen to a piece of music and then transcribe the music for any instrument from it), computers will be able to do it. It may take some time however. :-)
this and most other posts seem to assume a certain type of music: studio recordings, mostly, with a finite number of tracks (=instruments=voices?) and known instruments.
my understanding is a little different; what if you have a campfire song where you can hardly distinguish guitar & singer from other sounds, like crackling fire, crickets, clinking beer bottles, giggling girls - a human being still can dustinguish the song, and be able to hear & play the song after that.
actually that's an example where i can imagine a computer being somewhat succesful in seperating out guitar and human voice, but it leads me to another point:
what if all the noise is actually intended to be part of the music?
or, noise and music kind of flow into each other, and become indistinguishable?
or, it becomes an important part of the message of the piece whether you use real strings or synth strings?
or, the sometimes uncanny capability of people to recognize the slightest accent in the speech of others?
i think computers & software are still very, very far away from dealing with something like that.
Distribution: Debian Sid AMD64, Raspbian Wheezy, various VMs
Posts: 7,680
Rep:
Sorry, yes, computers can, to a degree, and will be able to even more distinguish between known instruments better than people. Human patter recognition sometimes gets side-tracked and the human understanding of probability is such that they conspire to make hum recognition of edge cases more problematic.
If all you need is a conversion of frequencies (.wav) to musical events (.mid) you can do that now. You'll end up with a piano score of sorts (not to imply playable by human hands on a piano). But every instrument will be converted to a single midi instrument / track. With human intervention you can break out the notes into parts for multiple instruments. It's not exactly an original score as the key signatures and meters might be omitted or vary drastically from the original work. But the notes will for the most part be there. To include wrong notes generated during the performance. And any incidental sounds (harmonics) created when more than one instrument plays in tune. It's not an exact science, but you can extract the "jist" of it programmatic-ally. Now if you have a studio master and each instrument was mic'd individually, the accuracy improves greatly. But by no means an automated process and most of the conversion tools in linux still suck for the most part. Although some of the ones to convert images to music are interesting.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.