Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game. 
Notices 
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto 
Site FAQ 
Sitemap 
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQrelated cookies.

Introduction to Linux  A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free. 


01152011, 10:44 PM

#1

Member
Registered: Aug 2010
Distribution: Gentoo
Posts: 62
Rep:

Optimizing the spacing of a mesh containing a given set of points
I tried to summarize the this as best as possible in the title. I am writing an initial value problem solver in the most general way possible. I start with an arbitrary number of initial values at arbitrary locations (inside a boundary.) The first part of my program creates a mesh/grid (I am not sure which is the correct nuance), with N points total, that contains all the initial values. My goal is to optimize the mesh such that the spacing is as uniform as possible. My solver seems to work half decently (it needs some more obscure debugging that is not relevant here.)
I am starting with one dimension. I intend to generalize the algorithm to an arbitrary number of dimensions once I get it working consistently. I am writing my code in fortran, but feel free to reply with pseudocode or the language of your choice.
Allow me to elaborate with an example:
Say I am working on a closed interval [1,10]
xmin=1
xmax=10
Say I have 3 initial points: xmin, 5 and xmax
num_ivc=3
known(num_ivc)=[xmin,5,xmax] //my arrays start at 1. Assume "known" starts sorted
I store my mesh/grid points in an array called coord. Say I want 10 points total in my mesh/grid.
N=10
coord(10)
Remember, all this is arbitraryexcept the variable names of course. The algorithm should set coord to {1,2,3,4,5,6,7,8,9,10}
Now for a less trivial example:
num_ivc=3
known(num_ivc)=[xmin,5.5,xmax
or just
num_ivc=1
known(num_ivc)=[5.5]
Now, would you have 5 evenly spaced points on the interval [1, 5.5] and 5 evenly spaced points on the interval (5.5, 10]? But there is more space between 1 and 5.5 than between 5.5 and 10. So would you have 6 points on [1, 5.5] followed by 4 on (5.5 to 10]. The key is to minimize the difference in spacing.
I have been working on this for 2 days straight and I can assure you it is a lot trickier than it sounds. I have written code that
only works if N is large
only works if N is small
only works if it the known points are close together
only works if it the known points are far apart
only works if at least one of the known points is near a boundary
only works if none of the known points are near a boundary
So as you can see, I have coded the gamut of almostsolutions. I cannot figure out a way to get it to perform equally well in all possible scenarios (that is, create the optimum spacing.)



01162011, 03:56 AM

#2

Senior Member
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723

Well, to really space you out, here's an alternative approach.
Think about the volumes (let's call them cells) defined by your samping points. Assume each point in the space belongs to the cell owned by the closest sampling point. You can use squared vector norm, sum(coord[i], i=1..n) in the distance comparison, for any number of dimensions.
Then, you can easily use an iterative method to move the sampling points, in small increments, into a configuration where the volumes are uniform.
In a single dimension, this will always result in a more or less even distribution from minimum to maximum; the interesting stuff happens in higher dimensional situations.
For example, you could do this like this:  Compute the volume of each cell.
You can do this by simply randomly sampling the entire nvolume, and noting which cell was closest to the random point.
The volume of each cell is then approximately the fraction of hits hitting that cell, of the total volume.
 Compute which cells are connected.
You can do this in many ways, for example noting the contenders while volume sampling, or doing ndimensional Delaunay triangulation on the sampling points.
 For each pair of connected cells, compute a weight factor based on the ratio of the cell volumes (smaller to larger, thus 0..1)
 Use slice samping to pick a random pair or cells, weighted by the the weight factors.
You do this by generating a random number between zero and the total sum of weight factors (not including the upper limit), and then walk the pair list until the cumulative sum exceeds the random number. Where that happens, you pick that pair.
 Move one or both of the sampling points (along the axis between the two), in such a way that the weight factor should increase.
Note that the average cell volume is the total volume divided by the number of cells; that helps to make the right decision.
 Restart from the beginning, until all weight factors are above a preset smoothness limit.
This method can get stuck in a local minimum, so you could use e.g. a genetic algorithm, evolving multiple systems in parallel, and killing off those that do not evolve.
This algorithm should adapt well to different starting conditions. Random sampling is most efficient volume estimation for five or more dimensions, but it's not terrible for the two to fourdimensional cases either. You must find a good strategy on how to move the sampling points in a way that never worsens the situation, but that shouldn't be too difficult.
If you have a very large system, you don't need to recompute the entire system at every step, the affected region is enough. If you know the longest distance R between the sampling points in a cell pair, you can use that as a safe yardstick of the maximum cell size: spheres of radius 2 R centered at the modified sampling points (three or four spheres, both old and moved sampling points) will contain all affected cells completely, I believe; the pair list will tell the cells connected to the modified cells.
Does any of this make any sense to you?
Nominal Animal
Last edited by Nominal Animal; 03212011 at 03:55 AM.



01162011, 06:03 PM

#3

Member
Registered: Aug 2010
Distribution: Gentoo
Posts: 62
Original Poster
Rep:

Ok, although I have always wanted to use a code up a neural network in (it's been on my todo list for a while), this seems like a rather large undertaking. It just seems like there should be an algebraic solution rather than having to resort to an adaptive algorithm. I had tried to come up with an algorithm that uses the ratio of the length (or volume if you want to jump to n dimensions) of an interval between two given points to the length of the total interval (xmaxxmin) to determine the number of points that should be put in between, but I could not get it to work consistently.
PS I lost you at the fourth bullet. I have never actually coded a neural network because I never fully understood how it works. I remember it having to do with adaptively adjusting weight factors.



01172011, 01:11 AM

#4

Senior Member
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723

It's not a neural network, just adaptive iteration, closely related to Monte Carlo methods.
The idea is that if you modify the boundary between two sampling regions in such a way that the difference in their volumes decreases. The difference in volumes behaves like entropy, and your modifications always decrease it (increasing order). If not, the method is stuck, and you have a result.
All the rest of the algorithm is just to make that efficient. Picking a random boundary  a pair of regions  is important, so that you cover all boundaries. Picking a boundary with a large difference in the volumes gives you a high possibility of decreasing entropy.
Each region is of course the volume which is covered by a single sampling point; a single sample cell.
If you just used this algorithm without modifications, you'd end up with a set of sampling points about evenly distributed in your sampling volume.
The trick is that you can add boundary conditions; for example, that each of the volumes must contain a specific point. This will decrease the efficiency of the algorithm a bit, since you'll have to discard those changes that violate the condition. On the other hand, you can use that as a yardstick on the changes you can make: at least some changes should violate the condition, or you're not doing drastic enough changes.
Even better, you can apply weighing to the sampling volumes. If you measure the volume using random sampling, you can use any continuous positive spatial distribution function to weigh the volumes; getting a sort of importance sampling.
As to the fourth bullet, consider this: Say you have four things, A, B, C, and D, that have probabilities a, b, c and d. Let L=a+b+c+d, the total sum of probabilities. Generate random number i, 0 <= i < L. Then, if i < a, pick A. if i < a+b, pick B. if i < a+b+c, pick C. Otherwise pick D, since i < a+b+c+d = L. This is slice sampling. The fourth bullet just means that this is a good way to choose the boundary. A, B, C, and D are the boundaries (or sampling cell pairs), and a, b, c, d are their weights. The weight being the ratio of the smaller volume to the larger volume in the pair. If you sample enough times, you'll sample all of them, but you'll sample the high weight ones more.
As you might have gathered, I do like bouncing off ideas such as these. I hope I didn't waste too much of your time Nominal Animal
Last edited by Nominal Animal; 03212011 at 07:32 AM.



Thread Tools 
Search this Thread 


Posting Rules

You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off



All times are GMT 5. The time now is 10:25 PM.

LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.

Latest Threads
LQ News

