LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   rsyncing 150GB of data from USA to Europe takes a lot of time (https://www.linuxquestions.org/questions/linux-general-1/rsyncing-150gb-of-data-from-usa-to-europe-takes-a-lot-of-time-4175582088/)

AdultFoundry 06-12-2016 10:08 AM

rsyncing 150GB of data from USA to Europe takes a lot of time
 
rsyncing 150GB of data from USA to Europe takes a lot of time, close to 6-8 hours of waiting. I am wondering what do people do if they have to rsync like 150TB or a big movie / tube site, which would be even bigger. People dont wait around like 3 years for this to complete probably? It would be A LOT easier to physically ship the disks in a package or something like this.

I am doing something like this for the first time now, and I am wondering how this normally works? Are there any special services for it (like satellite maybe) or what is the situation with that.

The same goes for data centers, all this is big, and consumes a lot of electricity and so on. It is all new and modern but it looks like a lot of things could be improved / done at this point - small data centers (how small it can be?), faster connection (like 100,000 faster for situations like this or more). It may be needed, I think, as the time goes by.

But the original questions is the rsync speed. People dont wait like 3 years, so how does it work?

Emerson 06-12-2016 10:15 AM

They do not wait 3 years, they use faster connections than you.

AdultFoundry 06-12-2016 10:26 AM

Quote:

Originally Posted by Emerson (Post 5559767)
They do not wait 3 years, they use faster connections than you.

It does not have anything to do with my connection speed. It is connection speed between hosting company A and B and these are reputable companies, and the target one is one of the biggest in the World (OVH). For this reason, I imagine, it would not get much faster somewhere else, although I am not sure.

273 06-12-2016 10:43 AM

I haven't done this exact thing but I have downloaded a 120GB file over bittorrent and it completed in a few days. Rather than using rsync would it be possible to simply use wget or similar for the initial backup then rsync for incremental? I think rsync can also be piped throguh gzip but not done so myself so couldn't say how it's done or whether it's worth it.

Turbocapitalist 06-12-2016 11:17 AM

Video is already compressed so trying to add compression with rsync (or ssh which is what rsync is traveling over) would only slow things down. If your data set is big enough, just put it on an encrypted, removable set of hard drives and ship the drives to the data center. If I recall correctly Netflix sends whole hardware units containing servers with populated drives to various geographic locations rather than transferring over the net. I cannot remember the buzzword they use to refer to the hardware units, but the disk set up in them is reportedly JBOD with FFS.

You have measured the transfer rate over the net. You also have the postal rates for several shipping options, including the time needed to get the drives to their destination. The math is fairly simple to see which one comes out ahead. For the initial population of the remote site, it will probably be the parcel service.

There's an informative and possibly relevant backstory to the quote, "Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway."

AdultFoundry 06-12-2016 01:11 PM

Good point with Netflix :)

Turbocapitalist 06-12-2016 01:37 PM

Quote:

Originally Posted by AdultFoundry (Post 5559838)
Good point with Netflix :)

It is the only concrete example I can recall. There are rumors of other big transfers, but nothing concrete.

Some general information is here:

http://queue.acm.org/detail.cfm?id=2933408

Looking around it seems that some of the data centers have import services for just such an occasion. The ones you are working with might have that available.

AdultFoundry 06-12-2016 01:45 PM

I am uploading a 333MB database from home to the hosting now and it will take rougly 40+ minutes. My Internet plan is basic, so this would be better and it would not be that bad. I am feeling like in the Stone Age today :)

I dont want to sound like I am going to deep on this, but this is showing some important things here, and on a global scale. The cables are not fast enough for what is needed. I wasted like 8 hours today, how much time gets wasted if we factor all the people in, and also the ones who physically ship the files, instead of just sending them over the Net (like 10,000 or 100,000 or 1,000,000 faster than it is now). These speeds are not impossible at all, they will come up with something, I am sure. Or there is something like this already. I've seen Google Fiber on google.com, this looks better / good.

Come to think of it, how many pcs / computers gets produced, how much electricity gets useds. I think that the electricity use doubles every four years or so (I am guessing here, but something like this, probably less than 8-10) and that all the computers / mobile phones (all these devices) consume more energy that all the planes combined (like airlines). This is still growing probably...

I am not familiar with these topics, but it is all good points. Like how much $$$ economy wise moved to the IT sector now, all the jobs (like you look for a job and there is 1,000 IT offers, and like 100 something else). Things have changed...

TB0ne 06-12-2016 04:52 PM

Quote:

Originally Posted by AdultFoundry (Post 5559860)
I am uploading a 333MB database from home to the hosting now and it will take rougly 40+ minutes. My Internet plan is basic, so this would be better and it would not be that bad. I am feeling like in the Stone Age today :)

I dont want to sound like I am going to deep on this, but this is showing some important things here, and on a global scale. The cables are not fast enough for what is needed. I wasted like 8 hours today, how much time gets wasted if we factor all the people in, and also the ones who physically ship the files, instead of just sending them over the Net (like 10,000 or 100,000 or 1,000,000 faster than it is now). These speeds are not impossible at all, they will come up with something, I am sure. Or there is something like this already. I've seen Google Fiber on google.com, this looks better / good.

Come to think of it, how many pcs / computers gets produced, how much electricity gets useds. I think that the electricity use doubles every four years or so (I am guessing here, but something like this, probably less than 8-10) and that all the computers / mobile phones (all these devices) consume more energy that all the planes combined (like airlines). This is still growing probably...

I am not familiar with these topics, but it is all good points. Like how much $$$ economy wise moved to the IT sector now, all the jobs (like you look for a job and there is 1,000 IT offers, and like 100 something else). Things have changed...

Or you can look at what an OC/12 or OC/48 line is like. Sorry, but your home connection is cheap, as is mine. Businesses pay big dollars for hugely-fast connections, which typically work as one. Further, they also spend big $$$ on SAN resources which do lots of magic behind the scenes to get data and BCV snapshots from one site to another quickly.

An OC/48 is 2.488 Gbps....and if you have one, NO ONE BUT YOU uses it. Do the math there.

Smokey_justme 06-12-2016 06:09 PM

@Adult... For the minimum package of any kind of ISP, waiting 40 minutes to upload 333 MB is not bad at all... Most ISPs are now on Fiber Optics cables or at least they have a backbone based on Fiber Optics but for 10 bucks they (rightfully) limit your upload speed... As for download speed, while higher that's also ussually limited by the ISP based on your package, not on the cables... In Romania we have 1 Gbps (it would take, in theory, 3 seconds to download 333 MB if the uploader can push it and your hardware can handle it) for just a bit more than 15 Euros.. And this is for HOME-use...
But even so, if you have to rsync daily large quantities of data as a business then be sure you're going to use a business option or even a partially dedicated line...

If you're going to look, you'll see that solutions for your problems already exists, they we're already invented (and possibily re-invented) ... They just don't come at 10 or 15 Euros/Dollars..

Habitual 06-13-2016 03:11 AM

Bandwidth costs money. Spend some.
Failing that. Buy a 1Tb USB Drive and mail it to Europe.
When did common-sense take a backseat to Computing?

Such is the life of a pornography host?

sundialsvcs 06-13-2016 06:47 AM

Also, don't forget that this transfer might be going by satellite!

There's this pesky thing called "The Atlantic Ocean" in the way . . .

jefro 06-13-2016 04:44 PM

That is funny. "
There's this pesky thing called "The Atlantic Ocean" in the way . . ."


Might have to get your old AOL floppies and save it to physical media to send large amounts.


Guess it can't hurt to look at modern compression for your type of data. If it is a database and you know typical record sizes then you may be able to consider compression program and options for best results.

suicidaleggroll 06-13-2016 06:09 PM

Quote:

Originally Posted by AdultFoundry (Post 5559773)
It does not have anything to do with my connection speed. It is connection speed between hosting company A and B and these are reputable companies, and the target one is one of the biggest in the World (OVH). For this reason, I imagine, it would not get much faster somewhere else, although I am not sure.

You're being throttled, that's why. THEY can transfer to each other much faster than that, YOU are being limited because you're a small user on their network and that's the bandwidth they've allotted for you.

Quote:

Originally Posted by AdultFoundry (Post 5559860)
I am uploading a 333MB database from home to the hosting now and it will take rougly 40+ minutes.

...

I dont want to sound like I am going to deep on this, but this is showing some important things here, and on a global scale. The cables are not fast enough for what is needed.

Yes they are, you just need to spend more. 333 MB in 40 minutes is 140 kB/s, businesses routinely run OC3 or faster lines which give them 150x that or more.
https://en.wikipedia.org/wiki/Optica...smission_rates

Regular users usually don't upload that much data, so the typical 50 Mb down, 5 Mb up connections you get from many ISPs are more than enough.

TenTenths 06-14-2016 03:08 AM

Quote:

Originally Posted by AdultFoundry (Post 5559762)
rsyncing 150GB of data from USA to Europe takes a lot of time, close to 6-8 hours of waiting.

A lot depends on what's being transfered, there's an overhead involved when transfering files and if it's a LOT of small files then the overhead can slow down a transfer considerably compared to transfering a single file.

Between hosting companies, as has been said, your speed may be dependent on a number of factors, such as what contracted bandwidth you have. If it's "speeds up to X Mb/s" then that's a maximum not a minimum, and may be burstable on a 95th percentile basis. You may find that if you try and transfer a large amount at a high speed that you exceed your burst limits and get throttled accordingly.

There may be other bottlenecks rather than the pure connectivity element. If you're on shared hosting then you're (surprise, surprise) sharing the connection with all the others on the same box. If you're on a VPS then you're also sharing resources and connectivity with other machines on the box. If the underlying storage is something like RAID5 SATA then this will be a bottleneck compared to something like RAID1+0 SAS

All of these things are factors.

You mention you're transfering to OVH, are you transfering to their Canada DC or one of the ones in France? I recently migrated around 500Gb of data in 60,000 files to them from another hosting company and the speed was very acceptable.


All times are GMT -5. The time now is 03:39 PM.