LibreOffice Calc performance vs MS Excell with large amount of data lines
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
LibreOffice Calc performance vs MS Excell with large amount of data lines
I'm doing my research project now with some quite large amount of data. I'm not a programmer and I my knowledge is limited to Excel and LibreOffice Calc, when we talk about preparation of the data for the analysis.
My problem is with LibreOffice Calc, where my data has ~500 000 of lines. MS excell just crunches it with no sweat (insert new column, do some vlookup's). However LibreOffice Calc just stalks when I ask it to insert new column or even sort that data. And stalks it from 10 to 60 or more seconds. I went to tools/options/memory and increased some numbers there, but no positive effect registered.
My Linux machine, on which I work, has no problems with resources: 8GB RAM, intel Core i7 CPU, SSD drive.
My Windows machine, where excel has no any problems with that large amount of data has even even worse CPU (i5).
I have also noticed, that during these stalks only one CPU core reaches 100 percent, all other has no load on my Linux machine.
I guess something could be tweaked. Anyone can suggest anything?
I'm doing my research project now with some quite large amount of data. I'm not a programmer and I my knowledge is limited to Excel and LibreOffice Calc, when we talk about preparation of the data for the analysis.
My problem is with LibreOffice Calc, where my data has ~500 000 of lines. MS excell just crunches it with no sweat (insert new column, do some vlookup's). However LibreOffice Calc just stalks when I ask it to insert new column or even sort that data. And stalks it from 10 to 60 or more seconds. I went to tools/options/memory and increased some numbers there, but no positive effect registered.
My Linux machine, on which I work, has no problems with resources: 8GB RAM, intel Core i7 CPU, SSD drive.
My Windows machine, where excel has no any problems with that large amount of data has even even worse CPU (i5).
I have also noticed, that during these stalks only one CPU core reaches 100 percent, all other has no load on my Linux machine.
I guess something could be tweaked. Anyone can suggest anything?
It seems like there is quite a lot of similar bug reports for poor calc performance with large data. It seems like this is not a configuration issue. Or not the configuration issue, which I could solve... I'll wait for next LibreOffice version.
I tried to use Calc against updating MS excel but the speed on some large calculations made buying the MS product cost effective. Excel rates well for many tasks. Strictly on a DB issue there are other programs that may be faster but you'll have to learn them. IBM DB2 has a host of accelerator products for massive databases.
Yeah... Common problem. But it seems like the best solution is to use right tool. And either Calc or Excel is not the right tool for large amount of data. I have to learn to use another tool. Are there any open source tools for that? I guess IBM DB2 is not an Open Source. Or is it?
Yeah... Common problem. But it seems like the best solution is to use right tool. And either Calc or Excel is not the right tool for large amount of data. I have to learn to use another tool. Are there any open source tools for that? I guess IBM DB2 is not an Open Source. Or is it?
ibm-db2 isnt (it runs on s/390 mainframes -- the version of db2 for unix would be ibm-udb which also is proprietary) but mysql is.
It boils down to the type of data and the way you want to access it and crunch it really.
You'd have to decide what tool is best.
Official differences. https://wiki.openoffice.org/wiki/Doc...Calc_and_Excel One can generally export/import data of these or link their data to some standard database to speed up or increase ability as in size or scope or join. You can use different backends with either calc or excel I think.
Then you get into similar things like Filemaker pro and ways to use similar in linux.
If one had a huge amount of data to crunch in some way and they needed it fast, IBM DB2 and maybe Oracle would be my first two choices. My huge and your huge may be two different things.
It boils down to the type of data and the way you want to access it and crunch it really.
You'd have to decide what tool is best.
[...]
If one had a huge amount of data to crunch in some way and they needed it fast, IBM DB2 and maybe Oracle would be my first two choices. My huge and your huge may be two different things.
Thanks for the links. I'll try to understand what are these.
I'm trying to analyze it using Gephi. I render these results into coalitions: https://flic.kr/p/TB8BZY . It seems like fun to me. It also helps me to understand Lithuanian politics better and quite faster than it would be otherwise.
And functions, which I use with the excel or calc is: Concatenate, Vlookup, filter.
I'm curious about how you are parsing those files. The header in "klausimai.csv" shows six fields but the number of commas in a record varies. For example the following line has 40 commas.
Code:
2017-05-02,11:56:24,rytinis,; pateikimas,"Darbo kodekso patvirtinimo, įsigaliojimo ir įgyvendinimo įstatymo Nr. XII-2603 1 straipsniu patvirtinto Lietuvos Respublikos darbo kodekso 21, 23, 31, 32, 40, 43, 48, 52, 53, 57, 63, 65, 71, 79, 112, 114, 115, 117, 120, 127, 144, 147, 169, 171, 179, 181, 185, 195, 197, 204, 209, 221, 237, 240, 241 ir 242 straipsnių pakeitimo įstatymo projektas (Nr. XIIIP-587)",http://www.lrs.lt/sip/portal.show?p_r=15275&p_k=1&p_a=sale_bals&p_bals_id=-26038
Kind of hard to import as a csv file.
EDIT: Never mind. I was trying to import some of the data into MySQL and it was ignoring the quotes around the "klausimai.pavadinimas" field. Figured it out by importing into LibreOffice calc.
All politicians has their fractions and then coalitions. And some of them changes fractions during the ruling period - I needed to put all these fractions next to the name. I then colored all these fractions, so I could see how these renders into the coalitions.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.