LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-15-2019, 09:55 AM   #1
datafanatik
LQ Newbie
 
Registered: Oct 2019
Posts: 2

Rep: Reputation: Disabled
gawk in Cygwin32 - script hangs - switched to Cygwin64 script runs. Memory issues?


Hi Everyone,

I'm a newbie and I'm excited to join this forum to learn and share where I can.

My platform is Windows10 Enterprise.
I have 24 GB RAM.
System type: 64-bit Operating System, x64-based processor.
Processor is i7-9800X CPU @ 3.8GHz

I just started to use Cygwin32 at work. I am merging text files from various sources into a pipe delimited format file. I chose gawk since I've used it more than other languages and I need to get this project completed in a short time-frame. Just started the job 1 months ago and I need to prove that I was the right candidate for the position!!

I'm reading a 800+k line file and 2 smaller files using gawk.

gawk -v p1="file1.txt" -v p2="file2.txt" -v p3="file800lines.txt" -f awkscript.txt afile.txt > gawkout.txt


In the awkscript, I read file1 and file2 into arrays.
I then read the file800lines file. I use the getline function.
I'm able to process all the 800k lines and populate array values based on the associative values in files file1 and file2.
When I attempt to print out the 800lines using a for loop the script just hangs. I then put print statements inside the for loop and found out it hangs around the 250k line. There are no error messages. I just Ctrl C to get back to the prompt.

Thankfully, I had Cygwin64 installed along with Cygwin32.
I started the Cygwin64 window and reran the script and it worked!!
It may appear that Cygwin64 by default allocates a large portion of memory, hence, the reason why my script was able to complete (I think).

I checked the web and found this link:
https://cygwin.com/cygwin-ug-net/setup-maxmem.html

After reading it, this may be a solution to the problem when using Cygwin32, but I defer this to people smarter than me.

If this is the right path for a solution, can someone provide some direction using the peflags utility since I'm lost as to how to implement the command. There's also a link: 4-Gigabyte Tuning that goes to: https://docs.microsoft.com/en-us/win...ectedfrom=MSDN

If Cygwin64 is superior to Cywgin32 and handles memory allocation dynamically and I won't have any future issues if I use Cygwin64 reading 1M+ lines, then, I guess I don't have a problem!!

However, just to be safe, are the above links a way to increase memory when using Cygwin64 to eliminate any future memory issues?

Sorry for being so verbose, but I felt its better to provide enough information the first time without having to go back and forth with Q&As.

Thanks in advance guys!
 
Old 10-15-2019, 10:20 AM   #2
JeremyBoden
Senior Member
 
Registered: Nov 2011
Location: London, UK
Distribution: Debian
Posts: 1,947

Rep: Reputation: 511Reputation: 511Reputation: 511Reputation: 511Reputation: 511Reputation: 511
Any reason why you can't just use
Code:
cat file1 file2 file3 | sort > output-file?
 
Old 10-15-2019, 10:45 AM   #3
datafanatik
LQ Newbie
 
Registered: Oct 2019
Posts: 2

Original Poster
Rep: Reputation: Disabled
Yes Jeremy. the 800line file require manipulation of the fields to perform references to the data in file1 and file2 which have been loaded into associative arrays. I also perform certain logic within the gawk script to generate values based on certain business rules. Examples are dates that are in different formats and I have to change them to specific formats plus I have to concatenate fields to create new fields using business rules, for the output file.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Windows “takeown” command fails when run via Cygwin64/OpenSSH ans1 Linux - Networking 2 03-27-2018 02:47 PM
xtermcontrol 2.10 and 3.2 on Cygwin64 on a windows 7 platform JPHJPH Linux - Newbie 0 07-10-2014 07:44 PM
Live DVD runs out of memory - using swap - free memory doesn't help at upgrade .... LiNuXkOlOnIe Linux - Distributions 3 06-09-2013 09:35 PM
[SOLVED] gawk 3.1.3 vs gawk 3.1.1 sharky Programming 2 04-13-2010 01:55 PM
Help!?! RH 8 Memory Mapping -High Memory-Virtural Memory issues.. Merlin53 Linux - Hardware 2 06-18-2003 04:48 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 08:33 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration