LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Java and String (https://www.linuxquestions.org/questions/programming-9/java-and-string-760910/)

Wim Sturkenboom 10-10-2009 12:11 AM

Java and String
 
Note in advance: I'm coming from a C world and this is my first Java stuff.

I have the following piece of code to read a ascii file; FileData is of type String.
Code:

    public static boolean ReadFile(File infile) {
        FileData = "";
        FileName = "";
        try {
            Scanner myfile = new Scanner(infile);

            try {
                while (myfile.hasNext()) {
                    FileData += myfile.nextLine();
                    LineCnt++;
                    FileData += "\n";
                }
                myfile.close();
            }
            ...
            ...

This works, but what worries me is that I can not figure out what will happen when I read a massive big text file. Somewhere memory will get exhausted and an exception should be trown. But I can't find that exception.
Can somebody advice.

gzunk 10-10-2009 03:05 AM

If the JVM cannot allocate any more memory, you'll get an OutOfMemoryError exception thrown. It's classified as an unchecked exception, which means any method can throw it without declaring that it throws it.

But you really don't want that to happen, because after that exception is thrown then nothing else can really happen.

Also, you really don't want to be using Strings like that. In Java, strings are immutable - they don't change. What actually happens in your program is that FileData gets allocated twice per loop.

Firstly, memory gets allocated for FileData that is the current length of FileData data plus the length of the new line, then the data is copied across, then the existing memory is marked as free so that the garbage collected can free it.

Secondly, memory gets allocated for FileData that is the current length of FileData plus the length of "\n", then the data is copied across, then the existing memory is marked as free so that the garbage collected can free it.

If the garbage collected doesn't run quickly enough, then you're going to run out of memory far sooner than you would expect.

Use StringBuffer for this kind of thing, it doesn't re-allocate memory like String does.

The reason for String behaving like this is so that the JVM can efficiently share String instances across your application - i.e. if you have these Strings: "fred" "freddie" "freddiejohn" and "john" then Java actually holds this as "freddiejohn" with pointers and length counts to define the four actual strings involved.

lutusp 10-10-2009 04:14 AM

Quote:

Originally Posted by Wim Sturkenboom (Post 3714201)
Note in advance: I'm coming from a C world and this is my first Java stuff.

I have the following piece of code to read a ascii file; FileData is of type String.
Code:

    public static boolean ReadFile(File infile) {
        FileData = "";
        FileName = "";
        try {
            Scanner myfile = new Scanner(infile);

            try {
                while (myfile.hasNext()) {
                    FileData += myfile.nextLine();
                    LineCnt++;
                    FileData += "\n";
                }
                myfile.close();
            }
            ...
            ...

This works, but what worries me is that I can not figure out what will happen when I read a massive big text file. Somewhere memory will get exhausted and an exception should be trown. But I can't find that exception.
Can somebody advice.

For a large file, you don't want to concatenate onto the end of a string as you are doing -- it's very inefficient and for large files the slowdown becomes dramatic (Java has to find the string ending on each append operation). Instead, append each line to a StringBuffer instance. Then, after the file is completely read, assign the StringBuffer result to a string. Much faster and more efficient.

This is a mistake:

Code:

FileData += "\n";
Always use the platform line ending in a Java program. Remember that Java programs are supposed to be platform-portable.

For truly huge files, you may need to launch your program with a larger memory allotment:

Code:

$ java -Xmx1000m ProgramName

Wim Sturkenboom 10-10-2009 07:22 AM

Thanks to both.

OK, I will read up on StringBuffer.

I have two problems with nextLine.
1) It reads including the line terminator but returns the string without it. When displaying the data in a textarea, it is one long line if I do not manually insert a line terminator.
2) Secondly it ignores empty lines at the end of the textfile by the looks of it and therefore does not give a true reflection of the file.

Because of this, I'm actually looking for a class and method that can read a file in one go (but have not found it yet exactly). I did find some other classes (http://java.sun.com/docs/books/tutorial/essential/io/) that I must look at and probably first determine the filesize and next use a read. Any hints for this?

lutusp 10-10-2009 06:31 PM

Quote:

Originally Posted by Wim Sturkenboom (Post 3714483)
Thanks to both.

OK, I will read up on StringBuffer.

I have two problems with nextLine.
1) It reads including the line terminator but returns the string without it. When displaying the data in a textarea, it is one long line if I do not manually insert a line terminator.
2) Secondly it ignores empty lines at the end of the textfile by the looks of it and therefore does not give a true reflection of the file.

In that case, why not read the entire file, not line by line as you are doing? If your intention is to get an internal representation of the exact content of the target file, you aren't going about it right.

Here is one example (somewhat out of date, but just to show the idea):

Code:

  public String readFile(String path) throws Exception
  {
      File f = new File(path);
      char[] buff = new char[(int)f.length()];
      FileReader fin  = new FileReader(f);
      fin.read(buff);
      fin.close();
      return new String(buff);
  }


Wim Sturkenboom 10-10-2009 11:48 PM

Thanks lutusp,

your reply came slightly too late. I did rework my code after some more research and it now looks like this.
Code:

/*
 * To change this template, choose Tools | Templates
 * and open the template in the editor.
 */

package demo_read_a_file;

import java.io.File;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.FileNotFoundException;

import javax.swing.JOptionPane;
/**
 *
 * @author wim
 */
public class FileOperations {
    private static String FileName;
    private static String FileError;
    private static char[] FileData = null;
    private static long FileSize;

    public static String getFileName () {
        return FileName;
    }
    public static char[] getFileData () {
        return FileData;
    }
    public static long getFileSize () {
        return FileSize;
    }
    public static String getFileError () {
        return FileError;
    }

    public static boolean ReadFile(File infile) {
        // slightly modified from
        // http://www.exampledepot.com/egs/java.io/File2ByteArray.html?l=rel
        FileName = "";

        // Get the size of the file
        FileSize = infile.length();

        // You cannot create an array using a long type.
        // It needs to be an int type.
        // Before converting to an int type, check
        // to ensure that file is not larger than Integer.MAX_VALUE.
        if (FileSize > Integer.MAX_VALUE) {
            // File is too large
            FileError = "Software limitation; file too large";
            JOptionPane.showMessageDialog(null, FileError, "Software Limitation", JOptionPane.ERROR_MESSAGE);
            return false;
        }

        int offset=0;
        int numRead=0;
        FileData = new char[(int)FileSize];
        try {
            BufferedReader inputstream = new BufferedReader(new FileReader(infile));

            try {
                while (offset < FileData.length
                  && (numRead=inputstream.read(FileData, offset, FileData.length-offset)) >= 0) {
                    offset += numRead;
                }
                inputstream.close();
                // Ensure all the bytes have been read in
                if (offset < FileData.length) {
                    FileError = "Could not completely read file "+infile.getName();
                    JOptionPane.showMessageDialog(null, FileError, "File Error", JOptionPane.ERROR_MESSAGE);
                    return false;
                }

            }
            catch (IOException e1) {
                try {
                    inputstream.close();
                }
                catch (IOException e2) {
                    FileError = "Error closing file";
                    JOptionPane.showMessageDialog(null, FileError, "File Error", JOptionPane.ERROR_MESSAGE);
                    return false;
                }
                FileError = "IOException occured";
                JOptionPane.showMessageDialog(null, FileError, "File Error", JOptionPane.ERROR_MESSAGE);
                return false;
            }
        }
        catch (FileNotFoundException e) {
            FileError = "File '" + infile.getName() + "' not found";
            JOptionPane.showMessageDialog(null, FileError, "File Error", JOptionPane.ERROR_MESSAGE);
            return false;
        }

        FileName = infile.getName();
        FileError = "";
        return true;
    }

    public static void WriteFile() {

    }
}

If you see major issues with it, please let me know so I can learn.


All times are GMT -5. The time now is 03:18 AM.