How to Read File Without Loading File Into Memory in Java

Inexperienced programmers often think that Coffee's automatic garbage collection completely frees them from worrying almost memory direction. This is a common misperception: while the garbage collector does its best, information technology's entirely possible for even the best programmer to fall prey to crippling memory leaks. Allow me explicate.

A retention leak occurs when object references that are no longer needed are unnecessarily maintained. These leaks are bad. For i, they put unnecessary pressure on your machine as your programs consume more than and more resources. To brand things worse, detecting these leaks tin exist difficult: static analysis often struggles to precisely identify these redundant references, and existing leak detection tools runway and study fine-grained data well-nigh individual objects, producing results that are hard to interpret and lack precision.

In other words, leaks are either too difficult to identify, or identified in terms that are too specific to be useful.

At that place actually four categories of memory issues with similar and overlapping symptoms, but varied causes and solutions:

  • Performance: usually associated with excessive object creation and deletion, long delays in garbage collection, excessive operating arrangement folio swapping, and more than.

  • Resource constraints: occurs when in that location'south either to picayune retentiveness bachelor or your retentiveness is besides fragmented to classify a large object—this tin can be native or, more commonly, Coffee heap-related.

  • Java heap leaks: the archetype retentivity leak, in which Coffee objects are continuously created without being released. This is unremarkably acquired by latent object references.

  • Native memory leaks: associated with any continuously growing memory utilization that is exterior the Java heap, such as allocations fabricated by JNI code, drivers or even JVM allocations.

In this memory direction tutorial, I'll focus on Java heaps leaks and outline an arroyo to detect such leaks based on Java VisualVM reports and utilizing a visual interface for analyzing Java technology-based applications while they're running.

But before you lot tin can prevent and find memory leaks, you lot should understand how and why they occur. (Annotation: If you lot take a good handle on the intricacies of memory leaks, you lot can skip ahead.)

Memory Leaks: A Primer

For starters, think of memory leakage every bit a illness and Java's OutOfMemoryError (OOM, for brevity) as a symptom. But as with whatever disease, not all OOMs necessarily imply memory leaks: an OOM can occur due to the generation of a large number of local variables or other such events. On the other hand, non all memory leaks necessarily manifest themselves as OOMs, especially in the case of desktop applications or client applications (which aren't run for very long without restarts).

Think of retention leakage as a illness and the OutOfMemoryError as a symptom. Just non all OutOfMemoryErrors imply memory leaks, and not all retentiveness leaks manifest themselves as OutOfMemoryErrors.

Why are these leaks so bad? Among other things, leaking blocks of memory during program execution frequently degrades system functioning over time, as allocated just unused blocks of retention will have to be swapped out once the organisation runs out of gratis physical memory. Eventually, a program may even exhaust its bachelor virtual address space, leading to the OOM.

Deciphering the OutOfMemoryError

As mentioned in a higher place, the OOM is a common indication of a memory leak. Essentially, the error is thrown when at that place's insufficient space to allocate a new object. Try as information technology might, the garbage collector can't find the necessary space, and the heap can't be expanded any farther. Thus, an error emerges, along with a stack trace.

The showtime step in diagnosing your OOM is to make up one's mind what the mistake really means. This sounds obvious, but the answer isn't always then clear. For example: Is the OOM actualization considering the Coffee heap is total, or considering the native heap is full? To assist you reply this question, permit's analyze a few of the the possible error messages:

  • coffee.lang.OutOfMemoryError: Java heap space

  • java.lang.OutOfMemoryError: PermGen space

  • java.lang.OutOfMemoryError: Requested array size exceeds VM limit

  • java.lang.OutOfMemoryError: request <size> bytes for <reason>. Out of swap space?

  • java.lang.OutOfMemoryError: <reason> <stack trace> (Native method)

"Java heap space"

This fault bulletin doesn't necessarily imply a retentiveness leak. In fact, the trouble can exist as elementary as a configuration issue.

For example, I was responsible for analyzing an application which was consistently producing this type of OutOfMemoryError. After some investigation, I figured out that the culprit was an array instantiation that was demanding likewise much retentiveness; in this example, it wasn't the application's fault, but rather, the application server was relying on the default heap size, which was as well small. I solved the problem by adjusting the JVM's memory parameters.

In other cases, and for long-lived applications in particular, the message might be an indication that we're unintentionally holding references to objects, preventing the garbage collector from cleaning them up. This is the Java language equivalent of a memory leak. (Annotation: APIs called by an awarding could also be unintentionally holding object references.)

Some other potential source of these "Coffee heap infinite" OOMs arises with the use of finalizers. If a form has a finalize method, then objects of that type do not take their space reclaimed at garbage collection time. Instead, after garbage collection, the objects are queued for finalization, which occurs later on. In the Dominicus implementation, finalizers are executed by a daemon thread. If the finalizer thread cannot go along upwardly with the finalization queue, and so the Java heap could fill up and an OOM could be thrown.

"PermGen infinite"

This fault message indicates that the permanent generation is full. The permanent generation is the area of the heap that stores course and method objects. If an awarding loads a big number of classes, then the size of the permanent generation might need to be increased using the -Twenty:MaxPermSize selection.

Interned java.lang.String objects are also stored in the permanent generation. The java.lang.String class maintains a pool of strings. When the intern method is invoked, the method checks the pool to encounter if an equivalent string is present. If and then, it'southward returned by the intern method; if not, the cord is added to the pool. In more precise terms, the java.lang.String.intern method returns a string's approved representation; the upshot is a reference to the same class instance that would exist returned if that string appeared as a literal. If an application interns a large number of strings, y'all might need to increase the size of the permanent generation.

Annotation: y'all can use the jmap -permgen command to print statistics related to the permanent generation, including data most internalized String instances.

"Requested assortment size exceeds VM limit"

This error indicates that the application (or APIs used by that awarding) attempted to allocate an array that is larger than the heap size. For example, if an application attempts to allocate an array of 512MB but the maximum heap size is 256MB, then an OOM volition be thrown with this error message. In most cases, the problem is either a configuration issue or a bug that results when an awarding attempts to allocate a massive array.

"Request <size> bytes for <reason>. Out of bandy space?"

This bulletin appears to be an OOM. However, the HotSpot VM throws this apparent exception when an allocation from the native heap failed and the native heap might be close to exhaustion. Included in the message are the size (in bytes) of the request that failed and the reason for the memory request. In most cases, the <reason> is the name of the source module that's reporting an allocation failure.

If this blazon of OOM is thrown, y'all might need to use troubleshooting utilities on your operating system to diagnose the issue farther. In some cases, the problem might non even be related to the awarding. For case, you might run into this mistake if:

  • The operating organization is configured with insufficient swap space.

  • Some other process on the system is consuming all available memory resources.

Information technology's also is possible that the application failed due to a native leak (for case, if some flake of application or library code is continuously allocating memory only fails to releasing information technology to the operating organization).

<reason> <stack trace> (Native method)

If you run across this error message and the peak frame of your stack trace is a native method, then that native method has encountered an allocation failure. The difference between this message and the previous is that the Java retention allocation failure was detected in a JNI or native method rather than in Coffee VM code.

If this type of OOM is thrown, you might need to employ utilities on the operating system to further diagnose the result.

Awarding Crash Without OOM

Occasionally, an application might crash shortly afterwards an resource allotment failure from the native heap. This occurs if you're running native code that doesn't check for errors returned by retentiveness resource allotment functions.

For example, the malloc system call returns NULL if there is no memory available. If the return from malloc is not checked, then the awarding might crash when it attempts to access an invalid memory location. Depending on the circumstances, this type of outcome can be hard to locate.

In some cases, the information from the fatal error log or the crash dump will be sufficient. If the cause of a crash is adamant to exist a lack of error-handling in some memory allocations, then yous must hunt down the reason for said allotment failure. As with any other native heap outcome, the organization might be configured with insufficient swap space, another process might exist consuming all available retentiveness resources, etc.

Diagnosing Leaks

In most cases, diagnosing retentiveness leaks requires very detailed knowledge of the application in question. Warning: the process tin be lengthy and iterative.

Our strategy for hunting down memory leaks volition be relatively straightforward:

  1. Identify symptoms

  2. Enable verbose garbage collection

  3. Enable profiling

  4. Analyze the trace

i. Identify Symptoms

Equally discussed, in many cases, the Java process will eventually throw an OOM runtime exception, a clear indicator that your retentivity resources have been exhausted. In this case, you need to distinguish between a normal memory exhaustion and a leak. Analyzing the OOM'south bulletin and try to observe the culprit based on the discussions provided above.

Oftentimes, if a Java application requests more storage than the runtime heap offers, it can be due to poor pattern. For instance, if an application creates multiple copies of an image or loads a file into an array, it will run out of storage when the paradigm or file is very large. This is a normal resource burnout. The awarding is working every bit designed (although this design is clearly boneheaded).

Merely if an application steadily increases its memory utilization while processing the aforementioned kind of data, you lot might have a memory leak.

2. Enable Verbose Garbage Collection

Ane of the quickest ways to affirm that you lot indeed have a retentivity leak is to enable verbose garbage collection. Memory constraint problems can usually exist identified past examining patterns in the verbosegc output.

Specifically, the -verbosegc argument allows you to generates a trace each time the garbage collection (GC) process is begun. That is, equally memory is being garbage-collected, summary reports are printed to standard error, giving you a sense of how your memory is being managed.

Here's some typical output generated with the –verbosegc option:

verbose garbage collection output

Each cake (or stanza) in this GC trace file is numbered in increasing order. To make sense of this trace, y'all should look at successive Allocation Failure stanzas and await for freed retention (bytes and percentage) decreasing over fourth dimension while total memory (here, 19725304) is increasing. These are typical signs of memory depletion.

iii. Enable Profiling

Unlike JVMs offer different ways to generate trace files to reflect heap activeness, which typically include detailed information near the type and size of objects. This is called profiling the heap.

iv. Analyze the Trace

This post focuses on the trace generated by Coffee VisualVM. Traces can come up in different formats, as they can be generated by different Java retentiveness leak detection tools, merely the idea backside them is always the same: find a block of objects in the heap that should not be at that place, and determine if these objects accrue instead of releasing. Of particular interest are transient objects that are known to be allocated each time a sure effect is triggered in the Java application. The presence of many object instances that ought to exist only in small quantities mostly indicates an awarding bug.

Finally, solving memory leaks requires y'all to review your code thoroughly. Learning about the type of object leaking can be very helpful and considerably speed upward debugging.

How Does Garbage Collection Work in the JVM?

Before we start our analysis of an application with a retentivity leak consequence, let's kickoff look at how garbage collection works in the JVM.

The JVM uses a grade of garbage collector chosen a tracing collector, which essentially operates past pausing the world around it, marking all root objects (objects referenced directly by running threads), and post-obit their references, marking each object information technology sees along the mode.

Java implements something called a generational garbage collector based upon the generational hypothesis supposition, which states that the bulk of objects that are created are quickly discarded, and objects that are not rapidly collected are likely to be around for a while.

Based on this assumption, Coffee partitions objects into multiple generations. Hither'south a visual interpretation:

Java partions into multiple generations
  • Young Generation - This is where objects beginning out. It has 2 sub-generations:

    • Eden Space - Objects start out here. Nigh objects are created and destroyed in the Eden Space. Hither, the GC does Minor GCs, which are optimized garbage collections. When a Pocket-sized GC is performed, any references to objects that are still needed are migrated to ane of the survivors spaces (S0 or S1).

    • Survivor Space (S0 and S1) - Objects that survive Eden end upward here. At that place are two of these, and only one is in utilise at whatsoever given time (unless we take a serious retentiveness leak). One is designated equally empty, and the other as live, alternating with every GC cycle.

  • Tenured Generation - Also known every bit the onetime generation (onetime space in Fig. ii), this infinite holds older objects with longer lifetimes (moved over from the survivor spaces, if they live for long enough). When this space is filled up, the GC does a Full GC, which costs more in terms of performance. If this space grows without jump, the JVM will throw an OutOfMemoryError - Coffee heap space.

  • Permanent Generation - A 3rd generation closely related to the tenured generation, the permanent generation is special because it holds information required by the virtual machine to describe objects that do non have an equivalence at the Java language level. For example, objects describing classes and methods are stored in the permanent generation.

Java is smart plenty to utilise different garbage collection methods to each generation. The young generation is handled using a tracing, copying collector chosen the Parallel New Collector. This collector stops the world, only because the young generation is more often than not small-scale, the suspension is short.

For more information virtually the JVM generations and how them work in more than detail visit the Retention Management in the Java HotSpot™ Virtual Auto documentation.

Detecting a Retentiveness Leak

To observe memory leaks and eliminate them, you demand the proper retention leak tools. It's time to detect and remove such a leak using the Java VisualVM.

Remotely Profiling the Heap with Java VisualVM

VisualVM is a tool that provides a visual interface for viewing detailed information about Java technology-based applications while they are running.

With VisualVM, you tin can view information related to local applications and those running on remote hosts. You can also capture data virtually JVM software instances and save the information to your local organisation.

In order to benefit from all of Java VisualVM'due south features, you lot should run the Java Platform, Standard Edition (Java SE) version 6 or above.

Enabling Remote Connection for the JVM

In a product environment, it'south often hard to access the actual machine on which our lawmaking will be running. Luckily, we can profile our Java application remotely.

First, we need to grant ourselves JVM admission on the target motorcar. To do so, create a file chosen jstatd.all.policy with the following content:

          grant codebase "file:${java.home}/../lib/tools.jar" {     permission java.security.AllPermission;  };                  

One time the file has been created, nosotros need to enable remote connections to the target VM using the jstatd - Virtual Auto jstat Daemon tool, as follows:

          jstatd -p <PORT_NUMBER> -J-Djava.security.policy=<PATH_TO_POLICY_FILE>                  

For example:

          jstatd -p 1234 -J-Djava.security.policy=D:\jstatd.all.policy                  

With the jstatd started in the target VM, we are able to connect to the target machine and remotely profile the application with retentiveness leak issues.

Connecting to a Remote Host

In the client machine, open up a prompt and blazon jvisualvm to open the VisualVM tool.

Next, we must add together a remote host in VisualVM. As the target JVM is enabled to allow remote connections from another machine with J2SE 6 or greater, we first the Java VisualVM tool and connect to the remote host. If the connectedness with the remote host was successful, we will see the Coffee applications that are running in the target JVM, as seen here:

running in the target jvm

To run a retention profiler on the awarding, nosotros only double-click its proper noun in the side panel.

At present that we're all gear up with a memory analyzer, let's investigate an application with a retention leak result, which we'll call MemLeak.

MemLeak

Of form, there are a number of ways to create memory leaks in Java. For simplicity we will define a class to be a key in a HashMap, but we will not define the equals() and hashcode() methods.

A HashMap is a hash table implementation for the Map interface, and as such it defines the basic concepts of key and value: each value is related to a unique key, so if the key for a given primal-value pair is already present in the HashMap, its electric current value is replaced.

Information technology's mandatory that our central course provides a correct implementation of the equals() and hashcode() methods. Without them, there is no guarantee that a expert key will exist generated.

By not defining the equals() and hashcode() methods, we add the same fundamental to the HashMap over and over and, instead of replacing the cardinal equally it should, the HashMap grows continuously, failing to identify these identical keys and throwing an OutOfMemoryError.

Here's the MemLeak class:

          bundle com.mail service.memory.leak;  import java.util.Map;  public course MemLeak {      public terminal String key;            public MemLeak(String key) {          this.key =cardinal;      }            public static void main(String args[]) {          try {              Map map = Organization.getProperties();                            for(;;) {                  map.put(new MemLeak("key"), "value");              }          } take hold of(Exception e) {              due east.printStackTrace();          }      }  }                  

Note: the memory leak is not due to the infinite loop on line 14: the infinite loop can lead to a resource exhaustion, but not a retentiveness leak. If nosotros had properly implemented equals() and hashcode() methods, the code would run fine even with the space loop every bit nosotros would merely accept one element within the HashMap.

(For those interested, here are some alternative means of (intentionally) generating leaks.)

Using Java VisualVM

With Coffee VisualVM, nosotros tin can retention-monitor the Java Heap and place if its behavior is indicative of a memory leak.

Hither'southward a graphical representation of MemLeak's Java Heap analyzer just after initialization (recall our discussion of the various generations):

monitor memory leaks using java visualvm

Later only 30 seconds, the Old Generation is almost full, indicating that, even with a Total GC, the Old Generation is ever-growing, a clear sign of a retentiveness leak.

One means of detecting the cause of this leak is shown in the post-obit image (click to zoom), generated using Coffee VisualVM with a heapdump. Here, we run into that 50% of Hashtable$Entry objects are in the heap, while the 2nd line points us to the MemLeak form. Thus, the retention leak is caused past a hash table used within the MemLeak class.

hash table memory leak

Finally, observe the Java Heap merely after our OutOfMemoryError in which the Young and Old generations are completely full.

outofmemoryerror

Decision

Retentiveness leaks are among the most difficult Java application problems to resolve, equally the symptoms are varied and hard to reproduce. Hither, we've outlined a stride-to-step approach to discovering memory leaks and identifying their sources. Only higher up all, read your error messages closely and pay attention to your stack traces—not all leaks are as simple as they appear.

Appendix

Along with Java VisualVM, there are several other tools that can perform memory leak detection. Many leak detectors operate at the library level by intercepting calls to retentivity management routines. For example, HPROF, is a simple command line tool arranged with the Java ii Platform Standard Edition (J2SE) for heap and CPU profiling. The output of HPROF can be analyzed directly or used as an input for others tools like JHAT. When we work with Java 2 Enterprise Edition (J2EE) applications, there are a number of heap dump analyzer solutions that are friendlier, such as IBM Heapdumps for Websphere application servers.

pattontiontems.blogspot.com

Source: https://www.toptal.com/java/hunting-memory-leaks-in-java

0 Response to "How to Read File Without Loading File Into Memory in Java"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel