121 lines
		
	
	
		
			5.8 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
			
		
		
	
	
			121 lines
		
	
	
		
			5.8 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
Contributed by Christopher Faylor
 | 
						|
 | 
						|
Cygwin has recently adopted something called the "cygwin heap".  This is
 | 
						|
an internal heap that is inherited by forked/execed children.  It
 | 
						|
consists of process specific information that should be inherited.  So
 | 
						|
things like the file descriptor table, the current working directory,
 | 
						|
and the chroot value live there.
 | 
						|
 | 
						|
The cygheap is also used to pass argv information to a child process.
 | 
						|
There is a problem here, though.  If you allocate space for argv on the
 | 
						|
heap and then exec a process the child process (1) will happily use the
 | 
						|
space in the heap.  But what happens when that process execs another
 | 
						|
process (2)?  The space used by child process (1) still is being used in
 | 
						|
child process (2) but it is basically just a memory leak.
 | 
						|
 | 
						|
To rectify this problem, memory used by child process 1 is tagged in
 | 
						|
such a way that child process 2 will know to delete it.  This is in
 | 
						|
cygheap_fixup_in_child.
 | 
						|
 | 
						|
The cygheap memory allocation functions are adapted from memory
 | 
						|
allocators developed by DJ Delorie.  They are similar to early BSD
 | 
						|
malloc and are intended to be relatively lightweight and relatively
 | 
						|
fast.
 | 
						|
 | 
						|
How is the cygheap propagated to the child?
 | 
						|
 | 
						|
Well, it depends if you are running on Windows 9x or Windows NT.
 | 
						|
 | 
						|
On NT and 9x, just before CreateProcess is about to be called in
 | 
						|
fork or exec, a shared memory region is prepared for copying of the
 | 
						|
cygwin heap.  This is in cygheap_setup_for_child.  The handle to this
 | 
						|
shared memory region is passed to the new process in the 'child_info'
 | 
						|
structure.
 | 
						|
 | 
						|
If there are no handles that need "fixing up" prior to starting another
 | 
						|
process, cygheap_setup_for_child will also copy the contents of the
 | 
						|
cygwin heap to the shared memory region.
 | 
						|
 | 
						|
If there are any handles that need "fixing up" prior to invoking
 | 
						|
another process (i.e., sockets) then the creation of the shared
 | 
						|
memory region and copying of the current cygwin heap is a two
 | 
						|
step process.
 | 
						|
 | 
						|
First the shared memory region is created and the process is started
 | 
						|
in a "CREATE_SUSPENDED" state, inheriting the handle.  After the
 | 
						|
process is created, the fixup_before_*() functions are called.  These
 | 
						|
set information in the heap and duplicate handles in the child, essentially
 | 
						|
ensuring that the child's fd table is correct.
 | 
						|
 | 
						|
(Note that it is vital that the cygwin heap should not grow during this
 | 
						|
process.  Currently, there is no guard against this happening so this
 | 
						|
operation is not thread safe.)
 | 
						|
 | 
						|
Meanwhile, back in fork_parent, the function
 | 
						|
cygheap_setup_for_child_cleanup is called.  In the simple "one step"
 | 
						|
case above, all that happens is that the shared memory is ummapped and
 | 
						|
the handle referring to it is closed.
 | 
						|
 | 
						|
In the two step process, the cygheap is now copied to the shared memory
 | 
						|
region, complete with new fdtab info (the child process will see the
 | 
						|
updated information as soon as it starts).  Then the memory is unmapped,
 | 
						|
the handle is closed, and upon return the child process is started.
 | 
						|
 | 
						|
It is in the child process that the difference between Windows 9x and
 | 
						|
Windows NT becomes evident.
 | 
						|
 | 
						|
Under Windows NT, the operation is simple.  The shared memory handle is
 | 
						|
used to map the information that the parent has set up into the cygheap
 | 
						|
location in the child.  This means that the child has a copy of the
 | 
						|
cygwin heap existing in "shared memory" but the only process with a view
 | 
						|
to this "shared memory" is the child.
 | 
						|
 | 
						|
Under Windows 9x, due to address limitations, we can't just map the
 | 
						|
shared memory region into the cygheap position.  So, instead, the memory
 | 
						|
is mapped whereever Windows wants to put it, a new heap region is
 | 
						|
allocated at the same place as in the parent, the contents of the shared
 | 
						|
memory is *copied* to the new heap, and the shared memory is unmapped.
 | 
						|
Simple, huh?
 | 
						|
 | 
						|
Why do we go to these contortions?  Previous versions (<1.3.3) of cygwin
 | 
						|
used to block when creating a child so that the child could copy the
 | 
						|
parent's cygheap.  The problem with this was that when a cygwin process
 | 
						|
invoked a non-cygwin child, it would block forever waiting for the child
 | 
						|
to let it know that it was done copying the heap.  That caused
 | 
						|
understandable complaints from people who wanted to run non-cygwin
 | 
						|
applications "in the background".
 | 
						|
 | 
						|
In Cygwin 1.3.3 (and presumably beyond) the location of the cygwin heap
 | 
						|
has been fixed to be at the end of the cygwin1.dll address space.
 | 
						|
Previously, we let the "OS" choose where to allocate the cygwin heap in
 | 
						|
the initial cygwin process and attempted to use this same location in
 | 
						|
subsequent cygwin processes started from this parent.
 | 
						|
 | 
						|
The reason for putting cygheap at a fixed, known location is that we
 | 
						|
need to put this information at a fixed location since it incorporates
 | 
						|
pointers to entities within itself.  So, when a process forks or execs,
 | 
						|
the memory referred to by the pointers has to exist at the same place in
 | 
						|
both the parent or the child.
 | 
						|
 | 
						|
(It "might be nice" to used something like Microsoft's "based pointers"
 | 
						|
for the cygheap.  Unfortunately gcc does not support that feature, as of
 | 
						|
this writing.)
 | 
						|
 | 
						|
The reason for choosing a fixed, arbitrary location is to accommodate
 | 
						|
Windows XP, although there were sporadic complaints of cygwin heap
 | 
						|
failures in other pathological situations with both NT and 9x.  In
 | 
						|
Windows XP, Microsoft made the allocation of memory less deterministic.
 | 
						|
This is certainly their right.  Cygwin was previously relying on
 | 
						|
undocumented and "iffy" behavior before.  So, now we always allocate
 | 
						|
space immediately after the dll in the theory that there is not going
 | 
						|
to be anything else living there.
 | 
						|
 | 
						|
Recent (2001-09-20) cygwin email threads have indicated that we're not
 | 
						|
exactly on completely firm ground now, though.  We are assuming that
 | 
						|
there is sufficient space after the cygwin DLL for the allocation of the
 | 
						|
cygwin heap.  Unfortunately the ld option '--enable-auto-image-base'
 | 
						|
has a tendency to allocate DLLs immediately after cygwin1.dll.  This
 | 
						|
causes the dreaded "Couldn't reserve space for cygwin's heap" message.
 | 
						|
 | 
						|
Solutions for this behavior are currently in the musing state.
 |