383 lines
		
	
	
		
			20 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
			
		
		
	
	
			383 lines
		
	
	
		
			20 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
<sect1 id="ov-ex-win">
 | 
						|
<title>Quick Start Guide for those more experienced with Windows</title>
 | 
						|
<para>
 | 
						|
If you are new to the world of UNIX, you may find it difficult to
 | 
						|
understand at first. This guide is not meant to be comprehensive,
 | 
						|
so we recommend that you use the many available Internet resources
 | 
						|
to become acquainted with UNIX basics (search for "UNIX basics" or
 | 
						|
"UNIX tutorial"). 
 | 
						|
</para>
 | 
						|
<para>
 | 
						|
To install a basic Cygwin environment, run the
 | 
						|
<command>setup.exe</command> program and click <literal>Next</literal>
 | 
						|
at each page.  The default settings are correct for most users. If you
 | 
						|
want to know more about what each option means, see 
 | 
						|
<xref linkend="internet-setup"></xref>. Use <command>setup.exe</command>
 | 
						|
any time you want to update or install a Cygwin package.  If you are
 | 
						|
installing Cygwin for a specific purpose, use it to install the tools
 | 
						|
that you need. For example, if you want to compile C++ programs, you 
 | 
						|
need the <systemitem>gcc-g++</systemitem> package and probably a text
 | 
						|
editor like <systemitem>nano</systemitem>.  When running
 | 
						|
<command>setup.exe</command>, clicking on categories and packages in the
 | 
						|
package installation screen will provide you with the ability to control
 | 
						|
what is installed or updated. 
 | 
						|
</para>
 | 
						|
<para>
 | 
						|
Another option is to install everything by clicking on the
 | 
						|
<literal>Default</literal> field next to the <literal>All</literal>
 | 
						|
category. However, be advised that this will download and install
 | 
						|
several hundreds of megabytes of software to your computer. The best
 | 
						|
plan is probably to click on individual categories and install either
 | 
						|
entire categories or packages from the categories themselves.
 | 
						|
After installation, you can find Cygwin-specific documentation in
 | 
						|
the <literal>/usr/share/doc/Cygwin/</literal> directory.
 | 
						|
</para>
 | 
						|
<para>
 | 
						|
Developers coming from a Windows background will find a set of tools capable of
 | 
						|
writing console or GUI executables that rely on the Microsoft Win32 API.  The
 | 
						|
<command>dlltool</command> utility may be used to write Windows Dynamically
 | 
						|
Linked Libraries (DLLs).  The resource compiler <command>windres</command> is
 | 
						|
also provided.
 | 
						|
</para>
 | 
						|
</sect1>
 | 
						|
 | 
						|
<sect1 id="ov-ex-unix">
 | 
						|
<title>Quick Start Guide for those more experienced with UNIX</title>
 | 
						|
<para>
 | 
						|
If you are an experienced UNIX user who misses a powerful command-line
 | 
						|
environment, you will enjoy Cygwin. Note that there are some workarounds
 | 
						|
that cause Cygwin to behave differently than most UNIX-like operating
 | 
						|
systems; these are described in more detail in 
 | 
						|
<xref linkend="using-effectively"></xref>.
 | 
						|
</para>
 | 
						|
<para>
 | 
						|
Any time you want to update or install a Cygwin package, use the
 | 
						|
graphical <command>setup.exe</command> program. By default,
 | 
						|
<command>setup.exe</command> only installs a minimal set of packages, 
 | 
						|
so look around and choose your favorite utilities on the package
 | 
						|
selection screen. You may also search for specfic tools on the Cygwin
 | 
						|
website's <ulink url="http://cygwin.com/packages/">Setup Package
 | 
						|
Search</ulink> For more information about what each option in
 | 
						|
<command>setup.exe</command> means, see <xref linkend="internet-setup"></xref>. 
 | 
						|
</para>
 | 
						|
<para>
 | 
						|
Another option is to install everything by clicking on the
 | 
						|
<literal>Default</literal> field next to the <literal>All</literal>
 | 
						|
category. However, be advised that this will download and install
 | 
						|
several hundreds of megabytes of software to your computer. The best
 | 
						|
plan is probably to click on individual categories and install either
 | 
						|
entire categories or packages from the categories themselves.
 | 
						|
After installation, you can find Cygwin-specific documentation in
 | 
						|
the <literal>/usr/share/doc/Cygwin/</literal> directory.
 | 
						|
</para>
 | 
						|
<para>
 | 
						|
Developers coming from a UNIX background will find a set of utilities
 | 
						|
they are already comfortable using, including a working UNIX shell.  The
 | 
						|
compiler tools are the standard GNU compilers most people will have previously
 | 
						|
used under UNIX, only ported to the Windows host.  Programmers wishing to port
 | 
						|
UNIX software to Windows NT or 9x will find that the Cygwin library provides
 | 
						|
an easy way to port many UNIX packages, with only minimal source code
 | 
						|
changes.
 | 
						|
</para>
 | 
						|
 | 
						|
</sect1>
 | 
						|
 | 
						|
<sect1 id="highlights"><title>Highlights of Cygwin Functionality</title>
 | 
						|
 | 
						|
<sect2 id="ov-hi-intro"><title>Introduction</title> <para>When a binary linked
 | 
						|
against the library is executed, the Cygwin DLL is loaded into the
 | 
						|
application's text segment.  Because we are trying to emulate a UNIX kernel
 | 
						|
which needs access to all processes running under it, the first Cygwin DLL to
 | 
						|
run creates shared memory areas that other processes using separate instances
 | 
						|
of the DLL can access.  This is used to keep track of open file descriptors and
 | 
						|
assist fork and exec, among other purposes.  In addition to the shared memory
 | 
						|
regions, every process also has a per_process structure that contains
 | 
						|
information such as process id, user id, signal masks, and other similar
 | 
						|
process-specific information.</para>
 | 
						|
 | 
						|
<para>The DLL is implemented using the Win32 API, which allows it to run on all
 | 
						|
Win32 hosts.  Because processes run under the standard Win32 subsystem, they
 | 
						|
can access both the UNIX compatibility calls provided by Cygwin as well as
 | 
						|
any of the Win32 API calls.  This gives the programmer complete flexibility in
 | 
						|
designing the structure of their program in terms of the APIs used.  For
 | 
						|
example, they could write a Win32-specific GUI using Win32 API calls on top of
 | 
						|
a UNIX back-end that uses Cygwin.</para>
 | 
						|
 | 
						|
<para>Early on in the development process, we made the important design
 | 
						|
decision that it would not be necessary to strictly adhere to existing UNIX
 | 
						|
standards like POSIX.1 if it was not possible or if it would significantly
 | 
						|
diminish the usability of the tools on the Win32 platform.  In many cases, an
 | 
						|
environment variable can be set to override the default behavior and force
 | 
						|
standards compliance.</para>
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="ov-hi-win9xnt"><title>Supporting both Windows NT and 9x</title>
 | 
						|
<para>While Windows 95 and Windows 98 are similar enough to each other that we
 | 
						|
can safely ignore the distinction when implementing Cygwin, Windows NT is an
 | 
						|
extremely different operating system.  For this reason, whenever the DLL is
 | 
						|
loaded, the library checks which operating system is active so that it can act
 | 
						|
accordingly.</para>
 | 
						|
 | 
						|
<para>In some cases, the Win32 API is only different for
 | 
						|
historical reasons.  In this situation, the same basic functionality is
 | 
						|
available under Windows 9x and NT but the method used to gain this
 | 
						|
functionality differs.  A trivial example: in our implementation of
 | 
						|
uname, the library examines the sysinfo.dwProcessorType structure
 | 
						|
member to figure out the processor type under Windows 9x.  This field
 | 
						|
is not supported in NT, which has its own operating system-specific
 | 
						|
structure member called sysinfo.wProcessorLevel.</para>
 | 
						|
 | 
						|
<para>Other differences between NT and 9x are much more fundamental in
 | 
						|
nature.  The best example is that only NT provides a security model.</para>
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="ov-hi-perm"><title>Permissions and Security</title>
 | 
						|
<para>Windows NT includes a sophisticated security model based on Access
 | 
						|
Control Lists (ACLs).  Cygwin maps Win32 file ownership and permissions to the
 | 
						|
more standard, older UNIX model by default.  Cygwin version 1.1 introduces
 | 
						|
support for ACLs according to the system calls used on newer versions of
 | 
						|
Solaris. This ability is used when the `ntsec' feature is switched on which
 | 
						|
is described in <xref linkend="ntsec"></xref>.
 | 
						|
The chmod call maps UNIX-style permissions
 | 
						|
back to the Win32 equivalents.  Because many programs expect to be able to find
 | 
						|
the /etc/passwd and /etc/group files, we provide <ulink 
 | 
						|
url="http://cygwin.com/cygwin-ug-net/using-utils.html#mount">utilities</ulink>
 | 
						|
that can be used to construct them from the user and group information
 | 
						|
provided by the operating system.</para>
 | 
						|
 | 
						|
<para>Under Windows NT, users with Administrator rights are permitted to 
 | 
						|
chown files.  With version 1.1.3 Cygwin introduced a mechanism for setting real
 | 
						|
and effective UIDs under Windows NT/W2K. This is described in 
 | 
						|
<xref linkend="ntsec"></xref>.  As of version 1.5.13, the Cygwin developers
 | 
						|
are not aware of any feature in the Cygwin DLL that would allow users to gain
 | 
						|
privileges or to access objects to which they have no rights under Windows.
 | 
						|
However there is no guarantee that Cygwin is as secure as the Windows it runs
 | 
						|
on. Cygwin processes share some variables and are thus easier targets of
 | 
						|
denial of service type of attacks.
 | 
						|
</para>
 | 
						|
 | 
						|
<para>Under Windows 9x, the situation is considerably different.  Since a
 | 
						|
security model is not provided, Cygwin fakes file ownership by making all
 | 
						|
files look like they are owned by a default user and group id.  As under NT,
 | 
						|
file permissions can still be determined by examining their read/write/execute
 | 
						|
status.  Rather than return an unimplemented error, under Windows 9x, the
 | 
						|
chown call succeeds immediately without actually performing any action
 | 
						|
whatsoever.  This is appropriate since essentially all users jointly own the
 | 
						|
files when no concept of file ownership exists.</para>
 | 
						|
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="ov-hi-files"><title>File Access</title> <para>Cygwin supports
 | 
						|
both Win32- and POSIX-style paths, using either forward or back slashes as the
 | 
						|
directory delimiter.  Paths coming into the DLL are translated from Win32 to
 | 
						|
POSIX as needed.  As a result, the library believes that the file system is a
 | 
						|
POSIX-compliant one, translating paths back to Win32 paths whenever it calls a
 | 
						|
Win32 API function.  UNC pathnames (starting with two slashes) are
 | 
						|
supported.</para>
 | 
						|
 | 
						|
<para>The layout of this POSIX view of the Windows file system space is stored
 | 
						|
in the Windows registry.  While the slash ('/') directory points to the system
 | 
						|
partition by default, this is easy to change with the Cygwin mount utility.
 | 
						|
In addition to selecting the slash partition, it allows mounting arbitrary
 | 
						|
Win32 paths into the POSIX file system space.  Many people use the utility to
 | 
						|
mount each drive letter under the slash partition (e.g. C:\ to /c, D:\ to /d,
 | 
						|
etc...).</para>
 | 
						|
 | 
						|
<para>The library exports several Cygwin-specific functions that can be used
 | 
						|
by external programs to convert a path or path list from Win32 to POSIX or vice
 | 
						|
versa.  Shell scripts and Makefiles cannot call these functions directly.
 | 
						|
Instead, they can do the same path translations by executing the cygpath
 | 
						|
utility program that we provide with Cygwin.</para>
 | 
						|
 | 
						|
<para>Win32 file systems are case preserving but case insensitive.  Cygwin
 | 
						|
does not currently support case distinction because, in practice, few UNIX
 | 
						|
programs actually rely on it.  While we could mangle file names to support case
 | 
						|
distinction, this would add unnecessary overhead to the library and make it
 | 
						|
more difficult for non-Cygwin applications to access those files.</para>
 | 
						|
 | 
						|
<para>Symbolic links are emulated by files containing a magic cookie followed
 | 
						|
by the path to which the link points.  They are marked with the System
 | 
						|
attribute so that only files with that attribute have to be read to determine
 | 
						|
whether or not the file is a symbolic link.  Hard links are fully supported
 | 
						|
under Windows NT on NTFS file systems.  On a FAT file system, the call falls
 | 
						|
back to simply copying the file, a strategy that works in many cases.</para>
 | 
						|
 | 
						|
<para>The inode number for a file is calculated by hashing its full Win32 path.
 | 
						|
The inode number generated by the stat call always matches the one returned in
 | 
						|
d_ino of the dirent structure.  It is worth noting that the number produced by
 | 
						|
this method is not guaranteed to be unique.  However, we have not found this to
 | 
						|
be a significant problem because of the low probability of generating a
 | 
						|
duplicate inode number.</para>
 | 
						|
 | 
						|
<para>Chroot is supported since release 1.1.3. Note that chroot isn't
 | 
						|
supported native by Windows. This implies some restrictions. First of all,
 | 
						|
the chroot call isn't a privileged call. Each user may call it. Second, the
 | 
						|
chroot environment isn't safe against native windows processes. If you
 | 
						|
want to support a chroot environment as, for example, by allowing an
 | 
						|
anonymous ftp with restricted access, you'll have to care that only
 | 
						|
native Cygwin applications are accessible inside of the chroot environment.
 | 
						|
Since that applications are only using the Cygwin POSIX API to access the
 | 
						|
file system their access can be restricted as it is intended. This includes
 | 
						|
not only POSIX paths but Win32 paths (containing drive letter and/or
 | 
						|
backslashes) and CIFS paths (//server/share or \\server\share) as well.</para>
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="ov-hi-textvsbinary"><title>Text Mode vs. Binary Mode</title>
 | 
						|
<para>Interoperability with other Win32 programs such as text editors was
 | 
						|
critical to the success of the port of the development tools.  Most Red Hat
 | 
						|
customers upgrading from the older DOS-hosted toolchains expected the new
 | 
						|
Win32-hosted ones to continue to work with their old development
 | 
						|
sources.</para>
 | 
						|
 | 
						|
<para>Unfortunately, UNIX and Win32 use different end-of-line terminators in
 | 
						|
text files.  Consequently, carriage-return newlines have to be translated on
 | 
						|
the fly by Cygwin into a single newline when reading in text mode.</para>
 | 
						|
 | 
						|
<para>This solution addresses the compatibility requirement at the expense of
 | 
						|
violating the POSIX standard that states that text and binary mode will be
 | 
						|
identical. Consequently, processes that attempt to lseek through text files can
 | 
						|
no longer rely on the number of bytes read as an accurate indicator of position
 | 
						|
in the file.  For this reason, the CYGWIN environment variable can be
 | 
						|
set to override this behavior.</para>
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="ov-hi-ansiclib"><title>ANSI C Library</title>
 | 
						|
<para>We chose to include Red Hat's own existing ANSI C library
 | 
						|
"newlib" as part of the library, rather than write all of the lib C
 | 
						|
and math calls from scratch.  Newlib is a BSD-derived ANSI C library,
 | 
						|
previously only used by cross-compilers for embedded systems
 | 
						|
development.</para>
 | 
						|
 | 
						|
<para>The reuse of existing free implementations of such things
 | 
						|
as the glob, regexp, and getopt libraries saved us considerable
 | 
						|
effort.  In addition, Cygwin uses Doug Lea's free malloc
 | 
						|
implementation that successfully balances speed and compactness.  The
 | 
						|
library accesses the malloc calls via an exported function pointer.
 | 
						|
This makes it possible for a Cygwin process to provide its own
 | 
						|
malloc if it so desires.</para>
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="ov-hi-process"><title>Process Creation</title>
 | 
						|
<para>The fork call in Cygwin is particularly interesting because it
 | 
						|
does not map well on top of the Win32 API.  This makes it very
 | 
						|
difficult to implement correctly.  Currently, the Cygwin fork is a
 | 
						|
non-copy-on-write implementation similar to what was present in early
 | 
						|
flavors of UNIX.</para>
 | 
						|
 | 
						|
<para>The first thing that happens when a parent process
 | 
						|
forks a child process is that the parent initializes a space in the
 | 
						|
Cygwin process table for the child.  It then creates a suspended
 | 
						|
child process using the Win32 CreateProcess call.  Next, the parent
 | 
						|
process calls setjmp to save its own context and sets a pointer to
 | 
						|
this in a Cygwin shared memory area (shared among all Cygwin
 | 
						|
tasks).  It then fills in the child's .data and .bss sections by
 | 
						|
copying from its own address space into the suspended child's address
 | 
						|
space.  After the child's address space is initialized, the child is
 | 
						|
run while the parent waits on a mutex.  The child discovers it has
 | 
						|
been forked and longjumps using the saved jump buffer.  The child then
 | 
						|
sets the mutex the parent is waiting on and blocks on another mutex.
 | 
						|
This is the signal for the parent to copy its stack and heap into the
 | 
						|
child, after which it releases the mutex the child is waiting on and
 | 
						|
returns from the fork call.  Finally, the child wakes from blocking on
 | 
						|
the last mutex, recreates any memory-mapped areas passed to it via the
 | 
						|
shared area, and returns from fork itself.</para>
 | 
						|
 | 
						|
<para>While we have some
 | 
						|
ideas as to how to speed up our fork implementation by reducing the
 | 
						|
number of context switches between the parent and child process, fork
 | 
						|
will almost certainly always be inefficient under Win32.  Fortunately,
 | 
						|
in most circumstances the spawn family of calls provided by Cygwin
 | 
						|
can be substituted for a fork/exec pair with only a little effort.
 | 
						|
These calls map cleanly on top of the Win32 API.  As a result, they
 | 
						|
are much more efficient.  Changing the compiler's driver program to
 | 
						|
call spawn instead of fork was a trivial change and increased
 | 
						|
compilation speeds by twenty to thirty percent in our
 | 
						|
tests.</para>
 | 
						|
 | 
						|
<para>However, spawn and exec present their own set of
 | 
						|
difficulties.  Because there is no way to do an actual exec under
 | 
						|
Win32, Cygwin has to invent its own Process IDs (PIDs).  As a
 | 
						|
result, when a process performs multiple exec calls, there will be
 | 
						|
multiple Windows PIDs associated with a single Cygwin PID.  In some
 | 
						|
cases, stubs of each of these Win32 processes may linger, waiting for
 | 
						|
their exec'd Cygwin process to exit.</para>
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="ov-hi-signals"><title>Signals</title>
 | 
						|
<para>When
 | 
						|
a Cygwin process starts, the library starts a secondary thread for
 | 
						|
use in signal handling.  This thread waits for Windows events used to
 | 
						|
pass signals to the process.  When a process notices it has a signal,
 | 
						|
it scans its signal bitmask and handles the signal in the appropriate
 | 
						|
fashion.</para>
 | 
						|
 | 
						|
<para>Several complications in the implementation arise from the
 | 
						|
fact that the signal handler operates in the same address space as the
 | 
						|
executing program.  The immediate consequence is that Cygwin system
 | 
						|
functions are interruptible unless special care is taken to avoid
 | 
						|
this.   We go to some lengths to prevent the sig_send function that
 | 
						|
sends signals from being interrupted.  In the case of a process
 | 
						|
sending a signal to another process, we place a mutex around sig_send
 | 
						|
such that sig_send will not be interrupted until it has completely
 | 
						|
finished sending the signal.</para>
 | 
						|
 | 
						|
<para>In the case of a process sending
 | 
						|
itself a signal, we use a separate semaphore/event pair instead of the
 | 
						|
mutex.  sig_send starts by resetting the event and incrementing the
 | 
						|
semaphore that flags the signal handler to process the signal.  After
 | 
						|
the signal is processed, the signal handler signals the event that it
 | 
						|
is done.  This process keeps intraprocess signals synchronous, as
 | 
						|
required by POSIX.</para>
 | 
						|
 | 
						|
<para>Most standard UNIX signals are provided.  Job
 | 
						|
control works as expected in shells that support
 | 
						|
it.</para>
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="ov-hi-sockets"><title>Sockets</title>
 | 
						|
<para>Socket-related calls in Cygwin simply
 | 
						|
call the functions by the same name in Winsock, Microsoft's
 | 
						|
implementation of Berkeley sockets.  Only a few changes were needed to
 | 
						|
match the expected UNIX semantics - one of the most troublesome
 | 
						|
differences was that Winsock must be initialized before the first
 | 
						|
socket function is called.  As a result, Cygwin has to perform this
 | 
						|
initialization when appropriate.  In order to support sockets across
 | 
						|
fork calls, child processes initialize Winsock if any inherited file
 | 
						|
descriptor is a socket.</para>
 | 
						|
 | 
						|
<para>Unfortunately, implicitly loading DLLs
 | 
						|
at process startup is usually a slow affair.  Because many processes
 | 
						|
do not use sockets, Cygwin explicitly loads the Winsock DLL the
 | 
						|
first time it calls the Winsock initialization routine.  This single
 | 
						|
change sped up GNU configure times by thirty
 | 
						|
percent.</para>
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="ov-hi-select"><title>Select</title>
 | 
						|
<para>The UNIX select function is another
 | 
						|
call that does not map cleanly on top of the Win32 API.  Much to our
 | 
						|
dismay, we discovered that the Win32 select in Winsock only worked on
 | 
						|
socket handles.  Our implementation allows select to function normally
 | 
						|
when given different types of file descriptors (sockets, pipes,
 | 
						|
handles, and a custom /dev/windows Windows messages
 | 
						|
pseudo-device).</para>
 | 
						|
 | 
						|
<para>Upon entry into the select function, the first
 | 
						|
operation is to sort the file descriptors into the different types.
 | 
						|
There are then two cases to consider.  The simple case is when at
 | 
						|
least one file descriptor is a type that is always known to be ready
 | 
						|
(such as a disk file).  In that case, select returns immediately as
 | 
						|
soon as it has polled each of the other types to see if they are
 | 
						|
ready.  The more complex case involves waiting for socket or pipe file
 | 
						|
descriptors to be ready.  This is accomplished by the main thread
 | 
						|
suspending itself, after starting one thread for each type of file
 | 
						|
descriptor present.  Each thread polls the file descriptors of its
 | 
						|
respective type with the appropriate Win32 API call.  As soon as a
 | 
						|
thread identifies a ready descriptor, that thread signals the main
 | 
						|
thread to wake up.  This case is now the same as the first one since
 | 
						|
we know at least one descriptor is ready.  So select returns, after
 | 
						|
polling all of the file descriptors one last time.</para>
 | 
						|
</sect2>
 | 
						|
</sect1>
 | 
						|
 |