

This document is also available as part number 95114-001.
An interface is MT-safe if
MT-safe is defined in the Sun document Interfaces and Threads: A Taxonomy and Guidelines. (7 pages, or 60 K postscript.) Some practical hints on how to make your interfaces MT-safe are collected in a hints file.
Quoting from the Guide to Multithread Programming, Nov 1993, page 42:
There is no general need to allocate stack space for threads. The threads library allocates one megabye of virtual memory for each thread's stack with no swap space reserved. (The library uses the MAP_NORESERVE option of mmap(2) to do the allocations.)
Each thread stack created by the threads library has a red zone. A stack red zone is made by rounding the stack size to the next page boundary and appending an invalid page. Red zones are appended to all automatically allocated stacks whether the size is specified by the application or the default size is used.
Don't specify stacks or their sizes to thr_create() unless you're absolutely certain you know it's correct. There are very few occasions when it is sensible to specify a stack, its size or both to thr_create(). It is difficult even for an expert to know if the size was specified right. The problem is that an ABI compliant program can't determine its stack size statically. Its size is dependent on the needs of the particular runtime in which it executes.
If you specify the size of a thread stack, be sure to take into account the allocations needed by the invoked function and by each function called. The accounting should include calling sequence needs, local variables and information structures.
Date: Thu, 21 Apr 1994 10:19:50 +0500 From: Richard.Marejka@Canada (Richard Marejka - OpCom Staff) Subject: Re: Thread Stack Size question by ImagiNation NetworksGentlemen,
Below is the e-mail that I sent to Vince with regard to thread size. And further to neils@nymets. A small program that I wrote this morning that just mmap's /dev/zero with MAP_NORESERVE was limited to between 2,000,000,000 and 2,100,000,000 bytes. Therefore Neil's concerns about exceeding process size are quite valid.
RWMarejka
Solaris 2 Migration Support Centre (and threads too)
Date: Tue, 19 Apr 1994 17:00:07 +0500 From: richard@scooter (Richard Marejka - OpCom Staff) Subject: Re: Thread stack size
The stack memory is mmapped from /dev/zero. Pages are only allocated when they are needed/referenced. For example, the threads library uses a default stack size of 1Mb. The virtual address space of the process will increase by 1Mb, however, the amount of required resources (swap and physical memory) will only increase as the memory is used.
That being said you can create your own stacks. Such stacks can be any size you need provided they are larger than the return value of thr_min_stack(3T). To determine the amount of stack required you would have to have detailed information about the call graph of the thread. Then estimate the amount of local (automatic) storage for each function and total for the leafs in the graph. As well you need to add a "slop" factor for register window overflow, this is dictated by the depth of the call graph. For every 8 levels you will need register window storage of about:
16 registers * 4 bytes / register * 8 levels = 512 bytesYou also have to account for functions that you call and estimate storage required by them.
Having said all this, simple experiments here show that about 16 - 32 kb is usually enough (provided you don't have large local arrays and data structures).
Yet more... If you create a stack for the library you should supply your own red-zone to provide stack overflow detection.
And another thing... Avoid using malloc(3) for the memory. Probably the best thing to do is to call thr_create(3T) will a NULL stack pointer and a non-zero stacksize.
Wow, hope this helps,
Richard Marejka
Solaris 2 Migration Support Centre (and threads too)
If your threads need local storage, use thr_keycreate(3t), as in this tiny example.
Solaris (UNIX International, or UI) threads will be fully interoperable with POSIX pthreads, which is an emerging standard.
POSIX has not yet defined a standard for threads. The POSIX 1003.1c committee (previously the 1003.4a committee) is working on defining such a standard and is in Draft 9 of the specification at the time of writing. (September, 1994.) The final standard is expected to be quite close to the D9 specification.
Semantically, and functionally, the Solaris or UNIX International (UI) thread interface is identical to that of the current POSIX D9 threads draft. There are some syntactic differences which will not prevent full interoperability in any way. Programming to the currently shipping Solaris threads API gives developers an interface they can write to now, and which will be supported in the future. At the same time, since the Solaris and POXIS interfaces are semantically identical, there will be a smooth transition path for the application to convert to the eventual POSIX interface, if such a transition is necessary. In most cases, only syntactic name changes will be required to port the application to POSIX threads.
One difference between the two is that Solaris has no equivalent for POSIX's thread_cancel.
An early access version of the POSIX threads draft 8 specification is available from SunSoft. To order a version of this threads package, send mail to threads@sun.com or call 1-800-363-6200 (the Solaris Developer Support Center.) This package comes with a document which compares the two interfaces in detail, with examples showing the similarities and differences between the two interfaces.
For more details, refer to the 59-page document, pthreads and Solaris threads: A comparison of two user-level threads APIs , Early Access Edition, May 1994; pthreads based on POSIX 1003.4a/D8. (This document is a 350kb postscript file.)
The default for thread creation is to create UNDETACHED threads. This means that the thread structure stays around after the thread exists, so that the creating thread can do a thr_join on it. If you don't do a thr_join, then you want to create your threads with an explicit DETACHED flag.
User-level threads, created by calling thr_create from libthread.so, are the primary portable programming interface. A good analogy is with standard UNIXI/O. UNIX provides read/write system calls, but C programmers use stdio functions like printf and scanf from libc.
In the same way, Solaris 2 provides LWPs, and application programmers typically program with threads.
Threads are lighter than LWPs, in that threads do not require kernel resources. You can use thousands of threads per process. Thread synchronization is part of the user-level threads library, so a process does not need to enter the kernel to synchronize threads. It is much faster to cause an LWP to switch from running a user-level thread to running a different user-level thread than it is to schedule the running of an LWP, which is a kernel-schedulable entity.
Since threads may be bound to LWPs, you can cause a user-level thread to have the status of a kernel schedulable entity, if that is important to your application.
The following performance numbers highlight the dramatic differences among threads, LWPs, and processes, as measured by creation time and synchronization time.
Thread Creation: microseconds _____________________________________ Unbound thread 101 Bound thread 422 Fork() 3,057 Thread synchronization: microseconds _____________________________________ No contention mutexex 1.8 Process-local unbound 48 Process-shared 105
Measuring an application's performance, we found that an RPC server that uses four threads, running on a 4-CPU SPARCstation 10, with Solaris 2.3, achieves the following performance gains:
On CPU-bound remote procedure,
Per-thread Aggregate Single-threaded server 54.1 ms 15.6 ms Multi-threaded server 22.7 ms 5.9 msOn an I/O-bound procedure,
Per-thread Aggregate Single-threaded server 2,854 ms 1,138 ms Multi-threaded server 1,837 ms 478 ms
The simple answer is "no".
The complex answer is: ANSI C only states that (3.5.3),
"An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression refering to such an object shall be evaluated strictly according to the rules of the abstract machine, as described in 2.1.2.3. Futhermore at every sequence point the value of the last stored object shall agree with that prescribed by the abstract machine, except as modified by the unknown factors mentioned previously. What constitutes an access to an object that has volatile- qualified type is implemenation defined."
and from ANSI C (2.1.2.3)
"... The least requirements on a conforming implementation are: - At sequence points, volatile objects are stable in the sense that previous evaluations are complete and subsequent evaluations have not occured. ..."ANSI C provides no detail about the abstract machine with regard to multithreaded issues or multiprocessor hosted environments. I will suggest (although not stated in ANSI C) that the abstract machine envisaged by the committee was a single-threaded, single processor environment. ANSI started working on a C standard before 1984 when such issues were still in the "distant" future (remember the processors of the day were the Intel 80286 and Motorola 68010).
The answer: writing to a volatile-qualified type object does not cause a store buffer flush.
The thr_yield function does voluntary preemption in favour of a thread at the same priority within the same process. (There cannot be a runnable thread of higher priority!)
There are two timing issues with cond_wait and cond_signal. The first is the "lost wake-up" that is caused by a thread not owning the "mutex" executing a cond_signal using the CV associated with the mutex. This can cause the waiter to block forever, since condition signals do not pend. In the example below the wakeup is sent when the waiter is between the condition evaluation and cond_wait. Since the signal does not pend, it is lost and the consumer will block until (if) the condition variable is again signalled.
BAD PROGRAMMER! consumer producer 1) mutex_lock( lockp ); 2) while ( !condition ) 3) cond_signal( cvp ); 4) cond_wait( cvp, lockp );The solution to this is to write proper code. You must own the mutex before calling cond_signal().
GOOD PROGRAMMER! consumer producer 1) mutex_lock( lockp ); 2) while ( !condition ) 2a) mutex_lock( lockp ); 2b) condition = TRUE; 3) cond_signal( cvp ); 3a) mutex_unlock( lockp ); 4) cond_wait( cvp, lockp );The second timing issue with cond_wait and cond_signal comes from POSIX.1c/D10 11.4.3.6.1 (Multiple Awakenings by Condition Signal). In this section POSIX declares that a single cond_signal on a CV may cause more than one thread in the corresponding cond_wait to return. This will only happen on an MP system and is self-correcting if the condition evaluation occurs within a loop.
Yes, in fact semaphores and readers/writer locks are written in terms of mutex and condition variable. In turn, a condition variable is written in terms of a mutex. The implementation of the cond_timedwait function requires extra work. The thread library maintains a callout queue (much similar to the kernel) in order to wake-up threads in a cond_timedwait.
There is more to a CV than just a mutex. To simplify matters the mutex and CV should be considered the base primitives.
They are awoken in priority order, based on the priority at the time of going to sleep on the mutex. Changing a threads priority once it has entered this state will have no effect on its position within the sleep queue. Note that the situation is different for POSIX.1c, POSIX.1c requires that threads are awoken based on their priority at pthread_mutex_unlock time by the owner.
As for FIFO, don't even think about it! Assuming so means you're basing your application on order of execution, not synchronization of data objects.
The student who asked this question needs to re-think his approach.
The mutex_destroy function is present to invalidate a previously valid mutex, i.e. to place the mutex in a state were it can no longer be used. POSIX.1c states that attempts to use a "destroyed" mutex will result in undefined behavior. As for how it should be used, this is open to debate, as POSIX.1c defines the effect on waiters or the owner is when a mutex is destroyed as undefined.
We recommend that you program in a strictly correct sense, and call mutex_destroy() for all dynamically allocated mutexes when you're done with them.
POSIX does not implement it. So if you are writing a program that you intend to port to other platforms, avoid it. If you are only concerned with Solaris threads, then it is legal, acceptable, and works just fine.
The SPILT package implements a Solaris thread readers/writer lock written in terms of POSIX.1c primitives. The POSIX.1c/D9 document provides sample code for a RW lock in section 18.2.5 (Examples) that will work.
There is a bug on page 182, line 622 that should be:
pthread_cond_broadcast( &l->rcond );SPILT (Solaris POSIX Interface Layer for Threads) has semaphores and RWlocks. It is available via http://www.sun.com/sunsoft/Products/Developer-products/sig/threads/apps.html
In Solaris threads, a zero-filled SV is always a valid USYNC_THREAD variable (this includes semaphores, however the initial count will then be zero). POSIX.1c requires that SVs be initialized by either using a function call, for example pthread_mutex_init or by assignment of a type-specific symbol, i.e. PTHREAD_MUTEX_INITIALIZER.
We encourage programmers using Solaris threads to use the initialization functions. The reason is if they want to port to POSIX.1c using SPLIT the SPILT package cannot detect Solaris thread initialization by zero-fill and the result would be an uninitialized SV (which would result in undefined behavior under POSIX.1c).
The only things that do not cause a potential preemption are SV _init functions.
RW locks are a slower than mutex locks by a factor of two or three. The benefit of the RW lock is of course concurrent access by readers. This has to be weighed against: 1) how much concurrent read access is expected and 2) for how long. A mutex can be used in place of an RW lock and the worst that will happen is that read access will serialize when it could have been concurrent and writers will not have been given preferential access. The latter can at least partially be corrected by having writers execute at a higher priority.
Yes. However, the fork(2) man page does not state this fact.
When a thread releases a mutex and there are waiters, the first waiter is moved from the waiting queue to the runnable queue. At that point it will contend for the mutex along with any other threads that want it. The mutex is not "passed" to the first waiter.
Shame on whoever asked the question! They're trying to rely on order of execution, instead of synchronization of data!!
The USYNC_PROCESS SV objects are maintained in shared memory (either mmap or SysV shared memory). If a call to lock an SV is successful, then the kernel is not involved.
When there is contention, then the LWP that carries the thread which is going to block on the SV will go to sleep in the kernel. When the owner releases the lock, it is woken up and competes for the SV as usual.
The kernel only arbitrates access to the object (lock/unlock), it does not record owner, waiters or state. If a process owning a shared mutex exits without releasing the mutex, the kernel will not perform any "clean-up" on behalf of the process.
There are a small number of functions that are async-safe. POSIX.1b defines a short list, POSIX.1c adds to the list as does the Solaris thread library. They are only:
sema_post(), thr_setsigmask(), and thr_exit().How to handle synch. signals in a library is still an open issue.
When a thread is in a system call the thread library will not interrupt it (except for signal delivery), and the system call will run to completion.
In the scenario described above the lower priority thread would would remain in the system call. The "different" thread would wake up the "higher priority" thread moving it from the sleeping queue to the runnable queue. The "higher priority" thread will remain on the runnable queue. The low priority thread will then be preempted as soon as it returns from the system call.
(This is done by a clever kernel hack where the preemption signal, SIGLWP, is designed to remain pending when sent to an LWP in the middle of a system call.)
In a time-sliced scheduler, the amount of time a process (or LWP, or thread) runs is recorded, and when it has run for some predetermined amount of time, it is interrupted, and sent to the back of the run queue. (This could be a prioritized run queue, or a simple linear one -- it's not important.)
In a preemptive scheduler, when a high priority thread becomes runnable, it will kick a lower priority thread off of its LWP. There is no time-slicing involved. A high-priority thread can run forever if it wants.
The Solaris threads library is a strict-priority preemptive scheduler. The Solaris kernel implements a Time-shared class which is a preemptive, time-sliced scheduler which adjusts priorities based upon run time.
No. The threads library takes care of the masks for you, and prevents you from doing this when using thr_setmask().
No. It is not even theoretically possible, because you cannot know what has been altered under the protection of the lock which caused the deadlock. You must write your programs to avoid deadlock.
For approved standards, contact:
IEEE Customer Service 445 Hoes Lane P.O. Box 1331 Piscataway, NJ 08855-1331 USA (800)678-IEEE
For draft standards (which POSIX.1c still is), contact:
IEEE Draft Standards Office 445 Hoes Lane P.O. Box 1331 Piscataway, NJ 08855-1331 USA (908)562-3834
The MT draft is DS-5314, and will cost you $24 if you're an IEEE member, $30 if not.
We don't KNOW, but a good guess is March of 1995.
Best answer: RECOMPILE!
Practical answer: Yes. Within limits. There are three issues at the heart of this question:
As a special way to handle exactly this problem, the threads library defines the thread-specific location of ERRNO for the MAIN thread to be the old global location. Thus referencing ERRNO from that one thread will work correctly in both recompiled libraries, AND legacy libraries.
In short if the legacy library:
Our stock answer to this question is #4. The other answers are refinements, and it's easy to get those refinements wrong, so we don't recommend them.
This is actually somewhat problematic. The POSIX API does not deal with this issue at all, leaving it up to the individual library writers to implement solutions of their own (if they wish to). Most of the implementations don't deal with this issue at all (since they don't have a two-level model where it makes sense). So for them, it's moot. They just won't work as well.
Solaris provides the libthread routine thr_setconcurrancy(), and this is how we recommend you deal with the issue -- just use that alongside your pthreads code:
#include <pthread.h> #ifdef _SOLARIS_2 #include <thread.h> #endif ... #ifdef _SOLARIS_2 thr_setconcurrancy(10); #endif pthread_create(...);where the symbol _SOLARIS_2 (or something similar) is defined by you on the compile line when compiling for Solaris:
% cc -mt -D_SOLARIS_2 my_program.c(Also see the next question.)
If an application links with both the Solaris and Posix thread libraries, then POSIX overrides Solaris wherever there is conflict in semantics and functionality. To be specific about fork(), if you link with POSIX (no matter whether you linked with Solaris or not), fork() behaves like fork1().
The two libraries will continue to interoperate in this way, in future releases of Solaris.
This is one of those things that happens when you're ahead of the standards committees. They won't always agree with your solution to things, and you'll be forced to change if you wish to adhere to the standard when it finally emerges. Here is the official announcement about the change:
In Solaris 2.5, we are announcing the eventual End of Life for two Solaris features. The features are per-LWP POSIX timers and per-thread alarms. Both features are being supplemented in 2.5 with per-process variants.
With 2.5, an application compiled defining the macro _POSIX_PER_PROCESS_TIMERS (or with the symbol _POSIX_C_SOURCE having a value greater than 199500L) will create per-process timers. The timer IDs of per-process timers are usable from any LWP and the expiration signals are generated for the process rather than directed to a specific LWP. Further, per-process timers are deleted only by timer_delete(3r) or at process termination.
Applications compiled prior to 2.5, or without the feature test macros described above, will continue to create per-LWP POSIX timers. In some future release, calls to create per-LWP timers will return per-process timers. The end of life for POSIX per-LWP timers is announced on the timer_create(3r) man page.
Such timers are automatically deleted upon termination of the creating LWP. Hence for multi-threaded programs, only bound threads can use this facility. Even with this restricted use, under 2.3/2.4, the use of alarm(2) and setitimer(2) timers in multi-threaded applications is unreliable with respect to masking the resulting signals from the bound thread which issues these calls. If such masking is not required, then these two system calls work reliably from bound threads.
With 2.5, an application linking with -lpthread (POSIX threads), will get per-process delivery of SIGALRM when calling alarm(2), but the effect due to setitimer(2) will continue to be per-LWP. The SIGALRM generated by alarm(2) is generated for the process rather than directed to a specific LWP. Further, the alarm is reset at process termination.
Applications compiled prior to 2.5 or not linked with -lpthread will continue to see a per-LWP delivery of signals generated by alarm(2) and/or setitimer(2).
In some future release, calls to alarm(2) and/or to setitimer(2) with the ITIMER_FLAG will cause the resulting SIGALRM to be sent to the process. The end of life for per-LWP SIGALRM generation is announced on the alarm(2) and setitimer(2) man pages.
Flags other than the ITIMER_REAL flag to setitimer(2) will continue to result in the generated signal being delivered to the LWP that issued the call, and hence will be usable only from bound threads.
If you really need to send an alarm to an individual thread (this is rare!), then you will have to manage this yourself.
In addition to all of the usual concerns such as locking shared data, a library should be well-behaved with respect to forking a child process when only one thread is running (the one which called fork()). The problem is that the sole thread in the child process may try to grab a lock which is held by a thread that wasn't duplicated in the child.
This is not a problem most programs are likely to run into. Most programs call exec() in the child right after the return from fork(). However, if the program wishes to carry out some actions in the child before the call to exec(), or never calls exec(), then the child COULD encounter deadlock scenarios. As a library writer you need to provide a safe solution, though not providing a fork()-safe library is not that big a deal, since it is a rare condition.
For example, assume T1 was in the middle of printing something (hence holding a lock for printf()), when T2 forked a new process. In the child process, should the sole thread (T2) call printf(), it will promptly deadlock.
The call fork() (POSIX threads) or fork1() (Solaris threads) will duplicate only the one thread that calls it. Calling fork() (Solaris threads) will duplicate all threads, so that issue does not come up in this case.
What you must do is ensure that there are no such locks being held at the time of forking. The most obvious way of doing this is to have the forking thread acquire all of the locks that could possibly be used by the child. You cannot do this for locks such as the one in printf() that is owned by libc. (The programmer must ensure that printf() is not being used at fork() time.) However, here is what you must do for locks within your own library.
You must identify all the locks used by the library. Let's say this list is {L1,...Ln}. Also identify the locking order for these locks. Let's say that this order is also L1...Ln. (If you don't use a strict locking order, then you must manage the lock acquisition carefully.)
Next you must arrange to acquire those locks at fork time. In Solaris threads you'll have to require the programmer to do this by hand -- obtaining each of the locks just before calling fork1(), and releasing them right afterwards:
mutex_lock(L1);
mutex_lock(L2);
fork1(...);
mutex_unlock(L1);
mutex_unlock(L2);
In Pthreads, you can just add a call to pthread_atfork(f1, f2, f3) in your library's .init section, where f1, f2, f3 are defined as follows:
f1() /* This is executed just before the process forks. */
{
mutex_lock(L1); |
mutex_lock(...); | -- ordered in lock order
mutex_lock(Ln); |
} V
f2() /* This is executed in the child after the process forks. */
{
mutex_unlock(L1);
mutex_unlock(...);
mutex_unlock(Ln);
}
f3() /* This is executed in the parent after the process forks. */
{
mutex_unlock(L1);
mutex_unlock(...);
mutex_unlock(Ln);
}
It's a bit of extra work, and it does require some careful thought to get it right, but it's not all that rough. And it will give you happier users.
This is a bit of a sticky question to which we cannot give a very satisfactory reply. As you know, Pthreads is still just a draft. (Come June '95 it will probably be official, but...) So about the only official thing we can say is that we will implement a Pthreads library as part of the Solaris distribution upon ratification. We can't tell you with 100% certainty what will be in it, since there is a possibility (very low) of changes to draft 10 before it becomes a standard.
We expect (with 99% certainty) that the current Solaris 2.5 EA (with fixes) will become our first Pthreads release. We do not currently have any comments to make about the implementation of the optional parts of the POSIX draft. If you have strong feelings or anticipated requirements in this area, do let us know. We can be influenced.
Solaris Pthreads draft 10 EA does not implement any of the optional features (such as SCHED_FIFO, SCHED_RR, POSIX Semaphores, Priority Mutexes, PTHREAD_INHERITSCHED). These can generally be worked around by using (a) different scheduling parameters, (b) implementing more complex constructs yourself, (c) using calls from Solaris threads (ie, linking both libraries), (d) simply ignoring those distinctions which are very minor.
Questions or comments regarding this service? webmaster@sun.com Copyright 1995 Sun Microsystems, Inc., 2550 Garcia Ave., Mt. View, CA 94043-1100 USA. All rights reserved.