Operating Systems 635-321/Ann Logan/McGill Continuing Education

FALL 2000
Lecture Notes 12:

EVALUATING OPERATING SYSTEMS

The performance of any one component (FM, DM, MM, and PM) is dependent on the performance of the other three.

Think about being given $250 to put into your home machine for improvements – what would you do?

Memory Management

Trade off between memory allocation scheme and CPU overhead. E.g. running college mainframe where average job is only 100 ms – upgrade to "better" memory allocation scheme that results in an overhead of 60 ms per job – not good.

Processor Management

Again, a highly interrelated situation. For example – suppose you decide to move to a multi-programming approach, there are several processes now running interleaved in the system. Because of that the demand on devices, files and memory increases dramatically.

Device management

Buffering for IO devices can deal with high CPU speed and low device speed but this means that memory has to be allocated and monitored. More overhead.

File management

Many examples of the performance of file manager impact on performance of machine.

e.g. file system that permits file fragmentation risks having files being unavailable while CPU does compaction.

Location of file directory affects performance. Win32 VFAT loaded into memory – speeds things up but problems if system goes down improperly.

File management closely related to the device on which the files are stored.

Measuring OS performance

Not easy to measure – based on three components – user’s programs, OS programs and hardware units. As well, there is the human element. Does OS perform differently depending on which human is using it?

Throughput is a composite measure that indicates the productivity of the system as a whole. Usually measured under "steady state" conditions. E.g. # jobs processed per day or # on-line transactions handled per hour. Can also look at individual component’s throughput.

Capacity is the maximum throughput level. Bottlenecks develop when components reach their capacity. Processes in the system so not get passed on e.g. thrashing in a saturated disk drive. Or memory can become overextended because of multiprogramming that it can’t keep the "working set" of pages in memory, and CPU is working slowly because its is spending all its time servicing page interrupts. Bottlenecks can be detected by monitoring queues at each resource – rapidly growing queue is a bad sign.

To interactive users, response time is paramount. This is the interval required to process a request from when the user presses the key to submit the request until the system responds to the request. This is the same as turnaround time for batch jobs. To provide useful information, the variance of these values, as well as the mean, should be computed, because two systems could have the same mean response time, but one of them be highly erratic in its response.

Resource utilization is a measure of how much each unit is contributing to the operation. Usually given as percentage of time component is actually in use. Can determine whether system is IO bound or CPU bound.

Availability is a measure of the probability that a resource will be available when a user needs it. Related to user-requested items such a printers, network services etc rather than CPU, memory etc. In its simplest form it means that a unit will be operational and not out of service when the user needs it. Availability is influenced by mean time between failures(MTBF) . This measures average time a unit is operational before it breaks down. Mean time to repair (MTTR) is average time to fix. The formula used to calculate a unit’s availability is:

A =      MTBF
         ----------------------
         MTBF + MTTR

So if a component has an MTBF of 4000 hours (manufacturer usually indicates this), and the repair time average (and this will be based on a multitude of factors) is 2 hours, then the availability will be 0.9995, meaning that the unit would be available 999.5 hours per 1000 hours.

Reliability measures the probability that a unit will not fail during a given time period.

In general, to prevent the processor from spending more time doing overhead, than executing jobs, OS must monitor system. Job scheduler uses this info to either allow more jobs to enter or prevent new jobs from entering. Feedback loop is used for this.

Negative feedback loop monitors system, and when too busy, signals appropriate manager to slow down the arrival rate.

e.g. a negative loop monitoring IO devices would inform the Device manger that Printer 1 has too many jobs in its queue, causing DM to direct new jobs to Printer 2.

Positive feedback loop works in opposite way. When some resource is under-utilized, causes arrival time to increase. However, must be watched carefully, because they can cause problems as compared to negative feedback loops. For example, if a positive feedback loop monitoring the CPU informs Job Scheduler that CPU is being underused, then the Scheduler allows more jobs to enter the system. However, as more jobs enter system, amount of memory available per job decreases. If too many jobs enter job stream – increase in page interrupts. This results in poor CPU usage. In fact, in poorly designed operating systems, positive feedback can put system into unstable state. Positive feedback should include a further step that would involve checking to see whether system performance has improved.

Various types of software for monitoring and benchmarking are available.
System Security

Viruses-
Worms, Trojan Horse, etc.
Encryption – can be at file level, machine level or network level. Increases overhead.

Passwords

Backups

WinNt security management

Has object based security model. An object can be any resource in the system. The system administrator can give precise security access to specific objects.

WinNT includes the following:

secure log on facility requiring users to identify themselves.
discretionary access control allowing the owner of a resource to determine who else can access the resource.
auditing ability to detect and record important security-related or any attempt to create, access or delete system resources.
memory protection preventing anyone from reading information written by someone else after memory has been deallocated.

Multilayered security:-

Password
NTFS gives second layer of security for files. The creator of a file is its owner. Owners can designate a set of users (a group) to use the file. Owner can also prevent some of the members from using the file. Can also determine what operations are permitted on a file.

When user logs on to NT system returns an access token. Afterwards, whenever user creates a process, process contains a copy of that user’s access token. Access tokens indicate individual rights and group rights.

Object in NT have a security descriptor, and applies it to the object when it is constructed. Owner of an NT objects can always change its security information.

An access token identifies a process and its threads to the operating system, but a security descriptor lists which of these processes can access an object. When a thread opens a handle to an object, the object manager and the security system looks over the information to determine whether the user should be given the handle.