In the previous article, we discussed the History of UNIX, Various flavors of UNIX, Basic commands that are helpful like ‘˜who’, ‘˜uname’, ‘˜ps’, the Environment variables and a good explanation about the Shell.
In continuation to the previous article, this time let us focus on some more Unix internals like processes – jobs and file system.
Getting Started Guide to Unix Based Testing: Part 2
Processes and Jobs:
Processes and Jobs are the vital elements of any operating system. Whenever an application is started, it has a process associated with it. Even if there is no application running on a machine, there are few processes running which are ‘˜System processes’.
A process is the execution of a program. Some operating systems call the basic unit of execution a “job,” some call it a “task.” In Unix it’s called a process. In the Unix kernel, anything that’s done, other than autonomous operations, is done by a process issuing system calls. To list the current processes running on the machine use the command ps. This command returns the PID (the Process ID), TTY (Console name), Time and the Process Name. Listing A shows the output from the ps command.
$ ps PID TTY TIME CMD 9279 pts/5 0:00 tcsh 9298 pts/5 0:00 bash 9299 pts/5 0:00 ps
As seen above, at any given instance, there is more than one process running on a machine. This denotes the multi-processing feature of UNIX.
ps too has many options like:
-e: List information about all processes currently running.
-f: Generates a full listing. See Listing B for details.
$ ps '“f UID PID PPID C STIME TTY TIME CMD patqa1 9279 9276 0 06:44:19 pts/5 0:00 -tcsh patqa1 9298 9279 0 06:44:26 pts/5 0:00 bash patqa1 9371 9298 0 06:54:18 pts/5 0:00 ps '“f -j: Prints session ID and process group ID. -l: Generates a long listing. -u : Lists only process data whose effective user ID number or login name is given in uidlist.
Just like shell (discussed in previous article), one process can give birth to a child process. This is called ‘˜Spawning a process’. When a process creates or spawns another process, the original process is known as parent while the process it creates is called a child process. The child process inherits the file access and execution privileges belonging to the parent. Also, the child process would have its ‘˜Parent Process ID ‘“ PPID’ as the PID of its parent process. See the Figure 1 below.
From the above figure:
- The shell (in this case the bash) would be the parent process for all other processes after the user logs in.
- The PID of bash is 9298 and PPID of bash is 9279 and now we run the command ps ‘“f from this shell.
The ps ‘“f then becomes the child process of the bash, which also displays the process information. (Here we say, the bash spawns ps ‘“f.)
- The PPID for ps ‘“f becomes 9298 (the PID for bash) and the PID is allotted the next available number which in this case is 9371.
The /etc/init process is the top-most process which is the first process that is invoked while starting the UNIX machine. This process has a PID of 1 and PPID of 0. (On some machines though the first process may be ‘˜sched’ ‘“ scheduler ‘“ that has PID as well as PPID as 0).
When there are more than two processes running at any given instance of time, they can run simultaneously and may communicate with each other. When processes run in this way, they are called as coroutines.
Now let’s take a look at Jobs, how to view the status of jobs and how to bring a job in foreground and send it to background.
UNIX is a multi-tasking system that allows you to run multiple jobs, programs, or processes at the same time. There are some simple commands to control your foreground and background jobs as well as determine the status of your jobs.
As mentioned in the previous article, in UNIX, jobs can be sent to background for processing while other work can be continued on the same shell in the foreground. This is a very useful feature provided by the UNIX architecture in the sense that, one does not need to wait for a job to get over in order to start some other job.
To see which jobs are running currently on a machine, use the command ‘˜jobs’ with appropriate switches. This command lists all the jobs running in the background as well as in the foreground. Listing C will show the use of the command jobs.
$ xcalc & $ jobs + Running xcalc &
From Listing C above following details can be obtained:
- First we ran the command jobs when there were no jobs running hence, it returned a blank line.
- Next, we instantiated a job by running the application ‘˜xcalc’ and sent it to the background using the ‘˜&’. Notice that once a job is sent to background, irrespective of whether it is running or stopped, we still get the $ prompt to enter the next command.
- We then used the jobs command to list the jobs running. The line ‘œ+ Running xcalc &’ denotes that there is 1 job named xcalc running at the background currently.
Some of the important switches for the jobs command are:
-p: Report the process group ID
-n: Display only jobs that have stopped or exited since last notified.
-r: Display only jobs that are running at the moment.
Apart from displaying the jobs, to bring a particular job in the foreground, use the command ‘˜fg %’.
So, to bring the ‘˜xcalc’ job into foreground, use:
+ Running xcalc &
$ fg %1
Notice that once a job is moved into foreground, no other command can be entered.
To stop a particular job, first bring it to foreground and then kill it using .
To start a job, a process is needed. Usually this process is the shell one is working upon. If a job requires more than one process to be run, the job would create a thread for each process and run these processes to accomplish the given task. In short one job has atleast one process running related to it. From the above example itself when a job ‘˜xcalc’ is running in the background observe that there is a process associated with the job xcalc:
+ Running xcalc &
14232 pts/3 0:00 ps
14230 pts/3 0:00 bash
14231 pts/3 0:00 xcalc
The jobs command only shows us jobs that we started directly from a particular shell. Once we exit that shell, we can no longer use their job numbers, nor can we manipulate them with the jobs, bg, or fg commands. Instead, we have to deal with them as processes.
Now comes the most important part of any operating system ‘“ the ‘File system’. File system forms the core of any operating system. One must understand File system of UNIX in relation to processes, jobs before going on for actual testing.
UNIX file system is supposed to be very robust and complicated. I will try to explain this beauty and make it sound a bit simpler.
In Unix, the files are organized into a tree structure with a root named by the character ‘/’. The standard system directories are shown below. Each one contains specific types of file. The details may vary between different UNIX systems, but these directories should be common to all. Figure 2 below gives the basic directory structure of Unix.
Following will give a short description on what the above directories usually contains:
/(root): the directory located at the top of the Unix file system.
/bin: This directory contains the commands and utilities that you use day to day. These are executable binary files.
/dev: This directory contains special files used to represent real physical devices such as printers and terminals.
/etc: This directory contains various commands and files that are used for system administration.
/home: This directory contains a home directory for each user of the system.
/lib: This directory contains libraries that are used by various programs and languages.
/tmp: This directory acts as a “reserve” area in which any user can store files on a temporary basis.
/usr: This directory contains system files and directories that you share with other users.
In UNIX, everything is considered to be a file. A text file is a file, an application or an executable is a file, a directory is a file and a device drive is a file. But these do not fall into the same category, they are divided into various categories and each category is treated in a different manner.
– Ordinary files:
Ordinary files can contain text, data, or program information. An ordinary file cannot contain another file, or directory.
Directories can be described as containers that can hold files, and other directories. A directory is actually implemented as a file that has one line for each item contained within the directory. Each line in a directory file contains only the name of the item, and a numerical reference to the location of the item. The reference is called an i-number, and is an index to a table known as the i-node (covered later). The i-node is a complete list of all the storage space available to the file system.
– Special/Device files:
Special files represent input/output (i/o) devices, like a tty (terminal), a disk drive, or a printer. Since UNIX treats such devices as files, a degree of compatibility can be achieved between device i/o, and ordinary file i/o, allowing for the more efficient use of software. This way, the same read() and write() functions used to read and write real files can also be used to read from and write to these devices. Special files can be either character special files that deal with streams of characters or block special files, which operate on larger blocks of data. Typical block sizes are 512 bytes, 1024 bytes, and 2048 bytes.
A link is a pointer to another file. Remember that a directory is nothing more than a list of the names and i-numbers of files. A directory entry can be a hard link, in which the i-number points directly to another file. A hard link to a file is indistinguishable from the file itself. When a hard link is made, then the i-numbers of two different directory file entries point to the same inode. For that reason, hard links cannot span across file systems. A soft link (or symbolic link) provides an indirect pointer to a file. A soft link is implemented as a directory file entry containing a pathname. Soft links are distinguishable from files, and can span across file systems. Not all versions(older) of UNIX support soft links.
File System ‘“ Internal Structure:
The Boot Block:
The boot block is usually a part of the disk label, a special set of blocks containing information on the disk layout. The boot block holds the loader to boot the operating system.
Each UNIX partition usually contains a special block called the superblock. The superblock contains the basic information about the entire file system like:
- Size of the file system
- Number of free blocks on the system
- A list of free blocks
- Index to next free block on the list
- Size of the inode list
- Number of free the inodes
- A list of free inodes
- Index to next free inode on the list
- Lock fields for free block and free inode lists
- Flag to indicate modification of super block
A Unix file is described by an information block called an i-node. There is an i-node on disc for every file on the disc and there is also a copy in kernel memory for every open file. All the information about a file, other than it’s name, is stored in the i-node. This information includes
- File access and type information, collectively known as the mode.
- File ownership information.
- Time stamps for last modification, last access and last mode modification.
- Link count.
- File size in bytes.
- Pointers to the data blocks for the file.
- Access permissions.
- Addresses of physical blocks.
There are 13 physical block addresses in an i-node, each of these addresses is 3 bytes long. The first ten block addresses refer directly to data blocks, the next refers to a first level index block (which holds the addresses of further data blocks), the next refers to a second level index block (which holds the addresses of further index blocks) and the last refers to a third level index block (which holds the addresses of further second level index blocks).
All physical addresses associated with a file are implicitly assumed to reside on the same disc (I believe the DFS ‘“ Distributed File System ‘“ has solved this problem). There is no requirement that the physical addresses of a file should be contiguous (i.e adjacent) and with multiple files being handled on a disc it is unlikely that contiguity would offer any advantages for performance..
Assuming 512 byte blocks and 3 bytes per address, which is equivalent to a disc capacity of about 8 GByte. An index block of 512 bytes is capable of holding 170 ‘“ 3 byte addresses. The size of the largest file can be calculated thus.
Directly addressed blocks 10 Ã— 512 byte = 5120 bytes
Blocks addressed via first level index block 170 Ã— 512 byte = 87040 bytes
There will be 170 index blocks addressed via the second level index block. This will address 170 Ã— 170 & 512 bytes = 14796800 bytes
Via the third level index block there will be 170 Ã— 170 Ã— 170 Ã— 512 bytes of addressable data. This comes to 2515456000 bytes.
The total addressable space comes to 2530344960 bytes (approximately 2.5 Gbytes), which means the size of largest file can be 2.5 Gbytes.
The Data Block:
The data blocks contain the actual data contained in the files or directories. The Unix file system allocates data blocks one at a time from a pool of free blocks. Unix uses 4K blocks.
The sum up:
To give a simple example, for an installation program ‘“ install.sh, there are number of small shell scripts associated with it. If there is a ‘˜version.sh’ script that verifies the version of a browser, start the netscape using netscape & and run the ‘˜install.sh’. Check to see if this install.sh first starts a process for version.sh and as soon as the verification is over, version.sh is killed. After the version.sh, there is another program written in PERL. Check to see if a process for perl is started and as soon as the processing is over, this perl process is killed. The ps command can be used in number of ways; it just needs the correct knowledge of where to use the commands. The knowledge of the file system is not limited to a particular test case. It is required from the moment one logs in till all the test cases are over and he logs out.
The knowledge of the file system is vital before proceeding with the testing on UNIX platform. Although I have tried to present a fair idea about the file system, one should keep updating the information because there is much more to explore within the file system itself. The knowledge about processes and jobs is helpful from testing point of view. It can be used to detect memory leaks within the programs.
Since we discussed about the files in this article, one key difference between Windows and Unix that I would like to point out is ‘˜Case-Sensitivity’. In UNIX the files ‘˜Config.ini’ and ‘˜config.ini’ would altogether be two different files whereas on Windows it denotes a single file.
In the next article, we will focus on ‘˜File Permissions’ and ‘˜Filters’.
Abhijit Potdar has a master’s degree in computer management and is a ‘Certified Software Tester (CSTE)’. He is having extensive experience in software testing. Specifically in Unix environment as well as in Test Automation apart from Database Testing. He can be reached at (abhijit.potdar AT gmail)
Latest posts by Abhijit Potdar (see all)
- Getting Started Guide to Unix Based Testing:Part 2 - July 23, 2005
- Getting Started Guide to Unix Based Testing - May 17, 2005