Thursday, 26 May 2011

ASSEMBLERS

Assembler ASSEMBLERS Definitions
Forward Reference: - A forward reference of a program entity is a reference to the entity which precedes its -------------- in the program.

Language processor pass: - A language processor pass is the processing of every statement in a source program to perform a language processing function.

Intermediate representation:- An intermediate representation is a representation of a source program which reflects the effect of some, but not all, analysis and synthesis tasks performed during language processing.
The first pass performs analysis of source program and gives its results in IR.The second pass reads and analysis the IR instead of source program. This avoids repeated processing of source program. Desirable properties of IR are:-
1.Ease of use-:Should be easy to construct and analyse.
2.Memory efficiency-:IR should be compact.
3.Processing efficency-:Efficients algorithms should exist.

A simple assembly language or assembly language:-

In this language, each statement has two operands .The first operand is always a register Which can be any one of AREG, BREG, CREG and DREG. The second operand refers to a memory word using a symbolic name and an optional displacement.

The figure lists the mnemonic opcodes for machine instructions.

• The MOVE instruction moves a value b/w a memory word and a register. In the MOVER instruction, the second operand is the source operand and the first operand is the target operand. Converse is true for MOVEM instruction.

• All arithmetic is performed in a register i.e. the results replaces the contents of a register and sets a condition code.

• A comparison instruction sets a condition code analogous to a instruction without affecting values of its operands.

• The condition code can be tested by a French on code instruction. The assembly language instruction corresponding to BC has format:-

            BC [condition code spec],[memory address]
It transfers control to the memory word with the address [memory address] if current value of condition code matches [condition code spec]. For simplicity, we assume to be a character string.
• In a machine language program, we show all addresses and constants in decimal form.

The fig. shows machine inst. Format corresponding to an assembly language instruction .The opcode, register operand and memory operand occupy 2,1and3 digits respectively. The sign is not a part of instruction.

Assembly language Statements:- An assembly language contains three kinds of statements:-
1. Imperative statements
2. Declaration statements
3. Assembler directives

1. Imperative statements:- An imperative statement indicates an action to be performed during the execution of the program. Each imperative statement translates into one machine instruction.

2. Declaration statements:- The syntax of declaration statement is as follows:
            [Label] DS
            [Label] DC
The DS is declare storage. The DS statement reserves areas of memory and associates names with them.
For eg:- The first statement reserves a memory area of 1 word and associates the name A with it.
The second statement reserves a block of 200 memory words. The name G is associated with first memory word. Other words can be accessed through offsets from G.
e.g. :- G+5 for 6th word of memory block etc.
The DC is short for declare constants, and DC statements constructs memory words containing constants.
The statement:-
           ONE DC ‘1’.
       associates the name ‘one’ with a memory word containing the value ‘1’. Constants can be declared by the programmer in different forms-decimal, binary, hexadecimal.

Use of Constants: - The DC does not really implement constants, it just initializes memory words to the given words to the given values. These values may be changed by moving a new value into the memory word. An assembly program can use constants in two ways – as immediate operands and as literals.
The immediate operands can be used in an assembly statement only if architecture of target machine includes the necessary features.

Consider the assembly statement
             ADD AREG,5

This statement is translated into an introduction with two operands-AREG and value ‘5’ as an immediate operand.

• A literal is an operand with the syntax=’[value]’
It differs from a constant because its location cannot be specified in the assembly program. Due to this fact, its value is not changed during the execution of the program. It differs from an immediate operand because no specific architecture is needed for its use. An assembler handles a literal by mapping its use into the features of assembly language.

3. Assembler Directives:- Assembler directives instruct the assembler to perform certain activities during the assembly of a program. Some assembler directives are:-
             1 START [constant]
This directive indicates that the first word of the target program should be placed in the memory word with address[constant].
             2 END [operand spec]
This directive indicates the end of the source program. The [operand spec] indicatesthe address of the inst. where execution of the program begin.

Advantages of Assembly Language: - The main advantages of assembly language programming arise from the use of symbolic operand specifications. If there is some change in program or some statement is inserted in a program, then in machine language program, this leads to changes in addresses of constants and memory areas. As a result addresses used in most instructions of the program had to change. But such changes need not to be made in assembly language program as operand specifications are symbolic in nature.
Also, assembly language holds an advantage over high level languages programming in situations where it is necessary to use specific architectural features of a computer.
e.g. special instructions supported by CPU

DESIGN SPECIFICATION OF AN ASSEMBLER

To develop a design specification for an assembler a four step approach is used
1. Identify the information necessary to perform a task.
2. Design a suitable data structure to record the information.
3. Determine the processing necessary to obtain & maintain the information.
4. Determine the processing necessary to perform a task.
Here mainly two phases are involved:-
1. Synthesis phase
2. Analysis phase

The fundamental info requirements arise in the synthesis phase. After this , it is decided that whether this info should be collected during analysis or derived during analysis.

Synthesis phase:- Consider the assembly statement
             MOVER BREG, ONE
To synthesize machine instruction corresponding to this statement, we must have the following information:-
1 Address of the memory word with which name ‘one’ is associated.
2.Machine operation code corresponding to mnemonic MOVER.
The first item of info. Depends on the source program. So, it must be made available by analysis phase.
The second item does not depends on source program, it depends on the assembly language. Hence the synthesis phase can determine this information Itself.
During synthesis phase, two data structures are used:-
1.) Symbol Table
2.) Mnemonics Table

The symbol table’s each entry has two fields:- name and address. This table is built by analysis phase. An entry in the mnemonics table has two fields:- mnemonic and opcode.
By using symbol table, synthesis phase obtains the machine address with which name is associated.
And by using Mnemonics table, synthesis phase obtains machine opcode corresponding to a mnemonic.

Analysis phase:- The primary function performed by the analysis phase is the building of the symbol table. For this purpose, it must determine the addresses with which symbolic names used in a program are associated. Some addresses can be determined directly while some other must be inferred.
Now, to determine the address of an element, we must fix the addresses of all elements preceding it. This is called memory allocation. To implement memory allocation, a data structure called Location Counter (LC) is introduced. The location counter contains the address of the next memory word in the target program. LC is initialized to the constant specified in the START statement.
To update the contents of LC, analysis phase needs to know the lengths of different instructions. This info. Depends on the assembly language. To include this info. Mnemonic table can be extended and a new field called as length is used for this.
The processing involved to maintain the LC is called LC processing.
The mnemonic table is a fixed table and merely accessed by analysis and synthesis phases. The Symbol Table is constructed during analysis and used during synthesis.

Tasks performed by Analysis Phases:-

(1)    Isolate the label, mnemonic opcode and operand fields of a statement.
(2)    If a label is present, enter the pair (symbol, ) in a new entry of symbol table.
(3)    Check validity of mnemonic opcode by look-up in Mnemonics table.
(4)    Perform LC processing i.e. update value contained in LC.

Tasks performed by Synthesis Phase:-

(1)    Obtain the machine opcode corresponding to mnemonics.
(2)    Obtain address of memory operand from symbol table.
(3)    Synthesize the machine form of a constant, if any.

Pass Structure of Assemblers:- The pass of a language processor is one complete scan of the source program. There are mainly tow assembly schemes:-
(1)    Two pass assembly scheme. (2)    One pass assembly scheme.

(1) Two Pass Translator / Two Pass Assembler:- The two pass translator of an assembly language program can handle forward references easily. The LC processing is performed in the first pass and also the symbols defined in the program are entered into the symbol table in this pass. The second pass synthesizes the target form using the address information found in the symbol table. The first pass performs analysis of the source program while the second pass performs the synthesis of the target program. The first pass constructs an intermediate representation (IR) of the source program for use by the second pass. This representation consists of two main components – data structures e.g. Symbol table, and a processed form of source program. The processed form of source program is also called as Intermediate Code (IC).




IMAGE



(2) Single Pass Translation / Single Pass Assembler:- In a single pass assembler, the LC processing and construction of symbol table proceeds as in two pass assembler. The problem of forward references is tackled using a process called back patching.
Initially, operand field of an instruction containing a forward reference is left blank. When the definition of forward referenced symbol in encountered, its address is put into this field which is left blank initially.

Consider the following statement:-
            MOVER   BREG,   ONE
The instruction corresponding to this statement can be only partially synthesized as ONE is a forward reference. So, the instruction opcode and address of BREG will be assembled to reside in location 101 and the insertion of second operand’s address at a later stage can be indicated by adding an entry of the Table of Incomplete Instructions (TII). This entry is a pair ([instruction address], [symbol]). e.g. (101, ONE) in this case.
By the time, the END statement is processed, the symbol table would contain the addresses of all symbols defined in the source program and TII would contain info. Describing forward references. The assembler can now process each entry in TII to complete the concerned instruction.

Design of a Two Pass Assembler:- Tasks performed by the passes of a two pass assembler are:-

PASS 1:-

(1)    Separate the symbol, mnemonic opcode and operand fields.
(2)    Build the symbol table.
(3)    Perform LC processing.
(4)    Construct intermediate representation

PASS 2:-

The Pass 1 performs analysis of the source program and synthesis of the intermediate representation while Pass 2 processes the intermediate representation (IR) to synthesize the target program. Before the details of design of assembler passes, we should know about advanced assembler directives.

Advanced Assembler Directives:-

(i) ORIGIN:- The syntax of this directive is
     ORIGIN      [address spec]
where [address spec] is an [operand spec] or [constant]. This directive indicates that LC should be set to the address given by [address spec]. The ‘ORIGIN’ statement is useful when the target program does not consist of consecutive memory words. The ability to use an in the ORIGIN statement provides the ability to perform LC processing in a relative manner rather than absolute manner.
(ii) EQU:-
   [symbol]   EQU   [address spec]
where [address spec] is an [operand spec] or [constant]. The EQU statement defines the symbol to represent [address spec]. This differs from DC/DS statement as associates the name [symbol] with [address spec].
LTORG:-
The LTORG statement permits a programmer to specify where literals should be placed. By default, assembler places the literals after the END statement. At every LTORG statement, the assembler allocates memory to the literals of a literal pool. This pool contains all the literals used in the program. The LTORG directive has less relevance (applicapability) for the simple assembly languages because allocation of literals at intermediate points in the program is efficient rather than at the end.

Pass 1 of the Assembler:- Pass 1 uses the following data structures:-
OPTAB: - A table of mnemonic opcodes and related info.
SYMTAB: - Symbol Table.
LITTAB: - A table of literals used in the program.

OPTAB contains the fields mnemonic opcode, class and mnemonic information. The ‘class’ field indicates whether the opcode corresponds to an imperative statement (IS), a declaration statement (DL) or an assembler directive (AD). If an imperative statement is present, then the mnemonic info. Filed contains the pair (machine opcode, instruction length) else it contains the pair id of a routine to handle the declaration or directive statement.

SYMTAB contains the fields address and length.
LIMTAB contains the fields literal and address.

  • The processing of an assembly statement begins with the processing of its label field. If it contains a symbol, the symbol and the value in LC is copied into a new entry of SYMTAB.
  • After this, the class field is examined to determine whether the mnemonic belongs to the class of imperative, declaration or assembler directive statements.
  • If it is an imperative statement, then length of the machine instruction is simply added to the LC.
  • If a declaration or assembler directive statement is present, then the routine mentioned in the mnemonic info. Field is called to perform appropriate processing of the statement.
  • The LITTAB is used to collect all literals used in the program. The awareness of different literals pools in maintained by an auxiliary table POOLTAB. This table contains the literal no. of starting literal of each literal pool.
  • At any stage, the current literal pool is the last pool in LITTAB. On encountering an LTORG statement (END), literals in the current pool are allocated addresses starting with current value in LC and LC is incremented appropriately.



IMAGE



Pass 1 Algorithm: -

(1) Loc ctr = ‘0’ (default value)
       Littab ptr = ‘1’
       Pool tab ptr = ‘1’

(2) While the next statement is not END statement.
        (a) If label is present then
             This label = get the name of label
              Store [this label, loc counter] in SYMTAB

        (b) If an LTORG statement then
            (i) Process literals in LITTAB and allocate memory.
           (ii) Pooltab ptr = Pool tab ptr + 1;
           (iii) Littab ptr = Littab ptr +1;

         (c) If instruction is START or ORIGIN then
               Loc counter = value specified in operand field;

         (d) If an EQU statement then
           (i) this . address = value of [address spec];
           (ii) Store [this . label, this . address] in Symbol Table.

         (e) If a declaration statement then (DC/DS)
           (i) Code = code of declaration statement;
           (ii) Size = size of memory area required by DC/DS.
           (iii) Loc counter =             (iv) Generate IC (DL,code).

         (f) If imperative statement then
           (i) Code = machine opcode from OPTAB
           (ii) Loc ctr = loc ctr + instruction length from OPTAB
           (iii) Generate Intermediate Code (IS,code);
(3) (Processing of END statement)
      (a) Perform step 2(b)
      (b) Generate IC (AD,02)
      (c) Go to Pass 2

Intermediate Code forms: - The intermediate code cinsistes of a set of IC units, each IC unit consisting of following three fields: -

(1) Address.
(2) Representation of mnemonic op code.
(3) Representation of operands.

SMALL IMAGE

There are generally two criteria for choice of intermediate code (IC) viz—processing efficiency and memory economy. There arise some variants of intermediate code. The variant forms of intermediate codes arises mainly in operand and address fields due to trade off between processing efficiency and memory economy. The info. In mnemonic fields same in all the variants.
  • The mnemonic field contains a pair of the form
                  (statement class, code)
Here, statement class can be one of the imperative (IS) (D2) declaration and assembler directive (AD) statement resp. for imperative statement (IS), code is the instruction opcode in machine language.
For DL and AD, code in an ordinal number within class.


IMAGE

Variant 1: - The first operand is represented by a single digit number which is a code for register or (i.e. 1-4 for AREG – DREG).
The second operand, which is memory operand, is represented by a pair of form
               (operand class, code)
where operand class is one of C, S and L standing for constant, symbol and literal resp.
  • For a constant, code field contains internal representation of constant itself.

  • For symbol or literal, code field contains the ordinal no. of operand’s entry in SYMTAB or LITTAB.

Variant 2: - This variant differs from variant 1 in that the operand fields of source statements are replaced by their processed forms.
For declarative statements and assembler directives, processing of operand fields contain processed forms. For imperative statements, the operand field is processed only to identify literal references. Variant 2 reduces the work of Pass 1 by transferring burden of operand processing from Pass 1 to pass 2.

IMAGE

Pass 2 Algorithm: -

(1) Code Area Address = address of code area (where target code is to be enabled)
       Pool tab ptr = ‘1’
       Loc ctr = o (defined)
(2) While the next statement is not an END statement.
      (a) Clear memory buffer area.
      (b) If an LTORG statement.
          (i) Process the literals and assemble the literals in memory buffer.
          (ii) Size = size of memory area reqd. for literals.
          (iii) Pooltab ptr = Pool tab ptr + 1

(c) If a START or ORIGIN statement then
          Loc ctr = value specified in operand field.
          size = 0

(d) If a declaration statement
          (i) If a DC statement
              Assemble the constant in memory buffer
          (ii) If DS statement
               Generate machine code
          Size = size of memory req. by DC/DS

(e) If an imperative statement
          (i) Assemble instruction in memory buffer.
          (ii) Size = size reqd. to store instruction

(f) If size != 0
          Store the memory buffer code in code area address.
          Loc ctr = loc ctr + size

(3) END statement
         (a) Perform steps 2(a) and 2(f)
         (b) Write code area into o/p file.

Wednesday, 20 April 2011

System Software

System Software System Software Contents of the Chapter
1.System Software
2. Components of System Software
3. Evolution of System Software
4. Model of Computer System
( i ) Translators
( ii ) Loaders
( iii ) Interpreters

System Software

System software is computer software which involves data and program management, including operating systems, control programs and database management systems. It is also known as system package. System software performs the basic functions necessary to start and operate a computer. It controls and monitors the various activities of computer and makes it easier and more efficient to use the computer. It is a collection of rules and instructions which help us to operate control and extend the processing capabilities of a computer. In general we can say that system software supports the running of other softwares, helps to communicate with the peripheral devices, helps in development of other softwares and monitors the use of various hardware resources viz., memory, CPU etc. Hence system software makes the operation of a computer system more effective.


The system software is classified into three main categories.

1.) System Control Softwares :- It contains programs that control and manage system resources and functions. The most important and main system control software is Operating System. OS acts as an interface between User and Computer.

2.) System Support Softwares :- It contains the programs that support the execution of various applications. It is a software that supports the smooth and efficient operation of a computer. It mainly involves language translators and database management systems.

3.) System Development Softwares :- This software helps the system developers design and build better systems.

Components of System Software

The main components of System Software are following :-

1.) Operaing System:-

Just as a processor is the nucleus of Computer System, the Operating System is the nucleus of all software activities. An important system software found in all the computer installations is the Operating System. It controls the overall Operation of the Computer. It is the first program to be loaded to RAM on any general purpose Computer System. Depending on the type of Computer, the Operating Systems performs the no. of functions. It acts as an interface between the User and the Computer. The Operating System translates the language in which application programs are written into a language which CPU can understand. An Operating System loads programs, performs and manages I/O Operations, manages files, manages the use of computer memory etc. It is a Operating System that provides access to different I/O devices and releases these devices when a task is completed so that they can be used by other programs.
With the help of Operating System a Computer can read, create, delete and rename files and can also performs other file related tasks.
The Operating System also manages hard-disks storage so that Users can create, execute, save and retrieve various applications.
Managing Computer resources such as the CPU, Primary memory, Secondary storage, I/O devices and other peripheral is also done by Operating System.
Thus we can say that Operating System is very important component of System Software.

2.) Database Management System:-

It is another important component of System Software. DBMS is a system in which related data is stored in an efficient and compact manner. Efficient means that data which is stored in the DBMS is accessed in very quick time and compact means data stored in DBMS takes very little space in Computers memory.
A DBMS is a system software package that helps the use of Databases. Database is a collection of data records and files. In large systems, a DBMS allows Users and other softwares to store and retrieve data in a structured way. The DBMS accepts requests for data from an application program and instructs the Operating System to transfer the appropriate data.
A DBMS includes four main parts:- Modeling language, Data Structure, Database Query Language and Transaction mechanism.

3.) Assemblers, Loaders, Macros and Compilers:-

1.) These components come in the programming system. The programmers found it difficult to write or read programs in machine language. So, they began to use a mnemonic for each machine instruction. Such a mnemonic machine language is called an assembly language. Programs called as assemblers were written to translate assembly language into machine language. The input to an assembler is called source program and the output is a machine language translation called as object program.

2.) Once the assembler produces an object program, that program must be placed onto memory and executed. It is the purpose of the loader to assure that object programs are placed in memory in an executable form. Thus the loader may be defined as a program that places program into memory and prepares them for execution. The loader places into memory, the machine language version of Users program and transfers control to it.

3.) To relieve programmers from repeating identical parts of their program, Operating System provides a macro procesint facility. This facility allows the programmer to define an abbervation for a part of his program. The macro-processor treats the identical parts of the program defined by abbreviation as a macro-definition and saves the definition. Then macro-processor substitutes all the occurrences of abbreviation be the definitions. The various high level languages like FORTRAN, COBOL, ALGOL are processed by compilers and interpreters. These high level languages were developed for allowing user to express many scientific, business related and statistical problems easily and concisely.
A compiler is a program that accepts a program written in high level language and produces an object program.

Evolution of System Software

The system software began to be used extensively with the second generation computers in the early 1960's. Before that, the operation of a computer was controlled primarily by human operators. These operators monitored the processing of each job. Typically, when a job ended, a bell rang or light flashed to indicate that another job should be input to computer and started by the operator. In addition, the operator had to activate each peripheral device when that device was needed by the computer. This type of things wasted large amount of computer time and human resources.
But with the development of system softwares or operating systems, a queue of jobs that are a waited can be read onto a disk. The operating system will start each job when system resources are available for its execution. Since due to introduction of system softwares, human intervention is eliminated, so computers idle time is reduced.

Now, the evolution of operating system can also be explained under this topic:-

Evolution of Operating System:-

Few years ago, a FORTRAN programmer would approach the computer with his source deck in his one hand and a green deck of cards that would be a FORTRAN compiler, in the other hand.

This system of decks was unsatisfactory and so there was strong motivation for moving to a more flexible system. Also, another reason for this was that the valuable computer time was being wasted as the machine stood idle during handling activities and between jobs (job is a unit of specific work).

To eliminate this waste of time, the facility of batch jobs was provided. A batch operating system performed the task of batching jobs.

But, in batched systems, the memory resource was totally allocated to a single program. Thus if a program did not need entire memory, a portion of that resource was wasted.

To overcome this problem, Multiprogramming operating systems with partitioned core memory were developed. Multiprogramming allows multiple programs to reside in separate areas of core at the same time.

Again in such partitioned memory systems, some portion of memory could not be used as it was too small to contain a program. This problem of unused portions(holes) is called fragmentation. This problem of fragmentation has been minimized by technique of relocatable partitions and by paging. Relocatable partitioned core allows unused portions to be condensed at one continuous part of core.

After this, some other programs and systems were also developed like:-
(i) The resource of processor time is allocated by a program called as scheduler.
(ii) I/O processor is a processor concerned with input/output.
(iii) The files of information are allocated by the file system.
(iv) Time-Sharing is a method of allocating processor time.
(v) The fragmentation technique, provides a large name space and a good protection mechanism.

So, these were main stages involved in the evolution of operating system to make it better for efficient use.

The Model of Computer System, Translators, Loaders and Interpreters:-

(1) TRANSLATORS:-

A translator could be designed to translate programs in the high level language into equivalent programs in the machine language of the actual computer. The Translator is a general term and it denotes any language processor that accepts program in some source language (high or low level) as input and produces functionally equivalent programs in object language (high or low level) as output. There are several specialized types of translators having particular names as:-

(i)Assembler:-

An assembler is a translator whose source language is an assembly language and whose object language is some variety of machine language.

(ii)Compiler:-

A compiler is a translator whose source language is a high level language and whose object language may be some variety of machine language.

(iii) Preprocessor of Macroprocessor:-

It is a type of translator whose source language is an extended form of some high-level language such as C++ or Java and whose object language is the standard form of same language.

(2) LOADERS:-

Loader is a program which accepts the object program, places it into memory and prepares them for execution. It is the purpose of loader to assure that object programs are placed in memory in an executable form. The loader performs the function of allocation, linking, relocation and loading. The general purpose loading scheme for a loader is shown in the figure:-

In a loader, the object program decks are accepted by the loader. The loader accepts them and places them in the memory for execution.

(3)INTERPRETERS:-

An interpreter is a language processor which bridges an execution gap without generating a machine language program. The interpreter is also a type of language translator although many differences exist between translators and interpreters.
link3 In interpreter, due to the absence of target program there is absence of an output interface. Thus the language processing activities of an interpreter cant be separated from its program execution activities. The interpreter does not produces machine code for computer. Instead, the interpreter produces some intermediate form of program that is more easily executable than the original program. However, the program execution is slower in using interpreter.