Assembly language

   

Assembly language or simply assembly is a human-readable notation for the machine language that a specific computer architecture uses. Machine language, a pattern of bits encoding machine operations, is made readable by replacing the raw values with symbols called mnemonics.

For example, a computer with the appropriate processor will understand this x86/IA-32 machine instruction:

 10110000 01100001

For programmers, however, it is easier to remember the equivalent assembly language representation

 mov  al, 0x61

which means to move the hexadecimal value 61 (97 decimal) into the processor register with the name 'al'. The mnemonic "mov" is short for "move," and a comma-separated list of arguments or parameters follows it; this is a typical instruction.

Unlike in high-level languages, there is (to a close approximation) a 1-to-1 correspondence between simple assembly and machine language. Transforming assembly into machine languages is accomplished by an assembler, and the reverse by a disassembler.

Every computer architecture has its own machine language, and therefore its own assembly language (the example above is from the i386). These languages differ by the number and type of operations that they support. They may also have different sizes and numbers of registers, and different representations of data types in storage. While all general-purpose computers are able to carry out essentially the same functionality, the way they do it differs.

In addition, multiple sets of mnemonics or assembly-language syntax may exist for a single instruction set. In these cases, the most popular one is usually that used by the manufacturer in their documentation.

Machine instructions

Instructions in assembly language are generally very simple, unlike in a high-level language. Any instruction which references memory (for data or as a jump target) will also have an addressing mode to determine how to calculate the required memory address. More complex operations must be built up out of these simple operations. Some operations available in most instruction sets include:

  • moving
    • set a register (a temporary "scratchpad" location in the CPU itself) to a fixed constant value
    • move data from a memory location to a register, or vice versa. This is done to obtain the data to perform a computation on it later, or to store the result of a computation.
    • read and write data from hardware devices
  • computing
    • add, subtract, multiply, or divide the values of two registers, placing the result in a register
    • perform bitwise operations, taking the conjunction/disjunction (and/or) of corresponding bits in a pair of registers, or the negation (not) of each bit in a register
    • compare two values in registers (for example, to see if one is less, or if they are equal)
  • affecting program flow
    • jump to another location in the program and execute instructions there
    • jump to another location if a certain condition holds
    • jump to another location, but save the location of the next instruction as a point to return to (a call)

Specific instruction sets will often have single, or a few instructions for common operations which would otherwise take many instructions. Examples:

  • saving many registers on the stack at once
  • moving large blocks of memory
  • complex and/or floating-point arithmetic (sine, cosine, square root, etc.)
  • applying a simple operation (for example, addition) to a vector of values

Assembly language directives

In addition to codes for machine instructions, assembly languages have extra directives for assembling blocks of data, and assigning address locations for instructions or code.

They usually have a simple symbolic capability for defining values as symbolic expressions which are evaluated at assembly time, making it possible to write code that is easier to read and understand.

Like most computer languages, comments can be added to the source code which are ignored by the assembler.

They also usually have an embedded macro language to make it easier to generate complex pieces of code or data.

In practice, the absence of comments and the replacement of symbols with actual numbers makes the human interpretation of disassembled code considerably more difficult than the original source would be.

Usage of assembly language

There is some debate over the usefulness of assembly language. It is often said that modern compilers can render higher-level languages into code that runs as fast as hand-written assembly, but counter-examples can be made, and there is no clear consensus on this topic. It is reasonably certain that, given the increase in complexity of modern processors, effective hand-optimization is increasingly difficult and requires a great deal of knowledge.

However, some discrete calculations can still be rendered into faster running code with assembly, and some low-level programming is simply easier to do with assembly. Some system-dependent tasks performed by operating systems simply cannot be expressed in high-level languages. In particular, assembly is often used in writing the low level interaction between the operating system and the hardware, for instance in device drivers. Many compilers also render high-level languages into assembly first before fully compiling, allowing the assembly code to be viewed for debugging and optimization purposes.

It's also common, especially in relatively low-level languages such as C, to be able to embed assembly language into the source code with special syntax. Programs using such facilities, such as the Linux kernel, often construct abstractions where different assembly is used on each platform the program supports, but it is called by portable code through a uniform interface.

Many embedded systems are also programmed in assembly to obtain the absolute maximum functionality out of what is often very limited computational resources, though this is gradually changing in some areas as more powerful chips become available for the same minimal cost.

Assembly language is also valuable in reverse engineering, since many programs are distributed only in machine code form, and machine code is usually easy to translate into assembly language and carefully examine in this form, but very difficult to translate into a higher-level language. Tools such as the Interactive Disassembler make extensive use of disassembly for such a purpose.

See also

External links





bg:Асемблер da:Assemblersprog de:Assemblersprache es:Lenguaje ensamblador eo:Asembla Komputillingvo fr:Assembleur (langage) nl:Programmeertaal Assembler ja:アセンブリ言語 no:Assembler ru:Ассемблер fi:Assembly (ohjelmointikieli) zh:汇编语言

Retrieved from "http://www.mywiseowl.com/articles/Assembly_language"

This page has been accessed 10937 times. This page was last modified 14:52, 25 Nov 2004. All text is available under the terms of the GNU Free Documentation License (see Copyrights for details).