Pipeline

   

de:Pipeline This article refers to the mechanical, electrical, and software systems meaning of pipeline. For pipelines used to transport fluids like water or petroleum, see pipeline transport.


The term pipeline has meaning in electrical and mechanical systems, as well as in software. In general, the term represents the concept of splitting a job into subprocesses in which the output of one subprocess feeds into the next (much like water flows from one pipe segment to the next).

Mechanical analogy

A mechanical example of a pipeline is a washer/dryer system for clothing. Instead of having one unit that both washes and dries, we have two units that together form a pipeline (the output of the washer enters the drier). If washing takes 1 hour and drying takes 1 hour, the pipeline allows us to finish a full load of laundry every hour, compared to every 2 hours if you had a single (non-pipelined) unit that washed and then dried. It still requires two hours for an item of clothing to complete its wash/dry cycle of course.

Pipelined processors

Electrically, pipelines are used in microprocessors to allow complex logic sequences to execute at faster speeds. Pipelines are related to the engineering concepts of throughput and latency. See Instruction pipeline and Classic RISC pipeline for a better discussion.

Software pipelines

In computer software, a pipeline is a command line feature prevalent in UNIX and other UNIX-like operating systems. Douglas McIlroy, one of the authors of the early UNIX command shells, noticed that much of the time they were processing the output of one program as the input to another. The UNIX pioneers established a means of chaining the running programs together as co-processes so that the output of the first program becomes the input to the second. This was to become the famous pipes and filters design pattern. A pipeline may be extended to any number of commands with the output of one serving as the input to the next.

Unix pipes

Commonly filter programs are used in a UNIX pipeline and they usually obey a few conventions: line structured records, reading data from the standard input, and writing to the standard output.

Below is an example of a pipeline that implements a kind of spell checker for the web resource indicated by a URL [1] (http://www.wikipedia.org/wiki/Pipeline).

curl http://www.wikipedia.org/wiki/Pipeline |
sed 's/[^a-zA-Z ]//g' |
tr 'A-Z ' 'a-z\n' |
grep '[a-z]' |
sort -u |
comm -23 - /usr/dict/words

Here is an explanation of the pipeline:

  • First the curl program obtains the HTML contents of a web page.
  • The contents of this page are piped through sed, which removes all characters which are not spaces or letters.
  • tr then changes all of the uppercase letters into their corresponding lowercase counterparts, and converts the spaces in the lines of text to newlines.
  • Each 'word' is now on a separate line.
  • grep is used to remove lines of whitespace.
  • sort sorts the list of 'words' into alphabetical order, and removes duplicates.
  • Finally, comm finds which of the words in the list are not in the given dictionary file (in this case, /usr/dict/words).

Hartmann pipelines

John Hartmann, a Danish engineer with IBM, extended the basic pipes and filters paradigm in a number of useful ways. His product, a/k/a CMS Pipelines, is available on a number of IBM platforms.

Some of the salient characteristics that distinguish Hartmann Pipeline from ordinary Unix pipes are:

  • Filters may have multiple inputs and multiple outputs. For example, a selection filter can send the found records down one output pipe and the not found records down another.
  • A linear notation for representing pipeline networks.
  • An interface that allows REXX programs to act as filters.
  • A pacing strategy in the Pipeline supervisor that allows, for example, a stream to be split, say by a selection filter, and the records on the output legs to be processed by other filters, then merged by a join filter and have the record order preserved in result stream.

The utility of the many filters supplied with the program is exemplified by the LOOKUP filter:

LOOKUP matches records in its primary input stream with records in its secondary input stream and writes matched and unmatched records to different output streams. The records are matched on the basis of a key field (the contents of a specified range of columns in the records).

LOOKUP reads records from its primary and secondary input streams and writes records to its primary, secondary, and tertiary output streams, if each is connected. The secondary input stream must be defined and connected.

The records in the secondary input stream are the master records. LOOKUP first reads the master records into a buffer, where records with duplicate key fields are discarded; the first occurrence of a key is retained. The records in the buffer are referred to as the reference.

The records in the primary input stream are the detail records. LOOKUP compares detail records to records in the reference. LOOKUP writes records to three output streams, if each is connected:

  • The primary output stream contains matching records. You can specify the sequence of the master and detail records written to the primary output stream and what is written to the primary output stream: both detail and master records, only detail records, or only master records.
  • The secondary output stream contains detail records that do not have a matching master record.
  • The tertiary output stream contains master records in ascending order by their key fields. The primary and secondary output streams are severed at the end of file on the primary input stream before records are written to the tertiary output stream.

This arrangement allows one to use other filters to prepare the dictionary, or master records for input to LOOKUP from whatever source is required. The many Input/Output filters, or drivers, allow a Hartmann Pipe to interact directly with a variety data sources, from files, to the system itself, and such things as TCP/IP ports. The repertoire of filters and drivers is rich enough that one could, for example, write a server that consisted solely of a Hartmann pipeline.


Retrieved from "http://www.mywiseowl.com/articles/Pipeline"

This page has been accessed 1171 times. This page was last modified 01:16, 10 Nov 2004. All text is available under the terms of the GNU Free Documentation License (see Copyrights for details).