Man Pages

csepdjvu(1) - phpMan csepdjvu(1) - phpMan

Command: man perldoc info search(apropos)  


CSEPDJVU(1)                      DjVuLibre-3.5                     CSEPDJVU(1)



NAME
       csepdjvu - DjVu encoder for separated data files.


SYNOPSIS
       csepdjvu  [options] [sepfiles]... outputdjvufile


DESCRIPTION
       This  program creates a DjVuDocument file outputdjvufile from separated data files sepfiles.  It can read sepa-
       rated data from the standard input when given a single dash instead of the separated  data  file  names.   This
       feature is intended for pre-processing programs that push separated data into csepdjvu via a pipe.

       Each  separated  data  file  represents  one  or more page images.  When the program arguments specify multiple
       pages, all the pages are encoded and saved as a bundled multi-page document.  When the program arguments  spec-
       ify a single page, the page is encoded and saved as a single page file.


OPTIONS
       -d n   Specify  the resolution information encoded into the output file expressed in dots per inch. The resolu-
              tion information encoded in DjVu files determine how the decoder scales the image on a  particular  dis-
              play.  Meaningful resolutions range from 25 to 6000.  The default value is 300 dpi.

       -q n,...,n

       -q n+...+n
              Specify  the encoding quality of the IW44 encoded background layer.  The option argument contain several
              integers (one per chunk) separated by either commas or pluses.  This option is similar to option  -slice
              of program c44.  Please refer to the c44(1) man page for additional details.  The default quality speci-
              fication is -q 72,83,93,103.

              This option does not apply to uniformly white background that were not specified by the  separated  data
              but are called for by the DjVu specification.  Such background images always come at the lowest possible
              resolution and with a standard quality setting that ensures the color uniformity.

       -t     Program csepdjvu interprets certain comments in the separated file to construct a hidden text  layer  in
              the  DjVu file. This layer records the location of each word for hiliting purposes.  This option reduces
              the file size by simply recording the location of each line.

       -v     Display a brief message describing each page.

       -vv    Display extensive informational messages during encoding.


SEPARATED DATA FILE FORMAT
       Each separated data file contains a concatenation of one or more separated page images.  Each page is logically
       represented by a foreground image with a transparent color and by a background image visible through the trans-
       parent pixels.  The data for each separated page image is the concatenation of the following data blocks:

       *  A foreground image encoded using either the "Color RLE format" or the "Bitonal RLE format".   These  formats
          are described later in this section.

       *  An  optional  background image encoded as a "Portable Pixmap" ( PPM ).  This well known format is summarized
          later in this section.  The absence of a background image simply indicates that a uniformly white background
          should be assumed.

       *  An  arbitrary  number  of  comment lines starting with character "#" and terminated by a linefeed character.
          Comment lines whose first word starts with a capital letter have special meanings documented later  in  this
          document.

       The  dimensions  (width and height) of the background image must be obtained by rounding up the quotient of the
       foreground image dimensions by an integer reduction factor ranging from 1 to 12.  Assume,  for  instance,  that
       the  width  of the foreground is 2507 and the reduction factor is 3.  The width of the background image will be
       the integer ratio (2507+2)/3.


   Color RLE format
       The Color RLE format is a simple run-length encoding scheme for color images with a limited number of  distinct
       colors.   The  data always begin with a text header composed of the two characters "R6", the number of columns,
       the number of rows, and the number of color palette entries.  All  numbers  are  expressed  in  decimal  ASCII.
       These  four  items  are  separated by blank characters (space, tab, carriage return, or linefeed) or by comment
       lines introduced by character "#".  The last number is followed by exactly one character  which  usually  is  a
       linefeed character.

       The  header  is  followed by the color palette containing three bytes per color entry.  The bytes represent the
       red, green, and blue components of the color.

       The palette is followed by a collection of four bytes integers (most significant bit first)  representing  runs
       of  pixels  with an identical color.  The twelve upper bits of this integer indicate the index of the run color
       in the palette entry.  The twenty lower bits of the integer indicate the run  length.   Color  indices  greater
       than  0xff0  are  reserved.   Color  index  0xfff  is  used for transparent runs.  Each row is represented by a
       sequence of runs whose lengths add up to the image width.  Rows are encoded starting with the top row and  pro-
       gressing toward the bottom row.


   Bitonal RLE format
       The Bitonal RLE format is a simple run-length encoding scheme for bitonal images.  The data always begin with a
       text header composed of the two characters "R4", the number of columns, and the number of  rows.   All  numbers
       are  expressed  in  decimal  ASCII.   These three items are separated by blank characters (space, tab, carriage
       return, or linefeed) or by comment lines introduced by character "#".  The last number is followed  by  exactly
       one character which usually is a linefeed character.

       The  rest of the file encodes a sequence of numbers representing the lengths of alternating runs of transparent
       and black pixels.  Lines are encoded starting with the top line and progressing toward the bottom  line.   Each
       line  starts  with  a  white run. The decoder knows that a line is finished when the sum of the run lengths for
       that line is equal to the number of columns in the image.  Numbers in range 0 to 191 are represented by a  sin-
       gle  byte  in  range  0x00  to 0xbf.  Numbers in range 192 to 16383 are represented by a two byte sequence: the
       first byte, in range 0xc0 to 0xff, encodes the six most significant bits of the number, the second byte encodes
       the  remaining  eight  bits  of the number. This scheme allows for runs of length zero, which are useful when a
       line starts with a black pixel, and when a very long run (whose  length  exceeds  16383)  must  be  split  into
       smaller runs.


   Portable Pixmap (PPM) format
       The Portable Pixmap format is a well known format for representing color images.  Check the ppm(1) man page for
       complete information.

       The data always begin with a text header composed of the two characters "P6", the number of columns, the number
       of rows, and the maximal value of a color component (usually 255).  All numbers are expressed in decimal ASCII.
       These three items are separated by blank characters (space, tab, carriage return, or linefeed)  or  by  comment
       lines  introduced  by  character  "#".  The last number is followed by exactly one character which usually is a
       linefeed character.

       The rest of the file encodes all the pixels.  Each pixel is represented by three bytes  representing  the  red,
       green and blue component of the pixel.  Pixels are ordered in left to right, top to bottom.


   Comments in separated files
       Each  page  is followed by an arbitrary number of comment lines starting with character "#" and terminated by a
       linefeed character.  Comment lines whose first word starts with a capital letter  have  special  meanings.  The
       following constructs are currently defined:

       *  # T px:py dx:dy wxh+x+y (string)
          This constructs indicates that the piece of text string must be associated with an area of size wxh at posi-
          tion x,y relative to the lower left corner of the page.  The string is UTF-8 encoded. Special characters can
          be  escaped  as  in PostScript using the backslash character.  Integers px, and py represent the position of
          the current point on the text baseline before the text was drawn. The drawing operation then moves the  cur-
          rent  point by dx, and dy pixels.  When such comments are present, csepdjvu produces a hidden text layer for
          the corresponding pages.

       *  # L wxh+x+y (url)
          This construct indicates that an hyperlink to url url should be associated with area of size wxh at position
          x,y.  When such comments are present, csepdjvu produces pages with an annotation chunk containing the speci-
          fied hyperlinks.

       *  # B count (string) (#pageno)
          This constructs provides outline information for the document.  An outline entry entitled string is  associ-
          ated  with  page pageno.  Integer count indicates how many of the following outline entries must be attached
          to the current entry as subentries.  When such comments are present in the first page csepdjvu  produces  an
          navigation chunk with the specified outline.


CREDITS
       This  program  was  initially  written  by  Leon  Bottou <leonbATusers.net> and was improved by Bill
       Riemers <docbillATsourceforge.net> and many others.


SEE ALSO
       djvu(1), ppm(5), c44(1)



DjVuLibre-3.5                     10/11/2001                       CSEPDJVU(1)