Man Pages

djvused(1) - phpMan djvused(1) - phpMan

Command: man perldoc info search(apropos)  


DJVUSED(1)                       DjVuLibre-3.5                      DJVUSED(1)



NAME
       djvused - Multi-purpose DjVu document editor.


SYNOPSIS
       djvused [options] djvufile



DESCRIPTION
       Program  djvused  is  a  powerful  command line tool for manipulating multi-page documents, creating or editing
       annotation chunks, creating or editing hidden text layers, pre-computing thumbnail images, and more.  The  pro-
       gram first reads the DjVu document djvufile and executes a number of djvused commands.

       Djvused  commands  can  be  read from a specific file (when option -f is specified), read from the command line
       (when option -e is specified), or read from the standard input (the default).


OPTIONS
       -v     Cause djvused to print a command line prompt before reading commands and a brief message describing  how
              each command was executed.  This option is very useful for debugging djvused scripts and also for inter-
              actively entering djvused commands on the standard input.

       -f scriptfile
              Cause djvused to read commands from file scriptfile.

       -e command
              Cause djvused to execute the commands specified by the option argument commands.   It  is  advisable  to
              surround the djvused commands by single quotes in order to prevent unwanted shell expansion.

       -s     Cause djvused to save the file djvufile after executing the specified commands.  This is similar to exe-
              cuting command save immediately before terminating the program.

       -u     Cause djvused to print hidden text and annotations as UTF-8 instead  of  encoding  non-ASCII  characters
              with  octal  escape sequences for maximal portability. This option is convenient for manually editing or
              viewing the djvused output.  This option also causes the emission of an UTF-8 BOM under Windows.

       -n     Cause djvused to disregard save commands.  This is useful for debugging djvused  scripts  without  over-
              writing files on your disk.


DJVUSED EXAMPLES
       There  are  many  ways to use program djvused.  The following examples illustrate some common uses of this pro-
       gram.


   Obtaining the size of a page
       Command size outputs the width and height of the selected pages using a HTML friendly  syntax.   For  instance,
       the following command prints the size of page 3 of document myfile.djvu.

          djvused myfile.djvu -e 'select 3; size'



   Extracting the hidden text
       Command  print-pure-txt  outputs  the  text  associated with a page or a document.  For instance, the following
       shell command outputs the text for the entire document.  Lines and pages are delimited  by  the  usual  control
       characters.

          djvused myfile.djvu -e 'print-pure-txt'

       Command  print-txt  produces  a  more  extensive  output  describing the structure and the location of the text
       components.  The syntax of this output is described later in this man page.  For instance, the following  shell
       command outputs extended text information for page 3 of document myfile.djvu.

          djvused myfile.djvu -e 'select 3; print-txt'



   Extracting the annotations
       Annotation data can be extracted using command print-ant.  The syntax of the annotation data is described later
       in this man page.  For instance, the following shell command outputs the annotation data for the first page  of
       document myfile.djvu.

          djvused myfile.djvu -e 'select 1; print-ant'

       Command  print-ant only prints the annotations stored in the selected component file.  Command print-merged-ant
       also retrieves annotations from all the component files referenced by the current page (using INCL chunks)  and
       prints the merged information.


   Dumping/restoring annotations and text
       Three  commands,  output-txt, output-ant, and output-all, produce djvused scripts.  For instance, the following
       shell command produces a djvused script, myfile.dsed, that recreates all the text and annotation data in  docu-
       ment myfile.djvu.

          djvused myfile.djvu -e 'output-all' > myfile.dsed

       Script  myfile.dsed  is  a text file that can be easily edited.  The following shell command then recreates the
       text and annotation information in file myfile.djvu.

          djvused myfile.djvu -f myfile.dsed -s


   Extracting a page
       Both commands save-page and save-page-with create a DjVu file representing the selected  component  file  of  a
       document.   The  following  shell  command, for instance, creates a file p05.djvu containing page 5 of document
       myfile.djvu.

          djvused myfile.djvu -e 'select 5; save-page p05.djvu'

       Each page of a document might import data from another component file using the so-called inclusion  (  INCL  )
       chunks.   Command  save-page  then  produces  a  file with unresolved references to imported data.  Such a file
       should then be made part of a multi-page document containing the required data in other  component  files.   On
       the  other  hand,  command  save-page-with  copies  all  the  imported data into the output file.  This file is
       directly usable. Yet collecting several such files into a multi-page document might lead to useless data repli-
       cation.


   Pre-computing thumbnails
       Commands set-thumbnails constructs thumbnails that can be later displayed by DjVu viewers.  The following shell
       command, for instance, computes thumbnails of size 64x64 pixels for all pages of file myfile.djvu.

          djvused myfile.djvu -e 'set-thumbnails 64' -s


DJVUSED COMMANDS
       Command lines might contain zero, one, or more djvused commands and an optional comment.  Multiple djvused com-
       mands  must be separated by a semicolon character ';'.  Comments are introduced by the '#' character and extend
       until the end of the command line.


   Selection commands
       Multi-page DjVu documents are composed of a number of component files.  Most component files  describe  a  spe-
       cific  page  of  a  document.   Some component files contain information shared by several pages such as shared
       image data, shared annotations or thumbnails.  Many djvused commands operate on selected component files.   All
       component files are initially selected.  The following commands are useful for changing the selection.

       n      Print the total number of pages in the document.

       ls     List all component files in the document.  Each line contains an optional page number, a letter describ-
              ing the component file type, the size of the component file, and identifier of the component file.  Com-
              ponent  file  type  letters  P,  I, A, and T respectively stand for page data, shared image data, shared
              annotation data, and thumbnail data.  Page numbers are only listed for component files  containing  page
              data.  When it is set, the optional page title (see command set-page-title below) is displayed after the
              component file identifier.

       select [fileid]
              Select the component file identified by argument fileid.  Argument fileid must be either a  page  number
              or a component file identifier.  The select command selects all component files when the argument fileid
              is omitted.

       select-shared-ant
              Select a component file containing shared annotations.  Only one such component file is supported by the
              current  DjVu  software.  This component file usually contains annotations pertaining to the whole docu-
              ment as opposed to specific pages.  An error message is displayed if there is no such component file.

       create-shared-ant
              Create and select a component file containing shared annotations.  This command only selects the  shared
              annotation  component  file  if such a component file already exists.  Otherwise it creates a new shared
              annotation component file and makes sure that it is imported by all pages in the document.

       showsel
              Shows the currently selected component files with the same format as command ls.


   Text and annotation commands
       print-pure-txt
              Print the text stored in the hidden text layer of the selected pages.  A similar capability  is  offered
              by  program  djvutxt.  Structural information is sometimes represented by control characters.  Text from
              different pages is delimited by form feed characters ("\f").  Lines are delimited by newline  characters
              ("\n").   Columns, regions, and paragraphs are sometimes delimited by vertical tab ("\013"), group sepa-
              rators ("\035") and unit separators ("\037") respectively.

       print-txt
              Prints extensive hidden text information for the selected pages.  This information describes the  struc-
              ture of the text on the document page and locates the structural elements in the page image.  The syntax
              of this output is described later in this man page.

       remove-txt
              Remove the hidden text information from the selected component files.  For instance, executing  commands
              select and remove-txt removes all hidden text information from the DjVu document.

       set-txt [djvusedtxtfile]
              Insert  hidden  text  information into the selected pages.  The optional argument djvusedtxtfile names a
              file containing the hidden text information.  This file must contain data similar to what is produced by
              command print-txt.  When the optional argument is omitted, the program reads the hidden text information
              from the djvused script until reaching an end-of-file or a line containing a single period.

       output-txt
              Prints a djvused script that reconstructs the hidden text information  for  the  selected  pages.   This
              script can later be edited and executed by invoking program djvused with option -f.

       print-ant
              Prints  the annotations of the selected component file.  The annotation data is represented using a sim-
              ple syntax described later in this document.

       print-merged-ant
              Merge the annotations stored in the selected component files with the annotations  imported  from  other
              component files such as the shared annotation component file..  The annotation data is represented using
              a simple syntax described later in this document.

       remove-ant
              Remove the annotation information from the selected component files.  For instance,  executing  commands
              select and remove-ant removes all annotation information from the DjVu document.

       set-ant [djvusedantfile]
              Insert  annotations into the selected component file.  The optional argument djvusedantfile names a file
              containing the annotation data.  This file must contain data similar to  what  is  produced  by  command
              print-ant.   When  the  optional  argument  is  omitted,  the program reads the annotation data from the
              djvused script itself until reaching an end-of-file or a line containing a single period.

       output-ant
              Print a djvused script that reconstructs the annotation information for the selected pages.  This script
              can later be edited and executed by invoking program djvused with option -f.

       print-meta
              Print  the  meta-data  part of the annotations for the selected component file.  This command displays a
              subset of the information printed by command print-ant using a different syntax.   Meta-data  are  orga-
              nized  as key-value pairs.  Each printed line contains the key name such as author, title,etc., followed
              by a tab character ("\t") and a double-quoted string representing the UTF-8 encoded meta-data value.

       remove-meta
              Remove the meta-data part of the annotations of the selected component files.

       set-meta [djvusedmetafile]
              Set the meta-data part of the annotations of the selected component file.  The  remaining  part  of  the
              annotations  is left unchanged.  The optional argument djvusedmetafile names a file containing the meta-
              data.  This file must contain data similar to what is produced by command print-meta.  When the optional
              argument is omitted, the program reads the annotation data from the djvused script itself until reaching
              an end-of-file or a line containing a single period.

       print-xmp
              Print the XMP metadata string contained in the annotation chunk of the selected  component  file.   This
              command displays in fact a subset of the information printed by command print-ant.

       remove-xmp
              Removes the XMP tag from the annotation chunk of the selected component file.

       set-xmp [xmpfile]
              Set  the XMP metadata part of the annotations of the selected component file.  The remaining part of the
              annotations is left unchanged.  The optional argument xmpfile names a file containing the  XMP  metadata
              in  a  format similar to that produced by command print-xmp.  When the optional argument is omitted, the
              program reads the XMP annotation data from the djvused script itself until reaching an end-of-file or  a
              line containing a single period.

       output-all
              Print  a  djvused  script  that reconstructs both the hidden text and the annotation information for the
              selected pages.  This script can later be edited and executed by invoking program  djvused  with  option
              -f.


   Outline/bookmarks commands
       print-outline
              Print the outline of the document.  Nothing is printed if the document contains no outline.

       remove-outline
              Removes the outline from the document.

       set-outline [djvusedoutlinefile]
              Insert  outline  information  into  the document.  The optional argument djvusedoutlinefile names a file
              containing the outline information.  This file must contain data similar to what is produced by  command
              print-outline.   When  the  optional  argument is omitted, the program reads the hidden text information
              from the djvused script until reaching an end-of-file or a line containing a single period.


   Thumbnail commands
       set-thumbnails sz
              Compute thumbnails of size szxsz pixels and insert them into the document.  DjVu viewers can later  dis-
              play  these thumbnails very efficiently without need to download the data for each page.  Typical thumb-
              nail size range from 48 to 128 pixels.

       remove-thumbnails
              Remove the pre-computed thumbnails from the DjVu document.  New thumbnails can then  be  computed  using
              command set-thumbnails.


   Save commands
       The  above commands only modify the memory image of the DjVu document.  The following commands provide means to
       save the modified data into the file system.

       save   Save the modified DjVu document back into the input file djvufile specified by the arguments of the pro-
              gram  djvused.  Nothing is done if the DjVu file was not modified.  Passing option -s program djvused is
              equivalent to executing command save before exiting the program.

       save-bundled filename
              Save the current DjVu document as a bundled multi-page DjVu document named filename.  A similar capabil-
              ity is offered by program djvmcvt.

       save-indirect filename
              Save  the current DjVu document as an indirect multi-page DjVu document.  The index file of the indirect
              document will be named filename.  All other files composing the indirect document will be saved into the
              same directory as the index file.  A similar capability is offered by program djvmcvt.

       save-page filename
              Save the selected component file into DjVu file filename.  The selected component file might import data
              from another component file using the so-called inclusion ( INCL ) chunks.  This command then produces a
              file  with unresolved references to imported data.  Such a file should then be made part of a multi-page
              document containing the required data in other component files.

       save-page-with filename
              Save the selected component file into DjVu file filename.  All data imported from other component  files
              is  copied into the output file as well.  This command always produces a usable DjVu file.  On the other
              hand, collecting several such files into a multi-page document might lead to useless data replication.


   Miscellaneous commands
       help   Display a help message listing all commands supported by djvused.

       dump   Display the EA IFF 85 structure of the document or of the selected component file.  A similar capability
              is offered by program djvudump.

       size   Display the width and the height of the selected pages.  The dimensions of each page are displayed using
              a syntax suitable for direct insertion into the <EMBED...></EMBED> tags.

       set-page-title title
              Sets a page title for the selected page.  When page titles are available, recent versions of the DjVuLi-
              bre  viewers  display  these  page titles instead of page numbers and also accept them in page selection
              options.  Command ls can be used to see both the page titles and page  identifiers.   To  unset  a  page
              title, simply make it equal to the page identifier.


DJVUSED FILE FORMATS
       Djvused uses a simple parenthesized syntax to represent both annotations and hidden text.

       *  This  syntax  is  the native syntax used by DjVu for storing annotations.  Program djvused simply compresses
          the annotation data using the bzz(1) algorithm.

       *  This syntax differs from the native syntax used by DjVu for storing the hidden text.  Program  djvused  per-
          forms  the  translations  between  the  compact binary representation used by DjVu and the easily modifiable
          parenthesized syntax.



   General syntax
       Djvused files are ASCII text files.  The legal characters in djvused files are the printable  ASCII  characters
       and the space, tab, cr, and nl characters.  Using other characters has undefined results.

       Djvused  files are composed of a sequence of expressions separated by blank characters (space, tab, cr, or nl).
       There are four kind of expressions, namely integers, symbols, strings and lists.

       Integers:
              Integer numbers are represented by one or more digits, with the usual interpretation.

       Symbols:
              Symbols, or identifiers, are sequences of printable ascii characters representing a name or  a  keyword.
              Acceptable characters are the alpha-numeric characters, the underscore "_", the minus character "-", and
              the hash character "#".  Names should not begin with a digit or a minus character.

       Strings:
              Strings denote an arbitrary sequence of bytes, usually interpreted as a sequence of UTF-8 encoded  char-
              acters.  Strings in djvused files are similar to strings in the C language.  They are surrounded by dou-
              ble quote characters.  Certain sequences of characters starting with a backslash ("\")  have  a  special
              meaning.   A  backslash  followed  by  letter "a", "b", "t", "n", "v", "f", "r", "\", and stands for the
              ascii character BEL(007), BS(008), HT(009), LF(010), VT(011), FF(012), CR(013), BACKSLASH(134) and  DOU-
              BLEQUOTE(042) respectively.  A backslash followed by one to three digits stands for the byte whose octal
              code is expressed by the digits.  All other backslash sequences are illegal.  All  non  printable  ascii
              characters must be escaped.

       Lists: Lists  are  sequence  of expressions separated by blanks and surrounded by parentheses.  All expressions
              types are acceptable within a list, including sub-lists.


   Hidden text syntax
       The building blocks of the hidden text syntax are lists representing each structural component  of  the  hidden
       text.  Structural components have the following form:

          (type xmin ymin xmax ymax ... )


       The symbol type must be one of page, column, region, para, line, word, or char, listed here by decreasing order
       of importance.  The integers xmin, ymin, xmax, and ymax represent the coordinates of a rectangle indicating the
       position  of the structural component in the page.  Coordinates are measured in pixels and have their origin at
       the bottom left corner of the page.  The remaining expressions in the list either is a single string represent-
       ing  the encoded text associated with this structural component, or is a sequence of structural components with
       a lesser type.

       The hidden text for each page is simply represented by a single structural element of type page.  Various level
       of  structural  information  are  acceptable.  For instance, the page level component might only specify a page
       level string, or might only provide a list of lines, or might provide a full hierarchy down to  the  individual
       characters.


   Outline/Bookmark syntax
       The outline syntax is a single list of the form

          (bookmarks ...)

       The first element of the list is symbol bookmarks.  The subsequent elements are lists representing the toplevel
       outline entries.  Each outline entry is represented by a list with the following form:

          (title url ... )

       The string title is the title of the outline entry.  The destination string url can be either an arbitrary per-
       cent encoded URL, or composed of the hash character ("#") followed by a page name or number, or composed of the
       question mark character ("?")  followed by cgi-style arguments interpreted by the djvu viewer.   The  remaining
       expressions in the list describe subentries of this outline entry.


   Annotation syntax
       Annotations  are represented by a sequence of annotation expressions.  The following annotation expressions are
       recognized:

       (background color)
              Specify the color of the viewer area surrounding the DjVu image.  Colors are represented  with  the  X11
              hexadecimal syntax #RRGGBB.  For instance, #000000 is black and #FFFFFF is white.

       (zoom zoomvalue)
              Specify the initial zoom factor of the image.  Argument zoomvalue can be one of stretch, one2one, width,
              page, or composed of the letter d followed by a number in range 1 to  999  representing  a  zoom  factor
              (such as in d300 or d150 for instance.)

       (mode modevalue)
              Specify the initial display mode of the image.  Argument modevalue is one of color, bw, fore, or back.

       (align horzalign vertalign)
              Specify  how  the image should be aligned on the viewer surface.  By default the image is located in the
              center.  Argument horzalign can be one of left, center, or right.  Argument vertalign can be one of top,
              center, or bottom.

       (maparea url comment area ...)
              Define an hyper-link for the specified destination.

              Argument url can have one of the following forms:

                 href
                 (url href target)

              where href is a string representing the destination and target is a string representing the target frame
              for the hyper-link, as defined by the HTML anchor tag <A>.  The destination string href can be either an
              arbitrary  percent  encoded URL, or composed of the hash character ("#") followed by a page name or num-
              ber, or composed of the question mark character ("?")  followed by cgi-style  arguments  interpreted  by
              the  djvu  viewer.  Page numbers may be prefixed with an optional sign to represent a page displacement.
              For instance the strings "#-1" and "#+1" can be used to access the previous page and the next page.

              Argument comment is a string that might be displayed by the viewer when the user moves  the  mouse  over
              the hyper-link.

              Argument area defines the shape and the location of the hyperlink.  The following forms are recognized:

                 (rect xmin ymin width height)
                 (oval xmin ymin width height)
                 (poly x0 y0 x1 y1 ... )
                 (text xmin ymin width height)
                 (line x0 y0 x1 y1)

              All  parameters are numbers representing coordinates.  Coordinates are measured in pixels and have their
              origin at the bottom left corner of the page.

              The remaining expressions in the maparea list represent the visual effect  associated  with  the  hyper-
              link.

              A first set of options defines how borders are drawn for rect, oval, polygon, or text hyperlink areas.

                 (none)
                 (xor)
                 (border color)
                 (shadow_in [thickness])
                 (shadow_out [thickness])
                 (shadow_ein [thickness])
                 (shadow_eout [thickness])

              where  parameter  color  has syntax #RRGGBB as described above, and parameter thickness is an integer in
              range 1 to 32.  The last four border options are only supported for rect hyperlink areas.   The  default
              border is a simple black line.  Border options do not apply to line areas.

              When  a  border  option  is specified, the border becomes visible when the user moves the mouse over the
              hyperlink. The border may be made always visible by using the following option:

                 (border_avis)

              The following two options may be used with rect hyperlink areas.  The complete area will be  highlighted
              using the specified color at the specified opacity (0-100, default 50).

                 (hilite color)
                 (opacity op)

              This is often used with an empty URL for simply emphasizing a specific segment of an image.

              The  following  three  options may be used with line areas to specify an optional ending arrow, the line
              width and color.  The default is a black line with width 1 and without arrow.

                 (arrow)
                 (width w)
                 (lineclr color)

              Finally the following three options can be used with text areas.  The default background color is trans-
              parent.  The default text color is black.  The pushpin option indicates that the text is symbolized by a
              small pushpin icon.  Clicking the icon reveals the text.

                 (backclr bkcolor)
                 (textclr txtcolor)
                 (pushpin)



       (metadata ... (key value) ... )
              Define meta-data entries.  Each entry is identified by a symbol key representing the nature of the  meta
              data  entry.   The string value represents the value associated with the corresponding key.  Two sets of
              keys are noteworthy: keys borrowed from the BibTex bibliography system, and keys borrowed from  the  PDF
              DocInfo  metadata.   BibTex  keys  are  always  expressed in lowercase, such as year, booktitle, editor,
              author, etc..  DocInfo keys start with an uppercase letter, such as  Title,  Author,  Subject,  Creator,
              Produced,  Trapped,  CreationDate,  and ModDate.  The values associated with the last two keys should be
              dates expressed according to RFC 3339.


LIMITATIONS
       The current version of program djvused only supports selecting one  component  file  or  all  component  files.
       There is no way to select only a few component files.


CREDITS
       This program was initially written by Leon Bottou <leonbATusers.net> and was improved by Yann Le Cun
       <profshadokoATusers.net>, Florin Nicsa, Bill Riemers <docbillATsourceforge.net> and many others.


SEE ALSO
       djvu(1), djvutxt(1), djvmcvt(1), djvudump(1), bzz(1), Emacs djvused front end djvu.el on GNU Elpa repository.



DjVuLibre-3.5                      5/22/2005                        DJVUSED(1)