man(1) Manual page archive


     DOCX2TROFF(1)                                       DOCX2TROFF(1)

     NAME
          docx2troff, docx2txt, word2troff word2txt - translate
          Microsoft™ Office™ documents

     SYNOPSIS
          docx2troff [ file.docx ]
          docx2txt [ file.docx ]
          opc/word2troff
          opc/word2txt

     DESCRIPTION
          Microsoft's new format for Office documents is a zip'ed
          directory hierarchy containing XML files. This format is
          known as the ``Open Packaging Convention'' or OPC.

          Docx2txt is an rc(1) script that uses fs/zipfs(1) and
          opc/word2txt to extract the printable text from the body of
          a Microsoft Word docx document and write it on the standard
          output. Typically this is then piped through fmt(1) to wrap
          paragraphs.

          Docx2troff is similar, but emits troff source corresponding
          to the document. If the document contains tables additional
          commands will be emitted for tbl(1) Opc/word2troff does not
          attempt to produce an exact facsimile of the source layout,
          but rather a reasonable looking troff version of the docu-
          ment.

     SOURCE
          /sys/src/cmd/opc

     SEE ALSO
          xlsx2txt(1)
          libxml(2)
          ``2007 Office Document: Open XML Markup Explained'',
          http://www.microsoft.com/en-
          us/download/details.aspx?id=15359