P5-docx2txt

Jul 20, 2023

Utility to convert Docx documents to equivalent Text documents

docx2txt is a perl based command line utility to convert Microsoft OfficeTm Docx documents to equivalent Text documents. Latest version supports following features during text extraction.

  • Character conversions “ ‘ < & > - … fraction and some mathematical symbols etc.; currency characters are converted to respective names like Euro.
  • Capitalisation of text blocks.
  • Center and right justification of text fitting in a line of configurable 80 columns.
  • Horizontal ruler, line breaks, paragraphs separation, tabs
  • Indicating hyperlinked text along with the hyperlink. configurable
  • Handling bullet, decimal, letter, roman lists along with attempt at indentation.


Checkout these related ports:
  • Zxing-cpp - ZXing C++ Library for QR code recognition
  • Zu-hunspell - Zulu hunspell dictionaries
  • Zu-aspell - Aspell Zulu dictionary
  • Zq - Easier and faster alternative to jq
  • Zorba - General purpose C++ XQuery processor
  • Zenxml - Simple C++ XML Processing
  • Zed - Command-line tool to manage and query Zed data lakes
  • Yq - Command-line YAML and XML processor, jq wrapper for YAML/XML documents
  • Yould - Pronounceable word generator
  • Yodl - Easy to use but powerful document formatting/preparation language
  • Yi-hunspell - Yiddish hunspell dictionaries
  • Yi-aspell - Aspell Yiddish dictionary
  • Yelp-xsl - DocBook XSLT stylesheets for yelp
  • Yelp-tools - Utilities to help manage documentation for Yelp and the web
  • Ydiff - Diff readability enhancer for color terminals