Converting Word document format with PowerShell

Do you have a file server full documents in the old Word document format? This blog explains how to use PowerShell to bulk convert files from .DOC to .DOCX. The script can be run against a folder full of documents, automatically crearting a new version in .DOCX format. The same script can be easily modified to convert Word documents of any format to PDF format.

The PC running the script must have PowerShell and Microsoft Word installed.

The example script processes all .DOC files in the C:\Olddocuments folder


$path = "c:\olddocuments\" 
$word_app = New-Object -ComObject Word.Application

$Format = [Microsoft.Office.Interop.Word.WdSaveFormat]::wdFormatXMLDocument

Get-ChildItem -Path $path -Filter *.doc | ForEach-Object {
    $document = $word_app.Documents.Open($_.FullName)
    $docx_filename = "$($_.DirectoryName)\$($_.BaseName).docx"
    $document.SaveAs([ref] $docx_filename, [ref]$Format)
    $document.Close()
}
$word_app.Quit()

If you need to convert the documents to PDF, make the following change to the “SaveAs” line in the script. 17 corresponds to the PDF file format when doing a Save As in Microsoft Word.


$document.SaveAs([ref] $docx_filename, [ref]17)

One of the big benefits of converting files is the reduction in size. In a test across several thousand documents I noticed a 40% diskspace saving. In addition, DOCX files are less like to get corrupted, and support new Microsoft Word features.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s