In previous articles, readers have learned how to fix Word icon not displaying issue on .doc and .docx files. In the following article, Mytour will guide you on how to fix character encoding corruption issue on Word.
1. Text characters in Word are corrupted
Users working with Plain Text files (.TXT file extension) may encounter text display issues. This occurs when the text document is corrupted due to being composed in a foreign language, not using the Latin alphabet, resulting in inconsistent settings during file saving.
Character corruption happens when saving files using a different default file encoding from the end user's program. Most computer programs default to UTF-8 encoding, but foreign characters often have one or more encoding systems specific to the language.
Asian languages utilize a 16-bit encoding system, so when documents are opened on an 8-bit system (or UTF-8), text gets replaced with truncated symbols.
There are various ways to fix character encoding errors in Word, including using specialized software for error correction. In this article, Mytour will guide you through some methods to fix character encoding errors in Word.
2. Fixing Character Encoding Errors in Word
Microsoft Word comes with a built-in character encoding converter that can be used to save files in the encoding format of your choice. Follow the steps below to fix the issue.
Note: This method applies to Word 2003 and later versions.
Step 1: Open the document in Microsoft Word.
By default, Windows opens plain text files (files with the .txt extension) with Notepad. To open corrupted or damaged documents in Word, follow the steps below:
Right-click on the corrupted or damaged document file.
Select Open with.
Choose Word.
Step 2: Convert the file from encoded text
The Convert File dialog will automatically open if a corrupted or damaged encoded file is detected. Select Encoded Text from the list of options, then click OK.
If the dialog does not appear, you'll need to manually activate it. Go to File =>Options =>Advanced, scroll down to find the General section. In the General section, check the box next to Confirm file format conversion on open. Close and reopen Word, then open the corrupted or damaged document again. This time, the Convert File dialog will appear on the screen.
Step 3: Choose the correct encoding format
The encoding format selection dialog will automatically suggest the correct encoding format. If not, you can manually select the encoding format from the list.
Choose Auto-Select if you're unsure about the source encoding, or select from the list if you know the language used in the document file. Additionally, you can check if the file is corrupted or damaged from the preview window.
Step 4: Save the document as readable plain text
After the text has been restored and is readable in Word, it may still appear corrupted or damaged in plain text processing software because many parts are not written to handle special character encoding. To prevent this, the best approach is to save the document in a commonly used text encoding format, such as UTF-8 or UTF-16.
To do this, click on the File tab at the top left corner of the document window, select Save As from the list. Choose the folder to save the document and select Plain Text Document as the file format, then click Yes.
The File Conversion dialog will appear on the screen. From the list, choose an encoding format for the final document. The preview window will highlight words not saved in the correct format in red. It's best to use Unicode as the encoding format, as this format is designed to accommodate all writing systems worldwide.
Finally, click OK to save your edited document.
From now on, your document will display correctly in plain text processing software, such as Notepad.
This article from Mytour has just guided you on how to fix character encoding errors in Word. Additionally, if you have any doubts or questions that need clarification, readers can leave their comments below the article, such as in the case of Word file encrypted and cannot be opened for example.
