Cleaning up text pasted from emails or Web sites

The ease of copying and pasting text from Web sites and email greatly simplifies many tasks in Word, but problems often arise in making the pasted text conform to the style of the document into which it is pasted. One of the most common chores is getting rid of excess line breaks, which cause the text to wrap short of the right margin. There are several ways to work around this problem.

Assessing the problem text

The most efficient method of reformatting short lines of text depends on whether the breaks are line breaks or paragraph breaks. So the first line of attack must be to display nonprinting characters:

  • Word 2003 and earlier: Click the Show/Hide button on the Standard toolbar.

  • Word 2007 and above: Click the Show/Hide button in the Paragraph group on the Home tab.

  • All versions: Press Ctrl+* (Ctrl+Shift+8 on U.S. keyboards).

For more on nonprinting characters, see What do all those funny marks, like the dots between the words in my document, and the square bullets in the left margin, mean?

Lines ending in paragraph breaks

Figure 1. Lines ending in paragraph breaks

If each line ends in a pilcrow or paragraph mark (¶), as shown in Figure 1 above, then AutoFormat may be all you need. If each line ends in a bent arrow
(signifying a line break), as shown in Figure 2 below, you will need to use a different approach.

Lines ending in line breaks

Figure 2. Lines ending in line breaks

Using AutoFormat

Access AutoFormat settings in various Word versions as follows:

  • Word 2000 and earlier: Tools | AutoCorrect | AutoFormat

  • Word 2002 and 2003: Tools | AutoCorrect Options | AutoFormat

  • Word 2007: Office Button | Word Options | Proofing | AutoCorrect Options | AutoFormat

  • Word 2010 and 2013: File | Options | Proofing | AutoCorrect Options | AutoFormat

No matter what other AutoFormat options you have enabled here, when you select a block of text with a paragraph break at the end of each full line, AutoFormat will delete all the paragraph breaks but the last (in some cases, it will be even smarter and will determine where the “real” paragraph breaks are and leave them intact). To run AutoFormat:

  • Word 2003 and earlier: Use Format | AutoFormat: AutoFormat now (there may be an AutoFormat button on the Formatting toolbar in some versions, or you can add one using Tools | Customize).

  • Word 2007 and above: These versions don’t provide access to AutoFormat through the Ribbon, so you will have to add an AutoFormat Now button to the Quick Access Toolbar (QAT) from the “Commands Not in the Ribbon” section of the Customize the Quick Access Toolbar dialog; the easiest way to access this dialog in all Ribbon versions is to right-click on the Ribbon or QAT and choose Customize Quick Access Toolbar. Alternatively, you can use the keyboard shortcut Alt+Ctrl+K.

Unfortunately, text pasted from the Web or email nowadays rarely has lines ending in paragraph breaks. But you can force this format by using Paste Special and selecting “Unformatted Text” (in Word 2002 and above, if you have “Paste Options” enabled, you can just Paste and then select the “Keep Text Only” option). This pastes your selection with paragraph breaks instead of line breaks, and AutoFormat will then do the trick.

Using Find and Replace

Sometimes, however, you will not want to paste as unformatted text. In that case, what you will most likely get is text with a line break at the end of each line. Provided there is an empty line at the end of each paragraph, cleanup is still relatively simple. It takes just two Replace operations.

First pass

  1. Press Ctrl+H to open the Replace dialog.

  2. In the “Find what” box, type ^l^l (those are lowercase Ls, representing two line breaks).

  3. In the “Replace with” box, type ^p (the code for a paragraph break).

  4. Click Replace All. You will now have a paragraph break at the end of each true paragraph.

Second pass

  1. In the “Find what” box, type ^l.

  2. In the “Replace with” box, type a space. (If there is already a space at the end of each line, leave the box empty.)

  3. Click Replace All.

This removes the line breaks and allows text to wrap naturally.

Note: If the lines end in paragraph breaks rather than line breaks, you can use a similar replace operation, adding a step before the first pass above to replace ^p with ^l. This replaces (1) the paragraph breaks with line breaks, (2) two line breaks with a paragraph break, and (3) the remaining line breaks with a space or nothing.

Harder cases

If there is not an empty line between paragraphs, you will probably have to insert paragraph breaks by hand. If the amount of text is not large, you can scroll through and press Enter wherever a paragraph break is needed. Then use Replace, as above, to replace each line break with a space. This will leave an extra space at either the beginning or the end of each paragraph. You can use Replace again to replace <space>^p or ^p<space> (as appropriate) with^p. (Note that “<space>” represents pressing the spacebar; you don’t type “<space>”!)

An alternative approach is to press Shift+Enter to enter an extra line break at the end of each paragraph, then follow the instructions in the section above.

Even when the amount of text is very large, there is no really good alternative to manual editing. But if you Paste Special as Unformatted Text and run AutoFormat, you may find that Word is almost as clever as you are at finding where a paragraph ends.

Note that the methods described above are suitable only for simple text. If you have copied and pasted an entire Web page, with graphics, tables, and frames, much more work will be required to format it for use in a Word document.

Other non-printing characters worth replacing

  • Often when you paste from the web, and also from some other applications, characters come in that display as paragraph marks but don’t behave like “proper” paragraph breaks should—they behave like manual line breaks!. So you might find that when you center a “paragraph,” several other adjacent paragraphs also get centered. To cure this, use Replace: in the “Find what” box type ^013; in the “Replace with” box, type ^p and click Replace All.

  • Whenpastingfromtheweb,nonbreakingspacesoftencomein,ratherthanordinary spaces. You can get rid of them with a Replace operation; in the “Find what” box type ^s, in the “Replace with” box, insert a <space>character (press the spacebar), and click Replace All.

If you want to automate any of the above steps you can record them using the macro recorder and play them back as needed.

Text in tables

Often when you paste text from the Web, it is in a table. Display table gridlines so that you can see what you’re dealing with.

  • Word 2003 and earlier: On the Table menu, select Show Gridlines (if this is already selected, so that the menu displays Hide Gridlines, leave it as is).

  • Word 2007 and above: If the insertion point is in a table, the contextual Table Tools tabs will be displayed. On the Layout tab, in the Table group, click the View Gridlines button to turn it on.

If you have ascertained that you are dealing with a table, you have two possible approaches.

Leave the table and reformat it

If the text is in several columns, such that a table is actually the most effective way to present it, you can reformat the table so that it will fit your margin width. You can then drag column borders as desired to fine-tune the layout.

  • Word 2003 and earlier: On the Table menu, click AutoFit and choose AutoFit to Window.

  • Word 2007 and above: On the Layout tab of the Table Tools, in the AutoFit group, click AutoFit and choose AutoFit Window.

Convert the table to text

If the text is in a single table column, you could just AutoFit that one column to your margin width as instructed above, but often there will be several successive tables involved, and often these tables will be nested, so you may find it easier to work with the text if you get it out of the table.

  • Word 2003 and earlier: Select the table using the table handle at the top left corner (or use Table | Select | Table). On the Table menu, click Convert, then Text to Table. You will see the dialog shown in Figure 3. If the table is a single column, it won’t matter which radio button you select because the characters used to “separate text” are used between columns. Click OK.

  • Word 2007 and above: Select the table using the table handle at the top left corner. On the Layout tab of the Table Tools, in the Data group, click Convert to Text. You will see the dialog shown in Figure 3. If the table is a single column, it won’t matter which radio button you select. Click OK.

Convert Table to Text dialog

Figure 3. Convert Table to Text dialog

If the pasted content includes several tables, you will have to repeat the process for each one. If the “Convert nested tables” check box is enabled, you will know you are dealing with nested tables; be prepared for more of a mess to sort out!

Neat tips

This following tip has appeared in Woody’s Office Watch (WOW). When you cut and paste text from a Web site, there are often leading spaces at the beginning of each line. A very quick way to remove all these spaces is to select the text, center the selection (Ctrl+E), and then left-align the selection (Ctrl+L). All the extra spaces will have disappeared!

If the text you have pasted has “reply” characters, such as “greater than” symbols (>) at the beginnings of lines, you could use Replace repeatedly to search for this character followed by a space. An easier way to remove them, however, is to use column select (Alt+drag) to select just the leading characters. When they are selected, press Delete.


Leave a Reply

Your email address will not be published. Required fields are marked *