DTP

Contents
    Contents

      How to convert from PDF to Word format

      When PDF files are converted to Word format in order to be translated there are a few things that have to be taken into account depending on the result desired by the client.

      Specific client instructions

      For specific client instructions, see the paragraph "Example on client specific instructions".

      This instruction is intended as a general checklist of things to be considered for the different levels of end result. But a general piece of advice is to avoid text boxes if possible.

      General things to consider

      • The files must be in docx format.
      • Avoid Section breaks if possible or at least minimize the use.

      Different levels of conversion

      The client can ask for at least three different levels after conversion:

      • Only the text should be extracted and the layout is not important (level 1)
      • Everything should be extracted but the layout need not be exactly as the original (level 2)
      • The extracted file should look like the original (level 3)

      Conversion for level 1

      This is obviously the easiest one. Since the text can be extracted without concern about the layout or formatting, the things to consider is to remove unnecessary line breaks and make sure text flow of the text makes sense.

      Conversion for level 2

      In this case the extracted file should be slightly more similar to the original. All content including graphics should be extracted but the layout do not need to be exactly as the original as long as the flow of the text is similar and of course readable. The original number of pages should be kept as well.

      Example of things that can be skipped/simplified:

      • Columns
        Multi-column layout can be removed
      • Flow charts
        Flow charts can be represented by for example a table
      • Drawings
        These are mostly extracted as gibberish and is difficult to handle. Leave it as a picture and if translation is needed it can be done in a separate file.

      Conversion for level 3

      The layout of the extracted file should look as the original. How this is done can vary but since the file will be translated it should be easy to work with.

      When recreating the file in Word (if not instructed to use another program), keep the following recommendations in mind:

      • It is preferable to remove all section breaks and use file setup to have the same settings for all pages
      • Use header/footer
      • Use paragraph formats to control the general formatting of the text
      • If there is a table of contents make it automated
      • Set up tables as seen below
      • Do not leave text in text boxes if not absolutely necessary
      • Do not use tables to create columns

      Extracting tables

      Depending on the contents of the tables, these can be handled in two different ways.

      • Most cells contain text to be translated
        Extract the table as it is with all text included
      • A heading row to be translated and remaining rows containing numbers
        (or other text not in need of translation)
        Extract the heading row as text and insert the rest of the table as a picture

      Example on client specific instructions

      • Target language will be German.
      • Declaration of incorporation, certificate and declaration of conformity do not need translation and can be added as images in the Word files.
      • Do not make text boxes, except for caption/image texts.
      • Drawings do not need translation, but captions needs to be editable.
      • Please only deliver Word files, not PDFs.
      • Double spaces and non-breaking space should be single, ordinary spaces in body text.

      How to perform a layout check

      A layout check of the translated material is often carried out after the translation has been completed. This is done to ensure that all text has been translated, all images are still in the right position, and so on.

      This is intended as a general checklist of things that should be checked at this inspection.

      Before layout check

      Before layout check is started, check that

      • All translated files have been included
      • All necessary fonts are available
      • All necessary image files are available (if necessary for creating PDFs etc.)

      Go through the client-specific DTP checklist if there is one.

      Check of translated files

      The translated files should be checked against the source files in the following areas:

      Formatting

      Make sure the correct

      • Styles are used (title, body, ...)
      • Formatting (bold, italic, underline, ...) is used
      • Language is selected for the document if hyphenation is turned on

      Images

      • If high-resolution PDFs have been requested, make sure all images have been linked and are displayed properly
      • Images may not be moved without approval from the client; see the DTP checklist, if attached

      Page layout

      Check that

      • Headers and footers are translated and look good
      • Pages flow (some target languages take up more space than the source language and, to maintain the number of pages, sometimes font size and/or line spacing needs to be adjusted, which must be approved by the client; see the DTP checklist, if attached)

      Text content

      Check that

      • All text is included and has been translated
      • Cross-references still function and have been translated
      • Table of contents, indexes, and other generated content is updated
      • Codes, variables, and bookmarks have been translated and function properly
      • All text in text boxes is visible and translated

      Special characters, etc.

      Check that

      • Line breaks and tabs are correct

      Also keep an eye out for

      • Double-spacing
      • Hard spaces
      • Space before dots and commas
      • Double dots
      • Quotation marks (must follow the language conventions, unless otherwise indicated)
      • Inch symbols (several programs replace the quotation marks)
      • Units of measurement used in accordance with the target language rules or client-specific guidelines.
      • Correct hyphenation

      PDF and final delivery to Semantix

      Check the instructions for delivery in PDF format for the project. Unless otherwise specified, create a low-resolution PDF for reference. Insert comments for the DTP to implement if necessary.

      All other files must be delivered in correct version (.fm/.indd) and in correct folder structure. In some cases it might also be necessary to deliver interchange formats (.mif/.idml).

      Read more in Language Talents' Forum