Example: dental hygienist

Unicode Characters in a Table of Contents - Lex Jansen

PhUSE 2016 1 Paper CC02 Unicode Characters in a Table of Contents John Hendrickx, Danone Nutricia Research, Utrecht, The Netherlands ABSTRACT In SAS, the ODS inline formatting statement ^{ Unicode <value>} can be used to insert special Characters such as Greek letters or mathematical symbols. Unfortunately, this method does not work in a Table of Contents generated with the Contents option of the ODS RTF statement. This paper discusses how to repair this problem in Word. A Word macro is presented that will repair all occurrences in a document. INTRODUCTION Sometimes, 256 just isn t enough (256 being the number of symbols* you could represent with a single byte consisting of 8 bits).

PhUSE 2016 1 Paper CC02 Unicode Characters in a Table of Contents John Hendrickx, Danone Nutricia Research, Utrecht, The Netherlands ABSTRACT In SAS, the ODS inline formatting statement ^{unicode <value>} can be used to insert special characters such as …

Tags:

  Table, Character, Unicode, Unicode characters in a table

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Unicode Characters in a Table of Contents - Lex Jansen

1 PhUSE 2016 1 Paper CC02 Unicode Characters in a Table of Contents John Hendrickx, Danone Nutricia Research, Utrecht, The Netherlands ABSTRACT In SAS, the ODS inline formatting statement ^{ Unicode <value>} can be used to insert special Characters such as Greek letters or mathematical symbols. Unfortunately, this method does not work in a Table of Contents generated with the Contents option of the ODS RTF statement. This paper discusses how to repair this problem in Word. A Word macro is presented that will repair all occurrences in a document. INTRODUCTION Sometimes, 256 just isn t enough (256 being the number of symbols* you could represent with a single byte consisting of 8 bits).

2 The Unicode system was developed to use two or more bytes to represent a much wider range of symbols mathematical symbols, Greek, Chinese, other languages, even emoji. In SAS, the ODS inline formatting function Unicode lets SAS programmers insert these symbols into documents generated using ODS RTF, PDF and other ODS destinations (with the exception of ODS Listing) . This works excellently except in Table of Contents entries, where the ODS Unicode statement is not processed properly. This paper discusses how to fix such TOC entries in RTF documents. Unicode IN SAS In SAS, you can use ^{ Unicode 2265} to print a symbol, where ^ is the ODS escape character and 2265 is the hex code for the symbol required.

3 The website is a good source for finding the hex code you need for the symbol you want, or just use Google to search for Unicode greater equal . Search results tend to give more information than strictly needed. Look for U+ and you ll find the hex code. See the SAS online documentation for the ODS Escapechar Statement under Using Unicode Symbols for further details. TOC ENTRIES IN SAS One of the strengths of SAS is its ability to generate publication ready output, with the option to include a Table of Contents (TOC). TOC entries can be specified in SAS by placing the ODS PROCLABEL statement just before the procedure that generates output.

4 The TOC itself can be generated using the ODS RTF options TOC_DATA and Contents . See Lawhorn (2011) for further details. SAS uses a somewhat unusual method for TOC entries in RTF output. Usually in MS Word, the styles Heading 1 , Heading 2 , etc. are used to generate the TOC. SAS on the other hand, inserts a TC field into the RTF output, which can also be used to create a TOC. What is a field (in a Word document)? I suppose fields need some description, not everyone knows what they re capable of, although almost all documents have some fields in them. If you go to the Insert tab of the ribbon, click on Quick Parts and then select Field , you ll get a full list.

5 Page numbers, hyperlinks, the Table of Contents are examples of frequently used fields. To view fields as text, press alt-F9. For this document, the page numbers in the footer appear as {page } . Pressing alt-F9 toggles them back. Pressing F9 will update all fields in a selection. Back to the TC fields used by SAS. In the case of the TC field, it s not necessary to use Alt+F9 to make these visible, you can just press Ctrl+Shift+8 or click the symbol on the Home tab of the ribbon to show hidden formatting symbols. This is what a TC field looks like: * Actually, the ASCII character set uses the first 32 values of a byte for non-printable control Characters .

6 But 256 is such a nice round number .. Hex sounds rather evil, particularly to a SAS novice I suppose. Hex is an abbreviation of hexadecimal and uses the numbers 0 to 9 together with the letters A to F to represent 16 values. That way, two hexadecimal values can be used represent 1 byte. For example, 41 corresponds with capital A (01000001 in binary). If you re a SAS novice, don t let hex values intimidate you! You don t have to know all the ins and outs of them to use Unicode values in your SAS programs. Just look up the codes you need and you ll see easily enough if you re getting the intended results.

7 PhUSE 2016 2 In a more readable form, the tc field contains: {tc "A level 1 TOC entry " \f C \l 1} The \f and \l specifications are switches for tc field options. They can be ignored but for the curious, \f C means this is a type C entry ( Contents , as opposed to illustrations) and \l 1 indicates that it is a level 1 entry. If the Contents option is used in the ODS RTF statement, then a Word TOC field will be inserted on the first page of the RTF document. This field will be invisible unless Alt+F9 is pressed. For the TOC files, the \f switch is the same as for the TC field and indicates that a type C TOC is to be created.

8 The \h switch means that hyperlinks are to be used. The TOC field is empty when SAS generates the RTF document. To generate the TOC, press Ctrl+A to select all, then press F9 to update all fields. Voila: your Table of Contents ! Unicode AND TOC ENTRIES IN SAS Basically, the Unicode inline formatting function works brilliantly. The ODS PROCLABEL statement also works brilliantly. It s when the two are combined that problems arise. If ^{ Unicode nnnn} is used in an ODS PROCLABEL statement, the Unicode specification is not processed properly. The curly braces are stripped and the specification appears as ^ Unicode 263A in the TOC rather than a smiley face.

9 The TOC generated with this TC field: PhUSE 2016 3 For RTF output*, the problem can be repaired. The key to this is a little known command in Word called ToggleCharacterCode . If you select the hex code that corresponds with a Unicode symbol and press Alt+X, the symbol is displayed. Press Alt+X a second time to revert to the hex code. These are the steps to repair your Table of Contents containing unprocessed Unicode specifications: Press Ctrl+Shift+8 to make hidden text and the TC field visible. Locate the TC fields (usually the first cell of a Table ) Delete the ^ Unicode text in the TC field.

10 Select the xxxx specification, then press Alt+x. This will transform 263A into . Repeat for all TC fields Use Ctrl+A to select all, then press F9 to update all fields. This will repair your TOC. The Table of Contents as intended: AUTOMATING THE PROCESS Fixing all TOC entries can be a tedious process if the number of Unicode specifications is large. Appendix A contains a SASU nicode Word macro which can automate the process. The SASU nicode macro automates the steps described above. The macro assumes that ^ is the ODS ESCAPECHAR and that TC fields can contain ^ Unicode nnnn strings.


Related search queries