On this page: Target audience: CAST AI Administrators |
This document describes the character sets that can be used in the CAST Services (CAST Storage Service, Analysis Service, Dashboard Service and Management Service). For further information, see also Collation compatibility between server hosting the Analysis Service and potential participating Servers in Appendix - RDBMS requirements and configuration.
The supported languages with Unicode are:
Any non-supported character that is present in the source code file encoded with one of the supported encodings, will be converted to an arbitrary supported character by CAST. The impact on the analysis result of this conversion depends on the situation in which the conversion occurs and on the character to which the conversion occurred. Therefore, the impact is unpredictable in a general way as for example:
Please also note that results of the analysis depend also on whether the Storage service (CAST Storage Service or commercial RDBMS) supports UNICODE (please see the details in the chapter below). |
The language of the code page used in the Operating System hosting the CAST Analysis workstation (the machine on which the CAST Management Studio is run from) must be the same as the language used for the source code to be analyzed. For example on an OS in Turkish you must analyze source code that is Unicode encoded for the Turkish language.
The CAST Storage Service (CSS) can be used to store analysis results of Unicode encoded source files provided the files use one of the below mentioned encodings:
The following character sets are those corresponding to a single byte, ASCII with '€' coding. They can be used with CAST AIP products:
WE8PC858 | IBM-PC Code Page 858 8-bit West European |
EL8ISO8859P7 | ISO 8859-7 Latin/Greek |
WE8ISO8859P15 | ISO 8859-15 West European |
EE8MSWIN1250 | MS Windows Code Page 1250 8-bit East European |
CL8MSWIN1251 | MS Windows Code Page 1251 8-bit Latin/Cyrillic |
WE8MSWIN1252 | MS Windows Code Page 1252 8-bit West European |
EL8MSWIN1253 | MS Windows Code Page 1253 8-bit Latin/Greek |
TR8MSWIN1254 | MS Windows Code Page 1254 8-bit Turkish |
BLT8MSWIN1257 | MS Windows Code Page 1257 8-bit Baltic |
Please note that using an "Oracle Database Server" does not provide any support for any Unicode encoding. |
With Microsoft SQL Server, CAST recommends using Windows collations. Linked to the Windows locales, the code page '1252' is suitable for Western Europe, the Americas and Australia. In addition, the CS (Case sensitive) and AS (Accent Sensitive) attributes must be active.
Please note that using an "Microsoft SQL Server" does not provide any support for any Unicode encoding. |
The Dashboard Service, Analysis Service, Management Service and Measurement Service schemas support the following encodings:
The following CAST analyzers:
support the following encodings:
The following CAST AIP components:
support the following encodings:
All other unmentioned CAST AIP components may provoke an arbitrary error when analyzing or working with a Unicode encoded source code file.
Note: BOM = Byte Order Mark, an indicator at the beginning of the Unicode encoded file that specifies in which order the bytes of a multi-byte character appear in the file ("Little Endian" vs. "Big Endian" encodings) |