
I like Chinese They only come up to your knees But they're cute, cuddly and eager to please They come from a long way overseas I like their tiny trees Their Zen, their ping-pong, their yin, and yang-ese We sometimes bomb their embassies But we don't really mean to; we thought they were trees Their food is guaranteed to please A fourteen, a seven, a nine and lychees Eric Idle
You wouldn't like Chinese if you had to build a MS Access application to support it.
Internationalisation addresses the User’s locale without necessarily providing an interface in the User’s local language.
Outside of Microsoft, this is known as i18N
This adrresses :
|
Data Input | |
|
Data Display | |
|
Convention | |
|
Culture |
It ensures that an application will run correctly when the User’s data is processed in the User’s language in the User’s locale.
Currency values, dates and numbers will be handled correctly and displayed in their local formats.
User’s will be able to use their own local keyboards, and in the case of Asian languages such as Japanese, Chinese and Korean, use their Input Method Editor which allows the input of characters too numerous to hold on a keyboard.
You can find a lot of good stuff on the Microsoft Global Development and Computing Portal
Multinationalisation addresses the management of data from locales other than that of the User. Outside of Microsoft this is known as M18N.
This addresses issues such as :
|
The processing of data, files and filenames from different locales of that of the User. | |
|
The management of data that contains more than one language. |
Our work may not be not to resolve the differences between international languages, but between client offices.
The biggest compromise may be to conduct business in “English” American. This is maintained to significant degree in many offices. For instance the French may interact with an application in English, even whilst using their “AZERTY” keyboards, maintaining their date/number/currency formats, entering the odd character ç, and losing the occasional accent off a character ǎ
Both Japanese and Chinese have vertical languages, but compromise to processing and displaying data horizontally, although China maintains right-to-left orientation.
Numbering (such as Hangzhou for Japan, China and Korea), Calendar, Sort and Find functionality may also be conceded.
The more compromises that are made, the less work we will have to perform.
This certainly seems a practical solution as all code pages include English. Some of the reasons against this could be :
|
The User can’t speak/understand English, well or at all | |
|
Local laws or procedures may require some data to be processed in the local language, convention or format.. | |
|
English date, calendar, currency and number formats could cause errors or even software failures in other countries. | |
|
Data is not available in English. | |
|
Text may have to include non-English text, such as Greek Invoice Number ΨΩ01234λ. | |
|
Existing document names may be in other languages such as the dated Japanese document 平成十五年一月十三日.doc | |
|
Existing documents such as spreadsheets may contain non-English characters that may be processed by our applications. | |
|
Even if only English data is allowed, as the Users have local keyboards, accidental non-English characters will appear. Additional validation will have to be added to screen data input. | |
|
Regional settings may display/process data incorrectly. |
This sometimes is the solution, but is expensive.
Drawbacks are :
|
Local versions could not be developed until the main application is tested and signed off, delaying roll-out | |
|
Far more software development is required making timescales long and costs high. | |
|
Bug-fixes and enhancements would be difficult and costly. |
This part of the site will grow to include many of the issues to be addressed in providing i18L and M18N. A taste of this follows :
|
Windows 2000 works differently to Windows XP. Language packs are optional. | |
|
There are different calendar systems. | |
|
Some existing middleware, databases and Software Languages may need to be replaced as they cannot cope. Existing applications may need to be completely different programming language. | |
|
Functionality of applications may need to change to work around issues which have no resolution. | |
|
Applications, middleware, databases and servers may malfunction when coming across foreign characters. | |
|
Existing applications may malfunction as Find and Sort rules in some countries process words by their sound, or meaning, rather than the position they are in a dictionary. | |
|
Different types of Unicode | |
|
Format or convention conflicts | |
|
API calls (calls to Windows) may no longer work. | |
|
IME (Input Method Editor) where keyboards cannot have a key for each of the many characters. (Japanese, Chinese and Korean) | |
|
System components such as local networking, or non-English network ID’s | |
|
Text orientation such as Chinese right-to-left | |
|
Formatted dates and numbers may not be able to be converted to their “True” values causing the software to crash. | |
|
Dual currency issues | |
|
Smaller currencies can cause number size issues. |
It should be noted that some of these countries have more than one major language, which may require other character-sets. Such countries include Finland, Italy and Switzerland. Furthermore, some countries have “minority” languages that could feature.
Some countries may include data from other countries, for instance an Italian company processing a Greek invoice, or a Russian company processing one from Belarus (different Cyrillic character-sets).
Although the vocabulary of a language may be similar in two different countries, the grammar could be quite different. This should be remembered when presenting structured text, such as sentences to the User. EG An application supporting Iberian Portugese, will not necessarily support Brazillian Portugese. The phrase “data that is stored in a Microsoft Jet or Microsoft SQL Server database” would be “dados queestao armazendados em uma base de dados do Microsoft Jet ou do Microsoft SQL Server” in Brazillian Portugese, but “esses dados sao armazenados numa base de dado do Microsoft Jet ou do Microsoft SQL Server”. Fairly different. When we deal with the Chinese language , such differences are more severe between Simplified and Traditional Chinese.
Even more fun is Hebrew, Arabic and Farsi, with the need to address bi-directional support or additional calendars.
Our problems are just challenges. There exists no manual to resolve all difficulties on the way. They require thoughtful engineering. Many companies can do it, so can we.
The globalisation of the business is not a temporary phenomenon.
We’ve stopped playing Chequers and we’re starting to play Chess.
Sorting is more important than just a convenience. If the sort order is not as anticipated, serious errors can occur. In Swedish, for example, some vowels with an accent sort after "Z," whereas in other European countries the same accented vowel comes right after the non-diacritic vowel.
If a User displays a list of names, and the name does not appear where they expected, they will assume that it is not in the system and perform an erroneous task.
Even worse would be if processes were conducted by the application without User intervention. For instance a User has five transactions which are all to be deleted. To execute this, the records are sorted, the first one is located, and five subsequent records deleted.
But what if the list was not in English-alphabetic order? Some European countries sort certain character groupings such as “ch” or “ll” as a single character, moving its position in the list.
Some countries such as Spain (Traditional/Modern) and Hungary (Normal/Technical) have more than one sort order type. Czech, Danish, Norwegian, Swedish, Finnish, Lithuanian, Polish, Spanish and Turkish have different sorting rules for Latin characters.
Many Latin-character languages have diacriticals which have their own sort rules.
Turkish has four versions of the character “I”.
Furthermore, some sort by the sound of the word, with many sort variations, and some by the meaning.
This can be exacerbated by the inclusion of more than one character-set in the list.
In Japanese, data is sorted in Shift-JIS order, which sorts Katakana, Hiragana and Romanji (Latin) in phonetic order, and Kanjii in radical order. Japanese data freely uses all these character-sets. Sort that one out!
Japanese can also be sorted in Unicode order, which is perhaps preferable when dealing with multi-language data.
Korean is also sorted in phonetic order, and the Hangul and Hanja character-set are mixed as the pronunciations overlap.
Traditional Chinese is sorted by the number of strokes to draw the character.
Simplified Chinese can be sorted phonetically or by stroke count.
Japanese/ Korean, Traditional Chinese and Simplified Chinese can also be sorted in Unicode order. However, it is essential that all users appreciate the sort order in use. Chinese, Japanese and Korean have a potential 40,000 ideograms (characters) which are currently being adopted as part of the surrogate Unicode range by the Unicode Technical Committee.
In applications, many different technologies such as Windows, Microsoft Access, Javascript, Web Browser, middleware, SQL Server and Oracle can be involved in the sorting of data. It is important to ensure that the anticipated sort order is uniform, whichever technology is used.
Difficulties can arise from using unique indexes on text fields where the sort mechanism may approximate different characters to same sort character, such as the Turkish four “I’s”.
It must be remembered that this may not just be an issue with Microsoft Access, but also may affect the Enterprise Server and SQL, using the international extensions of the ORDER BY statement. SQL Server Sort Orders depend on what Service Packs have been fitted.
“Find” functionality has similar issues to Sorting.
Finding the wrong record can produce significant errors, especially if part of an automatic process. Using “English” Microsoft Office, the variants of Find are Case Sensitive or Case Insensitive. Each may return different records. But in Japanese, the sound of words is important, and even on MS Office, there are 22 Japanese Find options.
Up to now most of us think that there are only two Find options :
|
Case Sensitive | |
|
Case Insensitive |
But Japanese Microsoft Office Find options include :
|
Case Sensitive. To not distinguish between uppercase and lowercase characters | |
|
Width Sensitive. To not distinguish between full-width and half-width characters | |
|
Hiragana/Katakana Sensitive. To not distinguish between Hiragana and Katakana characters. | |
|
Match Contractions (yo-on, sokun). Searches without distinguishing characters with diphthongs and double consonants and plain characters. | |
|
Match minus / dash (cho-on). Searches without distinguishing between minus signs, dashes and long vowel sounds. | |
|
Match 'repeat character' marks. Searches without distinguishing between repeat character marks. | |
|
Match variant-form kanji (itaiji). Searches without distinguishing between standard and non-standard ideography. | |
|
Match old kana forms. Searches without distinguishing between new and old kana. | |
|
Match cho-on used for vowels. Searches without distinguishing between characters with long vowel sounds and plain characters. | |
|
Match di/zi, du/ zu。Searches without distinguishing between ヂ and ジ or ヅ and ズ. | |
|
Match ba/va, ha/fa. Searches without distinguishing バ and ヴァ or ハ and ワァ | |
|
Match tsi/thi/chi, dhi/zi. Searches without distinguishing ツィ、テ ィ and チ, or ディ and ツ | |
|
Match hyu/iyu, byu/vyu. Searches without distinguishing ヒュ and ワュ, or ビュ and ヴュ | |
|
Match se/she, ze/je. Searches without distinguishing セ and ッェ、or ゼ and ッェ | |
|
Match ia/iya. Searches without distinguishing ア and ヤ following イ-row and ェ-row characters。 | |
|
Match ki/ku. Searches without distinguishing between キ and ク before サ-row characters. | |
|
Punctuation Characters. Searches without distinguishing between punctuation characters. | |
|
Whitespace Characters. Searches without distinguishing between characters used as blank spaces, such as full-width spaces, half-width spaces and tabs. |
Although Microsoft Word supports the above, Microsoft Access does not. However the Find options within OLE DB providers and Enterprise SQL could be explored.
Although not a language issue, the spread of an application around the globe does bring up a problem with time-zones.
It is very common for a software application to use the Time/Date of their local PC, whilst sharing a common database. The system will store and display date/time information based on this clock.
As the application spreads beyond its original time-zone, this information becomes both confusing and inaccurate. A record added in Budapest at 4pm can be deleted later at 3:15pm in London.
Once an application spreads beyond its time-zone, then server time should be used.
There is always a conflict in setting a Microsoft Access date formats.
| Medium Date formats are hard-wired as USA English | |
| Short Date and Long Date rely on the User's Regional Settings, which the User can set to non-Y2K compliant, or just STUPID formats. | |
| Custom Formats may not be compatible with the User's locale, or may produce unpredictable results. EG ddd, dddd, mmm or mmmm will be in Greek in Greece, but in English in Japan. |
Lets look at Tuesday, 25th October 2005
In Long Date format, assuming your User has not been fiddling, it may appear :
| Locale | Long Date | Short Date |
| UK English | Tuesday, 25 October 2005 | 25/10/2005 |
| US English | Tuesday, October 25, 2005 | 10/25/2005 |
| Spanish | martes 25 de octubre de 2005 | 25/10/2005 |
| Japanese | 2005年10月25日 | 2005/10/25 |
US English is one of the most bizarre foreign date formats, as it is neither left-to-right in value, nor right-to-left. And they say Americans can't do good comedy! It certainly is not ISO 8601 compliant.
With Long Date formats the day/month names can be in local language, but not always. Japanese Long Date formats do not include day/month names; they have three calendars, so date formats can change significantly. The Spanish day/month names are not Capitalised. In Russian, if wednesday is capitalised it becomes environment, and sunday becomes resurrection.
With Short Date, European date progression reads backwards, Japanese reads forwards, and US English curves backwards on itself.
In Microsoft Access two-digit years are resolved by OLE32.DLL. However, different versions of this file can produce different dates. Also, although string literals within queries may appear to have 4-character years, I believe that they are stored as two-character years then re-interpreted by the DLL. You should use the DateSerial() function to store your date literals.
Time formats have similar issues :
|
Not all locales use "English" numerals | |
|
Some use a 12-hour clock, and others use a 24-hour clock. | |
|
Some use AM/PM, or different characters, in different positions | |
|
Most use the colon : separator, but not all. |
The Regional And Language Options property sheet allows the user to:
|
Select an alternative calendar (if applicable to the selected locale). | |
|
Define a two-digit year range for each one of the available calendars. | |
|
Define a default long-date and short-date formatting for each available calendar type. |
Some countries have different or multiple Calendars.
Windows in Japan has three :
|
English Gregorian | |
|
Japanese Gregorian | |
|
Emperor Era |
Japanese Government offices usually require official documents to have their dates in Emperor Era years rather than the Gregorian calendar. Eras now only change when the Emperor changes. The current era of Akihito, started in 1989, and is called Heisei (Achieving Peace) and is calculated by taking 1989 from the current year and adding 1. So 2005 – 1989 + 1 = H17
Imperial dates are formatted with the name
of the era followed by year, month, and day. The Japanese characters for year
年,
month
月
and day
日
are used as separators. The date,
2003-01-24 (ISO 8601
yyyy-mm-dd) might therefore be written in Japanese western
style as
2003年1月24日,
and in Japanese imperial style as
平成15年1月24日
The year 2003 is the 15th year of the era Heisei
平成,
which began in 1989 as year 1.
This makes applications more difficult when we are providing “date pickers” to minimise data entry errors and Y2K errors
Although the Gregorian Calendar is used throughout most of the English-speaking world, other calendars exist such as
|
Korean Tangun | |
|
Japanese Emperor Era | |
|
Buddhist Era | |
|
Hijri | |
|
Hebrew Lunar | |
|
Taiwan |
Remember, that the first day of the year might not start on January 1., and may vary between years
The length of the year and months might also vary, as well as ways of handling leap years. Can you remember the Leap Year calculations in the era when we made millions on the Y2K projects? Year 2000 was not a Leap Year as it was divisible by 400.
The first day of the week might start on another day besides Sunday, even in the same culture.
Currency formatting needs to take into consideration these following elements:
Currency symbol - This can be a pre-defined symbol like the European Euro € or a combination of letters like the use of GBP for British Pound.
The Currency symbol can be placed before or after the digits.
There are many ways to display negative amounts.
Most currencies use the same decimal and thousands separator that the numbers in the locale use, but this is not always true. In some places in Switzerland, they use the period as a decimal separator for Swiss francs (Sfr. 127.54), but then use commas as the decimal separator everywhere else (127,54). No wonder they make cuckoo clocks! I used to work in Geneva and Sion, and the only periods that I remember, were the ones I spent in the bar.
Thousands Separators. Germany uses a period ., and the United Kingdom uses a comma ,. One thousand two hundreds and thirty four is displayed as 1.234 in Germany and 1,234 in the United Kingdom. In Sweden, the thousands separator is a space. Too much time in the sauna, and not enough in the classroom!
Decimal Separators. In the United States, this character is a period (.). In Germany, it is a comma (,). Thus one thousand twenty-five and seven tenths is displayed as 1,025.7 in the United States and 1.025,7 in Germany.
Negative Numbers. The negative sign can be used at the beginning of the number, but it can also be used at the end of the number. Alternatively, the number can be displayed with parentheses around it or even in a colour such as red. Thus a negative five hundred and twenty-seven could be displayed as:
|
-123 | |
|
123- | |
|
(123) | |
|
123 |
Number Shapes. These vary from one locale to another.
Different counting systems. You could try Arabic counting and characters for a laugh, but when I go to bed, I count the sheep in Japanese, remembering that they have an extra digit for ten : o, ー, ニ, 三, 四, 五, 六, 七, 八, 九, 十 ......z z z z z z z z
3-digit grouping is used for most cultures, such as for the United Kingdom: 123,456,789.00.
Other groupings exist. For example, Hindi uses a 2-digit grouping, except for the 3-digit grouping for denoting hundreds: 12,34,56,789.00 Crazy!
Percentages can be written 123%, 123 %, 123 pct or %123. Please don't hard code this.
These can be modified in Windows Regional Options .
Personally I am uncomfortable with Microsoft Access percentage formatting, and always prefer to handle it in my code.
Donchya just hate it when you are entering your credit card details into your favourite bunny-rabbit girly site, and it insists on you selecting which US State you live in, and what ZIP code, even though you have selected a different country.
Canadian postal codes consist of two groups of three characters, such as "M5R 3H5"; a French postal code is a five-digit number, as in 92300. In some places, people might add a country or region code in front of the postal code (for example, F-92300)
Microsoft Access, in some Language Settings, has a Postal Address property for text-boxes This appears on Asian Access (I don't know why it does not appear on all) It allows the conversion of Postcodes into Address fragments. In design mode there is a wizard for this.
Telephone numbers vary significantly around the world. For example :
| China | 1234 5678 |
| France | 01-23-45-67-89 |
| Poland | (12) 345.67.89 |
| Singapore | 123 4567 |
| Thailand | 01) 234-5678 or (012) 34-5678 |
| United Kingdom | 0123 456 7890 or 01234 567890 |
| United States | (123) 456 7890 |
Notice that there are different separators such as hyphens -, periods ., and spaces, different groupings (two, three, four, five, and six digits per group), and different numbers of total digits used (7-11). Also, the examples just given didn't include country codes, which could be anything from one to three digits.
The ITU-T standard E.164, defined by the Comité Consultatif International Téléphonique et Télégraphique (CCITT), states that the maximum number of digits is 15, but this doesn't include space for things like:
|
Long-distance access codes | |
|
Passwords | |
|
Credit card numbers | |
|
Extensions |
Code pages map the data code of a character to an actual displayed character.
Most code pages are 256 characters long, and their characters can therefore be expressed as 8-bit numbers.
The first 128 characters of any code page contain the “standard” English characters, and the latter characters are those local variants.
One code page can support many languages. For instance the Western European code page contains the German β and the French ç.
Code pages include :
|
1250 CP_EASTEUROPE Central European (Latin 2) | |
|
1251 CP_RUSSIAN Cyrillic | |
|
1252 CP_WESTEUROPE Western European (Latin 1) | |
|
1253 CP_GREEK Greek | |
|
1254 CP_TURKISH Turkish (Latin 5) | |
|
1257 CP_BALTIC Baltic Rim | |
|
932 CP_JAPAN Japanese | |
|
936 CP_CHINA Simplified Chinese | |
|
949 CP_KOREA Korean |
The Asian code pages are large and use double-byte character encoding (DBCS)
The big problem with code pages is that they localise a global application to the languages of one area of the world. Significant Windows/Software Language functionality still uses it, and it can be difficult to work around.
Gradually such functionality is moving to Unicode.
One of the reasons that VBA functions fail, I guess, is that their functions use default code page for the Client PC.
The failure of Esperanto was the development of different flavours of the spoken language. Unicode promises one size to fit all; but there are different Unicodes!
From Unicode to Punycode, it is the direction that our industry and Microsoft are going. Windows and Microsoft Access are not quite there yet, but each version becomes more compliant.
Data needs to be displayed by the correct Glyphs and Fonts.
Where I write the date 平成十五年一月十三日, MS Word used font MS Mincho. Without this, the document would still have the correct characters but the machine would be unable to display it.
There are also Unicode fonts such as MS Unicode. It would be preferable to have our applications use Unicode font to reduce local issues when the required fonts are not available on the User’s machine. Note that fonts whose names begin with "@" tip some of the glyphs on their side.
Font selection and font licensing needs consideration.
When creating a locale–aware application, you'll need to consider handling of linguistic nuances. These nuances might seem trivial, but could have a large impact on application design and functionality. For example, Windows allows you to convert characters into either uppercase or lowercase equivalents. Some applications use this feature to automatically convert the first letter of every sentence into uppercase or to assume that certain types of words should always be capitalized. In Russian, however, names of the days of the week are never capitalized–capitalizing the word for "Wednesday" changes the meaning to "environment," and capitalizing the word for "Sunday" changes the meaning to "resurrection."
In the past as localized products were developed, language–sensitive issues–such as casing–were sometimes handled with what were thought of as well–designed, intelligent algorithms. For example, an uppercasing macro that relies on the code–point numbers of ASCII characters and the linear relationship between uppercase characters (A = 41) and lowercase characters (a = 61) can be written as: #define ToUpper(ch) ((ch)<='Z' ? (ch) : (ch)+'A' - 'a')
You can see the problems this English–centric approach presented when representing uppercasing on non–Latin scripts or languages with accented characters where, for example, character mapping doesn't follow the assumed relationship between lowercase and uppercase characters? There are several other reasons why algorithmic solutions for case–folding do not cover all occurrences.
First, some languages do not have a one–to–one mapping between their uppercase and lowercase characters. For instance, the uppercase equivalent of the German ß is "SS." (Don't mention the war...) Second, some characters have different mappings depending upon the language in which they are used. For example, the lowercase "i" in English maps to a dotless uppercase letter: "I." However, in Turkish the lowercase "i" maps to a dotted uppercase letter: "İ." Finally, most non–Latin scripts do not even use the concept of lowercase and uppercase, as in the case of Chinese, Japanese, and Korean; Arabic, Farsi, and Hebrew; as well as Thai. For example, since Farsi has no notion of uppercasing, string output is composed of random and unsupported glyphs.
Along with Complex Scripts, word and line breaking add a special case when multilingual text is to be parsed or displayed.
Latin script follows some straightforward rules for world and line breaking, such as breaking a line at a space, tab, or hyphen. For languages like Thai and Khmer, words run together (with no space between characters that end a word and those that begin another word, as with Latin script). This makes word breaking in such languages a more complex process, since syntax rules require line breaking on word boundaries. Thus for languages like Thai and Khmer, word breaking is based on grammatical analysis and on word matching in dictionaries during text processing at run time. Other languages also have rules of their own.
Unlike most Western written languages, Chinese, Japanese, Korean, and Thai do not necessarily indicate the distinction between words by using spaces. Although the Thai language does not use spacing between words, it still requires lines to be broken on word boundaries.
For these languages, world–ready software applications cannot conveniently base line–breaking and word–wrapping algorithms on a space character or on standard hyphenation rules. They must follow different guidelines.
Take Japanese, for example. Japanese line breaking is based on the kinsoku rules–you can break lines between any two characters, with several exceptions. The first exception is that a line of text cannot end with any leading characters–such as opening quotation marks, opening parentheses, and currency signs–that shouldn't be separated from succeeding characters. The second exception is that a line of text cannot begin with any following characters–such as closing quotation marks, closing parentheses, and punctuation marks–that shouldn't be separated from preceding characters. The third exception is that certain overflow characters (such as punctuation characters) are allowed to extend beyond the right margin for horizontal text or below the bottom margin for vertical text.
All language versions of Windows 2000 and Windows XP are enabled for all supported languages, thereby empowering applications that use Unicode as their encoding model to handle mixed text from any of the supported scripts. For example, in Notepad you can display text containing English, Farsi, Greek, Hindi, Korean, and Thai text all at once. Among these scripts there are several that require special processing to display and edit because the characters are not laid out in a simple linear progression from left to right, as most European characters are. These writing systems are referred to as Complex Scripts
The special processing required by a complex script can involve one or more of the following characteristics: character reordering; contextual shaping; display of combining characters and diacritics; specialized word break and justification rules; cursor positioning; filtering out illegal character combinations. Scripts considered complex are: Arabic, Hebrew, Thai, Vietnamese and Indic family.
Oot sdrawkcab ti etirw ot meht tcepxe ton dluow ew tub, tpircs latnoziroh rof tpircs lacitrev esimorpmoc yam esenihC ehT.
I mean, the Chinese may compromise vertical script for horizontal script, but we would not expect them to write it backwards too.
We have to make facility for data being handled in right-to left orientation, but ensure that it is stored with left-to-right script effectively so it can be searched or ordered. The operational complexity is increased when a field on a form may display data of either orientation.
Field widths and displayed widths must facilitate the cultural differences, either with size of fonts or number of characters. German, for instance, requires 20% - 40% more characters because of longer words and grammatical constructs.
The Japanese language uses English , Japanese and Chinese characters. English (Romanji) characters can get by with 5X7 pixels. Japanese (Hirgana and Katakana) with 16X16 pixels, and Chinese (Kanji) 24X24 pixels. Not only do Kanji need space but their proportions are different.
| Size | English | Japanese | Chinese |
| 8 point | abcdefghijklmnopqrstuvwxyz | あぃいぅうぇえぉおかがきぎくぐけァアィイゥウェエォオカガキギクグ | 乿偓偊偣偕偐偲做偟健倦偈偶偽偖偌倐偆偱偦偁偅偸偧側偬 |
| 10 point | abcdefghijklmnopqrstuvwxyz | あぃいぅうぇえぉおかがきぎくぐけァアィイゥウェエォオカガキギクグ | 乿偓偊偣偕偐偲做偟健倦偈偶偽偖偌倐偆偱偦偁偅偸偧側偬 |
| 12 point | abcdefghijklmnopqrstuvwxyz | あぃいぅうぇえぉおかがきぎくぐけァアィイゥウェエォオカガキギクグ | 乿偓偊偣偕偐偲做偟健倦偈偶偽偖偌倐偆偱偦偁偅偸偧側偬 |
| 14 point | abcdefghijklmnopqrstuvwxyz | あぃいぅうぇえぉおかがきぎくぐけァアィイゥウェエォオカガキギクグ | 乿偓偊偣偕偐偲做偟健倦偈偶偽偖偌倐偆偱偦偁偅偸偧側偬 |
Users who use Right-to-Left, usually have their applications mirrored so that everything becomes “back-to-front”.
Care needs to be taken when orientating components on the screen, such as direction sensitive graphics. Some direction-sensitive graphics can have a different meaning when mirrored
Go to the Input Method Editor (IME) to understand the use of these properties.
Really good XP IME operational information can be found on Gregg Tavares's pages.
Windows Regional Settings are mainly a blessing. They provide local keyboard, sorting, find, currency, date and number format support in the User’s locale.
The downside is may not be helpful when viewing data of a different locale, such as in M18N applications.
Rich-client application, such as those developed in Microsoft Access, may use components such as ActiveX controls or COM objects.
Even though programming languages and Microsoft Access applications may resolve language issues, their components may not.
Units of Measure matter, especially when you are resizing and moving objects around your Microsoft Access application. Throughout the world things are measured using different units and scales. The most popular one used is the metric system (metres, litres, grammes, etc).
Perhaps the most bizarre foreign system, is that used in the USA called the Imperial System. Not only does it have unusual non-metric units, but the counting system is asymmetrical, and definitely not metric.
|
Inch. Approx 25.4mm. Equal to the distance between the tip of the thumb and the first joint of the thumb. In French, Italian, Spanish and Swedish the word for inch is similar to the word for thumb. The Swedes once had a decimal inch! There are also the US Survey Inch and the International Inch. There are 12 of these to the next measure : the Foot | |
|
Foot. Approx 305mm. As we travel around this tour of the human body, we plummet to the foot. You've got it; a human foot. Strangely enough the average European human foot is about 240mm. I guess that the Yanks have bigger feet, or just don't take their boots off. Does it include spurs? The Foot measure goes back to 2575BC and the Sumarians. The Imperial Foot was adapted from an Egyptian measure by the Greeks, with a subsequent larger foot being adopted by the Romans. Ironically it is a different size to the US Survey Foot. You aren't going to believe this but there are 3 of these to the next measure : the Yard. | |
|
Yard. Approx 914mm. This is named after a straight branch. It is the girth of a person's waist or the distance between the tip of nose to tip of thumb of King Henry Tudor I. At least THIS was binary where it was broken into successive halves called half-yard, span, finger and nail. There are 5.5 of these to the next measure : the Rod | |
|
Rod. Approx 5029mm. This is the length of the ox-goad used by medieval English ploughmen. There are 4 in a Chain. | |
|
Chain. Approx 20.11 metres. You can still buy 0.1 chain measuring wheels in the USA. It is a more civilised unit of measure as it is the length of a Cricket pitch, and used decimal : 10 to a Furlong | |
|
Furlong : Approx 201 metres. It is the length of a ploughed furrow in an acre field. 8 furlongs to a Mile |
Stop! You're kidding me! So US units go 12, 3, 5.5, 4, 10, 8, and are indeterminate measures of body parts and medieval stuff. No wonder a Mars probe was lost because it's guidance system was developed for the Imperial system whilst the scientists using it thought it was in Metric. It's odd, that even though the USA don't use metric, they insist on spelling metres, litres, grammes differently. We can't avoid Imperial as we have the 3.5 inch drive, the 17 inch screen and the 8-track tape. Well, I do anyway. My recommendation is not to use furlongs when scaling your Microsoft Access labels.
The paper sizes in the United States and Canada (such as letter, legal, and so on) do not satisfy the needs of all users in the world market. For example, most countries in Europe and Asia use a slightly larger standard known as "A4" (297 x 210 mm) that is slightly longer and narrower than the U.S. letter size (279 x 216 mm). Thus if your application needs to print, you should allow the default paper size to be configurable, and the data to still fit on the paper.
In 1988, before Microsoft Access had appeared, and DBASE ruled the earth, Ronald Reagan decreed that the US Government switch to letter size 8" X 10.5 from the then current 8.5" X 11. This size was used for children's writing in schools, meaning that there were savings in the purchase of paper, due to discounts. Of course the changeover cost a fortune in IT and machinery, and a handsome profit to those in the know. So be careful Letter should be 8.5" X 11", but sometimes Government Letter 8" X 10.5" is just called Letter.
The Japanese use metric paper sizes and their "A" sizes are ISO standard. However their "B" paper sizes are not the same as ISO paper sizes. For instance ISO B5 is 176mm X 250 mm, whereas Japanese B5 is 182mm X 257mm
Report Margins can also be an issue as some countries favour their home-grown manufacturers. For instance, I assumed HP, yet in Japan they favour Epson, and found that some models could not support a 10mmm margin; my reports wrapped-over to a new near-blank page.
For uniform printing of Microsoft Access reports, put this in the header :
Each subsequent version of Windows has improved its ability to handle foreign character-sets.
Windows XP loads most the international requirements for drivers, character sets, code pages and fonts, irrespective of the locale of the machine.
There are currently 33 different binary versions of Windows XP. English Windows XP with Japanese settings, is not the same as Japanese Windows. XP.
Comparing Windows XP Professional Multilingual Options
If you use API calls, these may not work everywhere.
API’s begin with either an “A” or “W”.
|
“A” (ANSI) APIs are multi-byte character sets, that use the default code page. | |
|
“W” (Wide) API’s assume that strings are in UTF-16/UCS-2 Unicode. |
Of course, Microsoft Access is part of Microsoft Office.
Office XP in a Multilingual Environment
Microsoft Access has Language-Specific Properties and Methods MSDN is a little out of date, and refers mainly to Office 2000. Property names and options have changed a little since. You really need to appreciate a little about Asian languages, such as the Japanese language, and the Input Method Editor (IME) to understand the use of these properties. Really good XP IME operational information can be found on Gregg Tavares's pages.
For instance, for Japan we have :
|
FELineBreak Property This is also known as Asian LineBreak. This property only appears on Asian Access. |
|
FuriganaControl Property The use of Furigana is to display characters that explain the phonetic pronunciation of an Ideogram, such as a Kanji (Chinese) character. This property only appears on Japanese Access. |
This is a demo of Furigana in action. The FuriganaControl property of the top control has the name of the lower control. I enter Kanji characters in the top control. When the data in the top control is updated, such as focus being moved, the Furigana appears below :

|
IMEHold Property This simple Yes/No choice allows the hold of the Kanji Conversion Mode from the PREVIOUS control. NO is the default. | |
|
IMEMode Property This sets which Kanji Conversion Mode should be used when the control has focus. NO CONTROL is the default. | |
|
IMESentenceMode Property This unusual mode controls the way that the IME controls groups of words, such as sentences. NORMAL is the default. | |
|
KeyboardLanguage Property This sets the keyboard language for the current control. The languages are those set in the Microsoft Office Language Settings. | |
|
NumeralShapes Property This sets the numeral types. Japanese use the same numeral types as Europeans/US. However, they also use number names, which are not covered by this property. | |
|
Orientation Property This governs the overall orientation (right-to-left or left-to-right) of forms and reports, in the absence of setting this at control level. For Japanese use Left-to-right. | |
|
PostalAddress Property This appears on Asian Access (I don't know why it does not appear on all) It allows the conversion of Postcodes into Address fragments. In design mode there is a wizard for this. | |
|
ReadingOrder Property This controls the right-to-left or left-to-right orientation of displayed text in a control. If you select CONTEXT it reads the first character and decides the orientation. As both Japanese (right-to-left) and Chinese (left-to-right) use Kanji characters, I don't know what it bases its decision. | |
|
ScrollBarAlign Property This property selects whether the scrollbar should be on the left/right of the form/report, or whether it should follow the Orientation of the parent object. For Japanese use Left-to-Right. |
VBA still seems to use Code Pages, and can fail when accessing data of another Code Page. For instance functions such as Dir() cannot handle off-page file-names.
One of my favourite sites to cut foreign phrases and paste them into my applications is Encyclopedia - Common Phrases in different languages
I
had an application that ran OK all around Europe, and on my Japanese test
machine. However, it crashed in Japan with Procedure Call
or Argument is wrong.
This was due to an absolute reference in some library code (not mine) : Application.CommandBars("File").Controls("Exit") This was hard to work around, as, although I could use integers, the required integer would be different for different Users.
I think this is a problem due to the configuration of the Microsoft Office Multilanguage User Interface, as when Germany switched back to German Access, then they also had this error (but in German).