Win32::MultiLanguage - Interface to IMultiLanguage I18N routines
|
Win32::MultiLanguage - Interface to IMultiLanguage I18N routines
use Win32::MultiLanguage;
# @@
Win32::MultiLanguage is an experimental wrapper module for the
Windows IMultiLanguage interfaces that comes with Internet
Explorer version 4 and later. Mlang.dll implements routines for
dealing with character encodings, code pages, and locales.
- DetectInputCodepage($octets [, $flags [, $codepage]])
-
Detects the code page of the given string $octets. An optional
$flags parameter may be specified, a combination of
MLDETECTCP
constants as defined above, if not specified MLDETECTCP_NONE
will be used as default. An optional $codepage can also be
specified, if this value is set to zero, this API returns all
possible encodings. Otherwise, it lists only those encodings
related to this parameter. The default is zero.
-
It will return a reference to an array of hash references of
which each represents a DetectEncodingInfo
strucure with the
following keys
-
'LangID' => ..., # primary language identifier
'CodePage' => ..., # detected Win32-defined code page
'DocPercent' => ..., # Percentage in the detected language
'Confidence' => ..., # degree to which the detected data is correct
- GetCodePageInfo($codepage, $langid =
GetUserDefaultLangID())
-
Returns a hash reference describing $codepage if known with strings in $langid,
for example:
-
'GDICharset' => 1,
'HeaderCharset' => 'utf-8',
'Flags' => 805701387,
'WebCharset' => 'utf-8',
'CodePage' => 65001,
'ProportionalFont' => 'Arial',
'BodyCharset' => 'utf-8',
'Description' => 'Unicode (UTF-8)',
'FixedWidthFont' => 'Courier New',
'FamilyCodePage' => 1200
- GetCodePageDescription($codepage, $locale =
GetUserDefaultLCID())
-
Returns a human-readable description of $codepage in $locale, for example,
-
Western European (Windows)
for Windows-1252 in 1033 (English (United States)).
- GetRfc1766FromLcid($lcid =
GetUserDefaultLCID())
-
Get the RFC 1766 identifier that corresponds to $lcid.
- DetectOutboundCodePage($utf8 [, $flags [, \@cp ]])
-
...
GetCharsetInfo($charset)
-
If the $charset string is known, returns a hash reference like this:
-
InternetEncoding => 65001,
CodePage => 1200,
Charset => 'utf-8'
-
Returns nothing if the encoding is unknown.
- IsConvertible($CodePageIn, $CodePageOut)
-
Checks if transcoding from $CodePageIn to $CodePageOut can be performed.
- GetRfc1766Info($lcid = GetUserDefaultLCID(), $langid =
GetUserDefaultLangID())
-
Information about the $lcid in $langid, for example
-
'Lcid' => 1033,
'LocaleName' => 'English (United States)',
'Rfc1766' => 'en-us'
GetLcidFromRfc1766($rfc1766)
-
Get the locale identifier that corresponds to the $rfc1766 identifier.
Also returns a second value indicating whether ``The returned LCID matches
the primary language of the RFC1766-conforming name only'' [@@ in list context];
GetFamilyCodePage($codepage)
-
The family code page that corresponds to $codepage.
GetNumberOfCodePageInfo()
-
The number of available code pages.
GetNumberOfScripts()
-
The number of available scripts.
- EnumCodePages($flags = 0, $langid =
GetUserDefaultLangID())
-
Returns a list of code page information hash references matching $flags
with strings in $langid.
- EnumScripts($flags = 0, $langid =
GetUserDefaultLangID())
-
Returns a list of script information hash references matching $flags
with strings in $langid.
- EnumRfc1766($langid =
GetUserDefaultLangID())
-
Returns a list of RFC 1766 information hash references with strings in $langid.
- Transcode($CodePageIn, $CodePageOut, $String, $Flags)
-
Uses the IMLangConvertCharset::DoConversion method to convert between
the two code pages. Sets the UTF-8 flag if CodePageOut == CP_UTF8 and
croaks if the conversion is not supported. Returns nothing if it fails.
It seems the empty string cannot be transcoded by DoConversion, so do
not pass the empty string to it, unless you handle the failure in some
way.
These are currently not exported/exportable.
- MLDETECTCP_NONE
-
Default setting will be used.
- MLDETECTCP_7BIT
-
Input stream consists of 7-bit data.
- MLDETECTCP_8BIT
-
Input stream consists of 8-bit data.
- MLDETECTCP_DBCS
-
Input stream consists of double-byte data.
- MLDETECTCP_HTML
-
Input stream is an HTML page.
- MIMECONTF_MAILNEWS
-
Code page is meant to display on mail and news clients.
- MIMECONTF_BROWSER
-
Code page is meant to display on browser clients.
- MIMECONTF_MINIMAL
-
Code page is meant to display in minimal view. This value is generally
not used.
- MIMECONTF_IMPORT
-
Value that indicates that all of the import code pages should be enumerated.
- MIMECONTF_SAVABLE_MAILNEWS
-
Code page includes encodings for mail and news clients to save a document in.
- MIMECONTF_SAVABLE_BROWSER
-
Code page includes encodings for browser clients to save a document
in.
- MIMECONTF_EXPORT
-
Value that indicates that all of the export code pages should be
enumerated.
- MIMECONTF_PRIVCONVERTER
-
Value that indicates the encoding requires (or has) a private
conversion engine. A client of IEnumCodePage doesn't use this value.
- MIMECONTF_VALID
-
Value that indicates the corresponding encoding is supported on the
system.
- MIMECONTF_VALID_NLS
-
Value that indicates that only the language support file should be
validated. Normally, both the language support file and the
supporting font are checked.
- MIMECONTF_MIME_IE4
-
Value that indicates the Microsoft® Internet Explorer 4.0 MIME data
from MLang's internal data should be used.
- MIMECONTF_MIME_LATEST
-
Value that indicates that the latest MIME data from MLang's internal
data should be used.
- MIMECONTF_MIME_REGISTRY
-
Value that indicates that the MIME data stored in the registry should
be used.
- MLDETECTF_MAILNEWS
-
Not currently supported.
- MLDETECTF_BROWSER
-
Not currently supported.
- MLDETECTF_VALID
-
Detection result must be valid for conversion and text rendering.
- MLDETECTF_VALID_NLS
-
Detection result must be valid for conversion.
- MLDETECTF_PRESERVE_ORDER
-
Preserve preferred code page order. This is meaningful only if you have
set the @@puiPreferredCodePages parameter in
DetectOutboundCodePage
.
- MLDETECTF_PREFERRED_ONLY
-
Only return one of the preferred code pages as the detection result.
This is meaningful only if you have set the @@puiPreferredCodePages
parameter in
DetectOutboundCodePage
.
- MLDETECTF_FILTER_SPECIALCHAR
-
Filter out graphical symbols and punctuation.
Please report bugs via mail to bug-Win32-MultiLanguage@rt.cpan.org, or
via http://rt.cpan.org/NoAuth/Bugs.html
Copyright (c) 2004-2008 Bjoern Hoehrmann <bjoern@hoehrmann.de>.
This module is licensed under the same terms as Perl itself.
Win32::MultiLanguage - Interface to IMultiLanguage I18N routines
|