Win32::LongPath - provide functions to access long paths and Unicode in the Windows environment |
Win32::LongPath - provide functions to access long paths and Unicode in the Windows environment
use File::Spec::Functions; use Win32::LongPath; use utf8;
# make a really long path w/Unicode from around the world $path = 'c:'; while (length ($path) < 5000) { $path = catdir ($path, 'ελληνικά-русский-日本語-한국-中國的-עִברִית-عربي'); if (!testL ('e', $path)) { mkdirL ($path) or die "unable to create $path ($^E)"; } } print 'ShortPath: ' . shortpathL ($path) . "\n";
# next, create a file in the path $file = catfile ('more interesting characters فارسی-தமிழர்-ພາສາລາວ'); openL (\$FH, '>:encoding(UTF-8)', $file) or die ("unable to open $file ($^E)"); print $FH "writing some more Unicode characters\n"; print $FH "דאס שרייבט אַ שורה אין ייִדיש.\n"; close $FH;
# now undo everything unlinkL ($file) or die "unable to delete file ($^E)"; while ($path =~ /[\/\\]/) { rmdirL ($path) or die "unable to remove $path ($^E)"; $path =~ s#[/\\][^/\\]+$##; }
Although Perl natively supports functions that can access files in Windows these functions fail for Unicode or long file paths (i.e. greater than the Windows MAX_PATH value which is about 255 characters). Win32::LongPath overcomes these limitations by using Windows wide-character functions which support Unicode and extended-length paths. The end result is that you can process any file in the Windows environment without worrying about Unicode or path length.
Win32::LongPath provides replacement functions for most of the native Perl file functions. These functions attempt to imitate the native functionality and format as closely as possible and accept file paths which include Unicode characters and can be up to 32,767 characters long.
Some additional functions are also available to provide low-level features that are specific to Windows files.
File and directory paths can be provided containing any of the following components.
C:/path
(fullpath) or c:path
(relative path).
UNC: The path can begin with a UNC path in the form \\server\share
or
//server/share
.
extended-length: The path can begin with an extended-length prefix in the
form of \\?\
or //?/
.
All input paths will be converted (normalized) to a fullpath using the extended-length format and wide characters. This allows paths to be up to 32,767 characters long and to include Unicode characters. The Microsoft specification still limits the directory component to MAX_PATH (about 255) characters.
Output paths will be converted back (denormalized) to a UTF-8 fullpath that begins with a drive letter or UNC.
NOTE: See the Naming Files, Paths, and Namespaces topic in the Microsoft MSDN Library for more information about extended-length paths.
Unless stated otherwise, all functions return true (a numeric value of 1) if successful or false (undef) if an error occurred. Generally, if a function fails it will set the $! value to the failure. However, $^E will have the more specific Windows error value.
This section lists the replacements for native Perl file functions. Since openL returns a native Perl file handle, functions that use open file handles (read, write, close, binmode, etc.) can be used as is and do not have replacement functions. Functions that are specific to the Unix environment (chmod, chown, umask, etc.) do not have replacements. A replacement for sysopen was not provided since it uses the fdopen () C library.
linkL ('goodbye', 'до свидания') or die ("unable to link file ($^E)");
FILEHANDLEREF cannot be a bareword file handle or a scalar variable. It must be a reference to a scalar value which will be set to be a Perl file handle. For example:
openL (\$fh, '<', $file) or die ("unable to open $file: ($^E)");
For the most part, MODE matches the native definition and can begin with <, >, >>, +<, +> and +>> to indicate read/write behavior. The |-, -|, <-, -, >- modes are not valid since they apply to pipes, STDIN and STDOUT. Read-only is assumed if the read/write symbols are not used. MODE can also include a colon followed by the I/O layer definition. For example:
openL (\$fh, '>:encoding(UTF-8)', $file);
PATH is the relative or fullpath name of the file. It cannot be undef for temporary files, a reference to a variable for in-memory files or a file handle.
# these are WRONG! openL ($infh, '', $infile); openL (INFILE, '', $infile); openL (\$infh, '', undef); openL (\$infh, '', \$memory); openL (\$infh, '', INFILE); openL (\$infh, '-|', "file<$infile");
# these are correct # append infile to outfile openL (\$infh, '', $infile) or die ("unable to open $infile: ($^E)"); openL (\$outfh, '>>', $outfile) or die ("unable to open $outfile: ($^E)"); while (<$infh>) { print $outfh $_; } eof ($infh) or print "terminated before EOF!\n"; close $infh; close $outfh;
# symlinks should always be equal symlinkL ($orig, $slink) or die ("unable to symlink file ($^E)"); $rlink = readlinkL ($slink) or die ("unable to read link ($^E)"); die ("links not equal!") if ($rlink ne $orig);
# hard links should always be undef linkL ($orig, $hlink) or die ("unable to link file ($^E)"); !readlinkL ($hlink) or die ("should have failed!");
NOTE: See MoveFile in the Microsoft MSDN Library for more information.
# should work renameL ('c:/file', 'c:/newfile'); # fails, can't move file to directory renameL ('d:/file', '.'); # should work for files renameL ('e:/file', 'f:/newfile'); # should work renameL ('d:/dir', 'd:/topdir/subdir'); # fails, can't move directory across volumes renameL ('c:/dir', 'd:/newdir');
atime: Last access time in seconds. NOTE: Different file systems have different time resolutions. For example, FAT has a resolution of 1 day for the access time. See the Microsoft MSDN Library for more information about file time.
attribs: File attributes as returned by the Windows GetFileAttributes ()
function. Use the following constants to retrieve the individual values. See
the Microsoft MSDN Library for more
information about the meaning of these values. Import these values into your
environment if you do not want to refer to them with the
Win32::LongPath::
prefix.
ctime: Although defined to be inode change time in seconds for native Perl, it will reflect the Windows creation time.
dev: The Windows serial number for the volume. See the Microsoft MSDN Library for more information.
gid: Is always zero.
ino: Is always zero.
mode: File mode (type and permissions). use Fcntl ':mode'
can be used
to extract the meaning of the mode. Regardless of the actual user and group
permissions, the following bits are set.
S_IFDIR
, S_IRWXU
, S_IRWXG
and S_IRWXO
Files: S_IFREG
, S_IRUSR
, S_IRGRP
and S_IROTH
Files without read-only attribute: S_IWUSR
, S_IWGRP
and S_IWOTH
Files with BAT, CMD, COM and EXE extension: S_IXUSR
, S_IXGRP
and S_IXOTH
mtime: Last modify time in seconds. NOTE: Different file systems have different time resolutions. For example, FAT has a resolution of 2 seconds for the modification time. See the Microsoft MSDN Library for more information about file time.
nlink: Is always one.
rdev: Same as dev.
size: Total size of the file in bytes. Has a value of zero for directories.
uid: Is always zero.
use Fcntl ':mode'; use Win32::LongPath qw(:funcs :fileattr);
# get object testL ('e', $file) or die "$file doesn't exist!"; $stat = statL ($file) or die ("unable to get stat for $file ($^E)");
# this test for directory $stat->{mode} & S_IFDIR ? print "Directory\n" : print "File\n"; # is the same as this one $stat->{attribs} & FILE_ATTRIBUTE_DIRECTORY ? print "Directory\n" : print "File\n";
# show file times as local time printf "Created: %s\nAccessed: %s\nModified: %s\n", scalar localtime $stat->{ctime}, scalar localtime $stat->{atime}, scalar localtime $stat->{mtime};
OLDFILE can be a relative or full path. If relative path is used, it will not be converted to an extended-length path.
NOTE: See CreateSymbolicLink in the Microsoft MSDN Library for more information about symbolic links.
symlinkL ('no problem', '問題ない') or die ("unable to link file ($^E)"); symlinkL ('c:/', 'rootpath') or die ("unable to link file ($^E)");
# these are equivalent die 'unable to read!' if -r $file; die 'unable to read!' if testL ('r', $file);
The supported TYPEs and their values are:
# if you do this you don't know which failed die ("delete of some files failed!") if !unlinkL ($f1, $f2, $f3, $f4);
# this identifies the failures foreach my $file ($f1, $f2, $f3, $f4) { unlinkL ($file) or print "Unable to delete $file ($^E)\n"; }
PATH must be the path to a file.
If successful, it returns the number of files changed. It returns undef if an error occurs, and the error variable is set to the value of the last error encountered.
NOTE: This function is not supported in Cygwin and will return an error.
NOTE: Different file systems have different time resolutions. For example, FAT has a resolution of 2 seconds for modification time and 1 day for the access time. See the Microsoft MSDN Library for more information about file time.
# set back 24 hours $yesterday = time () - (24 * 60 * 60); utimeL ($yesterday, $yesterday, $file) or die ("unable to change time on $file ($^E)");
# this is the same as the touch command utimeL (undef, undef, $file) or die ("unable to change time on $file ($^E)");
NOTE: Although extended-length paths are used, the Microsoft specification still limits the directory component to MAX_PATH (about 255) characters.
$ENV{HOME}
if it is set, or $ENV{LOGDIR}
if that is set. If neither is
set then it will do nothing and return.
Unlike other functions, the PATH cannot exceed MAX_PATH characters, although it can contain Unicode and be in the extended-path format.
chdirL ($path) or die ("unable to change to $path ($^E)");
print "The current directory is: ", getcwdL (), "\n";
mkdirL ($dir) or die ("unable to create $dir ($^E)");
rmdirL ($dir) or die ("unable to delete $dir ($^E)");
Unlike the openL function which returns a native handle, the open directory functions must create a directory object and then use that object to manipulate the directory. The native Perl rewinddir, seekdir and telldir functions are not supported.
$dir = Win32::LongPath->new ();
$dir->closedirL ();
$dir->opendirL ($dir) or die ("unable to open $dir ($^E)");
NOTE: Only the item name is returned, not the whole path to the item.
use Win32::LongPath qw(:funcs :fileattr);
# search down the whole tree search_tree ($rootdir); exit 0;
sub search_tree {
# open directory and read contents my $path = shift; my $dir = Win32::LongPath->new (); $dir->opendirL ($path) or die ("unable to open $path ($^E)"); foreach my $file ($dir->readdirL ()) { # skip parent dir if ($file eq '..') { next; }
# get file stats my $name = $file eq '.' ? $path : "$path/$file"; my $stat = lstatL ($name) or die "unable to stat $name ($^E)";
# recurse if dir if (($file ne '.') && (($stat->{attribs} & (FILE_ATTRIBUTE_DIRECTORY | FILE_ATTRIBUTE_REPARSE_POINT)) == FILE_ATTRIBUTE_DIRECTORY)) { search_tree ($name); next; }
# output stats print "$name\t$stat->{attribs}\t$stat->{size}\t", scalar localtime $stat->{ctime}, "\t", scalar localtime $stat->{mtime}, "\n"; } $dir->closedirL (); return; }
The following functions are not native Perl functions but are useful when working with Windows.
$short = '../SYSTEM~2.PPT'; $long = abspathL ($short); print "$short = $long\n"; # if it exists it could print something like # ../SYSTEM~2.PPT = c:\rootdir\subdir\System File.ppt # if not, it might print # ../SYSTEM~2.PPT = c:\rootdir\subdir\SYSTEM~2.PPT
# probably not the same because TMP is short path chdirL ($ENV {TMP}) or die "unable to change to TMP dir!"; $curdir = getcwdL (); if (abspathL ($curdir) ne $curdir) { print "not the same!\n"; }
ATTRIBS is a string that identifies the attributes to enable or disable. A plus sign (+) enables and a minus sign (-) disables the attributes that follow. If not provided, a plus sign is assumed.
The attributes are identified by letters which can be upper or lower case. The letters and their values are:
# sets System and hidden but disables read-only # could also be '-r+sh', 's-r+h', '+hs-r', etc. attribL ('sh-r', $file) or die "unable to set attributes for $file ($^E)";
copyL ($from, $to) or die "unable to copy $from to $to ($^E)";
if (shortpathL ($file) eq '') { or die "unable to get shortpath for $file"; }
maxlen: The maximum length of path components (the characters between the backslashes; usually directory names).
name: The name of the volume.
serial: The Windows serial number for the volume.
sysflags: System flags. Indicates the features that are supported by
the file system. Use the following constants to retrieve the individual
values. Import these values into your environment if
you do not want to refer to them with the Win32::LongPath::
prefix.
NOTE: See the Microsoft MSDN Library for more information about this feature.
use Win32::LongPath qw(:funcs :volflags);
$vol = volinfoL ($file) or die "unable to get volinfo for $file"; if (!($vol->{sysflags} & FILE_SUPPORTS_REPARSE_POINTS)) { die "symbolic links will not work on $vol->{name}!"; }
All functions are automatically exported by default. The following tags export specific values:
This module was developed for the Microsoft WinXP and greater environment. It also supports the Cygwin environment.
Robert Boisvert <rdbprog@gmail.com>
Many thanks to Jan Dubois for getting Windows support started with the Win32 manpage. It remains the number one module in use on almost every Windows installation of Perl.
A big thank you (どうもありがとうございました) to Yuji Shimada for the Win32::Unicode manpage. The concepts used there are the basis for much of Win32::LongPath.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Win32::LongPath - provide functions to access long paths and Unicode in the Windows environment |