Public bug announcement: Beware that GNU find in findutils 4.4.2 (as shipped on Ubuntu Lucid) will not find all your files if it's run in the UTF-8 locale: even if the file is there, find may just skip printing its name. Solution: If you have non-ASCII characters in your file names, use LC_CTYPE=C find
instead of find
.
Example:
$ echo $LC_CTYPE en_US.UTF-8 $ ls foo* ls: cannot access foo*: No such file or directory $ perl -e 'die if !open F, ">", "foo\x80bar"' $ ls foo* foo?bar $ find -type f ... ./foo?bar ... $ find -name 'foo*' $ LC_CTYPE=C find -name 'foo*' ./foo?bar
Possible explanation: The file name matcher won't match a file if its name cannot be parsed properly in the current locale (LC_CTYPE). That is, since foo\x80bar
is not valid UTF-8, GNU find 4.4.2 will not find it.
This strange behavior can be very surprising and possibly dangerous, especially in automated shell scripts.
No comments:
Post a Comment