find(1), trailing slashes and symbolic links to directories

Here's a nice little "gotcha" in find(1).

First, let's create a directory and a symlink to that directory. We'll add an empty file just underneath to illustrate what is going on:

$ mkdir a
$ ln -s a b
$ touch a/file

If we invoke find with a trailing slash, everything works as expected:

$ find a/
a/
a/file

$ find b/
b/
b/file

... but if we omit the trailing slash, find does not traverse the symlink:

$ find a
a
a/file

$ find b
b

This implies that any normal-looking invokation of find such as:

find /path/to/dir -name 'somefile.ext' ...

... is subtly buggy as it won't accomodate the sysadmin replacing that path with a symlink.


This is, of course, well-covered in the find(1) manpage (spoiler: the safest option is to specify -H, or simply to append the trailing slash), but I would still class this as a "gotcha" because of the subtle difference between the trailing and non-trailing slash variants.

Putting it another way, it's completely reasonable that find doesn't follow symlinks, but when this behaviour based on the presence of the trailing slash—a usually meaningless syntactic distinction—it crosses the rubicon to being counter-intutive.

Comments (3)

Felix

Well it's the same as with ls -l:
Using exactly the same setup, 'ls -l b' will show:
[...] b -> a
'ls -l b/'
[...] file
This is a UNIX distinction, if you referene the symlink itself as a filesystem object, you get information on the symlink. A '/' suffix references the content of the directory or symlink target directory, i.e. it's equivalent to '/.' - this UNIX distinction is not always useful but the tools behave consistently.

Dec. 25, 2014, 12:08 a.m. #
Fred

It's the same thing as when you do (with b still a symlink) "rm b/" which will fail because you're trying to unlink() a directory VS "rm b" which will succeed since you're doing the same on a symlink.

Dec. 25, 2014, 12:33 a.m. #
chrysn

differing behavior based on presence of the slash is not as meaningless as you portray it, even if you leave rsync out of the considerations:

* when related to file systems, it is the widespread way to distinguish between a symlink to a directory and its target. `ls -l b` and `ls -l b/` behave in exact analogy to find (the firmer shows a symlink, the latter its target). same goes for `cp -a`, `du` and `stat`.
* when related to urls: most servers automagically redirect you from /foo to /foo/, but serving foo/index.html under the name /foo would often result in garbled displays as relative links get offset.

Dec. 26, 2014, 10:54 a.m. #