Signed-off-by: Evan Lezar <elezar@nvidia.com>
1.9 KiB
pwalkdir: parallel implementation of filepath.WalkDir
This is a wrapper for filepath.WalkDir which may speed it up by calling multiple callback functions (WalkDirFunc) in parallel, utilizing goroutines.
By default, it utilizes 2*runtime.NumCPU() goroutines for callbacks. This can be changed by using WalkN function which has the additional parameter, specifying the number of goroutines (concurrency).
pwalk vs pwalkdir
This package is very similar to
pwalk,
but utilizes filepath.WalkDir
(added to Go 1.16), which does not call stat(2)
on every entry and is therefore faster (up to 3x, depending on usage scenario).
Users who are OK with requiring Go 1.16+ should switch to this implementation.
Caveats
Please note the following limitations of this code:
-
Unlike filepath.WalkDir, the order of calls is non-deterministic;
-
Only primitive error handling is supported:
-
fs.SkipDir is not supported;
-
no errors are ever passed to WalkDirFunc;
-
once any error is returned from any walkDirFunc instance, no more calls to WalkDirFunc are made, and the error is returned to the caller of WalkDir;
-
if more than one WalkDirFunc instance will return an error, only one of such errors will be propagated to and returned by WalkDir, others will be silently discarded.
-
Documentation
For the official documentation, see https://pkg.go.dev/github.com/opencontainers/selinux/pkg/pwalkdir
Benchmarks
For a WalkDirFunc that consists solely of the return statement, this implementation is about 15% slower than the standard library's filepath.WalkDir.
Otherwise (if a WalkDirFunc is actually doing something) this is usually
faster, except when the WalkDirN(..., 1) is used. Run go test -bench .
to see how different operations can benefit from it, as well as how the
level of paralellism affects the speed.