Writing the F# recursive folder visitor in C# - seq vs IEnumerable

c# f#

Question

I often use this recursive 'visitor' in F#

let rec visitor dir filter= 
    seq { yield! Directory.GetFiles(dir, filter)
          for subdir in Directory.GetDirectories(dir) do yield! visitor subdir filter} 

Recently I've started working on implementing some F# functionality in C#, and I'm trying to reproduce this as IEnumerable, but I'm having difficulty getting any further than this:

static IEnumerable<string> Visitor(string root, string filter)
{
    foreach (var file in Directory.GetFiles(root, filter))
        yield return file;
    foreach (var subdir in Directory.GetDirectories(root))
        foreach (var file in Visitor(subdir, filter))
            yield return file;
}

What I don't understand is why I have to do a double foreach in the C# version for the recursion, but not in F#... Does the seq {} implicitly do a 'concat'?

1
8
11/21/2008 11:26:38 AM

Accepted Answer

yield! does a 'flatten' operation, so it integrates the sequence you passed it into the outer sequence, implicitly performing a foreach over each element of the sequence and yield on each one.

13
11/21/2008 12:27:08 PM

There is no simple way to do this. You could workaround this by defining a C# type that can store either one value or a sequence of values - using the F# notation it would be:

type EnumerationResult<'a> = 
  | One of 'a
  | Seq of seq<'a>

(translate this to C# in any way you like :-))

Now, you could write something like:

static IEnumerable<EnumerationResult<string>> Visitor
       (string root, string filter) {
   foreach (var file in Directory.GetFiles(root, filter))
      yield return EnumerationResult.One(file);
      foreach (var subdir in Directory.GetDirectories(root))
           yield return EnumerationResult.Seq(Visitor(subdir, filter))
   }
}

To use it, you'd have to write a function that flattens EnumerationResult, which could be an extension method in C# with the following signature:

IEnumerable<T> Flatten(this IEnumerable<EnumerationResult<T>> res);

Now, this is a part where it gets tricky - if you implemented this in a straighforward way, it would still contain "forach" to iterate over the nested "Seq" results. However, I believe that you could write an optimized version that wouldn't have quadratic complexity.

Ok.. I guess this is a topic for a blog post rather than something that could be fully described here :-), but hopefully, it shows an idea that you can try following!

[EDIT: But of course, you can also use naive implementation of "Flatten" that would use "SelectMany" just to make the syntax of your C# iterator code nicer]


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon