Cached IEnumerable Implementation
Recently I had a little discussion with a colleague about IEnumerable
and that it evaluates its entries each time it is enumerated. This peaked my interest, and I decided to implement a cached version.
There are many other implementations out there, some of them very similar to mine. It still seemed like a fun little project, so here it goes.
Implementation Concept
To create a cached IEnumerable
, an implementation for IEnumerable
and IEnumerator
is needed.
The basic idea is that the CachedIEnumerable
wraps the source IEnumerable
(the one that should be cached), and includes a List
of already evaluated entries, so that they don't get evaluated multiple times.
The CachedIEnumerator
receives both, the IEnumerable
and the shared cached values. Only if it moves past the cached values, will it evaluate the next entry and add it to the cached entries. Then, all other enumerators will have access to the new entry without having to evaluate it.
Further Explanations
There are 2 details I want to elaborate on.
First, disposing the enumerator. IEnumerator
implements IDisposable
, so, the enumerator should, at least at some point, be disposed.
Since C# developers are not used to disposing collections, it seemed like a bad choice to implement IDisposable
on CachedEnumerable
. Because of this,
I decided to do it in the destructor. Additionally, it would be good to do it as soon as the source was fully enumerated, and then save that state somewhere.
This will be added in the next update of this blogpost.
Secondly, I'll address invalidating the enumerators if the source collection was modified.
Some collections, like lists, can be modified. If that happens, the source enumerator will throw an InvalidOperationException
when moving to the next value.
When that happens, the whole cache is not valid anymore. Therefore, the cache will be cleared and all existing enumerators invalidated.
This is implemented by CachedIEnumerable
, which holds a weak reference to all enumerators. I chose weak reference, so that the garbage collector can still
clean up the enumerators if they are not needed anymore.
Actual Code
Here is the implementation for CachedIEnumerable
:
public class CachedIEnumerable<T> : IEnumerable<T>
{
private List<T> cache;
private IEnumerable<T> source;
private IEnumerator<T> enumerator;
private List<WeakReference<CachedIEnumerator<T>>> enumerators;
public IReadOnlyCollection<T> CachedValues => cache;
public CachedIEnumerable(IEnumerable<T> source)
{
this.cache = new List<T>();
this.source = source;
this.enumerator = source.GetEnumerator();
this.enumerators = new List<WeakReference<CachedIEnumerator<T>>>();
}
~CachedIEnumerable()
{
enumerator.Dispose();
}
public IEnumerator<T> GetEnumerator()
{
var e = new CachedIEnumerator<T>(cache, enumerator, reset);
enumerators.Add(new WeakReference<CachedIEnumerator<T>>(e));
return e;
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
/// <summary>
/// resets the cached IEnumerable and invalidates all existing enumerators
/// called when the underlying collection changed
/// </summary>
private void reset()
{
cache.Clear(); //the cache is not valid at this point
//the enumerator of the underlying collection must be renewed (it threw the exception)
enumerator.Dispose();
enumerator = source.GetEnumerator();
//invalidate all enumerators of the cached enumerable
foreach(var weakEnumeratorReference in enumerators)
{
CachedIEnumerator<T> e;
if (weakEnumeratorReference.TryGetTarget(out e)) {
e.Invalidate();
e.Dispose();
}
}
enumerators.Clear(); //since all enumerators are invalidated, there is no need to reference them any more
}
}
And here the implementation for CachedIEnumerator
:
public class CachedIEnumerator<T> : IEnumerator<T>
{
private List<T> sharedCache;
private IEnumerator<T> enumerator;
private int currentIndex = -1;
private bool valid = true;
private bool pastEnd = false;
private Action invalidateAll;
public CachedIEnumerator(List<T> sharedCache, IEnumerator<T> enumerator, Action invalidateAll)
{
this.sharedCache = sharedCache;
this.enumerator = enumerator;
this.invalidateAll = invalidateAll;
}
private T current;
public T Current
{
get
{
if (!pastEnd) return current;
else throw new InvalidOperationException("The enumerator reached past the end of the collection");
}
private set
{
current = value;
}
}
object IEnumerator.Current
{
get { return Current; }
}
public void Dispose()
{
}
public void Invalidate()
{
valid = false;
}
public bool MoveNext()
{
if (!valid)
throw new InvalidOperationException("The collection was modified after the enumerator was created.");
currentIndex++;
if (currentIndex < sharedCache.Count)
{
Current = sharedCache[currentIndex];
return true;
}
try
{
var success = enumerator.MoveNext();
if (success)
{
Current = enumerator.Current;
sharedCache.Add(Current);
return true;
}
else
{
Current = default(T);
pastEnd = true;
return false;
}
}
catch (InvalidOperationException)
{
invalidateAll();
throw;
}
}
public void Reset()
{
throw new NotSupportedException();
}
}
Additional Info
You can also find the code on Github as a dotnet core project, including a lot of tests.
Happy Coding! :)
Update 1 (16.10.2016)
I have removed the implementation for IList
, because LINQ treats lists differently in some cases, since it expects that generators don't implement IList
. If you want to see the original blogpost, which includes the IList
implementation, you find it here.
The updated code also includes additional edge-case handling, like a changing source list.