## Tamed FileSystemWatcher

[Für Details siehe meinen Artikel “Gezähmte Beobachter” im Windows Developer Magazin, Sept 2015]

[Feb 21, 2018: Fixed bug in code: Filter no being applied with OrderByOldestFirst=false]
[Mar 02, 2018: Fixed existing files event argument fileName passing full path with OrderByOldestFirst=false]

This post shares robust wrappers around the standard FileSystemWatcher (FSW)  fixing problems commonly encountered when using it to monitor the file system in real-world applications.

Buffering and Recovering FSW

Simply replace the standard FSW with my BufferingFileSystemWatcher and you no longer need to worry about InternalBufferOverflowExceptions. Use my RecoveringFileSystemWatcher to automatically recover from typical transient watch path accessibility problems. Download complete code. For a file system watcher using polling instead of file system events see my FilePoller. To process files detected in either way I recommend using TPL DataFlow ActionBlocks. They allow you to easily process files without having to spawn a Thread or create a Task yourself and allow to configure the degree of parallelism desired. For tips about handling lots of files and using contig.exe to defragment NTFS indexes see NTFS performance and large volumes of files and directories.

# Typical FileSystemWatcher Problems

If used properly the standard FileSystemWatcher (FSW) is way better than its reputation. However, there are typical problems one may encounter when first using the FSW:

• Unexpected events.
• Lost events.
• InternalBufferOverflowExceptions.
• No option to report files existing before the FSW started.

The standard FileSystemWatcher:

• Reports exceptions via its Error event. Not via raising exceptions!
• Does not report files that existed before .EnableRaisingEvents =True.
• Does detect network disruptions, but does not automatically recover from them.
• Does automatically handle renames of its watch path.

# Unexpected Event Floods

Some applications trigger lots of file system events for a single action. The FSW simply reports these events. Ex: Excel triggers 15 NTFS events for 4 different files when creating a single new .xlsx file and triggers 8 events for 3 different files of which none is changed event for the file changed one would naively expect:

File system event flood triggered by Excel for single actions like “Save”

If you have control over the file producer you can easily tame this event flood by renaming the files to the watched directory or watched extension only when they are totally complete. For .NET applications a file rename is an atomic operation. Your watcher now only needs to watch for a single renamed event per file. An easy way to create lots of files and file system events is repeatedly copying and pasting all files in a directory via CTRL+A, CTRL-C, CTRL-V.

# Lost Events

The FSW only throws exceptions on problems when setting its properties. While watching for changes the FSW reports exceptions via its Error event only and does does not raise them. This is typical for the Event-based Asynchronous Pattern (EAP). To prevent exceptions going unnoticed one must handle the Error event. Strangely this so important FSW Error event does not show up in the Win Forms designer, my wrappers fix this. Maybe not discovering the need to implement an OnError handler and being misled by the FSW throwing exception until started is the reason for the wrongly perceived bad reliability of the FSW. The FSW minimizes usage of precious non-paged memory via its InternalBufferSize property and throws an InternalBufferOverflowException “Too many changes at once in directory …”when exceeded.

# Robust File Operations

For file operations one should consider using those from the Microsoft.VisualBasic namespace because they are more robust than those in System.IO and offer more features like automatically overwriting existing files.

When working with files this is typically done in a situation where producers and consumers work concurrently. Here if/then constructs are not robust because changes can happen in between. Thus one should rely on exceptions instead.

With IOException.HResult no longer being protected since .NET 4.5 and C# 6.0 finally supporting exception filters (VB.NET always had this feature) it is now easy to handle typical exceptions like FileNotFound, FileInUse or NetworkNameNoLongerAvailable.

# BufferingFileSystemWatcher

My BufferingFileSystemWatcher wraps the standard FSW:

• Buffers FSW events in a BlockingCollection. It is better to buffer in a BlockingCollection than consuming precious non-paged memory by increasing InternalBufferSize.
• Supports limiting the BlockingCollection.BoundedCapacity via the EventQueueSize property. Must be set before EnableRaisingEvents=True!
• Offers reporting existing files via a new event Existed. Existing files are reported before any ones detected by NTFS events.
• Offers sorting events by oldest (existing) file first. Gets enabled when subscribing to events Existed or All
• Offers a new event All reporting all FSW events. Real-world apps typically subscribe to all change types because the FSW change types triggered often do not correspond to the action of the producer. Ex: On saving changes Excel triggers 8 events for 3 different files with no(!) change event for the changed file, see picture above.
• Offers the Error event in Win Forms designer.
• Wraps FSW via composition not breaking its API. Thus you can simply replace your FileSystemWatcher instances with BufferingFileSystemWatcher and your InternalBufferOverflowException are gone without increasing InternalBufferSize.

The following listing shows key parts of the BufferingFileSystemWatcher in C# 6.0:

public class BufferingFileSystemWatcher : Component
{
private FileSystemWatcher _containedFSW = null;
...
public BufferingFileSystemWatcher()
{
_containedFSW = new FileSystemWatcher();
}
...
public bool EnableRaisingEvents
{
get
{
return _containedFSW.EnableRaisingEvents;
}
set
{
if (_containedFSW.EnableRaisingEvents == value) return;

StopRaisingBufferedEvents();
_cancellationTokenSource = new CancellationTokenSource();

_containedFSW.EnableRaisingEvents = value;
if (value) RaiseBufferedEventsUntilCancelled();
}
}
...
public event FileSystemEventHandler All
{
{
if (_onAllChangesHandler == null)
{
_containedFSW.Created += BufferEvent;
_containedFSW.Changed += BufferEvent;
_containedFSW.Renamed += BufferEvent;
_containedFSW.Deleted += BufferEvent;
}
_onAllChangesHandler += value;
}
...
private void BufferEvent(object _, FileSystemEventArgs e)
{
{
var ex = new EventQueueOverflowException($"Event queue size {_fileSystemEventBuffer.BoundedCapacity} events exceeded."); InvokeHandler(_onErrorHandler, new ErrorEventArgs(ex)); } } ... private void RaiseBufferedEventsUntilCancelled() { Task.Run(() =>; { try { if (_onExistedHandler != null || _onAllChangesHandler != null) NotifyExistingFiles(); foreach (FileSystemEventArgs e in _fileSystemEventBuffer.GetConsumingEnumerable(_cancellationTokenSource.Token)) { if (_onAllChangesHandler != null) InvokeHandler(_onAllChangesHandler, e); else { switch (e.ChangeType) { case WatcherChangeTypes.Created: InvokeHandler(_onCreatedHandler, e); break; case WatcherChangeTypes.Changed: InvokeHandler(_onChangedHandler, e); break; case WatcherChangeTypes.Deleted: InvokeHandler(_onDeletedHandler, e); break; case WatcherChangeTypes.Renamed: InvokeHandler(_onRenamedHandler, e as RenamedEventArgs); break; catch (OperationCanceledException) { } //ignore catch (Exception ex) { BufferingFileSystemWatcher_Error(this, new ErrorEventArgs(ex)); private void NotifyExistingFiles() { if (OrderByOldestFirst) { var searchSubDirectoriesOption = (IncludeSubdirectories) ? SearchOption.AllDirectories : SearchOption.TopDirectoryOnly; var sortedFileNames = from fi in new DirectoryInfo(Path).GetFiles(Filter, searchSubDirectoriesOption) orderby fi.LastWriteTime ascending select fi.Name; foreach (var fileName in sortedFileNames) { InvokeHandler(_onExistedHandler, new FileSystemEventArgs(WatcherChangeTypes.All, Path, fileName)); InvokeHandler(_onAllChangesHandler, new FileSystemEventArgs(WatcherChangeTypes.All, Path, fileName)); } } else { foreach (var fileName in Directory.EnumerateFiles(Path)) { InvokeHandler(_onExistedHandler, new FileSystemEventArgs(WatcherChangeTypes.All, Path, fileName)); InvokeHandler(_onAllChangesHandler, new FileSystemEventArgs(WatcherChangeTypes.All, Path, fileName)); ... # RecoveringFileSystemWatcher My RecoveringFileSystemWatcher wraps the BufferingFileSystemWatcher: • Detects and reports watch path accessibility problems. Using a poll timer monitoring the watch path and the FSW Error event. For Robustness restarting from the Error event is not done directly but also done via the timer! • Automatically recovers from watch path accessibility problems. By restarting the BufferingFileSysteWatcher. New files created during the outage are reported via the Existed event. • Allows consumer to cancel auto recovery for selected exceptions using e.Handled=True. The following listing shows key parts of the RecoveringFileSystemWatcher: public class RecoveringFileSystemWatcher : BufferingFileSystemWatcher { public TimeSpan DirectoryMonitorInterval = TimeSpan.FromMinutes(5); public TimeSpan DirectoryRetryInterval = TimeSpan.FromSeconds(5); private System.Threading.Timer _monitorTimer = null; ... public new bool EnableRaisingEvents { get { return base.EnableRaisingEvents; } set { if (value == EnableRaisingEvents) return; base.EnableRaisingEvents = value; if (EnableRaisingEvents) { base.Error += BufferingFileSystemWatcher_Error; Start(); } else { base.Error -= BufferingFileSystemWatcher_Error; } private void Start() { try { _monitorTimer = new System.Threading.Timer(_monitorTimer_Elapsed); Disposed += (_, __) =>; { _monitorTimer.Dispose(); _trace.Info("Obeying cancel request"); }; ReStartIfNeccessary(TimeSpan.Zero); private void _monitorTimer_Elapsed(object state) { try { if (!Directory.Exists(Path)) { throw new DirectoryNotFoundException($"Directory not found {Path}");
}
else
{
_trace.Info($"Directory {Path} accessibility is OK."); if (!EnableRaisingEvents) { EnableRaisingEvents = true; if (_isRecovering) _trace.Warn("<== Watcher recovered"); } ReStartIfNeccessary(DirectoryMonitorInterval); } } catch (Exception ex) when ( ex is FileNotFoundException || ex is DirectoryNotFoundException) { if (ExceptionWasHandledByCaller(ex)) return; if (_isRecovering) { _trace.Warn("...retrying"); } else { _isRecovering = true; } EnableRaisingEvents = false; _isRecovering = true; ReStartIfNeccessary(DirectoryRetryInterval); } catch (Exception ex) { _trace.Error($"Unexpected error: {ex}");
throw;

private void ReStartIfNeccessary(TimeSpan delay)
{
try
{
_monitorTimer.Change(delay, Timeout.InfiniteTimeSpan);
}
catch (ObjectDisposedException)
{ } //ignore timer disposed
}

private void BufferingFileSystemWatcher_Error(object sender, ErrorEventArgs e)
{
var ex = e.GetException();
if (ExceptionWasHandledByCaller(e.GetException()))
return;

EnableRaisingEvents = false;

if (ex is InternalBufferOverflowException || ex is EventQueueOverflowException)
{
ReStartIfNeccessary(DirectoryRetryInterval);
}
else if (ex is Win32Exception && (ex.HResult == NetworkNameNoLongerAvailable | ex.HResult == AccessIsDenied))
ReStartIfNeccessary(DirectoryRetryInterval);
}

The following picture shows a console trace of the RecoveringFileSystemWatcher working and auto recovering:

RecoveringFileSystemWatcher working and auto recovering

This entry was posted in Computers and Internet and tagged , , . Bookmark the permalink.

### 27 Responses to Tamed FileSystemWatcher

1. Peter,

Thanks for sharing!

I noticed that delete handler is not working, It is due to missing: _onDeletedHandler += value in delete event handler.

Regards,
Michal

2. Bart says:

Is there a chance that the dlls are going to be available via nuget?

• Peter Meinl says:

No. For that I have not tested this version good enough.

3. Joy says:

Hi Peter,
Thanks for sharing this very useful application.
I have created a filewatcher service which was failing when there were more then 40 files. This has been resolved by using your BufferingFileSystemWatcher API in my application. I have tested with more than 200 files and they all processed! But the processing time increased significantly. Is there any place in you code I may look into to improve the performance?

Thanks a lot!
Joy

• Peter Meinl says:

I doubt that your performance problems are related to my BufferingFileSystemWatcher. You might be able to optimize your performance by using a TPL DataFlow ActionBlock

• pregunton says:

did you improve performance ? tips ?

• Peter Meinl says:

What kind of performance problems do you mean?

4. Anghell says:

Hi Peter, Thanks for sharing!
Greetings from Mexico City!
Angel

5. greggman says:

Thank you for sharing. I got here in a round about way but basically watching a samba share with FileSystemWatcher, at least in my setup, Window 10 x64, share on NAS, notices single file changes but responds with

System.IO.InternalBufferOverflowException: Too many changes at once in directory:\\10.0.1.1\gregg\src.

Searching eventually led me here but even with the solution above I get the same error. I guess that is a deeper issue with .NET

Started looking through the dotcorefx code but it’s way out of my experience.

https://github.com/dotnet/corefx/blob/master/src/System.IO.FileSystem.Watcher/src/System/IO/FileSystemWatcher.Win32.cs

6. D Soa says:

This looks like a very cool tool. I noticed that BufferingFileSystemWatcher.NotifyExistingFiles does not honor the Filter but returns all files if OrderByOldestFirst is false. Move the definition of searchSubDirectoriesOption out of the if and change the foreach in the else to this:

foreach (var fileName in Directory.EnumerateFiles(Path, Filter, searchSubDirectoriesOption))

Cheers!

• Peter Meinl says:

Thanks for reporting this bug and describing a solution.

7. D Soa says:

Could you explain how the ReStartIfNecessary() is restarting the BufferingFileSystemWatcher? I don’t see that the contained FileSystemWatcher is getting recreated. We have run into issues where FileSystemWatcher just stops monitoring a path. Does changing EnableRaisingEvents somehow “poke” it to tell it to look again?

Thanks!

• Peter Meinl says:

Only the RecoveringFileSystemWatcher does recover automatically from watch path accessibility problems (not the BufferingFileSystemWatcher).

You can verify the auto recovery working by using the TestConsoleApp:
– Run the .exe (don’t run it from within Visual Studio).
– Uncomment RunRecoveringWatcher();
– Uncomment watcher.DirectoryMonitorInterval = TimeSpan.FromSeconds(10);
Because the default (for real world usage) ist 5 min.

I don’t recall why I simply change EnableRaisingEvents and do not create an new instance of the contained FSW. Maybe creating a new instance cause unwanted side effects or it was simply unnecessary.

• D Soa says:

Yeah, maybe it isn’t necessary to create a new one because internally the FSW will create a new directory handle when you enable raising events again. Creating a new one would be difficult also because you’d have to re-set all the options that were set on the old one. If this works, it is much simpler!

Thanks for the quick replies.

8. Tim Meyer says:

Great work on this! I noticed that the Existing files event argument passes the full path and file name back for FullPath and Name when not sorting by oldest file. All the other events just pass the file name in the Name property. Something similar to below should remedy in NotifyExistingFiles method.
foreach (var fileName in Directory.EnumerateFiles(Path, Filter, searchSubDirectoriesOption))
{
var name = System.IO.Path.GetFileName(fileName);
InvokeHandler(_onExistedHandler, new FileSystemEventArgs(WatcherChangeTypes.All, Path, name));
InvokeHandler(_onAllChangesHandler, new FileSystemEventArgs(WatcherChangeTypes.All, Path, name));
}

Thanks again for the article and code.

• Peter Meinl says:

9. Rich says:

Hi:
Thanks for the solution and all your hard work. I usually go with polling change events by file size comparisons after creation event has been issued. I now need a solution to monitor 8 separate directories of very large log files on AWS instance and when all the files have been copied; the copied newly Rsynced files need to be copied together into a separate directory and automatically updated into a MSSQL database and then have all the file handles and garbage collection done on all open file handles resources. What do you think is the best solution for this? Since this action will run once a day and move files from 8 different directories to one directory and then upload the one file ( copied together from the 8 separate files) into an MSSQL database. Do you think it would be better to watch each separate file with a FSW or since we know that all the files will definitely be created by 12:00 A.M. E.S.T. just check if the files exist and save the file size and re-check 50 seconds later to make sure the file size hasn’t changed and then just move the files into one directory and concatenate them together ? Then have the one file which will always have the one name attribute change and that is the date the file was created and use that as a way to trigger the upload of the concatenated file. Sorry for the lengthy explanation but in short we have 8 different files rsyched to a different directories. The file names will be exactly the same with the date appended at the end. Each of these separate files will to copied together and placed into a separate upload directory and when the concatenated file has finished copying it needs to be loaded into a MSSQL database. This would be in a ASP.NET MVC 4.5 framework application running MSSQL 2008 R2. I’m just trying to decide should I go with SSIS and a .INI file in each 8 separate directories and just upload each separately or go with 8 FSW and one more in the upload directory to trigger the load event on a timed creation response from the FSW in the upload directory. Thanks so much for your time

• Peter Meinl says:

As always: keep your solution as simple as possible!

Make it robust – e.g. against:
– the producer system hanging
– producer unexpectedly completing at a different time then 12:00
– your consumer app crashing and being restarted and then processing the same imput files again

With file-based interfaces it is always good if you have control over the producer. Instead of having to test in the consumer if the producer has completed file creation one can then simply have the producer rename the file to an extension being watched by the consumer once it iscomplete – e.g. produce filexyz.TMP and when complete rename it to filxyz.LOG. Renaming in Windows is an atomic operation and once the renamed file (.LOG) exists the consumer knows it is complete.

If you can’t control the producer instead of comparing file sizes consider trying to open the file exclusively. As long as you can’t get exclusive access the file is still being written to:

__Function IsFileComplete(filePath As String) As Boolean
____Try
______Using fs As New IO.FileStream(filePath, IO.FileMode.Open, IO.FileAccess.Read, IO.FileShare.None)
______End Using
______Return True
____Catch ex As IO.IOException When IsFileInUse(ex)
______Return False
____Catch
______Throw
____End Try
__End Function

__Function IsFileInUse(ByVal ex As IO.IOException) As Boolean
____’https://msdn.microsoft.com/en-us/library/ms681382%28v=VS.85%29.aspx
____Return (ex.HResult = ERROR_SHARING_VIOLATION OrElse ex.HResult = ERROR_LOCK_VIOLATION)
__End Function

I would simply check if all 8 files are complete (in case you used rename in the producer simply test for their existence) and if so do the import.
If not the next run of your consumer will get them.

Consider creating as Windows Scheduled Task or Windows Service instead of an ASP.NET MVC app.

For a scheduled task create a console app and register it with the Win Task Scheduler to run every hour or so.

For a service see my post: Windows Service Worker Options

__Do
____Check if all files are complete
____Import to SQLServer
____ ….WaitOne(TimeSpan.FromHours(1))
__Loop

You should not have to deal with file handles and garbage collection. Make use of the C# or VB.NET “using” statement and delete the input files after importing (or maybe for even better robustness delete them in the producer at the start of the resync step you mentioned).

Add a field to the database data holding a the date suffix you mentioned so you can handle the situation when running multiple times with the same input files.

Consider user MSSQL FileStream to store files in the DB. See my post Managing BLOBs using SQL Server FileStream via EF and WCF streaming

If you have control over the producer and thus can use the rename trick I mentioned you might need no code at all and could simply create an SSIS package and schedule it as periodically running SQL Server job. If all files (*.LOG) exist they are complete and will get imported. You just have to skip the import if one of the files does not exist yet.

10. Ben says:

Hello,

What an impressing piece of work on! Thanks for sharing. After struggling a long time with all FileSystemWatcher annoying drawbacks (event buffer filled, recovery,…) to watch a Samba share, it seems like I found a winner here.

I’m now testing RecoveryFileSystemWatcher (after BufferingFileSystemWatcher) and I wonder how respond to an error event, currently I get an type conversion error:

// RecoveryFileSystemWatcher version
_fswatcher.Created += new FileSystemEventHandler(event_created);
_fswatcher.Error += new ErrorEventHandler(event_error);

// OK
private void event_created(object _src, FileSystemEventArgs _arg)
{
// …
}

// Not OK: Error message in VS is “Cannot implicitly convert type System.IO.EventHandler to ‘System.EventHandler”

private void event_error(object _src, ErrorEventArgs _arg)
{
Exception _ex = _arg.GetException();
string _msg;

// … Code here
}

Could you give me a hint on how to handle errors from ‘RecoveryFileSystemWatcher’?

Plus what would be your design advice for post-detection processing (Tasks, threads,…) for a large amount of file (on Created & Existed events)?

Thanks

• Peter Meinl says:

The signature of the Error event differs between BufferingFileSystemWatcher and RecoveringFileSystemWatcher. RecoveringFileSystemWatchers event args are of type FileWatcherErrorEventArgs not ErrorEventArgs.

The TestConsoleApp in my sample code notifies errors like this:
..watcher.Error += (_, e) => { WriteLineInColor(e.Error.Message, ConsoleColor.Red); };

FileWatcherErrorEventArgs being derived from HandledEventArgs allows the consumer to suppress auto error handling by setting e.handled=true:
..watcher.Error += (_, e) =>
….{ WriteLineInColor(\$”Suppressing auto handling of error {e.Error.Message}”, ConsoleColor.Red);
……..// some error handling
……..e.Handled = true;
….};

I generally implement file processors as Windows services. See my posts “Windows Service Worker Options” and “Simple WCF”.

How to best process your files depends on the type of processing required. If the processing is compute-bound you might want to parallelize to make full use of the processors available. A nice solution for this is using TPL Dataflow. See my post “A WebCrawler demonstrating the Beauty of TPL Dataflow”.

Here a simplified example using a TPL Dataflow ActionBlock with MaxDegreeOfParallelism = Environment.ProcessorCount to process files in parallel:

Sub Main()
….Dim reader As New TransformBlock(Of String, String)(
……..Async Function(path)
…………….Return fileString
…………End Using
……..End Function)

….’Processing file content concurrently
….Dim totalWordCount As Integer
….Dim processor As New ActionBlock(Of String)(
……..Sub(content)
…………Dim allWordsRegEx As New Regex(“\S+”)
…………Dim wordCount = allWordsRegEx.Matches(content).Count
…………Debug.Print(“Processing result={0}”, wordCount)
……..End Sub,
……..New ExecutionDataflowBlockOptions With {.MaxDegreeOfParallelism = Environment.ProcessorCount})
….processor.Completion.ContinueWith(
……..Sub()
…………Console.WriteLine(“totalWordCount={0}”, totalWordCount)
……..End Sub)

….For Each path In Directory.EnumerateFiles(“x:\temp1\testfiles\in”, “*.txt”)
….Next

….PromptUser(“Working… Press to exit:”, ConsoleColor.White)
End Sub

11. Dave Newman says:

Hi Peter –

Thanks for sharing this fix for a “very powerful tool” (MSDN quote) — with limitations. I haven’t done much with it yet but I am encouraged by the comments here!

June 6, 2015: Changed from .NET 4.6 to 4.5

Why the change from 4.6 to 4.5? Was this due to an issue with your code fixes or just to make it easier to use in codebases that reference 4.5?

Thanks,
Dave

• Peter Meinl says:

Sorry, but I don’t remember the reason. I assume it was “just to make it easier to use in codebases that reference 4.5”. The code does not use any “special” framework features – only C# language features that were relatively new at this time (e.g. string interpolation with {}).

• Dave Newman says:

Thanks fr the reply, and thanks for creating and sharing this! I was considering doing something like this so I could reliably wait on files with FileSystemWatcher, but you have saved me the effort!

12. Jaime Stuardo says:

Hello… thanks for sharing this.

I have a question.

I am using RecoveringFileSystemWatcher and assign TimeSpan.FromSeconds(10) to DirectoryMonitorInterval property because a sample appears that way.

For its name, I guess there is a timer that checks the monitored folder and if there is a file, the event will be raised. That is for assuring all files will be detected, in case the Created event is not raised when it should (situation that may occur some times). Please confirm this.

On the other hand, when should EventQueueCapacity property be used? if not specified, the default of int.MaxValue is used.

Thanks
Jaime

• Peter Meinl says:

The standard FSW does only detect network disruptions, but does not automatically recover from them.
My RecoveringFileSystemWatcher uses the DirectoryMonitorInterval (default 5 min, the sample uses 10 sec to make testing easier) to detect and automatically recover from all watch path accessibility problems. This is not intended to compensate for lost events. The are no lost events because FSW errors and watch path accessibility errors are handled automatically by restarting the FSW and checking for existing files. In an older version I polled for overlooked files but this proved to be unnecessary and introduced race conditions with the standard event handling.

EventQueueCapacity (default ‌int.MaxValue) exposes the upper bound of the underlying BlockingCollection used to queue FSW events. I don’t see any reason to limit it but exposed the parameter in case someone might need it.