Managing BLOBs using SQL Server FileStream via EF and WCF streaming

Für mehr Informationen zu diesem Thema in Deutsch siehe meinen Artikel “Stream das BLOB” im dotnetpro Magazin 6/2012.

With FileStream SQL Server 2008+ stores BLOBs in the NTFS file system instead of its table store and offers fast streaming access.

This post shows how to manage FileStream BLOBs via the Entity Framework (EF 4.1) DbContext API and how to stream them via Windows Communication Foundation (WCF).
Download sample code.
View UI demo video.

The main options for storing large objects (BLOBs) in SQL Server are:

  • Store BLOB in a DB table.
  • Store BLOB via FileStream in NTFS.
  • Store BLOB via FileTable in NTFS.
  • Store File in NTFS with a reference to its path in a DB Table.

If you prefer to manually store your BLOBs as NFTS Files consider adding transaction support, see Using Transactional NTFS (TxF) with SQL Server in .NET.

BLOB storage options

FileStream and its internal NFTS folder

FileTable and ist virtual NTFS file share

The main features of FileStream are:

  • BLOBs are stored in NTFS.
  • Transactional consistency.
  • Access via SQL and fast Win32 streaming .
  • Consistent data recovery via full Backup/Restore support.

SQL Server 2012 FileTable exposes the FileStream BLOBS via an NFTS share. This enables users to mange (create, read, edit, rename, delete) the BLOBs stored in SQL Server via Windows Explorer and applications to manage the BLOBs via the standard System.IO API. While access via the NTFS share is non-transactional, you can still access the same BLOBs transactionally via the underlying FileStream.

Sample Application

The sample application uses a 3-layer architecture: WPF Client, WCF Service with streaming over NetTcpBinding and SQL Server with FileStream. 2-layer architectures are generally a bad choice, because they allow end users to access the data bypassing application logic (see Beware of End-User Permissions in Databases).

Sample Architecture

Database Table

Specifying the FILESTREAM attribute on a varbinary(max) column causes SQL Server to store the data in the local NTFS file system. To be updatable via Entity Framework the ID must be PRIMARY KEY, UNIQE is not sufficient.


CREATE TABLE FileStreamTest.dbo.Document
(
	[ID] UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL PRIMARY KEY,
	[Content] VARBINARY(MAX) FILESTREAM NULL,
	[Name] VARCHAR(255) NOT NULL,
	[Type] VARCHAR(5) NULL
)

WCF Service

If a message contains a stream it must be the only parameter in the body. Thus the other parameters are defined in the message header. For best performance the service accesses the BLOBs via the SQLFileStream class and not via SQL.

SQLFileStream can only be used within a transaction.

For downstreaming the BLOB (see Function GetDocument) we defer closing the transaction to the Dispose method. If you dislike tying the transaction length to the length of client operations you can alternatively return a copy of the BLOB stream. Depending on BLOB size this may cost a substantial amount of  service memory.

<ServiceContract()>
Public Interface IDocumentService
    <OperationContract()> Sub SaveDocument(documentInfo As DocumentInfo)
    <OperationContract()> Function GetDocumentKeysByFullText(searchCondition As String) As List(Of DocumentKeys)
    <OperationContract()> Function GetDocument(ID As Guid) As System.IO.Stream
End Interface

<MessageContract()>
Public Class DocumentInfo
    <MessageBodyMember()> Public DocumentStream As IO.Stream
    <MessageHeader()> Public FileName As String
    <MessageHeader()> Public DocumentStreamLength As Long
End Class

<MessageContract()>
Public Class DocumentKeys
    <DataMember()> Public ID As Guid
    <DataMember()> Public Name As String
End Class

<ServiceBehaviorAttribute(InstanceContextMode:=InstanceContextMode.PerCall)>
Public Class DocumentService
    Implements IDocumentService, IDisposable

    Private Shared _readCommittedTransactionOptions As New TransactionOptions With {.IsolationLevel = Transactions.IsolationLevel.ReadCommitted, .Timeout = TransactionManager.DefaultTimeout}

    Class FileStreamContext
        Public Property InternalPath As String
        Public Property TransactionContext As Byte()
    End Class

    Public Sub SaveDocument(documentInfo As DocumentInfo) Implements IDocumentService.SaveDocument
        _trace.Debug("")
        Dim stw = Stopwatch.StartNew
        Using tran As New TransactionScope(TransactionScopeOption.RequiresNew, _readCommittedTransactionOptions)
            Using DB As New FileStreamTestEntities
                'Insert structured data
                Dim ID = Guid.NewGuid
                'Initialize  Content with an empty stream because NULL does not create a file to be used with SQLFileStream later
                Dim enptyStream = New Byte() {}
                DB.Document.Add(New Document With {.Content = enptyStream, .ID = ID, .Name = documentInfo.FileName, .Type = IO.Path.GetExtension(documentInfo.FileName)})
                DB.SaveChanges()
                Dim fsc = DB.Database.SqlQuery(Of FileStreamContext)("SELECT Content.PathName() AS InternalPath, GET_FILESTREAM_TRANSACTION_CONTEXT() AS TransactionContext FROM Document WHERE ID={0}", ID).First

                'Insert filestream via Win32 IO (avoiding SQL for best performance)
                Using sqlFileStream = New SqlFileStream(fsc.InternalPath, fsc.TransactionContext, IO.FileAccess.Write)
                    documentInfo.DocumentStream.CopyTo(sqlFileStream)
                End Using
                tran.Complete()
            End Using
        End Using
        stw.Stop()
        _trace.InfoFormat("Saved document {0} length {1:n0} elapsed {2}", documentInfo.FileName, documentInfo.DocumentStreamLength, stw.Elapsed)
    End Sub

    Private _downloadCommittableTransaction As CommittableTransaction
    Private _downloadSqlFileStream As SqlFileStream
    Private _downloadDBContext As New FileStreamTestEntities
    Public Function GetDocument(ID As Guid) As System.IO.Stream Implements IDocumentService.GetDocument
        _trace.Debug("")
        _downloadCommittableTransaction = New Transactions.CommittableTransaction(_readCommittedTransactionOptions)
        With _downloadDBContext.Database
            .Connection.Open()
            .Connection.EnlistTransaction(_downloadCommittableTransaction)
            Dim fsc = .SqlQuery(Of FileStreamContext)("SELECT Content.PathName() AS InternalPath, GET_FILESTREAM_TRANSACTION_CONTEXT() as TransactionContext FROM Document WHERE ID={0}", ID).First
            _downloadSqlFileStream = New SqlFileStream(fsc.InternalPath, fsc.TransactionContext, IO.FileAccess.Read)
        End With
        Return _downloadSqlFileStream
    End Function

    Private disposedValue As Boolean
    Protected Overridable Sub Dispose(disposing As Boolean)
        _trace.Debug("xx")
        If Not Me.disposedValue Then
            If disposing Then
                If _downloadSqlFileStream IsNot Nothing Then _downloadSqlFileStream.Dispose()

                'Because we don't know the intrinsics of connection pooling and DbContext
                'we dispose everything explicitely to ensure we are not leaking any resources
                If _downloadCommittableTransaction IsNot Nothing Then
                    If _downloadCommittableTransaction.TransactionInformation.Status = TransactionStatus.Active Then _downloadCommittableTransaction.Commit()
                    _downloadCommittableTransaction.Dispose()
                End If
                If _downloadDBContext IsNot Nothing Then
                    _downloadDBContext.Database.Connection.Dispose()
                    _downloadDBContext.Dispose()
                End If
            End If
        End If
        Me.disposedValue = True
    End Sub

WCF Configuration

For best performance we use NetTcpBinding. To protect against denial-of-service attacks MaxReceivedMessageSize must be specified even with streaming. MaxBufferSize only applies to the message header.

 <netTcpBinding>
   <binding name="NetTcpBindingEndpoint" closeTimeout="00:01:00" openTimeout="00:01:00"
       receiveTimeout="00:20:00" sendTimeout="00:20:00"
       transferMode="Streamed"
       maxReceivedMessageSize="9223372036854775807" maxBufferSize="65536">
     <security mode="Transport">
       <transport clientCredentialType="Windows" protectionLevel="EncryptAndSign" />
       <message clientCredentialType="Windows" />
     </security>
   </binding>
 </netTcpBinding>

WPF Client

To avoid blocking the UI thread, synchronous service calls are made via Task Parallel Library (TPL) tasks. To allow the completion code to update the UI  TaskScheduler FromCurrentSynchronizationContext is used.

Private Sub _saveToDBButtton_Click(sender As System.Object, e As System.Windows.RoutedEventArgs)
    Dim ofd As New Microsoft.Win32.OpenFileDialog
    If Not ofd.ShowDialog Then Exit Sub

    'Note: getting .Length from FileStream would hurt performance
    Dim fileInfo = New FileInfo(ofd.FileName)
    Dim stw = Stopwatch.StartNew
    Task.Factory.StartNew(Sub()
                              Using localFileStream As New IO.FileStream(fileInfo.FullName, IO.FileMode.Open, IO.FileAccess.Read, IO.FileShare.Read)
                                  _documentService.SaveDocument(fileInfo.Length, fileInfo.Name, localFileStream)
                              End Using
                          End Sub).ContinueWith(Sub()
                                                    stw.Stop()
                                                    Me.UploadedDocuments.Insert(0, New UploadInfo With {.FileName = fileInfo.Name, .LengthKB = fileInfo.Length / 1024, .ElapsedTime = stw.Elapsed})
                                                End Sub, TaskScheduler.FromCurrentSynchronizationContext)
End Sub

Private Sub _showButton_Click(sender As System.Object, e As System.Windows.RoutedEventArgs)
    Dim documentKey = CType(_foundDocumentsGrid.SelectedItem, DocumentKeys)
    Task.Factory.StartNew(Sub()
                              'See: Those pesky temp files http://web.archive.org/web/20080724182921/http://dotnet.org.za/markn/archive/2006/04/15/51594.aspx
                              Dim tempFilePath = Path.Combine(Path.GetTempPath, Path.GetRandomFileName)
                              tempFilePath = Path.ChangeExtension(tempFilePath, Path.GetExtension(documentKey.Name))
                              _tempFileCollection.AddFile(tempFilePath, keepFile:=False)

                              Using sqlFileStream = _documentService.GetDocument(documentKey.ID),
                                    localFileStream As New IO.FileStream(tempFilePath, FileMode.Create, FileAccess.Write)

                                  sqlFileStream.CopyTo(localFileStream)
                              End Using
                              Process.Start(tempFilePath)
                          End Sub)
End Sub

The following log4net service trace shows how a uploading a small document (thread 21) overtakes a large document (thread 20):

11.03.2012 09:35:31,018| [10]Main.Main@9 | -> Service starting
11.03.2012 09:36:20,865| [20]DocumentService.Dispose@70 | xx
11.03.2012 09:36:54,890| [20]DocumentService..ctor@26 | **
11.03.2012 09:36:54,901| [20]DocumentService.SaveDocument@30 | -> DVC11.wmv
11.03.2012 09:36:56,923| [21]DocumentService..ctor@26 | **
11.03.2012 09:36:56,924| [21]DocumentService.SaveDocument@30 | -> DSC01775.JPG
11.03.2012 09:36:57,126| [21]DocumentService.SaveDocument@50 | <- Saved document:DSC01775.JPG length:2,131,220B elapsed:00:00:00.2011388
11.03.2012 09:36:57,128| [21]DocumentService.Dispose@70 | xx
11.03.2012 09:37:03,435| [20]DocumentService.SaveDocument@50 | <- Saved document:DVC11.wmv length:404,618,331B elapsed:00:00:08.5327333
11.03.2012 09:37:03,437| [20]DocumentService.Dispose@70 | xx

Enabling FileStream

To enable FileStream for a database:

  • Enable FileStream in Windows:
    SQLFileStream_7_EnableFS
  • Enable FileStream with streaming access for the database instance.
  • Create a filegroup for FileStream data.
  • Create a table with a varbinary(max) column using the FILESTREAM attribute.
--Enable FileStream and with file I/O streaming access in SQL Server
EXEC sp_configure filestream_access_level, 2
RECONFIGURE

--Create FileStream enabled database
CREATE DATABASE FileStreamTest
ON
PRIMARY ( NAME = FileStreamTest,
	FILENAME = 'D:\FileStreamDemo\DB\FileStreamTest.mdf'),
FILEGROUP FileStreamGroup1 CONTAINS FILESTREAM( NAME = FileStreamTest_Files,
	FILENAME = 'D:\FileStreamDemo\DB\FileStreamTest_Files')
LOG ON  ( NAME = FileStreamTest_Log,
	FILENAME = 'D:\FileStreamDemo\DB\FileStreamTest_Log.ldf')
GO

About Peter Meinl

IT Consultant
This entry was posted in Computers and Internet and tagged , , , , , , . Bookmark the permalink.

4 Responses to Managing BLOBs using SQL Server FileStream via EF and WCF streaming

  1. veera says:

    what happens when _downloadSqlFileStream is not closed but DocumentService is disposed??

    • Peter Meinl says:

      My sample service implements closing the connection in via IDisposable (see sample code). The service is configured with InstanceContextMode.PerCall, thus the service instance gets disposed (and the transaction closed) after the client call completes. You can verify the behavior by tracing.

      I you try to close the connection in GetDocument with a large document you will receive an WCF error “connection unexpectedly closed” because it will be closed while the client is still downstreaming.

  2. veera says:

    Until a CommittableTransaction has been committed, all the resources involved with the transaction are still locked.

  3. Pingback: FILESTREAM, Entity Framework and WCF Sample Application « svenaelterman

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s