We want to save multiple type of data, documents, image, audio or video. BLOB data type can be used to store this kind of data but it’s limited to 2GB in size. BLOB data slows down database performance and takes significant resources to read. For detail, refer here ;
https://shahzadblog.com/2021/03/29/store-and-retrieve-blob-data-without-file-stream/
Filestream allows storing these large documents onto the file system itself. There is not limit on storage as compared to 2GB in BLOB storage. Application can access these files using NTFS streaming API. Filestream is not a SQL Server data type to store data.
Filestream does not use the buffer pool memory for caching these objects. If we cache these large objects in the SQL Server memory, it will cause issues for normal database processing. Therefore, FILESTREAM provides caching at the system cache providing performance benefits without affecting core SQL Server performance.
This is how to enable Filestream feature in SQL server;
Since I will be doing a lot of inserts/update/delete so I am going to change recovery mode of selected database to “Simple”.
Make sure that you have enable FILESTREAM on database level;
Checking the status of Filestream;
--check filestream status
USE master
Go
EXEC sp_configure
Go
0 = FILESTREAM is disabled.
1 = only T-SQL access to FILESTREAM data is allowed.
2 = T-SQL access and local streaming access is allowed.
3 = T-SQL access and local and remote streaming access is allowed.
You can also use this query to check status;
exec sp_configure 'filestream access level'
I am going to create my first table with filestream enabled;
--Create a table to work with Filestream
USE TODO
GO
CREATE TABLE [dbo].[Document](
[Doc_Num] uniqueidentifier ROWGUIDCOL unique NOT NULL DEFAULT newsequentialid(),
[Extension] [varchar](50) NULL,
[FileName] [varchar](200) NULL,
[FileSize] numeric(10,5),
[Doc_Content] [varbinary](max) FILESTREAM
) ON [PRIMARY]
Oops. I got this error;
I haven’t enabled Filestream feature in SQL server. Open SQL Server Configuration manager. Right click on SQL Server Services, properties and FileSTREAM.
Restart SQL server service.
Re-run table creation query. This time you will see this error;
What it means that we have not created File group. Remember Filestream require its own File groups. Right click on your database;
After creating filegroup, open database properties again and click on files. Click on Add to add new file and connect with previously created filegroup.
Navigate to your filesystem path and you should see this;
As shown in the above image, the $FSLOG directory and the filestream.hdr file have been created. $FSLOG is like SQL server T-Log, and filestream.hdr contains metadata of FILESTREAM. Make sure that you do not change or edit those files.
re-run our table creation sql script. This time table should be created successfully.
It’s time to add a new excel document in this table. Run following sql script;
--insert a new excel document
INSERT [dbo].[Document1] ([Extension] ,[FileName], [FileSize], [Doc_Content] )
SELECT 'xlsx', 'SampleExcelDoc.xlsx', 10, [Doc_Data].*
FROM OPENROWSET
(BULK 'D:\SimpleExcelDoc.xlsx', SINGLE_BLOB) [Doc_Data]
Run this statement;
SELECT * FROM [dbo].[Document]
Here is the file on file system;
This file can be opened directly with compatible application program, in this example Excel. Right click, open with and select Excel. When we insert the document using the FILESTREAM feature, SQL Server copies the file into the FILESTREAM path. It does not change the file properties.
To access Filestream data using managed API, we need to figure out Filestream content path. Here is the script;
--access Filestream data using managed API
SELECT Doc_Content.PathName() AS FilePath, *
FROM [dbo].[Document]
You should see these results;
To delete a document from Filestream container, run this;
--delete files from Filestream container.
USE TODO
GO
DELETE FROM dbo.Document
WHERE Doc_Num = '658020B3-1690-EB11-90BB-00155D717606'
In this case, we can see that file still exists for FILESTREAM document. SQL Server removes the old files using the garbage collection process. This process will remove the old file if it is no longer required by the SQL Server. Do you remember about a folder, $log into the FILESTREAM path. It works similar to a transaction log in the SQL Server database.
SQL Server has a particular internal filestream_tombstone table, which contains an entry for this old file.
SELECT * FROM sys.internal_tables where name like ‘filestream_tomb%’
The garbage collection process removes the old file only if there is no entry for the file in the filestream_tombstone table. We cannot see the content of this filestream_tombstone table. However, if you wish to do so, use the DAC connection.
The database recovery model plays an essential role in the entry in the filestream_tombstone table.
- Simple recovery mode: In a database in simple recovery model, the file is removed on next checkpoint
- Full recovery model: If the database is in full recovery mode, we need to perform a transaction log backup to remove this file automatically
We can run the garbage collector process file manually to remove the files from the container before the automatic garbage collector process cleans the file. We can do it using ‘sp_filestream_force_garbage_collection’ stored procedure.
Resource
https://www.pmichaels.net/tag/default-filestream-filegroup-is-not-available-in-database/
https://www.sqlshack.com/managing-data-with-sql-server-filestream-tables/
https://www.pmichaels.net/tag/default-filestream-filegroup-is-not-available-in-database/
https://www.jitendrazaa.com/blog/sql/sqlserver/export-documents-saved-as-blob-binary-from-sql-server/