Getting Documents Out of a Sharepoint (MOSS) Database

April 9, 2008

Our migration to the new MOSS farm has gone really well. We had a small hiccup with our CRM integration code, but it was quickly solved and I was able to leave last night a little after 11pm.

So a little background before I get to my utility. Our CRM system contains a custom entity that, on creation, uses a callout to interact with the MOSS web service and create a new site collection that we use to store documents and the like related to that entity. It works really well when MOSS is responding – but last year we had lots of timeout problems, and when that happened the developer wrote a utility to try to create the site again.

It looks like MOSS has somehow become confused about the nature of those sites that were recreated, because they are not browsable in the new farm. Users were getting to them in the old farm, and the documents are all in the database that was copied to the new farm…you just “can’t get there from here.”

Our solution will be to create new sites for those that we know are problems using our existing template and security process, but we needed to get the files out of the database with the folder structure intact so that we could just drag and drop them into the new site shared document lists. Full credit: this is largely taken from a blog post here – the only real change I made was to use a command line argument to select the shared document list for a specific site without having to recompile the program.

Here we go…and I apologize, wordpress “vanilla” doesn’t do a very good job of preserving all my pretty resharper-assisted formatting from Visual Studio.

using System;
using System.Data.SqlClient;
using System.IO;
namespace SharepointFiles
class Program
static void Main(string[] args)
// Connection string to your Sharepoint content database - substitute Server and DB names
string dbConnString = "server=servername;database=WSS_Content_DatabaseName;trusted_connection=true";

// If no command line argument then quit
if(args.Length == 0)
Console.WriteLine("ERROR - No Command Line Argument Provided");

// The ProgramId is the internal identifier for the site I want to get from the system.
string programId = args[0];

// Set up the SQL Connection and Command
SqlConnection con = new SqlConnection(dbConnString);
SqlCommand cmd = con.CreateCommand();

// Query that will return the file path, name and content from the database - Note the use of the formatted string.
cmd.CommandText =
"SELECT d.DirName, d.LeafName, ds.Content FROM AllDocs d JOIN AllDocStreams ds on d.Id = ds.Id
AND d.Level = ds.Level where d.DirName like 'sites/{0}/shared documents%'
and d.DirName not like '%shared documents/forms%' order by d.DirName, d.LeafName",

// Execute a datareader for the command
SqlDataReader reader = cmd.ExecuteReader();
string dirname = (string) reader["dirname"];
string leafname = (string) reader["leafname"];
// Note that the directory and files will be created from the directory that the exe file is located
Console.WriteLine("Creating Directory: " + dirname);

// Filestream is used for the file binary data
FileStream fs = new FileStream(dirname + "/" + leafname,FileMode.Create,FileAccess.Write);
BinaryWriter writer = new BinaryWriter(fs);

int bufferSize = 1000000;
long startindex = 0;
long retval = 0;
byte[] outbyte = new byte[bufferSize];

// Create the file
retval = reader.GetBytes(2, startindex, outbyte, 0, bufferSize);
startindex += bufferSize;
writer.Write(outbyte, 0, (int) retval);
} while (retval == bufferSize);

// Finish writing the file

Console.WriteLine("Finished writing file: " + leafname);