The new XML format for Office files makes it interesting to build. Each element lies in its own separate XML file and there are relationship files that determine how all of the files relate to each other within the package. If you rename a typical .docx or .pptx or .xlsx file to a .zip extension, you can open the file up in WinZip or as a Windows Compressed folder. The first think you will notice in a .docx folder is that there is a rels folder and a word folder. The rels folder contains all of the package level relationships and determines where the rest of the data files lie. The word folder contains these data files. Within the word folder is a relationship file and at least a document.xml. There could also be a style.xml, header.xml, or footer.xml among others. The folder structure will look similar to the following drawing.

To access this object model in a more efficient way, you can download the Microsoft SDK for Open XML Formats. The Microsoft.Office.DocumentFormat.OpenXml.Packaging class will expose some objects to make it easier to construct or modify a Word 2007 document.
The first class you will want to access is the WordprocessingDocument class. You create an object in code similar to this:
WordprocessingDocument myDocument = WordprocessingDocument.Open(myPackage);
where myPackage is a System.Xml.Packaging Package. You can also create a WordprocessingDocument from a Stream or a file.
Once you have the WordprocessingDocument, you can get back the main part of the document through the MainDocumentPart:
MainDocumentPart mainDoc = myDocument.MainDocumentPart;
The MainDocumentPart represents the main part of the document which resides in the document.xml file.
You can also gain access to headers and footers:
IEnumerable<headerpart> headerParts = mainDoc.HeaderParts;
You have to create an IEnumerable interface of the type of part you want to enumerate. You can then enumerate the headerParts collection. The reason that headers and footers are collections is that you can have multiple headers and footers in a document.
In the next part I will get back the XML and modify it.
Would you be willing post your code for programmatically adding a document to a SharePoint Document Library?
ReplyDeleteI don't want to use the SharePoint Object Model as the code will run on individual client PCs, not the server where SharePoint lives.
I also need to populate some custom fields I have in the library at the same time.
Hope I'm not asking too much!
Many thanks in advance
Joe
Joe,
ReplyDeleteI actually used the object model to add the document to the document library. You could use a custom web service that is floating around the Internet to upload documents through the web service and then you could run it from client computers.
For this project I could have the code run on the server and so simply utilized the object model.
(Also I no longer have this code as I have switched companies since the original post.)
Michael