word處理

ImagePart :word裏面的圖片:

在OpenXML中, 要插入一個word文檔中的圖片叫做一個Blip對象或一個element。

Object->OpenXmlElement->OpenXmlCompositeElement->Blip

Object->OpenXmlElement->OpenXmlCompositeElement->Paragraph

Object->OpenXmlElement->OpenXmlCompositeElement->Run

Object->OpenXmlElement->OpenXmlCompositeElement->Drawing

Object->OpenXmlElement->OpenXmlCompositeElement->OfficeMath

Object->OpenXmlElement->OpenXmlCompositeElement->SdtElement

派生自SdtElement得對象爲:

SdtBlock

SdtCell

SdtRow

SdtRun

SdtRunRuby

Object->OpenXmlPartContainer->OpenXmlPart->ImagePart

Object->OpenXmlPartContainer->OpenXmlPart->MainDocumentPart

Object->OpenXmlElement->OpenXmlCompositeElement->BodyType->Body

一個ooxml文檔中包含很多Parts:

ChartParts

DiagramDataParts

FooterParts

HeaderParts

.....

它們都是類似於IEnumerable<ImagePart>這樣的數據類型。

DrawingML是一個定義ooxml文檔中的圖片,圖形,圖表等圖形對象的語言, 是一個定義語言, 類似於sql

http://www.officeopenxml.com/drwOverview.php

Drawing對象表示圖象, 嵌入在Run裏面, 一個典型的xml如下:

<w:r>  
  <w:drawing>  
    <wp:inline>  
      …    </wp:inline>  
  </w:drawing>  
</w:r>

正如嵌入文字一樣:

run->Text

run-> Drawing

獲取ppt中某個圖片的信息:

    // from a picture
    foreach (var pic in slide.Descendants<Picture>())
    {                                
      // First, get relationship id of image
      string rId = pic.BlipFill.Blip.Embed.Value;

      ImagePart imagePart = (ImagePart)slide.SlidePart.GetPartById(rId);

     // Get the original file name.
      Console.Out.WriteLine(imagePart.Uri.OriginalString);                        
      // Get the content type (e.g. image/jpeg).
      Console.Out.WriteLine("content-type: {0}", imagePart.ContentType);           

      // GetStream() returns the image data
      System.Drawing.Image img = System.Drawing.Image.FromStream(imagePart.GetStream());

      // You could save the image to disk using the System.Drawing.Image class
      img.Save(@"c:\temp\temp.jpg"); 
    } 

ooxml中的圖片是由圖片數據和一個ID租車, ID可以在body中找到, 而且圖片數據可以被替換和覆蓋。

<w:p>
  <w:r>
    <w:drawing>
      <wp:inline>
        <wp:extent cx="3200400" cy="704850" /> <!-- describes the size of the image -->
        <wp:docPr id="2" name="Picture 1" descr="filename.JPG" />
        <a:graphic>
          <a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/picture">
            <pic:pic>
              <pic:nvPicPr>
                <pic:cNvPr id="0" name="filename.JPG" />
                <pic:cNvPicPr />
              </pic:nvPicPr>
              <pic:blipFill>
                <a:blip r:embed="rId5" /> <!-- this is the ID you need to find -->
                <a:stretch>
                  <a:fillRect />
                </a:stretch>
              </pic:blipFill>
              <pic:spPr>
                <a:xfrm>
                  <a:ext cx="3200400" cy="704850" />
                </a:xfrm>
                <a:prstGeom prst="rect" />
              </pic:spPr>
            </pic:pic>
          </a:graphicData>
        </a:graphic>
      </wp:inline>
    </w:drawing>
  </w:r>
</w:p>

ID保存在Blip元素中,

從所有的Run當中提取出Inline元素的代碼, 利用Descendants<Run>, 返回一個IEnumerable<Inline>:

https://stackoverflow.com/questions/2810138/replace-image-in-word-doc-using-openxml

using (WordprocessingDocument document = WordprocessingDocument.Open("docfilename.docx", true)) {

  // go through the document and pull out the inline image elements
  IEnumerable<Inline> imageElements = from run in Document.MainDocumentPart.Document.Descendants<Run>()
      where run.Descendants<Inline>().First() != null
      select run.Descendants<Inline>().First();

  // select the image that has the correct filename (chooses the first if there are many)
  Inline selectedImage = (from image in imageElements
      where (image.DocProperties != null &&
          image.DocProperties.Equals("image filename"))
      select image).First();

  // get the ID from the inline element
  string imageId = "default value";
  Blip blipElement = selectedImage.Descendants<Blip>().First();
  if (blipElement != null) {
      imageId = blipElement.Embed.Value;
  }
}

如果把docx文檔後綴名改爲zip, 然後解壓, 就會看到Media目錄, 裏面存放了很多圖片文件以及和Id的映射關係。

如何把docx裏面的圖片存放到另外一個目錄, 請看:

https://stackoverflow.com/questions/2810138/replace-image-in-word-doc-using-openxml

omml2mml生成的mathml帶有namespace, 需要在html前面加上這個名字空間才行。

Section: Sections are subdivisions of a document. 一旦文檔分爲幾個Section,可以僅僅格式化某個Section。例如,改變頁朝向以及欄目數。

Pagesize: This element specifies the properties (尺寸和朝向) for all pages in the current section.

oleobjectbinarypart

一個twip是一個打印點的二十分之一, 1440分之一英寸, 567分之一釐米

當DPI設置爲96時, 一個像素爲(1/96)*1440=15Twip

https://baike.baidu.com/item/twip/1554184?fr=aladdin

https://en.wikipedia.org/wiki/Twip

<w:pgSz w:w="11907" w:h="16839" />

A4紙的標準尺寸爲: 210*297 毫米

換算爲英寸爲:8.27 * 11.69英寸

1英寸等於2.54釐米

PageMargin裏面有個header屬性, header意味着到header頂部的空間。

<w:sectPr> <w:pgMar w:header="720" w:bottom="1440" w:top="1440" w:right="1440" w:left="1440"/> …</w:sectPr>

sectPr (Document Final Section Properties)

This element defines the section properties for the final section of the document. [Note: For any other section the properties are stored as a child element of the paragraph element corresponding to the last paragraph in the given section. end note]

[Example: Consider a document with multiple sections. For all sections except the final section, the sectPr element is stored as a child element of the last paragraph in the section. For the final section, this information is stored as the last child element of the body element, as follows:

XMLCopy

<w:body>  
  <w:p>  
  …  </w:p>  
  …  <w:sectPr>  
    (final section's properties)  </w:sectPr>  
</w:body>  

如果一個文檔有多個Section, sectPr是描述最後一個section的, 它是body的最後一個節點;而對於其它的section, 保存爲該Section的最後一個paragraph的子元素。

OpenXml操作Word的一些操作總結.無word組件生成word.

https://blog.csdn.net/u011394397/article/details/78142860

new ContextualSpacing() { Val = false }

Which tells word to uncheck the “在相同樣式的段落間不添加空格” in paragraph options.

利用AltChunk合併多個word文檔

 

發佈了38 篇原創文章 · 獲贊 5 · 訪問量 1萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章