Programmatically creating MS Office-compatible charts

Programmatically creating MS Office-compatible charts

If you have a web application that has chart visualizations and customer requests a downloadable copy of the report that can be viewed in PowerPoint, How do you go about it?

You can clone or download source code from GitHub project with the following URL: https://github.com/sekhar-rangam/docx4j-charts-api

Choice of Open Source Library:

Apache POI:

Apache POI can create tables, paragraphs, loading images etc., but not possible to create complex editable charts in office document

Docx4j:

Docx4j can create a complex editable charts or it can also modify existing chart in the office document, according to their site:

Here is a paid product which allows to create/modify all types of complex charts in Office Documents

In this article I’ll show you a process with example to generate (editable) chart in PowerPoint document using open source docx4j java library.

Pre-requisites:

  1. Download docx4j library and its dependencies from http://www.docx4java.org/downloads.html

A PresentationML or .pptx file is a zip file (a package) containing a number of “parts” as XML files. You can look at the file structure and the files that comprise a PresentationML file by simply unzipping the .pptx file, try to understand the structure and chart xml by just creating a simple column chart in presentation and then unzipping.

Below are the steps which we follow to create the PowerPoint document with chart:

  • Load/Create the PowerPoint document to add chart
  • Create a new slide into the PowerPoint document
  • Create a new chart with xml file as input
  • Embed chart Excel worksheet to enable edit option in office document
  • Save Office document

The approach here is to create a new PowerPoint document or load the existing document as a template with your own header/footer styles and finally create one slide with one chart. In the following section I am going to create a column chart for instance which looks exactly as below.

editable charts - diagram2

Load/Create the PowerPoint document to add chart:

Let’s create a simple PowerPoint document with your own styles that we can use as a template.

Now it’s time load the template and create a Column Chart in “C:\ppt\docx4JChart.pptx”, for this just create a standalone java program and copy the following piece of code from all steps inside main method

String templateFile = “/main/resources/template.pptx”;
InputStream in = Docx4jChart.class.getResourceAsStream(templateFile);
PresentationMLPackage presentationMLPackage= (PresentationMLPackage) OpcPackage.load(in);

This will return a PresentationMLPackage  object which has full access to modify your document

Create a new slide into the PowerPoint document:

All slides inside the document can be found in the main presentation part and charts inside the document can be found in slide layout part, so to access these objects you can you use the following piece of code

MainPresentationPart pp = (MainPresentationPart)
                                                 presentationMLPackage.getParts().getParts().get(new
                                                 PartName(“/ppt/presentation.xml”));

SlideLayoutPart layoutPart = (SlideLayoutPart)
                                                presentationMLPackage.getParts().getParts()
                                               .get(new PartName(“/ppt/slideLayouts/slideLayout2.xml”));

Now it’s time to create a new slide by using the slide layout part,

SlidePart slide1 = presentationMLPackage.createSlidePart(pp,layoutPart, new                                                 PartName(“/ppt/slides/slide” + slideIndex+ “.xml”));
StringBuffer slideXMLBuffer = new StringBuffer();
BufferedReader br = null;
String line = “”;
String slideDataXmlFile = “/main/data/slide_data.xml”;
InputStream in = Docx4jChart.class.getResourceAsStream(slideDataXmlFile);
Reader fr = new InputStreamReader(in, “utf-8”);
br = new BufferedReader(fr);
while ((line = br.readLine()) != null) {
           slideXMLBuffer.append(line);
          slideXMLBuffer.append(” “);
}
Sld sld = (Sld) XmlUtils.unmarshalString(slideXMLBuffer.toString(),Context.jcPML,
               Sld.class);
slide1.setJaxbElement(sld);

Note: you can change the slide title and title background color by modifying the slide_data.xml file which is a input for creating a slide

Create a new chart with xml file as input:

In the previous section we created a slide by providing XML file.

In this section we will also use another XML file as input for creating chart, first we have to create the chartpart object by providing XML file which has all data points and assign chartPart object to slide, you can download XML file from here

Below is the sample code for creating a chart in slide:

org.docx4j.openpackaging.parts.DrawingML.Chart chartPart = new org.docx4j.openpackaging.parts.DrawingML.Chart(new PartName(“/ppt/charts/chart” + slideIndex + “.xml”));
StringBuffer chartXMLBuffer = new StringBuffer();
String chartDataXmlFile = “/main/data/chart_data.xml”;
in = Docx4jChart.class.getResourceAsStream(chartDataXmlFile);
fr = new InputStreamReader(in, “utf-8”);
br = new BufferedReader(fr);
while ((line = br.readLine()) != null) {
            chartXMLBuffer.append(line);
            chartXMLBuffer.append(” “);
}
CTChartSpace chartSpace = (CTChartSpace) XmlUtils.unmarshalString(
chartXMLBuffer.toString(), Context.jcPML,CTChartSpace.class);
chartPart.setJaxbElement(chartSpace);
slide1.addTargetPart(chartPart);

With this we created a column chart and added to chartpart object, just to explain a little more about the above XML which we used to create a CTChartSpace object, chart_data.xml contains the following

  1. Type of chart to display, in this example it is BarChart
  2. All data points for bars
  3. Series name
  4. Excel sheet column references for related data points and series

Below is the snapshot of XML which highlights all of these

editable charts - diagram3

Note: you can change the chart type, series names, category names and data points by modifying the chart_data.xml file, when you change the series names and data points also change the data.xlsx file which is mentioned in the below section

Embed Excel worksheet to enable edit option in office document:

Now you can copy the below code to create the Embedded Package Part which is used to embed the excel file into the document for editing the chart

Below is snapshot of the data.xlsx file typically it looks like, you can download from here

editable charts - diagram4

EmbeddedPackagePart embeddedPackagePart = new EmbeddedPackagePart(
                       new PartName(“/ppt/embeddings/Microsoft_Excel_Worksheet”+
slideIndex + “.xlsx”));
embeddedPackagePart.setContentType(contentType);
embeddedPackagePart.setRelationshipType(“http://schemas.openxmlformats.org/officeDocument/2006/relationships/package”);
RelationshipsPart owningRelationshipPart = new RelationshipsPart();
PartName partName = new PartName(“/ppt/charts/_rels/chart”+ slideIndex + “.xml.rels”);
owningRelationshipPart.setPartName(partName);
Relationship relationship = new Relationship();
relationship.setId(“rId1”);
relationship.setTarget(“../embeddings/Microsoft_Excel_Worksheet”+ slideIndex + “.xlsx”);
relationship.setType(“http://schemas.openxmlformats.org/officeDocument/2006/relationships/package”);
owningRelationshipPart.setRelationships(new Relationships());
owningRelationshipPart.addRelationship(relationship);
embeddedPackagePart.setOwningRelationshipPart(owningRelationshipPart);
embeddedPackagePart.setPackage(presentationMLPackage);
String dataFile = “/main/data/data.xlsx”;
InputStream inputStream = Docx4jChart.class.getResourceAsStream(dataFile);
embeddedPackagePart.setBinaryData(inputStream);
chartPart.addTargetPart(embeddedPackagePart);

Save Office document:

Save the document with the following of piece of code you should now get the editable column chart in the PowerPoint document

              presentationMLPackage.save(new java.io.File(destinationFolderStr));

That’s it, the same approach can also be used to create different type of charts with your own data into the document by just modifying the Slide XML , Chart XML and Excel Data.

This blog is written by Sekhar Rangam, Analytics Consultant at BRIDGEi2i

About BRIDGEi2i: BRIDGEi2i provides Business Analytics Solutions to enterprises globally, enabling them to achieve accelerated business impact harnessing the power of data. Our analytics services and technology solutions enable business managers to consume more meaningful information from big data, generate actionable insights from complex business problems and make data driven decisions across pan-enterprise processes to create sustainable business impact. To know more visit www.bridgei2i.com

Connect with us:
facebook BRIDGEi2i on twitter BRIDGEi2i on LinkedIn BRIDGEi2i on Google+ BRIDGEi2i on YouTube

The views and opinions expressed in this article are those of the author and do not necessarily reflect the official position or viewpoint of BRIDGEi2i.

 

 

Related Posts

Comments (5)

Dear Admin , could you please provide complete tutorial on creating charts in PPTX using docx4j and pptx4j.

Thank You.

Hi Anurag,

in case if you still have the problem in creating charts in PPTX, you can down load the sample source code from github from the below link.

https://github.com/sekhar-rangam/docx4j-charts-api

Regards,
Sekhar

Hi thanks a lot for this article, I still do not get why chart_data.xml would contain data points, *and* a link to the excel sheet containing the data : why duplicate the information ?

Hi Antonin,

Excel directly depends on chart_data.xml to plot the graph in the spreadsheet, but once the graph is created and when user wants to modify the data points on right click of the graph+edit then the excel sheet containing the data will open, so we are placing the excel file data and linking to allow edit the graph.

I hope answered to your question, let me know if you need any clarifications.

Regards,
Sekhar

Thank you very much Sekhar, and I found out that if I wanted to create non editable charts, I could replace strRef and numRef by strLit and numLit elements.
Thanks again for sharing this code.

Best regards,
Antonin PA

Leave a comment