Sunday 9 September 2012

Integrating Excel into a web application

The new version of my website www.physiotherapyexercises.com with new HTML 5 functionality continues to perform well, so I decided to continue the work on getting everything working as best as I can make it.

Originally, there was a data maintenance page where the physiotherapists could edit english text and translators could enter data in for the translations.

We modified the translation functionality so we could download the text as a CSV file and export it to excel. The translators would enter their text and we could upload it back to the database.

This had several problems
  • Excel has a method of "guessing" what the column types are, by looking at the first "X" rows, so columns with numbers and text would be imported incorrectly (as well as text potentially getting truncated).
  • I was a semi-manual process and time-consuming.
  • The translators would "break" the excel file by sorting partial row/column selections and hence corrupt the file because the text and key column would get misaligned.
  • The files were quite large
So I wanted to better automate this and also make it so the file couldn't be corrupted by translators sorting it incorrectly.

I first attempted to use Open Xml to automate the generation of excel files (in 2007+ xml format).

This worked reasonably well, but
  • Documentation on how to do things was not very good
  • Generating files from scrach with code could be quite difficult, a single change can result in the file refusing to open with very cryptic error messages.
  • Reading in large recordsets was quite slow
So I did some research and found the Closed Xml library on codeplex. It provides a more structured interface and generates the xml document for you in the background. This had good documentation and was an active project.

With a bit of work, we now have a simple webpage where the translators can just click on their language of choice and they get :
  • A protected workbook where they can only enter text in the translation column.
  • The translation sheet has frozen columns and can be filtered, this helps them in the translation process.
  • The rows are coloured coded indicating the need for translation/re-translation.
  • The sheet cannot be sorted without using the pre-defined sorting macros, so the translators cannot break the file.
  • The exercise text include a hyperlink to the exercise image. This helps the translators understand the exercise.
We have another simple page, where we can upload the modified file, and add a comment. It then summarises the changes and updates only the records that have been changed. It keeps a copy of the file so we can revert back to a previous version if something goes wrong.

The closed xml library is used in the reading and writing. It is very fast and works very well. I highly recommend it. The ability to base the excel file on pre-existing files is really useful as you can pre-program your own macros and have other sheets in the workbook.

An example of the Norwegian translation file can be found here.

This was received very well, so the next step was to use the library to update the english. With some work, we now have a page where they can :
  • Choose the data sections they wish to update, and it generates a custom file.
  • Columns have validation (ie number ranges, text max length and so on)
  • The sheets have frozen columns and can be filtered.
  • The last row has comments indicating what needs to be entered.
Much like the translations, they can upload the file and it will modify records, delete ones that are no longer required or insert new records. With the validation built in, we can have quite a bit of confidence of it being able to be used by the physiotherapists without "IT assistance".

An example of the English data file can be found here.

This has been very well received, they love the ability for them to use excel with spell checking and being able to email it to one another.

I gave a presentation to the Newcastle Coders Group in September, and there was quite a few developers there who were quite interested in using this library. It may be useful for your own projects.

No comments:

Post a Comment