As part of our Earth2.0 application we provide the ability for environmental surveyors to define their own set of data entry fields for any survey work they have carried out. When entering survey data into the application users use the fields that they have defined, including custom drop-down lists and validations, this is great when in the office and connected to the internet but when out in the middle of the Pilbara there is a distinct lack of internet connection points. We wanted to give our users the easiest method of recording data in the field and uploading that data to Earth2.0 when they return to the office.
In an ideal world we envisaged a user downloading a field sheet in the form of a Microsoft Excel spreadsheet with all of the correct field headings included, the user could then enter data into the spreadsheet whilst in the field on a PDA or laptop and upload all of the data in just a few mouse clicks on returning to the office. How to do this though?
Enter Apache POI and the Horrible Spreadsheet Format.
Apache POI is a collection of Java ports of file formats used in Microsoft Office applications. The APIs allow Java developers to read from and write to documents in Microsoft Office formats such as .xls and .doc. The part of POI that deals with xls files is lovingly named HSSF, Horrible Spreadsheet Format.
However the developers feel about Microsoft’s file formats, POI is very powerful. Using POI it was simple to generate spreadsheets containing all of the users custom data entry fields and write a parsing tool to prepare the data for importing into the Earth2.0 database.
With the spreadsheet uploaded, Earth2.0 uses POI to extract the data and then validates the site location details that were entered on the sheet (happily managing a variety of coordinate systems, WGS84, GDA94, and others), validate each of the taxonomic names that were recorded, run any custom validations that the user has requested and then save all of the data into database for immediate review and analysis using the Earth2.0 data review tools.
Apache POI gave us the ability to provide this very powerful and timesaving feature to our clients. Without it, we would have been stuck using formats like CSV or XML that are not as user friendly as Microsoft Excel.
The POI team are currently working on providing support for OOXML files as generated by the very latest version of Microsoft Office. When these features are released we will be using them to give our clients the most flexibility in the format that they can upload data in. If a client requests it, we will provide support for Open Office Calc documents as well, I doubt the API for that is called “horrible”!