SKOS HASSET Development Process

John Payne

One of the deliverables of the SKOS HASSET project was to provide the ability to present the HASSET thesaurus as SKOS linked data.  This required elements of work including database cleansing, application creation and software configuration.  As described in an earlier blog entry, we chose to deliver SKOS by utilising the open source project PUBBY and by converting our SQL Server based HASSET thesaurus into RDF and storing this in a BrightstarDB triple store.  For more information on SKOS, Pubby and Triple stores, refer to this post.

At the Data Archive, the Application Development and Maintenance team adopt Agile methodologies wherever possible to see projects through to deployment.  We employed these Agile techniques in the SKOS HASSET project in order to focus and drive development through to delivery of the completed product.

At the Archive, we have used JIRA for three years to manage both issue tracking and development tasks across our complete range of projects – both development and maintenance.  We have a plugin called GreenHopper installed within JIRA, that delivers an Agile presentation layer on top of the issues/tasks and provides configurable Scrum and Kanban views and functionality on top of any project or combination of projects you choose.  The documentation even suggests you combine the two and try Scrumban!

The combination of JIRA and Kanban where used for this sprint to track issues and progress.  JIRA issues contained the tasks, including current status, comments, time logging etc. and we used the Kanban JIRA plugin to give a visual representation of the current state of play and progression of tasks from ‘not started’ through ‘in progress’ to ‘complete’ every morning, rather than using physical post-it notes and a whiteboard.

We decided to use two sprints during the development of SKOS-HASSET with the second following several weeks after the first.  This is our preferred strategy.  Sprint one had specific goals in terms of laying the groundwork in terms of data quality and the production internally of valid a valid SKOS file.  Sprint two tied everything together and addressed any issues that came to light during sprint one.

The process we adopted was:

Sprint Preparation

During sprint preparation, the three developers involved met and picked through the complete list of issues/requirements to familiarise everyone with the task at hand.  These tasks where then created within JIRA and each was assigned to an individual and prioritised.


The initial sprint lasted for five days and primarily involved validating and cleaning the data and creating an application to create valid triples from our relational database version of the thesaurus. Every morning of the sprint would involve a short ‘stand-up’ meeting where progress, problems and proposed work for the current day would be briefly described by each developer.  This was backup up visually by using the Kanban view provided by Greenhopper.  All application code created was stored in SVN source control and built from within Jenkins, our continuous integration server in order to satisfy our coding quality standards.

Post sprint review

In the week following the sprint, the developers met to reflect on what we had achieved and what issues we encountered.  This was also a good opportunity to make sure that both the addressed and remaining issues had been documented and commented upon in readiness for the second sprint.

Sprint 2

Sprint 2 was a smaller, two-day sprint and was the final push to actually move from our development environment to that of a production environment ready for external consumption.  The requirements for sprint 2 were not data-related but focused instead on implementing Pubby on a newly set up production server and ensuring that all underlying data creation and was now being supplied from the production environment.

Sprint preparation

The developers once again met to discuss the remaining list of issues/requirements.  These were then reassigned in certain instances and reprioritised.  During this second preparation phase, we also tried to resolve any external dependencies that would otherwise hamper the forthcoming sprint such as setting up of domain names and preparing for firewall changes etc.


The second sprint was better described as a dash with it being so short!  Most of this sprint involved configuring a new production web server to host Pubby, correctly installing our Triple Store onto its live server and deploying application code and tables from development to production.

Post sprint

It would be lovely to say that after sprint 2, all our issues were closed but this is not quite true.  We still have a couple of small internal loose ends but these either do not directly affect the SKOS HASSET product or they were moved out of the scope of this development cycle. One advantage of JIRA to manage tasks is that these remaining issues are formally documented and must be commented on, resolved and closed by the project managerbefore the project is completed.

As I started out by saying, in terms of scale, the SKOS-HASSET development was only small but our decision to adopt the ‘sprinting mind-set’ was a sensible choice.  The Agile techniques of sprinting and having short, stand up morning meetings are insightful and not only deliver information, they act as glue between the team members and provide the focus and impetus to keep momentum going and deliver results in a short timeframe.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s