“Am I really a Philomath*?”

*a person who enjoys learning or studying.

From April – June this year, I participated in the Data Science Accelerator Programme. This is a capability-building programme that gives analysts from across the public sector the opportunity to develop their data science skills. The programme is delivered by the Government Data Science Partnership and this year took place virtually and remotely.

Over three months, participants have just 12 days to complete a data science project that will hopefully make a difference to a chosen area of your organisation. You are also allocated a mentor to help you to upskill and develop as a data scientist. I normally work with data relating to Adult or Children’s social care and had never had any prior experience analysing geospatial data. With this in mind, I proposed a project to analyse geospatial data and hopefully support current work to tackle carbon emissions in Essex.

My project

Essex is developing an ambitious Climate Action programme to try and achieve net zero carbon emissions by 2050. A Climate Focus Area (CFA) has been identified to be both a pilot study and consequent example of how land use can be transformed through effort, engagement and partnership working. The aim is to increase natural green infrastructure in the CFA, from 13% to 30% by 2030. It had not been identified where within the CFA, the areas for transformation would be proposed.

There were two components of my project. Firstly, an NLP (natural language processing) analysis of tweets from Essex residents within the CFA, to pull out key themes in residents’ perception and enjoyment of green space.

Secondly, to use geospatial data to map out the current land use and amount of natural green infrastructure within the CFA, then use this information in a model to highlight potential areas which may be suitable for transformation.

Project evaluation

I completed the NLP analysis of Essex residents’ tweets, showing positive association and sentiment for green space; supporting the Council’s proposal to increase natural green infrastructure.

I used geospatial data to define the CFA shape and composition with regards to land use and existing green infrastructure. My machine learning model is not fully finalised but we have got proof of concept for using a random forest classifier to identify desirable areas for transformation.

It was really good to have dedicated time to work on a project and a supportive peer group.

Challenges I faced included many(!) technology issues, trying to learn to use new software and analysis techniques whilst completing a project and missing out on the informal knowledge transfer that would have taken place with my mentor had I completed the programme on site and in person.

I now aim to present my project outcomes to sustainability leads at ECC. If I can further refine the model it would be great to expand it to classify the whole of Essex county, to identify further potential locations for transformation. I would encourage others to apply for future programmes; there is also now an additional programme for those interested in developing data visualisation skills, more information can be found here.

Share this page

Leave a comment

We only ask for your email address so we know you're a real person