Anna Shipman : JFDI

Good Strategy Bad Strategy

17 May 2019

A while ago, the excellent Russell Davies gave me a copy of Good Strategy, Bad Strategy. He said it was the best book he’d read on the topic, and I found it extremely useful. Here are some of my notes, but I recommend reading it.

Strategy is diagnosis, vision, and a plan

The kernel of a strategy contains 3 things:

  1. Diagnosis: the current situation
  2. Vision: where you want to get to; this is your guiding policy
  3. Coherent action: how you get from your current situation to your desired end state. Coherent action consists of feasible plans, resource commitments and actions.

The core of strategy is discovering the critical factors and designing a way to coordinate and focus on actions to deal with them, including risk mitigation.

Good strategy is not just what you are trying to do, it’s also why and how.

Good strategy almost always looks simple

The book opens with a description of The Battle of Trafalgar, and how the British (led by Lord Nelson) won, even though there were fewer British ships than French and Spanish ships.

Instead of following what was the usual tactic at the time, approaching in a single line, Nelson’s fleet approached in two columns, one aiming at the centre of the French and Spanish line, in order to break up their formation. However, this did put the ships at the front of the columns in greater danger.

In summary, his strategy was to risk his lead ships to break the coherence of his opponents’ fleet.

Good strategy almost always looks this simple.

What strategy is not

The author talks a lot about what is generally described as strategy and explains why it’s not. For example:

And strategy, responsive to innovation and ambition, selects the path; identifying how, where, and why determination and leadership are to be applied.

A strategy is like a lever that magnifies force

Leaders must identify the critical obstacles to forward progress and develop a coherent approach to overcoming them.

The plan should be the highest impact areas. What single feasible objectives will make the biggest difference?

A strategy coordinates action to address a specific challenge. The job of the leader is to create the conditions that will make that push effective; to have a strategy worthy of the effort called upon.

It’s not enough just to focus – we need to think about why that is the focus. We need to apply power to the right target.

It involves making hard choices

Creating a strategy involves choice, and the difficult work of casting out other things.

It is problem solving. By its very nature, you need to make hard choices. You need to address the elephant in the room.

Good strategy usually emphasises focus over compromise.

Universal buy-in means a choice hasn’t been made.


Diagnosis should replace the overwhelming complexity of reality with a simpler story that calls attention to its crucial aspects.

The diagnosis part of the strategy is handing the organisation a problem it can solve.

Working on the strategy

The disconnect between current results and current action is what makes strategy hard and interesting.

A good strategy is a hypothesis about what will work formed by educated judgement. Exploit your rivals’ weaknesses and avoid leading with your own.

A strategy should be episodic, though not necessarily annual.

You need to make your strategy robust

You have to be able to defend your strategy.

He has some interesting suggestions about how you can question your own judgement. For example have in your head a panel of people that you know what kind of thing they’d say, and imagine them critiquing your strategy.

He also suggests that you note down judgements you make over time, and then refer back to improve your process.

It involves constant work

Making such a policy work takes more than a plan on paper – you need to work to maintain the coherence of the plan, every quarter, year, decade.

This Twitter thread from Deepa Subramaniam is also very useful, particularly the practical tips on how to do that work.

You should read it

These are some of the notes I’ve found useful to refer back to. These slides by Sophie Dennis are also an excellent summary of some of the main points.

But the book itself is extremely worth reading. It has made a big difference to how I think about and work on strategy.

Finance for non-accountants

23 April 2019

This is the first role I’ve had managing a large budget and I recently had some excellent training from my finance director colleague Isabelle Campbell. Here are my notes.

The role of the finance team in an organisation

Isabelle started by answering the question “what is even the point of a finance team?”

There is a lot to know about finance. The current regulations are on dictionary-thin paper and stack up about four feet high. So the finance team takes care of keeping on top of regulations, tax and the latest accounting standards, as well as reporting.

They do budgeting and forecasting, and can also advise on strategic decisions.

Accounting years are important

The first non-obvious thing I learned was that the concept of an accounting year is very important. This is because a profit & loss statement (P&L) is only for the current accounting year. So whether something is capital or operating expenditure (more on that in a moment) is defined by the value within or beyond the accounting year.

Most UK companies use the calendar year (i.e. January 1st to December 31st). Many instead follow the tax year (so April 6th to April 5th; or April 1st to March 31st). In the UK you can set your accounting year however you like.

Annual accounts

Every accounting year, most UK companies have a requirement to produce annual accounts. This can include a profit and loss statement (a P&L), a balance sheet and a cash flow statement of varying levels of detail and complexity.

We spent more time learning about the P&L rather than the other two, so I’ll talk more about that, but in brief:

How to read a P&L

For example, here is John Lewis’s P&L for last year. It was an image in a PDF, so I’ve typed out the contents below.

Screenshot of John Lewis P&L, text is below this image

Notes 2018
1.2 2.1 Gross sales 11,597.7 11,374.2
2.1 Revenue 10,204.0 10,026.2
Cost of sales (6839.5) (6633.1)
Gross profits 3,364.5 3,393.1
Other operating income 111.3 92.6
2.2 Operating expenses before exceptional items and Partnership Bonus (3,114.0) (3,007.8)
3.3 Share of (loss)/profit of joint venture (net of tax) (1.0) 0.3
2.1 Operating profit before exceptional items and Partnership Bonus 360.8 478.2
2.3 Exceptional items (111.3) 171.2
2.1 Operating profit before Partnership Bonus 249.5 649.4
5.1 Finance costs (85.7) (109.7)
5.1 Finance income 141.1 1.9
Profit before Partnership Bonus and tax 177.9 541.6
Partnership Bonus (74.0) (89.6)
2.4 Profit before tax 103.9 452.2
2.7 Taxation (29.8) (98.7)
Profit for the year 74.1 353.5
2.1 Profit before Partnership Bonus, tax and exceptional items 289.2 370.4

There will be a scale. At the top, we can see the scale is millions (“£m”).

If a sum is in brackets, it’s negative. For example, “Cost of sales” in 2018 was “(6839.5)”, that means it was -£6,839,500,000, i.e. it cost them ~$7bn to buy or produce what they sold. (This may not include all costs – to see what all the parts are, we’d have to refer to note 2.1.)

The bottom line (hence the phrase!) is the revenue minus expenditure, i.e. the profit.

In this example the bottom line is £289.2m, though taxes, bonuses and exceptional items will come out of that. Somtimes the bottom line may be used to just indicate gross profit, i.e. before any deductions at all.

Capex and opex

Capex is capital expenditure. This means money spent on something that will have a longer term value; specifically this means a value outside of this accounting year. Buying a car for £30,000 would probably be capex, because you expect that car to last longer than a year.

Capex is money you spend on buying assets.

Opex is operational expenditure. This is money you spend on things that do not retain value for the company. For example, if you lease that car for £8,000 per year, that’s opex. There’s no value to the company. When you stop paying, you no longer have the car. Examples of opex include paying salaries and rent, buying stationery, etc.

Opex is the cost of operating the business in the year.

Capex and opex in the 21st century

All reasonably straightforward when talking about buying vs leasing cars. With some things it is obvious whether something is capex (for example, buying a fleet of cars) or opex (renting a garage).

But there are a lot of grey areas. For example, what if our car had actually cost £8,000 to buy? Should we capitalise that if we expect it to last more than a year, or should we in fact just consider it part of this year’s operating expenses? There are guidelines, but this kind of thing is a judgment call and something the finance department can help with.

It becomes even more complicated when you get into the kinds of work most companies do now.

For example, staff costs such as salary are generally opex. But if they are working on something capitalisable, like software, then there is a question there. Say you have a team of people working on building an A/B testing framework that you expect will be used for the next 5-10 years. That A/B testing framework is an asset that you could potentially capitalise, in which case you may be able to capitalise some or all of the staff costs.

Capex is not included in the P&L

Capex and opex are treated in different ways. One extremely influential distinction is that capex does not appear on the P&L. Instead, it appears on the balance sheet as an asset.

Technology is usually a cost centre

A cost centre is a part of the business to which you allocate costs and is not responsible for revenues. This is in contrast to a profit centre, which can add to the company’s costs and profits.

In most businesses, technology is a cost centre, meaning that the main way the technology department can increase the overall profitability is by cutting costs, and/or making the revenue-generating areas of the business more economical or efficient.

Cost centres generally do not produce their own balance sheet.

This can lead to misalignment of incentives

Focusing on a P&L rather than having full visibility of what sits on the company’s balance sheet can potentially lead to incentives that are not aligned.

A big example in software is buy vs build. In general, you should build things that represent your core competency and things that will give you a competitive edge, and you should buy everything else. So if you’re not a hosting provider, use cloud computing for your infrastructure and hosting.

However, that cost is not capitalisable. The company does not gain any asset from paying AWS or Heroku. This means the costs will appear on your P&L. If, instead of using cloud computing, you bought your own hardware and hired a team of people working full-time in a data centre you own, that would be very capitalisable and the costs would not appear on your P&L.

It’s clearly a bad idea, for a number of other reasons, but the incentives here can be poorly aligned.

Amortisation, depreciation and write-downs

Capitalised assets appear on the balance sheet. However, normally they lose value over time. This is called depreciation, and the annual cost of this is deducted from the balance sheet and charged to the P&L.

In the UK, ‘depreciation’ refers to the loss of value on tangible assets. There is also ‘amortisation’, which in the UK usually refers to loss of value of intangible assets, like brand. (US companies usually call depreciation amortisation, which can be confusing).

Sometimes assets lose more value than has been allowed for in a pre-determined depreciation or amortisation rate. In this case, the company might “write down” the cost, which means further reducing the value on the balance sheet. The amount it is reduced by is charged to the P&L. Whether to do this is usually a judgment call, though a company may receive pressure from their auditors.

Impairment and goodwill

A recent big example of a write-down was at the end of last year, when Verizon took an impairment charge of $4.6bn on their purchase of Oath (the company formed from Yahoo & AOL).

When a company buys another company, they have an expectation of the value that the asset will generate, as with any asset. If the purchased company turns out not to be delivering the value, for example due to changes in the market, government regulations, loss of brand reputation etc, then the asset is no longer worth what it was, and the owning company’s balance sheet suffers.

Part of the purchase amount agreed is the value of the assets. The remainder of the cost of the company is what is called “goodwill”: this is the difference between the assets that you can put a value to and how much you are willing to pay for it.

In the case of Verizon and Oath, the $4.6bn was a “non-cash goodwill impairment charge”, meaning the “goodwill” aspect of the deal was no longer considered to be worth what Verizon had paid, so they were taking a “write-down”, i.e. reducing the overall value of Verizon’s assets by $4.6bn, a figure which would have come out of Verizon’s 2018 P&L.

Budgets and forecasts

The P&L budget for the year is worked out towards the end of the previous year. It covers everything that might happen in the business in the next year. How many employees the company is likely to have, how much revenue from existing and new customers can be expected, etc. It also involves asking the question of what can be capitalised.

A forecast can be done at any point during that year, and takes into account new information. For example, you might do a Q1 (Quarter 1, i.e. the first three months of your accounting year) forecast at the end of March. Knowing what we know now about what’s happened in the first three months of the year, what do we know about the rest of the year? Forecasts might be a few months, or a few years.

A forecast may not always change the budget; so a Q1 forecast that shows a higher or lower than predicted revenue may not result in a revised budget, but it does mean at least that you are armed with the information.

How to find a P&L to read

If you want to browse the accounts of companies, you can! will take you to the Companies House search (currently in Beta). I searched for John Lewis Partnership, and then ‘Filing history’. The image above is from page 85 of their company accounts. Quite a useful tip if you want information on a company you are thinking of joining, or even if you’d like to know more about the company you work at.

Smaller companies only have to file restricted information in their accounts, but there is still interesting information about them on Companies House, e.g. who the directors are.

This has only scratched the surface

There were loads of other things that we didn’t cover, and given that the regulations are four feet high, may not ever cover, but it was a really useful session and has given me much a better grasp of some of the competing priorities we need to think about.

How to make the most out of meetings with vendors

22 March 2019

Building relationships with vendors is an important part of my role as Tech Director and when I joined the Financial Times, the excellent Rob Shilston gave me his five top top tips on how to make the most of those meetings. He has kindly agreed to share them here.

1. Use it as an opportunity to learn about how others are doing things

2. Reiterate that we aren’t special

3. Do push for industry wide solutions and collaboration

4. Avoid talking absolute price, but instead try pursuing the pricing model itself. Is it a fair model?

5. Talk about table stakes. The absolute basics that they should be achieving without charging extra.

He also had some other tips

These were Rob’s five top tips, but he did have some suggestions for other things the meetings could be useful for:

And one to avoid:

Finding the next level tech job

16 January 2019

Last April, I joined the Financial Times as Technical Director for This has been a brilliant move. The job search was different from any of my previous job searches, as it involved an explicit step into technical leadership, and here I share how I did it and some things I learned.

This was very different from previous job changes

I joined my previous organisation, the Government Digital Service in 2012. At that point, I was a developer and I was looking for a role as a developer.

While at GDS I attained my long-held ambition to become a technical architect, ultimately leading a project to build a PaaS for government.

In 2016, I moved into a new, exciting role as the Open Source Lead. This role involved influencing rather than delivery, which meant I could have a huge impact, but I had no team and wasn’t delivering a product.

After a year, I had achieved what I set out to do in that role, and knew I was ready for a new job, but unlike any of these moves, it wasn’t obvious what my next step would be.

First, I had to work out what I actually wanted to do

I knew that I wanted a job that married my two previous roles – something that used my influencing skills and had a large impact, as in my Open Source Lead role, with the team and delivery of my Technical Architect role.

So I was looking for some kind of technical leadership. But within that, there is a lot of variation. For example, tech leadership can be just leading/managing technical people, or it can be just setting vision and direction for a technical product, or it can be a combination of both.

So it wasn’t immediately clear what kind of role I’d be looking for next.

Inventorying my skills

I was at an advantage because while I felt ready to move on, I was not obliged to rush. I still had a job and plenty of interesting things to do, so I could take a bit of time to make sure my next step was the right one.

The first important step was to work out what I was good at. This is something worth doing because although it seems like it might be obvious, I tend to focus on getting things done, and don’t always reflect on what skills it is that mean I’m succeeding (or what weaknesses mean I’m not).

There were four main ways I did this:

  1. I asked people directly what they valued about me. For example, a friend made an intro to her boss and I asked how she’d described me, and she reported that she’d said “more single minded than anyone else I know”. Another former boss called me “terrifyingly competent” (which I think was a compliment…).
  2. I did an exercise called a Johari window to learn what colleagues thought my strengths are, which I’ve written up here.
  3. As I started having interviews, I made sure to ask for feedback at every stage of the process. A good question to draw that out can be something along the lines of “Do you have any concerns about my ability to do this job? Are there any gaps I can perhaps set your mind at rest about?”
  4. And of course, as I live my life by lists, I made a list, and updated it when I noticed I’d done something well or badly.

Now that I’ve done this exercise it’s made me much more conscious and reflective about my strengths and weaknesses, which I think is helping me develop myself better.

Working out what I wanted in my next role

Because I wasn’t sure what my next steps were, I just started looking at job descriptions. As things happened at work, or I saw a job I was interested in (or very much wasn’t), I would make a note about what it was about it that appealed or repelled.

This was quite an interesting exercise, because it made me realise some things that are really important to me but it wouldn’t necessarily have occurred to me to mention to a recruiter.

For example, I cycle to work. This is a hugely important part of my life and I realised that when I see a job I’m interested in, the first thing I do is find out where the main location is, and then calculate the cycle time using Citymapper. If it’s too long a bike ride, that’s it, the job is out of contention. The interesting bit was that this was an almost unconscious process I was going through – the location was completely non-negotiable, but it didn’t appear in my list of characteristics of my ideal job.

Apart from cycling, I came up with a list of what I was looking for. It included:

Once I had this list, it was much easier to ask the right questions of recruiters and understand how well a job advert fit what I was looking for.

Parsing job titles is difficult

Unfortunately, identifying the kind of work I wanted to do didn’t mean I could then identify the job from the job title.

“Engineering Manager”, “Engineering Director” and “Technical Director” could all be either entirely people-focused, entirely tech focused, or a mix. Even “CTO” varies hugely from company to company. It’s not just that in a small company you’ll be hands-on whereas in a large one you’ll be many levels away from the code. In some companies, the CTO is actually in sales, or otherwise outward facing. (This is an interesting paper on the different kinds of CTO roles.)

I think this will be the case for the next few steps in my career; I’ll need to know more about what the actual work is. A recruiter with whom you have a good relationship and knows what you are looking for is useful here, as they will be able to parse the job titles for you.

I had 24 hours of chats

As soon as I started thinking about moving on, I reached out to people for advice, and by the time I’d found my next role I had had chats with 20 different people, including former colleagues, contacts and people I’d worked for.

Some of the excellent advice I got from the people I talked to included:

If you do nothing else, definitely have some chats

Many of the chats led to introductions to other people for more chats, and many of those resolved into an actual job that I could potentially apply for.

This meant some of the chats were with potential employers, almost a pre-interview stage, and these were also very useful. It was a lot of effort but it was good practice. The interview process is about learning about the company as well as them interviewing you, and I made some good contacts with people in companies that I never entered into a formal process with.

If you are looking for a new job and you only follow one suggestion from this, it would be do reach out to people for chats. Be explicit that you are looking for a new role. It took a lot of time, but it was the most fruitful part of this process.

I had 21 hours of interviews

In the end, I interviewed with four companies, which involved 21 hours of interviews with 31 different people. At this level, application processes can be very long.

It’s worth mentioning that I found out about all four of those jobs through chats. The jobs were all advertised, but that wasn’t how I found out about them.

I withdrew from one partway through the process (“Engineering manager”) because having been through two interview rounds, I felt that the role would not be challenging enough for me. I was unsuccessful in another one (“CTO”), having made it to the final round, but it was a very useful experience and I made some good contacts.

And I was ultimately offered two, “Engineering Director” and “Technical Director”. Both reported to the CTO, both had opportunities for development and both met all I wanted in my next role. Ultimately, I chose the Financial Times because I really believe in the mission, but it was a tough choice.

This all took a lot of time, but it doesn’t need to

I said earlier that I was at an advantage in that I didn’t have to rush, and if I’d moved when I started looking (September) then I would have missed the amazing opportunity that my current job is (came up in the first week of January) but that aside, it’s not clear it was totally an advantage; if I’d had more of an impetus, it may not have needed to take so long.

As my excellent friend pointed out, the risks of taking an imperfect next step might not have been so high.

I’m really happy with where I landed, and I learned a lot from the process, but if you plan to follow what I did, you don’t necessarily need to allow four months.

However, the right technical leadership role can take some time to come up – unlike developer roles, there isn’t always one out there somewhere. If you do find it takes time, you can always use that time to think about particular things you can achieve in your current role that will help you with what you are looking for.

Good questions I asked

Here are some of the questions that I asked in chats or interviews that got me really useful information about the job.

A good question to ask at the end of an interview is the one I mentioned above; “Do you have any reservations about my suitability for this role?” One person gave me ten minutes worth of very useful, actionable feedback after I asked this. So much so that I suspected I wasn’t going to get the job, but he put me through to the next round.

And a bonus, not necessarily for interviews but a really good question someone asked in a presentation once and I share here: “You’ve said how you see it working, how do you see this failing?”

Good questions I got asked

I got asked a lot of very good and/or difficult questions (though almost no competency-based questions). One advantage of doing a lot of interviews at once is that you get into the swing of answering questions like this. But it’s always worth practising, so here are some of the ones I was asked.

Bad questions I got asked

I also got asked some bad questions. A red flag was when people only wanted to talk about my previous job but one, i.e. my job as Technical Architect for GOV.UK PaaS.

My year as Open Source Lead was great, and I developed a lot of very useful skills, particularly around influencing, and identifying the highest impact areas to focus on. However, a few potential companies didn’t find my most recent experience relevant and only wanted to ask me very specific questions about team management, like ‘what dashboards do you make sure your team has up?’

Yes, I can manage a team, but that is not my main value to your organisation. When interviewers focused on this, it made it clear to me that this wasn’t the role I was looking for.

The hardest question I got asked

I had three interviews for my job at the Financial Times (with a total of 8 people). The last one was with the CTO John Kundert, now my boss, and the very first question was the hardest question I was asked in the all of the 21 hours of interviews in the previous few months:

As with some of the questions above, all of the thinking I’d done about my strengths and values, plus the practice I’d recently had in inteviewing elsewhere, made this something I could answer both honestly and usefully.

It was a very difficult question, especially as the opener, but it also demonstrated that this was an organisation, and a manager, who would challenge me to be my best.

How it worked out

I’ve been at the FT for eight months now as Technical Director and it’s everything I hoped for. It makes use of some of the skills I already have, for example influence, clarity of thought, bias for structure and execution; while giving me the opportunity to develop some others, for example stakeholder management, managing a budget.

I’m sure that at least one of the other jobs in contention would also have been great, and approaching this process in good faith had unintended benefits. For example, it was a very good way to network. I received a free invitation to a tech conference from one job that didn’t work out, and another one invited to be on a tech advisory board. Tech is a small world, and I’m sure I will be working with many of the people I met later in my career.

And I’ve even set up my own circuits club!

What I learned in six years at GDS

08 December 2018

When I joined the Government Digital Service in April 2012, GOV.UK was just going into public beta. GDS was a completely new organisation, part of the Cabinet Office, with a mission to stop wasting government money on over-complicated and underperforming big IT projects and instead deliver simple, useful services for the public.

Lots of people who were experts in their fields were drawn in by this inspiring mission, and I learned loads from working with some true leaders. Here are three of the main things I learned.

1. What is the user need?

The main discipline I learned from my time at GDS was to always ask ‘what is the user need?’ It’s very easy to build something that seems like a good idea, but until you’ve identified what problem you are solving for the user, you can’t be sure that you are building something that is going to help solve an actual problem.

A really good example of this is GOV.UK Notify. This service was originally conceived of as a status tracker; a “where’s my stuff” for government services. For example, if you apply for a passport online, it can take up to six weeks to arrive. After a few weeks, you might feel anxious and phone the Home Office to ask what’s happening. The idea of the status tracker was to allow you to get this information online, saving your time and saving government money on call centres.

The project started, as all GDS projects do, with a discovery. The main purpose of a discovery is to identify the users’ needs. At the end of this discovery, the team realised that a status tracker wasn’t the way to address the problem. As they wrote in this blog post:

Status tracking tools are often just ‘channel shift’ for anxiety. They solve the symptom and not the problem. They do make it more convenient for people to reduce their anxiety, but they still require them to get anxious enough to request an update in the first place.

What would actually address the user need would be to give you the information before you get anxious about where your passport is. For example, when your application is received, email you to let you know when to expect it, and perhaps text you at various points in the process to let you know how it’s going. So instead of a status tracker, the team built GOV.UK Notify, to make it easy for government services to incorporate text, email and even letter notifications into their processes.

Making sure you know your user

At GDS user needs were taken very seriously. We had a user research lab on site and everyone was required to spend two hours observing user research every six weeks. Ideally you’d observe users working with things you’d built, but even if they weren’t, it was an incredibly valuable experience, and something you should seek out if you are able to.

Even if we think we understand our users very well, it is very enlightening to see how users actually use your stuff. Partly because in technology we tend to be power users and the average user doesn’t use technology the same way we do. But even if you are building things for other developers, someone who is unfamiliar with it will interact with it in a way that may be very different to what you have envisaged.

User needs is not just about building things

Asking the question “what is the user need?” really helps focus on why you are doing what you are doing. It keeps things on track, and helps the team think about what the actual desired end goal is (and should be).

Thinking about user needs has helped me with lots of things, not just building services. For example, you are raising a pull request. What’s the user need? The reviewer needs to be able to easily understand what the change you are proposing is, why you are proposing that change and any areas you need particular help on with the review.

Or you are writing an email to a colleague. What’s the user need? What are you hoping the reader will learn, understand or do as a result of your email?

2. Make things open: it makes things better

The second important thing I learned at GDS was ‘make things open: it makes things better’. This works on many levels: being open about your strategy, blogging about what you are doing and what you’ve learned (including mistakes), and – the part that I got most involved in – coding in the open.

Talking about your work helps clarify it

One thing we did really well at GDS was blogging – a lot – about what we were working on. Blogging about what you are working on is is really valuable for the writer because it forces you to think logically about what you are doing in order to tell a good story. If you are blogging about upcoming work, it makes you think clearly about why you’re doing it; and it also means that people can comment on the blog post. Often people had really useful suggestions or clarifying questions.

It’s also really valuable to blog about what you’ve learned, especially if you’ve made a mistake. It makes sure you’ve learned the lesson and helps others avoid making the same mistakes. As well as blogging about lessons learned, GOV.UK also publishes incident reports when there is an outage or service degradation. Being open about things like this really engenders an atmosphere of trust and safe learning; which helps make things better.

Coding in the open has a lot of benefits

In my last year at GDS I was the Open Source Lead, and one of the things I focused on was the requirement that all new government source code should be open. From the start, GDS coded in the open (the GitHub organisation still has the non-intuitive name alphagov, because it was created by the team doing the original Alpha of GOV.UK, before GDS was even formed).

When I first joined GDS I was a little nervous about the fact that anyone could see my code. I worried about people seeing my mistakes, or receiving critical code reviews. (Setting people’s mind at rest about these things is why it’s crucial to have good standards around communication and positive behaviour - even a critical code review should be considerately given).

But I quickly realised there were huge advantages to coding in the open. In the same way as blogging your decisions makes you think carefully about whether they are good ones and what evidence you have, the fact that anyone in the world could see your code (even if, in practice, they probably won’t be looking) makes everyone raise their game slightly. The very fact that you know it’s open, makes you make it a bit better.

It helps with lots of other things as well, for example it makes it easier to collaborate with people and share your work. And now that I’ve left GDS, it’s so useful to be able to look back at code I worked on to remember how things worked.

Share what you learn

It’s sometimes hard to know where to start with being open about things, but it gets easier and becomes more natural as you practice. It helps you clarify your thoughts and follow through on what you’ve decided to do. Working at GDS when this was a very important principle really helped me learn how to do this well.

3. Do the hard work to make it simple (tech edition)

‘Start with user needs’ and ‘Make things open: it makes things better’ are two of the excellent government design principles. They are all good, but the third thing that I want to talk about is number 4: ‘Do the hard work to make it simple’, and specifically, how this manifests itself in the way we build technology.

At GDS, we worked very hard to do the hard work to make the code, systems and technology we built simple for those who came after us. For example, writing good commit messages is taken very seriously. There is commit message guidance, and it was not unusual for a pull request review to ask for a commit message to be rewritten to make a commit message clearer.

We worked very hard on making pull requests good, keeping the reviewer in mind and making it clear to the user how best to review it.

Reviewing others’ pull requests is the highest priority so that no-one is blocked, and teams have screens showing the status of open pull requests (using fourth wall) and we even had a ‘pull request seal’, a bot that publishes pull requests to Slack and gets angry if they are uncommented on for more than two days.

Making it easier for developers to support the site

Another example of doing the hard work to make it simple was the opsmanual. I spent two years on the web operations team on GOV.UK, and one of the things I loved about that team was the huge efforts everyone went to to be open and inclusive to developers.

The team had some people who were really expert in web ops, but they were all incredibly helpful when bringing me on board as a developer with no previous experience of web ops, and also patiently explaining things whenever other devs in similar positions came with questions.

The main artefact of this was the opsmanual, which contained write-ups of how to do lots of things. One of the best things was that every alert that might lead to someone being woken up in the middle of the night had a link to documentation on the opsmanual which detailed what the alert meant and some suggested actions that could be taken to address it.

This was important because most of the devs on GOV.UK were on the on-call rota, so if they were woken at 3am by an alert they’d never seen before, the opsmanual information might give them everything they needed to solve it, without the years of web ops training and the deep familiarity with the GOV.UK infrastructure that came with working on it every day.

Developers are users too

Doing the hard work to make it simple means that users can do what they need to do, and this applies even when the users are your developer peers. At GDS I really learned how to focus on simplicity for the user, and how much better this makes things work.

These three principles help us make great things

I learned so much more in my six years at GDS. For example, the civil service has a very fair way of interviewing. I learned about the importance of good comms, working late, responsibly and the value of content design.

And the real heart of what I learned, the guiding principles that help us deliver great products, is encapsulated by the three things I’ve talked about here: think about the user need, make things open, and do the hard work to make it simple.

This post originally appeared on 24ways.

How to interview job applicants fairly

22 November 2018

Interviews are not a great way to find out how good someone will actually be at the job. They can also make it easy for your unconscious biases to have too much influence. This has led some to suggest that interviews are broken and we need to find other ways; but it is possible to do interviews in a much fairer and more informative way.

The best predictor of future performance is past performance

Let’s say you want someone who is good at setting vision and direction for a large team. A question like “How would you set vision and direction for a large team?” is only going to give you hypothetical answers.

All you will learn is how they talk about doing it, how much they’ve read around the topic of vision-setting, and also how much what they say chimes with what you believe. But we know that planning to do something a certain way and actually doing it that way are two very different things. They might say all the right things, but then not be able to follow through.

A better question would be “Tell us about a time that you set vision for a team”, because that will get them to talk about what their actual past behaviour was, which is a much better indicator of what their future behaviour will be. It also gives you an opportunity to ask questions to find out more what their approach was, like “How did you do it?”, “What was the result?”, “What didn’t go so well?” None of those questions are hypothetical, so you’re not asking the candidate to make something up that you want to hear, you are asking them what they did, and how they thought about it.

Note that it doesn’t have to be the exact same thing, because chances are this job will be a step up in one way or another. You need to work out what the important part is, and ask about that. So my question above was about setting vision for a team, not for a large team, because the important thing I wanted to know about was vision-setting.

Make the questions relevant to the main skills in the job they will be doing

Interviewing like this is something the civil service does very well. When advertising a job, the hiring manager works out what competencies the job will involve.

Let’s say you are hiring a tech lead. What should this person be able to do? You might come up with some things like set technical direction, communicate effectively with other teams, communicate effectively with the business, unblock team members.

You may come up with a very long list. Identify the four or five that are essential to the role, and then devise questions around those skills. “Can you give us an example of a time that you unblocked a team member?” for example. Or, “Tell us about a time that there was a breakdown in communication between engineers and the business. What did you do to improve the situation?”

Spending time thinking about what exactly you want from the candidate is time consuming, but nowhere near as time consuming as hiring the wrong person.

Make sure the questions have clear criteria that you can judge them on

As well as being relevant to the role and asking about past performance, you need to make sure the questions can be answered in a way that you can aim to objectively judge the answer. For example, with the vision-setting question, you’d be looking for things like whether they’d considered a diverse audience, whether and how they measured the impact, etc.

If you ask a vague question like “Have you worked in an Agile environment? What did you find positive and negative about it?”, what criteria can you judge their answer on, other than whether that sounds like the kind of thing you’d agree with?

Ask all candidates the same questions

Once you’ve identified the most important areas, write out the questions and ask all candidates the same ones. This may feel unnatural, as you will be following a script, but it’s the only way to make sure that you aren’t just making biased assumptions and are actually comparing the information each candidate gives you with each other candidate.

I recommend you come up with around 5 or 6 questions. Once you’ve allowed time for introducing the panel, explaining the interview, and 5-10 minutes at the end for answering their questions, then six is the most you can reasonably cover in an hour.

Some people do go into a lot of detail, so be prepared to cut people off if necessary – you want to give them the best chance possible to demonstrate the full range of skills.

Encourage them to give you all the information you need for the answer

You want to know as much as possible about the extent and effectiveness of what they did, so you can understand whether they have the skills you are looking for.

A useful model to think of is the context/action/result model (a useful mnemonic is CAR). If they don’t tell you all that as part of their answer, you can prompt them.

For example, using the question about unblocking team members. Context: Why were the team members blocked? Action: What did you do to unblock them? Result: What was the outcome?

This will give them a better opportunity to make sure you have the information you need. Remember, this is about giving them the opportunity to give you the most useful information to help you make a decision about whether they are right for the role.

(For candidates, I’ve got a bit more advice about how to structure your answers and think of examples in this blog post for the Government Digital Service or read this excellent Twitter thread by Beth Fraser.)

Score the questions immediately

In order to make the most of the process, you need to give a score for each question as soon as you can. Ideally immediately after the interview, so that you can remember what has been discussed. This helps you focus on how they performed against each of the important skills, rather than being influenced by what you thought of them personally, or your general sense of whether they’d be good (which will be informed by your unconscious biases).

This also helps address the primacy/recency effect, where you will think more highly of them if they answered the first and last question well, or if they were the first or last candidate.

At GDS we used a score of 0-3.

0. They demonstrated no evidence of this skill. 
(e.g. if the question is about Perl, they cannot write it at all).

1. They showed that with support, they can master this skill
(They made a few mistakes, they'll get there)

2. They have definitely demonstrated this skill
(They can start writing Perl on day one)

3. They've exceeded the skill level required for this role.
(They are Larry Wall*).


Have more than one person in each interview

To avoid biases, you need at least two perspectives and ideally three, and the interviewers should be different from each other to help with that: for example, from different disciplines, or different levels (e.g. a junior and a senior), or different ethnic backgrounds, etc.

Set people up to succeed

Interviews are not meant to be a stressful test, or have trick questions. You want to get the right person for the job, not outwit anyone. So the whole process should be clear and transparent to the candidate.

In the job advert, spell out skills you are looking for. Let them know when you are inviting them for interview that they can prepare by thinking about examples of times they’ve demonstrated those skills, and perhaps tell them about the context/action/result model.

This means that people will be set up to do their best and you’ll be able to really give them a chance to shine. And it will also mean that you can give clear feedback on what areas to work on for unsuccessful candidates.

Further reading

If you want to read more about this, here is an article about why interviews are useless. But, as a throwaway question about what can be done at the end, it says “One option is to structure interviews so that all candidates receive the same questions, a procedure that has been shown to make interviews more reliable and modestly more predictive of job success. Alternatively, you can use interviews to test job-related skills, rather than idly chatting or asking personal questions.”

How to do exactly that is what I’ve described here.

High output management

05 November 2018

Earlier this year I read High Output Management. This is one of the best books I’ve read on leadership, and I really recommend you read it. Some of the things I took from it are here.

It’s really excellent on what the important things are when running a large team or business. Andy Grove was co-founder and CEO of Intel, so it’s written by someone who was actually doing the work at the time, unlike lots of other management books.

A lot of it really resonated with me because it reflected some of my own views, but it also changed my perspective on a few things; particularly the value of 1-1s and on how much time is reasonable to be spending in meetings.

The art of management is selecting which activities will provide most leverage

“Like a housewife’s, a manager’s work is never done.”

There is always more you could or should be doing, so the skill in being a manager is in shifting your attention to the activities which will have the most impact on increasing the output of your organisation. In other words, working out which activities are the ones where your leverage will be greatest.

The highest leverage activity is developing your reports

This is the highest impact thing you can do, because the more senior your reports are, the more they can take on big pieces of work that will free you up.

This makes 1:1s with your reports one of the most important things you can spend your time on. He suggests the agenda for the 1:1 should be set by the report, but also says you should cover current problems, plans and what’s worrying them/what they see as future problems.

Another suggestion he makes is that being part of a peer group is a way to increase your leverage by affecting the work of all. This one was counter-intuitive to me, but after six months of being involved in the Technical Leadership Group at the Financial Times, I can really see what he means by this.

Your output is the output of your team

The output of a manager is a result of a group under her supervision or management, or as my excellent boss Cait O’Riordan puts it, “you’re nothing without your team”.

He gives the example of a fire station. You have to shape an energetic team that can deal with whatever comes up.

Delegating is extremely high leverage

He points out that the “delegator” and the “delegatee” need to have a common set of operational ideas about how to go about solving problems, otherwise the delegatee can only become an effective proxy if given explicit instructions. Like with micro-managing, this really isn’t effective in multiplying your impact, or, as he puts it “meddling produces low managerial leverage”.

So a manager must communicate her objectives, priorities and preferred approaches, and you need a shared set of values to effectively delegate. He also says that following the corporate culture makes it easy for people to make decisions.

If that common base of information and similar ways of doing and handling things is created, this can exert enormous leverage.

He has a lot of very useful advice about how to delegate

Something that is a small task to you may be a much larger proposition to the person you delegate it to. “A senior manager’s tactics might be the next manager down’s strategy.”

He also has suggestions for how to effectively delegate; following the principles of quality assurance in manufacturing that he introduced early in the book, e.g. review rough drafts of reports and sample output (more often if they are unfamiliar with the task, and then ease off). The approach depends on the “task-relevant maturity” of the employee. Operational reviews and presentations are good source of motivation as people will want to make a good impression.

“Monitoring” – giving people objectives and checking in – is on paper a manager’s most productive approach but we have to work our way up to it, and if things suddenly change we may have to revert quickly to much closer management of the task. But we are biased against that, so sometimes we don’t take it up where necessary until too late. We need to think about the most effective way to manage, not be guided by what we feel is good/bad.

Meetings are the medium of managerial work

This one is really interesting, and he calls out that he disagrees with Peter Drucker on this. Drucker says that spending more than 25% of your time in meetings shows that “there is time-wasting malorganization”. Certainly, coming from an engineering background, I tend to feel the fewer the meetings, the better.

But Andy Grove expressly disagrees with this. He points out that, as above, a big part of your role is to share know-how, and “impart a sense of the preferred method of handling things” to your reports. This can only really be done in person, and therefore, during meetings. “Thus I will assert again that a meeting is nothing less than the medium through which managerial work is performed”. So we shouldn’t be fighting their existence, but just making sure that we are using the time as efficiently as possible.

This has really changed my perspective on a lot of the meetings I have.

Be available

The most important thing in making good decisions is getting information; and usually this is better if verbal and quick, rather than written reports. It’s useful to go round and chat to people. Reports are mostly useful because they force people to clarify their thoughts, so the writing is more important than the reading.

So make sure you schedule slack into your calendar. But you should also have a list of non-urgent things you can do that will increase productivity – otherwise you will use your free time to meddle in your reports’ work, which will slow things down.

How to deal with interruptions

Hiding physically in order to get work done is not great, because people do have genuine questions or issues and you’ll slow things down.

One approach is to standardise responses, which can then also help with delegation.

Another is batching – if you have regular catch-ups people can’t complain about being asked to batch and bring to you at scheduled times. You could also have office hours.

“To make something regular that was once irregular is a fundamental production principle and that’s how you should try to handle the interruptions that plague you”.

Performance review is really important

Don’t ask your reports to review themselves first. You should be paying attention. “Reviewing the performance of subordinates is a formal act of leadership.”

He scrutinises a selection of performance reviews given by other managers to their reports throughout the year, with as much visibility as possible, to make it clear it’s the most important kind of task-relevant feedback we can give our reports.

When delivering a performance review make sure your report receives the message. Don’t say too many things in a performance review as they may not take it in – focus on the most important things. He also has some good, practical advice on how to prepare for and run a performance review.

The performance rating of a manager cannot be higher than what you would give the organisation under her control.

If there is a problem and they don’t agree with your assessment but commit to your proposed solution, that’s fine. To make things work people don’t need to side with you, they only need to commit to pursue a course of action that’s been decided on. He also has some practical advice on how to get that commitment.

When reviewing a star performer make sure to focus on how they can improve their performance

He points out that often the reviews of star performers focus a lot on what they’ve done well, whereas reviews of poor performers focus on how they can improve, but in fact we should spend more time on focusing on how star performers can improve. It’s a high leverage activity because it will have a big impact on group output.

Develop culture and values by what you do as well as what you say

He had two important examples of this:

  1. When you choose to promote someone you are signalling your values and who you see as role models
  2. If a valued employee wants to quit your top priority becomes that – not going to whatever important meeting you are on your way to. Your first reaction counts.

How to use objectives and key results

OKRs are usually attributed to Google, but while reading this book, I realised that in fact, they originated here. Andy Grove developed Peter Drucker’s Management by objectives into OKRs, and then John Doerr learned them at Intel and took them to Google. Reading them here was the first time that they actually made sense to me rather than feeling cargo-culted.

Objectives are what you need to do; key results are how you know you are on your way. The example that really made it clear to me was: your objective is to reach the airport in an hour. Key results are: pass through town A at 10 mins, B at 20 mins, C at 30 mins. If after 30 mins there is no sign of town A, you know you’ve gone off track. So they need to be clear enough that you know you’ve met them, and that you are on track.

He points out that the system requires judgment and common sense. Objectives are not a legal document. If the manager mechanically relies on the OKRs for the review, or the report ignores an emerging opportunity because it wasn’t one of the objectives, “then both are behaving in a petty and unprofessional fashion”.

And finally, a very important point: you should not have too many! “To focus on everything is to focus on nothing”.

Everyone should be able to contribute to decisions, but then commit even if they disagree

The first stage of making decisions with a team needs to be free discussion. It’s very important that everyone feels they can contribute, because more junior people are likely to have a better grasp of the technology than senior managers, but senior people have experience from previous errors, etc.

Once a decision is made it needs to be clear, and it needs to be supported by everyone, even those who don’t agree.

Confidence in these discussions comes from realising it’s OK to make mistakes, so it’s important to make sure everyone understands this. That is, in this book written 35 years ago, he’s telling us about the importance of psychological safety.

Say no as early as possible

In manufacturing, it is vitally important to reject defective material at its lowest value (i.e. when you’ve not done much work on it). Similarly with managerial work. Say no as early as possible. Getting an unnecessary meeting called off early is the same as stopping work on something at a low value-added time.

Remember: saying yes to one thing means saying no to another. And we can say no in two ways – one by saying no, and one by saying yes but not actually doing it. The latter wastes time and energy.

Plan for the future, not fixing the present

The final thing I wanted to note was what he says about planning. If you see a gap today, that represents a failure of planning in the past, and if you focus your energy on fixing that, you are constantly chasing after things that should already have happened. You need to instead focus on future events.

“Remember that as you plan you must answer the question: What do I have to do today to solve – or better, avoid – tomorrow’s problem?”

You should read the book

It was really hard to write this post because I made so many notes while reading it. When refreshing my memory of the book while drafting it, I wanted to make so many more, and this really just scratches the surface. It’s an excellent book and well worth your time reading it.

Three other books I would recommend

I’ve read lots of other good books on work-related topics, but the three that have had the most impact on me are:

Don’t make me think

This is an excellent book on web usability. If you have anything at all to do with producing websites and haven’t read it already, you absolutely must.

Thinking fast and slow

This is a quite dense read but it’s extremely worth the effort. It explains how our minds trick us and what we can do to avoid making mistakes by biased thinking.


In itself this is not an amazing book but it’s worth reading for the point about how you need some slack at work. It’s also a short read. Two important points from it have stuck with me:

  1. The example of a sliding tile puzzle. Without the empty slot, it’s a much more efficient use of space but you can’t actually do the puzzle. This has stuck with me a really powerful metaphor as to why we need slack at work.
  2. His explanation of why “people are not fungible resources”. If you have someone working half time on two teams, it’s not the case that they are 50% on each team and that adds up to 100%; it’s more like 30 or 40% on each team. The more teams you split people between, the less total time they’re able to actually function.

For a better explanation of these two points alone, this is definitely worth the short time it will take you to read it.

Do let me know what you think

I’d be interested to hear about books that you recommend, or what you thought of these books if you’ve read them.

Moving SPA to HTTPS

03 August 2018

I had long been thinking I must move the SPA site to HTTPS. However, everything is difficult when using shared hosting and I wasn’t sure how to do it. So I was delighted when Johan Peeters and Nelis Boucké proposed a session for the 2017 SPA conference called Why SPA should switch to HTTPS and how easy that is.

It was a great session. In short, they recommended three approaches:

  1. Get the hosting provider to do it. Before the session, Yo and Nelis asked me who the SPA hosting provider was (it isn’t discoverable through dig as it’s resellers all the way down). Once I had told them, they did a bit of investigation, and the hosting provider we use charges £180 per year for HTTPS. A bit much for a non-profit conference run by a registered charity.
  2. Use Let’s Encrypt. This is a free certificate authority, provided by the non-profit Internet Security Research Group whose aim is to make a secure and privacy-respecting web. The advantage of this is that you are managing it yourself (though that is also a disadvantage as it’s then up to you to make sure that you get the configuration right and keep up to date with changes). If you are using nginx or Apache, it can be a little tricky to configure, though that is addressed if you use Caddy. However, Let’s Encrypt won’t work for SPA, as the shared hosting does not allow us to install anything.
  3. Use Cloudflare’s free plan. You need to point your nameservers at Cloudflare. It means that they terminate your TLS and you are sharing a certificate with many other websites, plus the connection from your server to Cloudflare may still be unencrypted, but it’s free, easy and massively reduces the attack vectors and ability for people to sniff your network traffic. (This is an excellent article about the security aspects of Cloudflare’s service).

So it seemed like Cloudflare was the only plausible option for SPA.

First, I set Cloudflare up for this site

Prior that stage, my blog did not use HTTPS, and I didn’t see why it would need to, as all the content is open anyway.

But Yo and Nelis made the excellent point that even static sites with non-private content should be HTTPS, because if only sites that need to be private use TLS then that leaks information just by that fact.

Setting Cloudflare TLS up for my site was easy. I did that in the break after the session, and it had propagated within a few hours.

That went so well I decided to move on to doing the SPA site shortly after lunch. This proved to be significantly more tricky.

CSS stopped working

After the nameservers had been changed to Cloudflare’s nameservers, the CSS was no longer available. The site looked like this:

SPA homepage with no CSS

My initial response was to publish the site. Big mistake.

SPA programme page with no content at all

Frantically trying to get the site back online

Before I go on, I should point out that this was in the afternoon of the last day of the conference. (Thank goodness it wasn’t the first day!) So while most presenters and attendees were at the post-conference pub drinks, I was at home frantically trying to get something back where the website had been so that when presenters went to upload their outputs from sessions there was something there for them to upload to.

In fact, the /scripts pages were fine, so they could have uploaded their outputs… if they’d known to navigate directly to the page.

The problem was with how we used the CMS

At that stage, the static content on the website was edited in a staging directory using a CMS (a fairly dated version of CMS Made Simple), and then published using a publish.php script which copied the content from staging to production.

It turned out that the issue with the blank pages was with the publish script. Somehow, running the script, instead of copying pages from staging to production, had replaced the production pages with empty strings.

The copying was done with the following code:

function getUrlContents($url, $credentials="") {
    if ($credentials == "") {
        return file_get_contents($url, false);
    } else {
        $context = stream_context_create(
                'http' => array(
                    'header'  => "Authorization: Basic " . base64_encode($credentials)
        return file_get_contents($url, false, $context);

My intial thought was that the issue was with stream_context_create, possibly with the http on line 7. But reading the documentation for that, and for file_get_contents and extensive googling for “stream context create HTTPS” got me nowhere. Eventually, I created a small test file to run this code locally. But it output the full page.

So then I deployed my test file to the server, and it failed there. I then tried curl on the server, and got this output:

curl: (35) error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure

A StackOverflow answer led me to question the version of OpenSSL:

OpenSSL> version
OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008

Essentially, the version of OpenSSL that curl was using doesn’t allow copying over HTTPS.

Unfortunately, with shared hosting, there’s nothing I can do about that. I can’t upgrade OpenSSL, and I certainly can’t rebundle curl and whatever file_get_contents is using.

The only solution to allow publishing using the current method was to allow the site to be served as HTTPS and HTTP.

This meant I had to change the settings on Cloudflare to no longer be “always use HTTPS”, which also means that I couldn’t use HSTS. This was very annoying as it meant I then had to rely on the user to try to use HTTPS in order to protect themselves. On the plus side, it did allow me to publish the content again rather than an empty string.

Fixing the CSS

The issue with the CSS was that in the page source there was a tag <base href="" />. I could not see where this was being set. After much investigation and a long time, it turned out that this was a metadata tag set by the CMS.

It turned out that to change this setting, I had to change some code in the template from {metadata} to {metadata showbase="false"}.

Documentation about the metadata tag

It was not possible to find this by grepping; it involved understanding how the CMS worked. Not for the first time, I thought that I must remove the CMS.

Fixing this meant that the site was available over HTTPS, with CSS, but only if the user went to https:// rather than http://. Not ideal, but a step in the right direction.

A series of embarrassing roll-backs

However, when it came to launch the 2018 site, I had to roll back even further.

The first roll-back I mentioned above was almost immediate, switching off HTTPS by default in July 2017, in theory until such time that I could fix the publishing/remove the CMS.

Then, in September 2017, when I was deploying the site for 2018, the whole thing went wrong. Instead of the site, I got a Cloudflare warning page.

Cloudflare warning

I needed to finish getting the 2018 site out, and didn’t have time to investigate and fix (often the way on side projects when the next priority should be going to bed). In order to move foward, I had to switch off HTTP entirely, which would have been an embarrassing full climbdown if anyone paid attention to the site outside of the conference season.

I couldn’t remove the CMS immediately, as I was in the middle of an epic of removing MediaWiki from the site. While I was writing up that epic, I realised that the session and user pages on the 2017 site didn’t have CSS because I’d optimistically updated the links to point to HTTPS. So I then had to find and replace all those links.

However, the new site was being published with an HTTPS base tag: <base href="" /> which meant that the CSS could not be retrieved as it was no longer available over HTTPS. I knew that I’d set that somewhere but I couldn’t remember where, and I spent a long time trying to figure that out. I eventually found the answer… above, in a draft of this blog post. (A huge benefit of having a blog when you don’t have an amazing memory!)

Then I realised even my new 404 and 500 error pages, added as part of the MediaWiki removal had a link to the HTTPS version of the homepage. Even that had to be removed.

It was a series of ever more embarrassing roll-backs until, in October 2017, I was exactly where I had been when the session was proposed.

A side note

At some point during this process, while looking at the certificate, I saw this.

Shows I have visited site 1,920 times

That’s depressing.

Fixing the staging issue

In October, once I’d finished removing MediaWiki, I returned to this problem.

The current status was that I could only deploy from staging to live via HTTP because in order to do this, I had to copy the contents of the staging pages to production which didn’t work with the available version of OpenSSL. It seemed to me there were three potential ways to address this:

  1. Move staging from to This would mean I could put and on HTTPS while leaving staging HTTP to allow the data to be copied.
  2. Find another way to copy from staging to production that doesn’t depend on OpenSSL, i.e. defer the problem for now with an interim fix until I’m ready to tackle removing the CMS.
  3. Bring forward my plan to sack the CMS from ‘someday’ to ‘now’.

I dismissed the first idea pretty quickly. It would be a lot of work for what would ultimately be a temporary solution. If I’m going to do a lot of work at this stage, it should be to remove HTTP entirely.

The third idea was clearly the best long term idea – removing the CMS was something I wanted to do anyway, but it would be a lot of work, and I wanted not to go down the rabbithole and just get the HTTPS back. So I looked closely at option 2.

An interesting diversion

My initial supposition was that the files would be generated on the server so I could just copy them over, but they’re not. CMS Made Simple stores the data in separate blobs in the database, and composing them is difficult. However, CMS Made Simple is open source, and because we are on shared hosting, the code was vendored, so I could dig into it and work out how the files were composed.

This was a very interesting diversion and involved a lot of diving into output buffering in PHP. I spent quite some time down this rabbithole, dealing with issues like wanting my script to call an output buffer but it not working, and this being because there was an unexpected ob_get_clean() in the CMS Made Simple PHP; the buffers are not nested so that call clears all buffers for the rest of that process.

I finally managed to generate the Lead a session page using the following code.


$alias = 'lead-a-session';

function buffer_callback($buffer_contents) {
    $buffer_contents = preg_replace("/index\\.php\\?page=([^\\\"]*)\\\"/", "\\.html\"", $buffer_contents);
    return "$buffer_contents";
$_SERVER['REQUEST_URI'] = "/staging/index.php/" . $alias;


I could then call this by:

`if (!isset($_SERVER['REQUEST_URI']) && isset($_SERVER['QUERY_STRING']))`

`QUERY_STRING='page=lead-a-session' php -q -d error_reporting=0 index.php > test.html`

So essentially the script recreates the page in the same way the CMS does for display, and then copies that page to production.

And this worked! I could use this as the publish script.

Except… it was missing the menu, which is constructed differently.

And a number of other things:

Remaining TODO for generating CMS pages

Needs nav and other sorting
Needs programme
Needs not to output the output of the script!
Needs to suppress header warning
Needs switching to HTTPS
Clear the extra buffer handlers

At this point, I gave up. This was, again, turning into a rabbithole for an uncertain and interim outcome. It was time to face facts and remove the CMS.

This was a huge piece of work that took several months (as this is something I maintain in my not hugely copious spare time), and I’ve written that up in a separate post.

Finally, having removed the CMS…

I switched HTTPS back on. I had only paused it in Cloudflare, so this wasn’t a lot of setting back up again.

This time round, there was no issue with the CSS as it was relative, and the base tag had been removed. There was no mixed content for the same reason, all neater.

I turned ‘Always use HTTPS’ back on. This redirects all HTTP requests to HTTPS, which is an extra hop, so ideally I want to update all links in the previous sites but this is not a high priority.

Some of the previous sites don’t look great, but there is a Cloudflare setting ‘Automatic HTTPS Rewrites’ which helps fix mixed content by changing HTTP to HTTPS for all resources that can be served with HTTPS, which fixes some of them. I’ll update the others when I have time.

I changed back my new error pages and all the URLs in the code, and then switched HSTS back on.

Moving the site to HTTPS caused the build of the Jekyll site that replaced the CMS to fail in a way I hadn’t anticipated, but the amazing advantage of the static pages now being open meant that someone else identified and fixed this problem before I was aware of it.

The main issue outstanding is that the local version of the scripts doesn’t work over HTTPS. This is annoying, but currently only for me as I’m the only person working on it, so I’ll fix it at some point. I also set up the SPA account with Cloudflare under my account rather than a separate one, which will make it harder for me to hand over when I do, but that’s also a story for another day.

The great success is I finally managed to switch HTTPS back on in time for this year’s conference.

I’m not sure it was that easy

So, despite Yo and Nelis’s claims, it was not in fact that easy. To be fair, they didn’t know the full story of the 20 year old PHP site. And it’s a great relief to have done it.

I look forward to the next epic task they set me.

Replacing the SPA CMS

20 July 2018

As part of my ongoing mission to update the SPA conference website, I had long wanted to replace the CMS. When I realised it was blocking the move to HTTPS, I knew the time was now. This post is about how I did that and what I learned.

The conference website used a CMS to generate the static pages

The functionality of the conference cycle, for example programme submission, reviews etc, is done through PHP scripts, but the generation of the static pages used CMSMadeSimple.

The CMS didn’t edit the site directly, it edited a staging site, which was in fact a subdirectory: When the chairs were happy with the content in staging, they published it to the live site (another subdirectory, e.g. using a script called publish.php.

It wasn’t very secure

The staging subdomain and and the publish script were behind Basic Auth, which when coupled with the fact that the site was still HTTP (more on that another day) meant it wasn’t very secure. When new chairs joined the committee, I had to create their passwords for Basic Auth as I was the only one with access to .htpasswd, and email the passwords to them; very insecure. In addition, the publish script had the previous web admin’s credentials in it to allow it to work, which was a bit of a blocker to my plan of eventually open sourcing this code.

However, this is not a very high-profile website so the risk and impact of someone unauthorised gaining access is quite low. The CMS, while a bit dated, was relatively intuitive and easy for new chairs to use, so replacing it was not a high priority.

It was blocking the move to HTTPS

It became a priority to remove it when I discovered it was blocking moving the site to HTTPS. The issue wasn’t the CMS itself, but the publish script, which literally copied the page from the staging subdirectory to the current year subdirectory using curl. The version of OpenSSL running on the shared hosting didn’t support copying the page contents over HTTPS.

Because it’s shared hosting, it’s not possible for me to upgrade OpenSSL, so after considering various other ways to copy the site from staging to live, it became clear that the right solution was to bring forward my plan to remove the CMS and publish the site a different way.

Most of the pages were static so could be replaced with a Jekyll project

It was quite easy to choose Jekyll. I’m familiar with it from using it on this blog, and I know others are too. Hosting the project on GitHub meant that I could offload the issues around authorisation to GitHub, dealing with the security issues I mentioned above.

Using GitHub also has the advantage that people can see when others are editing and what changes are being published, and comment if neccessary. With the CMS, there was no way to tell if another user was editing at the same time. The publish script published everything, and a couple of the conference chairs had mentioned to me that this was a bit stressful, because there was no way of knowing if you were about to publish something half-finished that another chair was working on. With GitHub, you can see what commits have been made and what pull requests are open.

I also removed the necessity to run a separate publish script by setting the project up so that it deploys automatically to the live site on merge to master. This turned out to be quite tricky on shared hosting, so I wrote up how to do it.

The programme generation was a different kettle of fish

While most of the site content is completely straightforward, the programme generation was managed in a completely different way, not using the CMS. The programme was created via the PHP scripts, using a plug-in called Xinha as a WYSIWYG editor. Because the site is on shared hosting, this was vendored into our code, and was an old version.

Image showing how the edit programme with Xinha looked

The accepted sessions were available to be added via a drop-down menu. When chairs published the programme, the accepted session links were expanded to include all the details about the session and the session leaders, with links to the supplementary pages.

The session and session leader pages themselves had to be generated separately, using a publishallsessions.php script.

The process was not intuitive, but in theory it worked quite nicely because you could just add the sessions you wanted via a drop-down, and publishing it took care of generating all the rest of what was needed.

However, when I looked into it, it turned out that the last programme chairs who actually used it were me and Andrew in 2014. Subsequent chairs had generated or hand-written the HTML in a different way, including manually adding all the session details.

I wanted to create a process that had the good features of the old process (like automatically creating the session and session leader pages) but was easy enough to use that subsequent chairs would actually use it.

The programme had to look like the rest of the site

The way it had worked previously was that the “publish” button saved the HTML for the programme (so not the whole page, i.e. not the header and footer) to the file system on the server. If you look closely at the diagram above you can see the ‘Publish’ button says “to staging only”.

Close up of button that says 'Publish, to staging only'

The CMS then picked up the HTML from the file system, added the same header and footer as the rest of the site and published it along with everything else when the publish script was run. (This did mean that the preview and published programme HTML was actually available on the file system if you knew where to look, not even behind Basic Auth.)

Before launching the new Jekyll site, I did a proof of concept to make sure it would be possible to publish directly, rather than via the CMS.

I also looked around to see how other conferences generated their programmes, but I couldn’t find a good tool.

Working out how to generate the programme took AGES

It took a very long time to turn the proof of concept into a working process for generating the programme because there was a lot to unpick. The main challenge was making it look like the rest of the site.

I didn’t want to add it to the Jekyll project, because that is public and I think it’s important that the programme generation is done privately. You wouldn’t want session proposers to find out their session had been rejected by being able to observe you editing the draft programme. You might also not want it to be public who had pulled out of the conference or other changes you might make in the run up to publishing it.

I considered a number of options for how to do it, and concluded that adding it directly to the correct place on the filesystem on publish was the least worst option.

How programme generation works now

Instead of generating HTML and saving that to the file system, my changes saved the programme into the database. I created a very basic form for editing.

Basic form for editing programme

I had big plans for making it much prettier, but meanwhile the conference cycle marches on and I had to get something out before the programme meeting in March of this year, so MVP it is.

To keep it in synch with the Jekyll site, I added programme includes. These are included in the programme page so that when the rest of the site is generated they will change too – so, for example, if another item is added to the menu this will show up the same on the programme page.

<?php include('{$GLOBALS['pathToCurrentYear']}/programme_includes/header.html'); ?>

This means that when I’m retiring the site to go in the list of previous conferences, I’ll need to make sure to replace that with the actual HMTL.

I also added the ability to preview the page exactly as it would look on the site. previewing programme looks as it would on site

And the publish button now actually publishes to the live site, by saving it in the correct place on the filesystem.

I’m not very happy with the separation of the generation of the static pages from the generation of the programme, especially as there is a chance they could be briefly out of synch (e.g. if you add a new menu item and republish the site but do not republish the programme). In practice, this will rarely happen as once the programme is published it tends to be frequently republished as things change, e.g. session leaders update details of their sessions. However, I see this as an area to be improved in future.

The preview and publish code both reused the code from the previous editing of the programme to generate the full info for the session, including a link to the session page and the user pages.

In addition, I made it so that publishing the programme also publishes those session and user pages rather than requiring the extra step of the publishallsessions.php script..

(This meant, before my changes, if a user updated their bio and asked you to publish it you would have to run publishallsessions.php in the browser, then navigate to the edit programme page in the site and hit the publish (to staging only) button, and then run publish.php in the browser. Now, you just navigate to the programme and hit the publish button.)

Tidying up after the programme

Finally, I deleted all the Xinha code. This was INCREDIBLY SATISFYING.

Screenshot of

Tidying up after the CMS

But that was NOTHING compared to the joy of getting rid of the CMS. Not just the code – that was great – but also tidying the database.

CMSMadeSimple stores everything in the database, so once I’d removed it I could remove 54 tables. Added to the removal of the MediaWiki tables, in a few months I had gone from a database with 113 tables to one with 10. This is so much better.

Also, because the publish.php script copied everything from staging to live, it meant that there were several unused files that got copied over year after year. For example (check out the URL) And here it is again in 2017:… and every year in between. It’s nice to tidy those things up.

Making it easier to develop locally

Removing the CMS also made it much easier to set up a local development environment. When I took over running the website in 2014, I did what I always do: set up a Vagrant box, managed by Puppet, so I have to do as little manual set up as possible.

Part of our setup of the CMS involved some modifications to the staging site, so I set up Puppet to copy over those modifications. These were things like the .htaccess BasicAuth. Removing the CMS meant I could remove a whole Puppet module, and also some manual instructions for setting up the site locally.

It also meant that developing locally didn’t require setting up a local version of the CMS.

What did the new chairs think?

I rushed to get this ready so that this year’s programme chairs could set up the programme for this year. They did this, using the site, and didn’t ask me any questions about it so I assumed it was straightforward enough to use.

At the conference, I managed to grab them to get their feedback, and they hadn’t felt quite as positive about it as I’d hoped. Initial response was “completely incomprehensible interface”.

However, they managed to use it rather than handwriting HTML, unlike any other chair since 2014, so I still call it a win.

We came up with some clear ideas of how to improve the interface to make it more useful and easier to use. However, one thing worth mentioning is that a lot of the programme chairs’ concerns were that it didn’t allow them to add all the information involved in generating the programme, like the speakers’ availability, their AV requirements, etc. That is by design; the aim of this form is to give chairs an easy way to generate the programme in the style of the rest of the site. The complexity of putting the programme together needs to be done elsewhere because we can’t assume chairs will do it in the same way. Some might use Trello, some spreadsheets, some index cards, etc.

However, I’m definitely going to make some improvements to make it more user-friendly to do the job it’s meant to do. Now it is in my control, rather than using an external tool, I can continue to iterate it to make it better.

Was it worth it?

If this had been part of my job, it would have been very hard to justify doing this work, especially as it took so long. The right solution would probably have been to buy a tool that doesn’t reflect our process, and change our process.

But this is a side project for me, and I enjoy tinkering and improving it, and in doing so it allows us to support our community-led and anonymous submission process.

And it is really satisfying to be able to do things that people like, and get that feedback. One of the programme chairs said the CMS made her feel stressed because she didn’t know what needed changing, but with the new Jekyll site, she could see what everything was and it felt “calming”. That is high praise indeed. When it’s all code, you can grep, but with the CMS you might not be able to find what you are looking for.

And another huge advantage of making the static pages open and accessible to others is that when something on the site broke, in this case the build, it wasn’t just an email to me to do something; someone else could fix it.

Roll on open sourcing the rest of it!

Understanding my strengths using a Johari window

02 June 2018

When leaving my last job I decided to do an exercise which compares what other people think my strengths are with what I think my strengths are. This is a write-up.

Others may know you better than you know yourself

Various studies over the years have found that your colleagues are actually better at identifying some of your strengths than you are yourself. Here’s an interesting article in the Guardian from 13 years ago and a more recent article in the Atlantic.

The Guardian article suggests you try an exercise called a Johari window, and when I was leaving GDS seemed like a good time to try it out.

To do this, you ask some co-workers to choose 5 or 6 adjectives that they think best describe you, from a list of 56. You do the exercise yourself separately, and then you compare your answers with theirs.

Adjectives you select and they also select are strengths that you have self-awareness of. Things they select and you don’t, are traits that are obvious to others and not to you. Adjectives you select and they don’t are things that aren’t as obvious to others as they are to you – maybe because you hide them (e.g. you might try and hide anxiety) or perhaps because you don’t know yourself as well as you think.

I set the exercise up using a Google form

I listed the adjectives in the order they are presented in the Guardian article (rather than Wikipedia which has them in alphabetical order). There is one adjective on Wikipedia (empathy) that is not in the Guardian so I added that. The form was anonymous, so I wouldn’t know who had said what.

I then emailed 11 people I’d worked with closely over the past few years and asked them to fill it in, with a link to the Guardian article for more explanation.

Screenshot of the Google form

For my own results, I went through the list deleting until I had the set. It was pretty hard to get it from around 12 to 6 and I had to do a lot of hard thinking (“Am I more this than that?”).

I also found that some of my strengths didn’t seem to be represented. For example, one of my strong points is that I get things done: I’d characterise this as determination or single-mindedness. What adjective in the list would represent that?

I put the results into a Johari window

Of the 11 people I asked, 10 filled it in. Here is a photo of the results. The number in brackets refers to the number of other people (not including me) who selected it.

If you can’t read my handwriting or see the picture, I’ve listed the full results at the end.

Image of the results. Full details and results listed in appendix

The results were positive and interesting

This seems to me to be a good result; of the 6 strengths that I identified, 5 of them were agreed with by others, showing that I have a good amount of self-awareness. Additionally, the ones that I value most highly in myself (intelligence, trustworthiness) were selected by a high number of the others, which is pleasing. The one that I selected and no-one else did was “caring” which I dithered over whether I meant “caring” or “kind” – and 2 people selected “kind”.

There were some interesting results. Three people picked “calm”, which isn’t something I’d use to describe myself (it didn’t make the top 12). But does it really matter? In a work context, if people think I am calm then that’s a good thing, whether or not I consider myself to be calm on the inside.

Some of them also seem very similar. What’s the difference between “independent” (which 4 picked and I did not) and “self-assertive” (which I picked and 2 others also picked)? More interestingly, what’s the difference between “intelligent” (which I and 5 others picked) and “clever” (which no-one picked)? Is the difference there just a factor of “intelligent” being earlier in the list?

I’m not sure of the value of the fourth quadrant, “Not known to self or others” - it might just not be known because it isn’t true. This is referred to as the untapped area, but some of them you wouldn’t necessarily want to tap, like “silly”, or “tense”. And some of them contradict others, e.g. “introvert”/”extravert”, so you’d hope for at least one of those in the fourth quadrant. I think this is where the model doesn’t quite work, but I haven’t read the original paper.

If I did it again I might do a few things differently

The Google form was not very user-friendly. If I was doing it again, I think I would print out pieces of paper and ask people to circle the relevant ones, as that might be a much easier way for them to parse all the information and pick one. It is a lot to ask of people, so anything that could make it easier would be helpful.

I’d also be tempted to remove some of the options, both to make it easier for the participants, and also because some of them are odd. For example “loving” is not something I would expect colleagues to comment on. “Religious” is odd as well, is that relevant to your personality? Could we remove some of the near duplicates, like “intelligent”/”clever”?

Another possibility was suggested to me by my excellent art therapist and trainer friend, Dr Sarah Haywood: you could just ask people to supply their own adjectives and then put them into a word cloud or similar. That might also have the advantage of drawing out some more negative words, which is also really useful to know.

I’m glad I finally did it

I first saw the article in the newspaper when it came out, over a decade ago, and at the time I did not feel confident enough to ask colleagues to do that for me. That change in itself is interesting – I may be more self-aware than I was then, but I’m also much more comfortable with asking people to do things like this for me.

In all, it was an interesting and worthwhile exercise, and it was nice to see that my self-assessment is reasonably close to what others think of me. If you do try it for yourself, do let me know how you found it!

Appendix: Full results

The number in brackets refers to the number of other people (not me) who selected it.

Known to self and others

Known to others and not known to self

Known to self and not others

Not chosen by anyone

Cleaning up SPA for GDPR

20 May 2018

I think GDPR is great. I am really pleased that companies are now being more thoughtful about how they handle my data. For SPA, a conference that has been running for over 20 years, this coincided with a project I had already started to clean up the user data, and here’s what I did.

We had a lot of dormant users

When I first decided to clean up the user data last summer, there were 1176 registered users, of whom 906 hadn’t logged in in the past 5 years. Some of them hadn’t logged in since the account was migrated from a previous database in 2004.

The data we store about users is mainly the information they enter in their profile: name, email address, and maybe bio, company name, Twitter handle etc. We used to also store address, phone number and fax (!) details if the user entered them, but I removed these fields and all the data as part of retiring the wiki.

We also store proposals users have made, and any feedback or reviews of other proposals, but only for the duration of one conference cycle (i.e. they are deleted every year).

First step: delete a large number of inactive users

As Mags Allen points out in this excellent developers’ guide to GDPR, one of the key points of GDPR is only to store the data you need. So the first step was easy; I deleted everyone who hadn’t logged in since 2010. I left a long gap as, because of the community nature of SPA, people might not submit for a few years and then come back into the fold.

Although it wasn’t completely straightforward; the dates were stored as strings and the format had been changed at some point from d-M-Y to j F Y (i.e. from 20-Nov-2004 to 20 November 2004) which made identifying the earlier records to delete somewhat complicated. (Something like the below helped).

SELECT * FROM users where str_to_date( reg_date, '%d-%M-%Y' ) > '20107-01-01'

I changed the format back, so this will even itself out over time as users either log in or are deleted for being inactive.

This left me with 584 users.

We generally only email our users a few times a year, announcing the CFP, asking them if they’d like to get involved in giving feedback or reviewing sessions, and then to tell them about the conference and suggest they come.

The first two are arguably “legitimate interest” because we have a “relevant and appropriate relationship” to the people on the list, but the last one is definitely marketing. In any case, I think it’s best to be explicit.

So in November last year, I added a checkbox to ask for consent to receiving emails. I highlighted the box so that users who hadn’t seen it would have their attention drawn to it.

Checkbox with label text "Can we email you about the conference? We send occasional emails, for example announcing our call for papers each year. Check this box if that's OK"

I also made it easier for new users to register. Previously when signing up for an account, you were asked to fill in a whole load of stuff including a bio. In fact none of this info was needed unless you were a session leader, and it was offputting and a hassle to be asked for it all up front.

I changed this so on first registering you are now only asked for name, email and consent to contact. If you come to edit your profile later you are given the option then to enter the rest of the info. In most cases, you would only edit your profile if your session is accepted to the conference, at which point details such as bio become relevant.

You could see all the registered users

Because SPA is mostly interactive workshops, many sessions have more than one session leader. One person submits, and then adds the people they will be co-leading with.

If they were new to SPA, you added their email address and an account was created for them. But if they weren’t new to SPA, you could select their name from a drop-down list of all registered users.

all users in drop down

Like many aspects of the conference, this dates back from a couple of decades ago when it was mostly a group of people who already knew each other. But fast-forward to the current situation with over 1000 users and being able to enumerate all users is really not great security. It’s also not a great user experience.

I changed this so that you just add the co-leader’s name, and behind the scenes we figure out if they are an existing user or not.

I tested the current list

Once I’d made those changes, in early December last year, I emailed all the remaining users to announce our call for papers, including an unsubscribe link and instructions on how to delete their account. This was the first time we’d emailed the list for a while.

It went to 584 people, had 95 hard bounces (including an entire company!) and 5 unsubscribes which helped clear up the list. (This, among other publicity, resulted in some excellent submissions for the conference this year and it’s a great programme – you should come!)

Four months later, I reviewed who had consented

By April this year, there were 553 users.

77 had said yes to being contacted, 15 had said no, and 461 had neither said yes or no (i.e. they hadn’t edited their profile).

Of the 77 who had said yes, 67 had registered after the addition of the consent checkbox – so would have seen it on the page they created the account.

Of the 15 who said no, 14 had also registed after the change.

It’s not possible for you to say neither yes nor no on first registration, because not checking it saves it as no. However, it is possible then to log into your account and not see it, because it is on an ‘edit my profile’ page which you may not visit if you’ve logged in for another reason.

In fact, of the 461 who still had a NULL value for the checkbox, 23 had logged in since the change but clearly hadn’t edited their profile. Not a huge number but worth making it more prominent, so I also flagged it up on the first page you see when you log in. Over the next couple of weeks, a few more people changed the setting for that.

It was also nice to note that the majority of new users (88%) had consented to email, which was encouraging about the messaging.

Last chance to sign up

Alongside this I’d been updating our terms and conditions with the help of the other committee members.

On 11th May I sent an email announcing the programme. I sent two versions: one to all those who’d checked yes to emails, and a different one to the remaining people who had selected neither yes nor no, which in addition to the details of the programme also said that it was the last email they’d receive from us unless they opted in.

This had a good result; within 24 hours the number of people consenting to contact had gone up to 92, and a week later it was up to 100.

And we’re done for now

There’s some extra stuff I’d like to do, like make it easier for users to delete their own accounts rather than having to ask one of us to do it, but I’m really happy to have cleaned up the user data so we are only storing the minimum information about people, and are only contacting those who are actually engaged with the conference.

And it’s a great conference, you should come!

How coding in the open can help you release faster

29 April 2018

A few weeks ago I talked about how coding in the open can help you release faster at PipelineConf. This was my last talk as Open Source Lead at the Government Digital Service. Here is the video, slides, links, and a bonus sketchnote!

Video (30 minutes)



For a great summary, here is an excellent sketchnote of it by Sarah Jones.

Sarah Jones

I love to see sketchnotes of my talks, especially when they are as clear and nicely designed as this one. Thank you, Sarah!

Finally, I mention a lot of resources in my talk. Most of them can be found in this list of useful open code resources. Those specific to this talk are below.

SPA Software in Practice conference – you should totally come!

Examples of how coding in the open helps government

When build a thing really works – blog post about using smart answers code.

How making our deployment code open improved our workflow

Point 8 of the Government Service Standard: Make all new source code open

Addressing challenges of coding in the open

GDS pull request guidance

Blog post: Easing the process of pull request reviews

John Allspaw’s blog post about the importance of being able to recover quickly: MTTR is more important than MTBF (for most types of F)

Companies that open code, even when it’s their secret sauce

Feedbin is open source

GitLab uses LibGit2

Open code is the foundation for a culture of openness

GOV.UK roadmap

Trello for the publishing platform team

GOV.UK architectural decisions

GOV.UK incident reports

Deploying to shared hosting with Travis

13 March 2018

The SPA conference website deploys to production automatically on merge to master. It is on shared hosting, and setting this up with Travis was a bit tricky, so I’m blogging how I did it in case it’s useful for others.

The documentation on deploying to shared hosting is not very clear

Configuring Travis to deploy automatically on merge to master is pretty straightforward if you’re using a supported provider. However, if you’re using shared hosting, the documentation is not amazing.

I figured it out eventually by juggling between a blog post (this one is useful for explanations of how Travis works but less so for step-by-step), the actual config that person uses, another blog post (useful for some of the steps), the Travis docs on encrypting files, custom deployment and the build lifecycle, and yet another blog post (very useful for the actual config, but for a different set-up than I wanted).

That juggling made me think my addition to the genre was worth it.

The steps I took

  1. Follow Travis’s get started instructions to sign in to Travis and add the repository you want to deploy.
  2. Create a .travis.yml file. The inital one can just have the language and how to build the site.

     language: ruby
     - 2.3.3
     script: scripts/
       - master

    You can tell Travis how to build the site inline, or call out to a build script as I have done.

     set -e
     bundle exec jekyll build

    The build script has to be executable (chmod +x 600).

  3. Install travis locally (gem install travis), and, using the command line tool, encrypt any variables that you will use for deploy. For example, if you use FTP, you will want to encrypt at least the password.
     travis encrypt SOMEVAR="secretvalue" --add

    The --add flag in this command adds the required lines to your Travis file.

  4. Write a deploy script: a script that describes the steps you take to deploy to shared hosting. For example, if you use FTP, then write the command that you would use to FTP the built site to the server. Instead of the variables you have encrypted, use SOMEVAR.

    I use Rsync to deploy, and have encrypted the username and deploy host. You can see the deploy script on GitHub.

    This also needs to be executable.

  5. Create a dedicated SSH key (no passphrase) for deploying. This makes it easy to to identify and revoke if necessary.

     ssh-keygen -t rsa -b 4096 -C '[email protected]' -f ./deploy_rsa
  6. Log in to command line Travis (travis login) and get Travis to encrypt the private key file. It prints a helpful output reminding you to only commit the .enc version NOT the deploy_rsa itself.

     travis encrypt-file deploy_rsa --add

    Again, the --add flag automatically adds the required lines to your Travis file.

    Commit the changes to .travis.yml and the deploy_rsa.enc.

  7. Copy the public key to the remote host.

     ssh-copy-id -i <ssh-user>@<deploy-host>
  8. Delete the public key and the private key as they are no longer needed; you only need the encrypted key.
     rm -f deploy_rsa
  9. In the before_install, change the out location for the decrypted private key to /tmp/ so it doesn’t end up in any of the publically accessible directories, and make it executable.

     - openssl [snip...] -out /tmp/deploy_rsa -d
     - chmod 600 /tmp/deploy_rsa
  10. In the same section, start the ssh agent and add the key to it.

     - eval "$(ssh-agent -s)"
     - ssh-add /tmp/deploy_rsa
  11. Get .travis.yml to call the deploy script.

    This is an example of how the Travis docs could be clearer. You don’t want to deploy if the build script fails, so you might want to call this in an after_success step as part of a script deployment, but in fact, this will do whatever you ask after every successful build. But in our case, we only want to actually deploy on merges to master, not on any successful build. For example, if we’ve just opened a PR and that builds successfully, we don’t want that branch deployed to production.

    So what we actually want is custom deployment.

       provider: script
       script: scripts/
       skip_cleanup: true
         branch: master

    skip_cleanup is required because otherwise Travis resets the working directory, so you lose the artefects from the build (i.e. exactly what you want to deploy!)

What my config looks like

The full travis.yml we’ve now created looks like this:

language: ruby
- 2.3.3
script: scripts/
  - master
  - secure: qvSoY270qAXOtmWdRio9vvhLEf5HHdyzMS39yS4yZw74[snip for length]
  - secure: Hr7FV7lHFEblYfn7EYM/4qV3qV8zdHLebXzNyRvP8L/U[snip for length]
- openssl aes-256-cbc -K $encrypted_ed2cb1b127e1_key -iv $encrypted_ed2cb1b127e1_iv
  -in deploy_rsa.enc -out /tmp/deploy_rsa -d
- chmod 600 /tmp/deploy_rsa
- eval "$(ssh-agent -s)"
- ssh-add /tmp/deploy_rsa
  provider: script
  script: scripts/
  skip_cleanup: true
    branch: master

Further information

This (allegedly deprecated but still working) travis.yml linter is useful.

Have a look at the SPA website Travis file and the builds if you’d like to know more about how it works for us.

Resources for coding in the open

02 March 2018

One of my main aims as Open Source Lead at the Government Digital Service was to make sure that there were good resources in place to help people code in the open. Many of these didn’t exist when I started the role so I created them. I’ve collected useful resources here.

The benefits of coding in the open

I was often asked what the value of coding in the open is to the teams themselves, those who are opening the code. There is a lot of value. I’ve shared that in various formats:

How to make your code open

Security when coding in the open

From the security perspective, it’s also worth knowing that while GCHQ don’t code in the open, they have released quite a bit of open source code.

GDS open code

Coding in the open across government

Let me know if there’s anything else

I hope you find this list useful. I’ll update it as things change so let me know if there’s anything you’d like to see here.

How to open up closed code

19 February 2018

Every digital service designed within government has to meet the Digital Service Standard. One of the requirements of the standard is that new source code should be made open and published under an open source licence.

There are a few situations where it’s acceptable to keep code closed but in most cases it will need to be open.

The easiest way to make code public is to write the code in the open from the beginning, and I’ve previously blogged about the benefits of doing this.

Your team may have old closed code that it needs to open. If there is a lot of closed code, this can be challenging. Here are 3 ways to open it up.

Option 1: Cycle all credentials then open it

The commit history is a very useful part of a codebase as it explains the reasoning behind changes. If your team wants to maintain the commit history a good approach is to make sure your code contains no secrets and then make it public.

GOV.UK’s infrastructure team did this when it opened the GOV.UK Puppet code. The team reviewed the code closely and made sure that any credentials mentioned in the code were no longer in use. They were then able to open up the code. You can read the infrastructure team’s blog post about it to learn more about what tools and techniques they used to make sure the code was safe to open.

The advantage of this approach is that you maintain the full commit history.

Option 2: Rewrite history

If your team feels that the commit history is not suitable for publication, there is an alternative. After reviewing the code to make sure it’s safe to open, you can either rewrite history to remove or improve some commits, or squash all previous commits into one.

The MOJ’s Prison Visit Booking team took this approach when it moved from coding privately to coding in the open. The team opened the existing code in a snapshot commit (this was the first commit) and then carried on coding in the open.

The advantage of this approach is that you don’t have to touch your infrastructure to do things like change a password or other details. However, the disadvantage is that you lose useful commit history, which can provide context for people working on that codebase.

Option 3: Move code to a new repo as you use it

Another way to open closed code is to create a new repository and move code. GOV.UK used this method when improving its deployment workflow. This helped the GOV.UK team to spread out the work over a longer time and share the workload out between a larger group of people.

The advantage of this approach is that you can do the work in smaller chunks. And, all the code is examined as you make it open. The disadvantage is that it can take a long time. You’ll also need to make sure your team finishes the job, otherwise you’ll end up maintaining two repos alongside each other.

After the code is open, work on the open code

You may be tempted to open code in a snapshot, continue to work in private, and regularly release code to the open version. This is a bad idea because:

I hope these examples will help you find the method that will work best for your team, so you can enjoy all the benefits of coding in the open.

This post originally appeared on the GDS technology blog.

A year in the life of an Open Source Lead

08 December 2017

I have been the Open Source Lead at GDS for a year now. Here are some of the things I’ve achieved and learned.

Setting and implementing strategy

Open source in government covers a huge range of activities, including open sourcing our own code, using open source software and contributing to open source software. My first task was to define the strategy, work out what areas would have the most impact and then prioritise within those areas. I wrote about the strategy, and the areas I prioritised, on the GDS blog.

I had to be very focused to stick to these priorities, as lots of people had many interesting ideas for other things I could do. Most of these ideas were excellent suggestions that were just not as high a priority, though I was able to do some of this work as well without getting diverted.

Driving coding in the open

The UK government has committed to making all new source code open.

Many teams are doing this really well (you can see a lot of government code on the GitHub government page). However, some teams find it harder to meet this commitment, so one of my main priorities was helping those teams to overcome the barriers.

The first step to coding in the open is very small; you don’t need to jump in by opening all the code that exists in your organisation. The first thing can be just creating a repository on GitHub with a licence and a README. The step itself is easy; what is difficult is making the decision to take that step. There are many barries, including organisational barriers and emotional barriers.

This year I’ve approached that from three main angles.

Firstly, I’ve shown the value of coding in the open; why it’s worth taking that step. I’ve had many one-to-one chats and given presentations to organisations, and to reach a wider audience I’ve given two conference talks, at GOTO Berlin and Turing Fest.

Most importantly, I’ve blogged about the benefits on the GDS blog, which I’ve had feedback has been of huge help to open-source advocates across both government and industry.

Secondly, I’ve addressed the reservations teams have around the practicalities of taking that step. The first question most people have is whether coding in the open is secure. It is. To make that clear, and to share the safeguards you need, I’ve published guidance about when code should be open or closed and security considerations when coding in the open. This not only helps people with the details, it also demonstrates our commitment to the policy.

Thirdly, I’ve addressed some of the other practical details. For example, here is some guidance I wrote about how we license code at GDS.

Opening existing government code

As well as helping teams code in the open, I’ve also supported teams opening existing code that had been closed. As well as lots of smaller services, I’m really proud to have supported the opening of three high-profile GDS projects this year. GOV.UK Verify, GOV.UK Pay and the Register to Vote frontend.

The Open Source Lead role is about strategic influence, and the versatility required to persuade and support such varied teams, working with everyone from developers to very senior executives, has been one of the most interesting parts of the role.

My work here has ranged from writing papers for completely non-technical audiences explaining why (and how) the code should be opened, through hands-on help with the code, to influencing senior stakeholders to push for quicker progress from their angle.

Building a community

One of the main ways to find out about useful code you can learn from and reuse is by hearing people talk about it. It’s also really useful to have people you can discuss shared problems with, so one of the most important things to do was to build the kind of environment where these conversations can happen.

To facilitate this I organised a series of cross-government open source meetups. They were very well attended, each with around 100 participants from 20+ government organisations, and they received 80% Net Promoter Scores. Khidr Suleman did a good write-up of the second meetup.

More important than the events themselves was the community I built around this work. We have a cross-government Slack organisation and there is an #open-code channel. When I started this role a year ago, this channel was very quiet, and I was involved in most of the conversations. Now I often log into Slack and see that several questions have been asked and answered by other people, and discussions have taken place that haven’t needed my input. This is really heartening to see, as it means the community is supporting each other without my active involvement being required on a daily basis.

Learning and sharing how to write business cases

One thing I had to do a lot more of this year was write business cases, and I wrote some guidelines on how to do it. I was really pleased when the head of GDS’s delivery management community sent the post round to his team, to show why writing business cases need not be laborious or confusing.

The most interesting piece of work that I wrote a business case for was a discovery into how we could promote reuse of open government code. One addendum is that I followed my own guidelines to write the business case, but although it was well-received (the report I got was that it was “super”) that wasn’t enough to get it approved; I had to keep pushing, and eventually rewrote it as a one-pager, which got a very quick result. In future I’ll try and keep such pitches to one page.

Keeping my eyes on the prize(s)

Another interesting aspect of roles at this level is that things can take a very long time to come to fruition. I had to demonstrate single-mindedness to see some of them through.

For example, the security guidance I published had a longer than expected road to publication, firstly because of the unexpected general election and then because of a new process around guidance, and it involved liaising with a wide range of stakeholders including NCSC and other government departments. I started work on it in January and wasn’t able to publish it until September; but it was worth waiting for: the blog post I wrote announcing it made it to #2 on Hacker News and had 11,000 views in the first 24 hours. More importantly, lots of people have given me feedback that it really helped them with opening their code.

Some other things I’d like to mention

Alongside these larger themes I did some smaller things I’m pleased with.

I worked with several departments to review and assist with their open source policies. One of the things I’m really proud of is that, based on my feedback, HMCTS changed their open source policy from “closed by default” to “open by default” and their technical architect later gave an interesting talk about how that cultural change had happened at our meetup.

I arranged a workshop on contributing to humanitarian open source software; two core committers from OpenMRS gave us an introduction to the project and the code, and then we worked on some tickets. We all managed to contribute some code to production during the workshop, which was great.

I’ve supported and encouraged several others to write blog posts about their open source work. Many have been published on the GDS technology blog as well as an excellent one from the Ministry of Justice.

I’ve also talked to a lot of my counterparts in other governments, and I even found myself writing some lines to send back to an MP responding to a constituent’s inquiry about open source.

I also did a lot of other work moving things along, but can’t yet report the results. That’s one of the interesting things about this level of work, the achievements are larger but the intervening steps tend to be things that are not interesting to talk about (“I had a chat with this person and now have an idea of what to say to that person to convince them that this other course of action might be a good bet…”).

Making open source in government self-sustaining

My aim in any job I do is to ultimately replace myself with a small shell script. For example, a major success this year was when my colleague Jenny Duckett was able to help a team through the whole process of opening their code, using guidance I’d written and with almost no input from me.

As Open Source Lead I’ve received a lot of very similar questions on a lot of very similar topics, and much of the work I’ve done this year has been to stop people having to ask those questions by publishing the answers and then publicising them. So my aim in this role is perhaps not to replace myself with a shell script, but rather series of blog posts, guidance, videos, and a shared culture and community. And from the evidence so far, I’ve made significant inroads to that!

Coding in the open in government: Turing Fest talk

30 November 2017

In August I gave a talk about coding in the open in government at Turing Fest. The video has just been published and you can watch it on their site (you can skip the email request).

The slides are here:

If you’d like more detail on any of the things I reference, links are below:

GDS code

Scottish government code

Some other central government code

Government blog posts that I discussed

Building a platform to host digital services

Coding in the open makes better code

When ‘build a thing’ really works

Digital service standard

Make all new source code open

GOV.UK coding styleguides

Commit message guidance

Pull request guidance and blog post

Easing the process of pull request reviews

Be able to recover quickly

MTTR is more important than MTBF (for most types of F)

Feedbin is open source

GOV.UK roadmap, Trello and incident reports

GOV.UK roadmap

GOV.UK publishing platform Trello

GOV.UK incident reports

Sign up to my new mailing list!

05 November 2017

I usually publicise new posts on Twitter, though I also have an Atom feed. However, Twitter relies on people seeing it at the right time, so I’ve set up a mailing list for new blog posts (and possibly very occasional annoucements).

Skip the detail and just sign up

Email Format

Why I’ve set up a mailing list

Only publicising posts via Twitter relies on you seeing a tweet to know there’s a new post. Twitter is also problematic, and it seems like a good idea not to rely on it as my only way of sharing my writing.

Why you should sign up to my mailing list

I blog about a bunch of interesting topics, from open source in government through how to write a good pull request to tips on conference speaking, via more than you ever wanted to know about redirecting URLs, so signing up for the mailing list means you won’t miss out on any of that good stuff.

How I set it up

MailChimp currently have this market sewn up; I checked every newsletter I’m subscribed to (I highly recommend Sandi Metz’s Chainline and Benedict Evans’s newsletter, by the way) and they are all powered by MailChimp. The plan is free for the first 2,000 subscribers; after that it becomes relatively pricey and I don’t have a business model for this blog. However, you can extract your mailing list and I figured JFDI; it will take some time to get that many subscribers!

However, I had a lot of issues setting it up, and after the amount of time it took me to create the email template I somewhat wished I’d gone a different route. So if anyone has any suggestions of better mailing list software then do let me know.

A few issues with MailChimp

  1. The unsubscribe link in the email takes you to a form that requires you to write the email address you want to remove. This is an annoying pattern (we know what it is! You just followed a link from the very email we sent!) and was nearly a deal-breaker for me, but time is limited and I will work out how to migrate away/fix that later.

  2. There were some hidden rules about what email addresses are allowed for the mailing list, for example [email protected] wasn’t allowed. I don’t have a catchall for my domain, so I had to set up a new forwarding address each time without a hint as to which other addresses wouldn’t be allowed.

  3. I found the experience of setting up the mailing campaign frustrating; drag-and-drop editing the design of an HTML email must be challenging even if you’re not by default a Vim user. There is also a limit to how many test emails you can send which I didn’t find out about until I’d hit it (it’s 12 for free accounts) and there doesn’t seem to be any way to test a plain-text version of the HTML email — so if you do sign up to receive emails in plain text and it looks rubbish, please do let me know.

It’s also worth noting that they’ve changed their default sign-up to not request confirmation.

Sending blog posts automatically is easy

Having said that, it is free, and it offers some great advantages. It’s very easy to manage subscribers and unsubscribers.

The automation of emailing when a new blog is published via your RSS (or Atom) feed is great and straightforward to set up, it was just designing the email that was taxing. And I’m sure that there are a lot of other advantages to MailChimp that I’ll see when I start using it a bit more.

However, do let me know if you have other good solutions

There are some free/open source tools which I didn’t have time to look into, and if anyone has any recommendations I’d love to hear them. I think my requirements are:

Also, if you know any MailChimp solutions to my problems above, please let me know.

And do sign up!

An awk command I always forget

13 October 2017

There’s a task I have to do every now and again for which awk is the best tool, but it’s infrequent enough that I always have to remind myself how. Usually by referring back to some shell scripts we wrote 5 years ago, so thought I’d post here instead.

Tell me how many different kinds there are

Given a CSV of people from different government organisations, tell me how many organisations are represented:

awk -F "," '{ print $4 }' output.csv | sort | uniq -c | wc -l

Instead of wc -l, I usually pipe to a file so I can manually edit out duplicates like MOJ/MoJ/Ministry of Justice, but that’s straightforward once I’ve done all that.

While I’m here making notes of things I forget

find . -iname *utput*

Removing MediaWiki from SPA: Cool URIs don't change

01 October 2017

I run the website for SPA conference. This conference has been running for more than 20 years, and I’ve been the web admin since 2014. This is about one step in updating a 20-year-old legacy PHP site into something more modern: removing the integrated wiki. The story is in two parts and this part is about how I made sure links to the old wiki still worked. It involves many, many things I learned about .htaccess.

To read the other part of the story, about the actual changes I made to the site, see my previous post: Removing MediaWiki from SPA: changes to the site.

Getting rid of the wiki

The integrated wiki was mainly used for session outputs and user pages. I’ve described in my previous post how these are now handled on the main site.

When I’d made those changes, there was no longer any need for the wiki. However, there are lots of links to it from previous years, both of outputs and user pages, and Cool URIs don’t change.

By far the most time-consuming part of this work was redirecting old links to the new site. I thought it might be less work than my previous huge redirection of URLs when we launched GOV.UK, but it was close.

Saving the wiki as a static site

What I wanted to do was save the pages that already existed on the wiki as a static site. So the thing to do was spider the pages and save that. This is something I’m familiar with from /dev/fort, where we don’t have internet connectivity so we need to take the internet with us.

For a /dev/fort we use some Chef cookbooks but in this case, I only needed one command:

wget -m -k -p -nH -np \

man wget if you want to know what the arguments are, but to save you a bit of time if you’re only mildly interested:

-m = mirror
-k = convert links
-p = download all page requisites
-nH = Disable generation of host-prefixed directories
-np = no parent; i.e. do not ascend to the parent directory
when retrieving recursively.

So then all I had to do was put the output of the spider into a directory called mediawiki, retire everything that was in the old mediawiki directory and job done, right? Well, not quite.

The URLs give no information about content

The first issue was that the web server didn’t know what to do with the files. Most web servers guess at the content type based on the file extension. The files in my static site have names like Without a file extension, browsers will assume it’s binary.

Dialog box indicating server thinks files are binary

To solve this, I created a new .htaccess file in the mediawiki directory. I wanted to contain strange config needed for retaining the wiki information to the directory itself, rather than messing with the already somewhat complex .htaccess for the whole site.

I also took this opportunity to move the new MediaWiki folder into a public repo on GitHub. The rest of the SPA code is currently private (which is a story for another day) but there is no reason for this to be, and there is no reason for it to be part of the main SPA repo. The first change to the .htaccess was to use Apache’s FilesMatch directive to tell the server that pages where the URL contains .php? should be returned as HTML.

<FilesMatch "\.php\?">
  Header set Content-Type "text/html"

? has a reserved meaning

The next issue was that all the saved MediaWiki file have names like index.php?title=Main_Page. But ? has a reserved meaning in the browser, to indicate a query string. So the browser will not be looking for a file with a name with a ? in it, it will be looking for what to do with a path of index.php and a query string of title=Main_Page.

So I needed to use Apache mod_rewrite to rewrite URLs so that index.php? would be replaced as index.php%3F. In that case, instead of trying to interpret the query string, Apache would look for the file.

This special requirement took me an extremely long time to work out how to do.

More detail of what it all means in the commit message but the answer was:

RewriteCond %{QUERY_STRING} (.+)
RewriteRule ^(index\.php)$ $1\%3F%1 [L]

Spending ages on CSS and JavaScript

I’d redirected the PHP pages in the two stages described above, first making the browser recognise the files were HTML, and then redirecting to the correct filename. So I tried to do the same with the JavaScript and CSS. I initially tried to add a FilesMatch directive to CSS files but it didn’t have the desired effect. I then tried with the JavaScript but got the same result. It kept saying that the content-type was text/html.

Image showing JS and CSS as `text/html`

This also took me a long time to work out, but I eventually realised that this was because the redirect wasn’t in place – after all, it didn’t have index.php in the title so it wasn’t getting rewritten. The text/html was coming from Apache’s 300 (Multiple Choices) response, which was, of course, HTML.

Image showing the response as a 300 and text/html

When I did both the redirect and the FilesMatch at the same time it had the desired effect.

There was an extra JavaScript file with a URL that would have required a whole new level of regexing – /index.php?title=-&action=raw&gen=js&useskin=monobook&270 – but the lack of this file doesn’t appear to have any observable effect apart from a console error so I have left it.

I could also probably tidy up the regexes so there isn’t one rewrite rule for each different file, but there doesn’t seem to be much point – it would only make it slightly harder to understand and introduce the possibility for error. It is not easy to understand .htaccess files as it is. If I planned for this one to get very long then it might be worth doing, but in this case it’s a finite piece of work: I don’t intend to be updating this directory once this move is finished.

The spider hadn’t grabbed everything

It was at this point that I discovered that wget had not grabbed all the files. Earlier this year I had discovered that redirects had not been set up for one of the previous wikis (there were at least two before MediaWiki, to my knowledge). I had put a lot of redirects in place, and that meant I had a lot of test data that I could use to check links were working. But many were not.

I realised that wget had not downloaded all of the files because some were not linked to from within the wiki, so wget didn’t know about them. This was mainly user files, because the wiki didn’t know about all users; it was only the people.php script, which generated the ‘card index’ that knew about all users (more on that later).

Image showing the 'card index' list of all users

I could have crawled that page, but the point of retiring this content is not so much to back up all the data that was ever on the wiki (especially as many of the user pages were blank so were just their name and the wiki chrome), instead it is to preserve old links. So the user pages that I needed were ones that were linked to from previous programmes.

So I put the wiki back in place and crawled it again, this time starting from each of the programme pages between 2011 and 2016 (the years where MediaWiki was in operation and the programme linked out to users).

The command was almost the same as the above, except without the np flag – no parent. In this case, we did want the spider to go up to the parent directory so that it could then go across to the mediawiki directory to get the users.

This did mean that the spider crawled a lot more than we needed because it was grabbing everything linked to. However, it’s possible to put all the starting directories in one command, in which case wget doesn’t download the same resource twice. It’s also possible to avoid that by doing wget into the root of the site locally – whichever folder structure you’re in it builds entire folder structure from root, but I was doing it into a separate directory. I also used Amphetamine to stop my computer going to sleep, and directed STDERR to a file so I could check afterwards for 500s, 404s, etc.

wget -m -k -p -nH \ \ \
[...etc] 2> file.txt

I then copied the new output over the existing mediawiki directory and committed only the new files. There was no need to commit any changed files because the changes were just things like timestamps.

Redirecting all user pages

At this point, the static site contained all the pages from the wiki, and I just needed to create the redirects. Links from previous conferences to users are of the form{$USER} which redirected the user to the person’s wiki page, i.e.{$username}. So the in order to be able to remove the people.php script, I just needed to put the redirect in the .htaccess:

RewriteCond %{QUERY_STRING} ^username\=(.+)$
RewriteRule ^scripts/people.php/?$ /mediawiki/index.php?title=User:%1 [R,L]

I then needed to make sure all the redirects worked.

I also needed to deal with the case where the page wasn’t there.

Dealing with missing pages

On the wiki, where there wasn’t a user you would get:

Wiki page for a non-existent user

Not ideal, but now there is no MediaWiki framework around it, for a missing page it was even worse:

Image showing Apache's unstyled 300 reponse page showing multiple possible file matches

If there’s no page, I definitely don’t want that 300 result, I want a 404. But it took me a long time to figure out what the issue was. I went down a rabbithole about content negotiation and turning off MultiView without success, and then eventually by chance discovered that the second part of the error message (“However, we found documents with names similar to the one you requested”) also appears when Apache is attempting to use the hilariously named mod_speling.

Check spelling is useful for the main site because it allows URLs like and to go to the right place.

However, for the wiki site we didn’t want it, so I turned it off for the mediawiki directory only.

CheckSpelling Off

This now returns a 404 if the exact URL can’t be found.

Dealing with pages that had always been missing

There was some extra MediaWiki chrome to deal with. For example, all the pages have an IP address at the top which is apparently a link to a user page for the IP address the user is editing from. However, when I ran the wget the site was behind CloudFlare, so in fact the IP addresses here are CloudFlare IPs. In any case, the user pages had nothing in them, so a 404 seems appropriate.

Login/create account on any page took you to the MySPA login page and then back to the page of the wiki you were on, through a URL something like and some clever code that linked the wiki to the main conference site. Since it’s no longer possible to log into the wiki (as it is no longer a wiki) I also made this a 404.

I also created a 404 page. Previously, it had been the unstyled Apache 404; not a good look.

Linking to new user pages

Turning off CheckSpelling avoided lots of 300 errors, but it also meant that a lot of pages were missing because of the URLs. On the wiki, several of the links had extra query parameters. A really common one was &useskin=bio because it turned out the PHP pages that created the ‘card index’ added this parameter.

The people pages were linked up to the wiki via scripts/people.php. For example: The people script passed that call through to a Smarty template, code below:

<title>SPA Conference - People</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  <frameset cols="184,*" border=0>
    <frame name="index" src="/people/index.html" marginwidth=0>
    <frame name="person" src="/mediawiki/index.php?title=User:{$username}&useskin=bio" marginwidth=0>
    <body bgcolor="#FFFFFF" text="#000000">
        We're sorry - the SPA people page requires a browser that supports frames.<br>
        <a href="index.html">Click here</a> to return to the conference site.

The template created a page made up of two frames, one being the ‘card index’, which was actually just a list of all users, and the other frame loading the user’s page from the wiki. In the example above, the frame would display /mediawiki/index.php?title=User:MarinaHaase&useskin=bio.

(Either that, or it said “We’re sorry - the SPA people page requires a browser that supports frames.”)

An interesting result of that was that if you did go to the wiki via a link like the one above and then choose someone else from the ‘card index’, the URL would remain the same.

However, it turned out that none of the stored files have &useskin=bio in their title, so this would lead to a 404 even when the page was present.

I solved this by adding a redirect to ignore all second query parameters.

RewriteCond %{QUERY_STRING} (.+)&(.+)
RewriteRule ^(index\.php)$ $1\%3F%1 [L]

This rule looks for all query strings that have two parameters and then discards the second one. (We still need the first one because it’s the user’s name, e.g. ?title=User:MarinaHaase.)

Removing all the people files

With all these redirects in place, I could now remove all the PHP files that connected to the wiki, e.g. the Smarty template shown above.

I could also remove all the MediaWiki tables from the database – there were 49 MediaWiki tables in total.

This was extremely satisfying.

That wasn’t the end of it

Before MediaWiki was the wiki, there were (at least) two other wikis for the SPA site, and when they were replaced, there were redirects to the new site in the .htaccess. However, they just redirected to /mediawiki/, relying on index.php to do the right thing. For example:

RewriteRule ^cgi-bin/$ /mediawiki/ [L]

However, there isn’t an index.php any more, just a bunch of flat HTML files.

I updated all those redirects to the main wiki page, for example:

RewriteRule ^scripts/wiki$ /mediawiki/index.php?title=Main_Page [R,L]

I needed to add the [R] flag to redirect the URL. Without a redirect, the page doesn’t get the CSS or other assets as they are referenced using relative paths.

How do you send a 403 to a 404?

These changes meant that there were a few more pages where Apache returned a 403 Forbidden. In particular, and both no longer have an index page to do the right thingwhich resulted in a 403: ‘You don’t have permission to access /mediawiki/ on this server.’

Firstly, this was again an unstyled Apache response page, and secondly, it makes it sound more interesting than it is. It’s not that there’s something cool there you can’t see, it’s just “hey everyone! We use Apache!”.

My initial plan was to redirect the 403 to a 404 (This is an acceptable way to handle a 403, according to RFC2616.) However, I could not work out how. One of the problems with Apache is that it’s very hard to Google for help, as it’s been in use for a very long time, so the internet is full of people asking questions and giving answers, many of whom do not have a clue. Much Googling and RTFMing led me nowhere. In desperation I even tried a suggestion to replace the 403 error document with a 404, which led to the worst of both worlds:

Page showing unstyled 403 *and* 404z

As an interim measure, I just returned the 404 pages for a 403 response, which is suboptimal but fine from most users’ perspective.

ErrorDocument 403 /404.php

What are users trying to do?

When users are trying to access /people/, they are probably trying to find out whether there is a list of all people there. So my first plan here was for people.html to have some text explaining what had happened. Here is this, unstyled (so with no SPA conference chrome around it).

However, even styled, this would have been no use at all.

Page showing some not very useful text about the former existence of the people page

In addition, because the disabling of CheckSpelling is in the /mediawiki/ directory not at the top level, is a 300, and the option it offers is, which is a 404.

Image of `/people/` directory being a 300

In summary, this was a terrible idea.

So I reconsidered the 403 to 404 approach and decided to redirect both of /mediawiki/ and /people/ the wiki home page. In the case of /people/ this is not really what they want either, but there is no longer a list of all SPA people and this will give them something.

This was straightforward for /people/.

RewriteCond %{QUERY_STRING} ^$
RewriteRule ^people/?(\.html)?$ /mediawiki/index.php?title=Main_Page [R,L]

In the case of it was slightly more interesting. /mediawiki to /mediawiki/ was already a 301 because of CheckSpelling, and if I put an index.html in the mediawiki directory then it would redirect to that, but I couldn’t use a ^$ redirect. I eventually figured out that this was because there is actually a directory there.

I found that if I put in a Redirect rather than a RewriteRule this worked, e.g.:

Redirect 302 /mediawiki/index.html`

(index.html doesn’t need to exist for this redirect to work)

However, then I realised that this work shouldn’t be done in that directory and in fact it’s a feature of the mediawiki folder that it doesn’t have an index.html and its homepage is actually the oddly titled index.php?title=Main_Page, so I deleted the Redirect and did the work in the mediawiki .htaccess instead:

RewriteCond %{QUERY_STRING} ^$
RewriteRule ^$ /mediawiki/index.php?title=Main_Page [R,L]

This solves the 403 to 404 issue. But I’m still interested if you can tell me how to redirect 403s to 404s correctly in Apache.

And that is the end of it, I think…

And I think we’re done. MediaWiki has been removed, but all the old outputs and information are preserved at the same URLs. Please do let me know if you spot any issues.

This was a lot of work but it was really fun and I learned far more about .htaccess than I’d ever intended to. I’m happy it’s moved me one step closer to bringing the SPA site up to date.

Removing MediaWiki from SPA: Changes to the site

01 October 2017

I run the website for SPA conference. This conference has been running for more than 20 years, and I’ve been the web admin since 2014. This is about one step in updating a 20-year-old legacy PHP site into something more modern: removing the integrated wiki. The story is in two parts, and this part is about why I removed the integrated wiki and the changes I made to the function of the site to accommodate that.

The second part is about how I made sure the wiki pages were still available.

The site included a ‘discussion wiki’

In early years, the conference was residential and had a much smaller, close community, and the wiki was used much more.

The wiki was an installation of MediaWiki, modified so that conference website and the wiki could share a single login.

In recent years, the wiki has been used for two main things: session outputs and user pages.

Session outputs

The sessions at SPA are mostly interactive, so session outpts can included slides, working code, links to GitHub repos and blog posts, etc.

Session leaders were encouraged to add output from the session to the wiki and you could then look at all outputs from the conference.

To replace this, I added the ability for conference presenters to update their sessions with the outputs. You can see an example under the heading ‘Outputs’ for the session Access control for REST services.

User pages

If you are submitting a proposal to SPA, you need to sign up to the conference site, where you fill in some details and, if you consented, a user page was created on the wiki.

In previous years tickets were also purchased through the site, so this meant all attendees would also sign up to the conference site, and so most attendees would have a wiki page. The wiki (helped by some PHP scripts) showed a ‘card index’ of these user pages to help with the community feel:

Page showing a list of all user pages

However, the conference is now hosted by the BCS, who handle ticketing, so new profiles were only created for session leaders and conference committee members. The ‘card index’ wasn’t at all up to date with who you might meet at the conference.

I moved the user pages to the site itself, rather than the wiki. And we don’t need everyone; just speakers.


A user page on the wiki


The same user page on the site

As part of this, I also made a change to the way the site handled images. Previously, when a user uploaded an image, it would be stored by MediaWiki and that would be the image associated with them. A new image would update their user page. Now, user images are uploaded to a directory called mugshots. When a user is a session leader for a particular conference, and they have an image in mugshots their image will be copied over to the current year’s images directory and used in their user page.

Two reasons to copy the image over rather than just linking to mugshots:

Updating profile page

I also made some changes to the profile page at the same time. For example, MediaWiki markup is no longer allowed. It was also collecting a lot of information like address, phone number and fax (!), but now bookings are no longer done through the site, there’s no need to collect this information. I also deleted those columns from the database; there’s no need to store it for existing users.

Old profile page showing out of date fields

I also checked whether that information was published on the wiki so I could remove it from there, but even though there were users who had checked the ‘publish address’ checkbox, no addresses were actually published on the wiki.

In the near future I’ll make more changes to the user page so that we actually get consent (or not) for marketing, rather than the implied consent seen here (“By registering with this site you are giving consent for the BCS SPA Specialist group to contact you…”); both in order to comply with GDPR, but also because that’s a horrible dark pattern.

This was also a good opportunity to change the markup from a table to more modern mark-up that will allow it to be read on a smaller device.

Users without details

Many of the sessions at SPA are lead by two or more session leaders, and if their co-presenters are not already registered on the site, naming them in the proposal creates an empty record (i.e. it bypasses all the compulsory fields). This leads to user pages with no content. This is what that looks like:


A wiki page showing a user with no details


A page on the site showing a user with no details

I think the after looks better. However, one potential addition is a script to let the programme chairs know which speakers need to pad out their bios. There’s no need to force this information at the proposal stage, as some proposals will be rejected.

Getting rid of the wiki

Having made these changes, there was no longer any need for the wiki. However, there are lots of links to it from previous years, both of outputs and user pages, and Cool URIs don’t change.

This was a lot of work, so I’ve written it up in a separate post. Read on for part 2: cool URIs don’t change.

Don’t be afraid to code in the open: here’s how to do it securely

27 September 2017

There are two big concerns government organisations have around making source code open. They want to know which subsets of the code should be kept closed and how to code in the open securely. To address these questions I’ve introduced two pieces of guidance:

Both pieces of guidance are based on industry standards and have been reviewed by the GDS security engineering team as well as government colleagues from National Cyber Security Centre, Department for Work and Pensions, Home Office and Ministry of Justice.

Why we have updated the guidance

We previously blogged about what code not to open in our post: 'When is it ok not to open all source code?', but as guidance, it is no longer relevant as our approach to specific areas such as configuration have changed. For example, last year we made GOV.UK's Puppet repository publically available on GitHub.

We’ve also evolved our thinking on security. The previous guidance discouraged people from sharing code that had anything at all to do with security. It didn’t take into account that coding in the open can actually make code more robust as it helps you design with security in mind.

The new guidance addresses why open sourcing code that performs a security-enforcing function is beneficial. In simple terms, we can compare coding in the open to how padlocks work. Everyone knows how padlocks work but they are still secure because you cannot open them without the key. Security enforcing software works in the same way, and good cryptographic algorithms are reviewed by many professional peers. Security is improved through public review.

We still specifically seek peer review on open code and subject our code to penetration testing, as part of following security design principles.

What we’re doing at GDS

Most of the code produced by GDS has been coded in the open from the beginning. Some services started closed source, and to ensure that we are practicing what we preach, we are now opening those: we have recently completed open sourcing GOV.UK Pay and we are working on opening up more components of GOV.UK Verify.

This new guidance will make it easier for your organisation to develop and deploy secure and open services, and should address your concerns around coding in the open securely.

This post originally appeared on the GDS technology blog.

The benefits of coding in the open

04 September 2017

For any service to be put in front of the public, it has to meet the Digital Service Standard, a set of 18 criteria.

One of the criteria is that all new source code is made open and published under an open source licence.

This goes hand in hand with our tenth design principle: make things open: it makes things better.

In this blog post, I explain why coding in the open makes things better.

two developers in front of a screen with code on it

It encourages good practice

When you know someone is watching, you tend to take greater care. You're more inclined to document your work clearly. You make sure your code is secure by keeping secrets separate from the code. You are polite and constructive in code reviews, and you follow good architectural principles.

In short: when other people can see your work, you tend to raise your game.

It makes collaboration easier

If code is open, it is easier to work on it with others. You don't need to give them special access or make complicated business arrangements. You don't even need to be in the same building.

For example, someone from 18F, the government agency that provides digital services to the government of the United States, was able to help a colleague from GDS with a code-writing problem.

It worked because both sides coded in the open. We also worked with the Australian Government to help them establish their own Digital Marketplace.

Closer to home, it makes it easier to work on the same code between departments.

External users can help make it better

Open code makes it possible for people who don’t work for you to make improvements to your code.

For example, members of the public made improvements to the Government Petitions Service. Someone added the scheduled date for debates. Someone else made a change to the signature counter to make it update in real time.

People can 'scratch their own itches'. They can make the small improvements that aren't at the top of your list of priorities, and they can help make your code more robust.

Others can learn from your work

If your code is open, people can apply what you've learned from doing the work.

Skills Funding Agency used GOV.UK's Smart Answers code to build a tool for their apprenticeships service. It took less than a week.

Without the Smart Answers example to learn from, it would have taken at least two months.

It makes it easier to share standards

Open code makes it easy to follow other teams’ work. This promotes a common culture and way of working when you can see how other teams manage certain issues.

Anna Shipman and another member of GDS staff

Quite often, teams will make small improvements to other teams’ work. For example, a developer from GOV.UK made a correction to GOV.UK Verify.

GOV.UK publishes coding style guides. This makes it easy for everyone to find and stick to the same standards.

It improves transparency on government’s work

When code is developed in the open, you can see where public money goes.

It is a catalyst which encourages openness in other things. For example, the GOV.UK roadmap is open, and one of the teams on GOV.UK uses a public Trello board.

When there is an occasional outage on GOV.UK we investigate and publish a report. It’s important to show how we learn from mistakes.

It clarifies ownership

We want government to own and be able to make changes to its services, and lack of clarity on intellectual property (IP) can be a barrier to that.

Open coding from the beginning surfaces copyright and IP issues before work starts.

The Service Standard demands that code is published under an open source licence (at GDS we use MIT). Additionally, all the work we do as civil servants is Crown copyright.

In the past, government services have wanted to change a project but have been unclear about who owns the IP.

Clarifying the issue upfront is valuable. It means that departments can bring in a supplier to work on their alpha and then switch to another supplier for beta without losing their work.

They can even build up teams from many suppliers who can work on the code seamlessly.

It prevents supplier lock-in. Without clarification, the software created for you can be the thing that will prevent you from switching suppliers.

So resolving this can save a lot of money for government.

It helps make government technology seamless

People who move between departments can continue to work using the same tools as before. It saves time and money. They can share knowledge of projects they were working on, because it’s all open.

After someone moved from GDS to another department, they contributed to our single sign-on service.

Over time, it will make government technology seamless as people move towards the most useful tools.

It’s easier to code in the open than to open a closed repository

Coding in the open means you decide whether that code is suitable for publication as part of reviewing each small piece of work.

To open it later means having to go back through a body of work that has built up over time to make sure there is nothing that shouldn’t be made public, which can be significant extra work.

Make your own code open

Many people think that being able to reuse code is the biggest benefit of coding in the open. However, while reuse is a nice-to-have, I hope this blog post illustrates that there’s more to it than that.

Take a look at our open code and our guidance.

This post originally appeared on the GDS blog.

Break into public speaking

23 August 2017

Earlier this year I had the opportunity to work with my excellent colleagues Rosa Fox and Lucy Carey on a series of workshops to help get more underrepresented people in tech into public speaking. Lucy has written an excellent blog post about it including more details about the breakdown of the course.

This is something I’m really interested in. I’ve written before about how to get more women to speak at your conference and shared resources on getting started with conference speaking, so I was really happy to help with it.

You definitely have something interesting to talk about

The main thing that I want people to realise is that everyone has something interesting to speak about (or blog about). Jessica Ivins has written a great post about this. You know a lot of things that other people don’t know and would find interesting.

The topics that came out of the first workshop were many and varied, including a deeper dive into one aspect of a project someone was working on, advice about how to make it as a project manager, developing junior developers (which Emma Beynon went on to give at Brighton Ruby), and a day in the life of business support.

You only need to be a few hours ahead of the audience

We often think we have to be expert in something in order to give a talk about it, but that’s not the case. Even in something deeply technical, you only need to be about three hours ahead of the audience, and in fact the closer you are to their level of understanding, the clearer your talk can be.

Finally, it’s worth remembering that a deadline of a talk is the best way to make sure you really know a subject. The reason I submitted my first conference workshop, Data Visualisations in JavaScript was because I wanted to know how to to do it. It was a lot of work, but I really knew it by the time of the session!

Break into public speaking got lots of interest on the back of Lucy’s post (including international attention!) and the next run of it starts next week. I really look forward to seeing where it goes next.

Cross government meetup on Open Source and Security

08 August 2017

We are hosting the second cross-government meetup on Open Source in London on Tuesday 26th September.

Make things open image

The event will focus on security considerations around Open Source and talks will be given by:

Following this, there will be a panel of security experts, including Ahana Datta, Head of Technical Security at Ministry Of Justice, Jenny Duckett from the GDS security engineering team and a representative from NCSC, who will answer questions from the audience about open source and security.

There will then be time for networking and open space discussion to explore security in-depth or discuss other topics.

The aim is to give you tools to take back to your organisations to support your work with Open Source. Everyone working in government is welcome, whether you’re against the idea of open source or already an advocate. In fact, if you’re not convinced, we’d be even more keen to have you come, so we can understand and try to address your reservations.

A recap of the previous event

The first in this series of cross-government meetups on Open Source was in February and had a strong turnout, with around 75 attendees from over 20 different government organisations. The feedback was generally positive, with over 80% saying they found it useful and would recommend it to a colleague.

We had talks from Dave Rogers, CTO of MOJ, explaining why they code in the open, someone from GCHQ talking about their open source projects, and Gemma Leigh talking about the GOV.UK Frontend Alpha.

This was followed by 10 Open Space sessions discussing topics suggested by participants. There was some really useful sharing of thoughts and experiences.

Giving people the tools to help

Something I noticed at the event was that in some ways we were preaching to the choir. People who will travel to a meetup about Open Source are people who are already working hard to promote it in their organisations. While an opportunity to discuss common areas of frustration can be cathartic, what would be more useful would be giving people the tools to have those conversations back in their organisations with people who are less enthusiastic about Open Source.

One of the primary concerns people have about using Open Source Software and coding in the open is the misconception that it is less secure. So this meetup will provide tools, experiences and information to help attendees combat those concerns.

Better opportunities for networking

A few people commented in their feedback for the last event that the structure of the day didn’t help them get to know the other attendees. I think it’s important to fix this because I want it to be easy for people to network and meet others in the Open Source community if they want to.

At this meetup we will have some structured networking activities so those who want to take part and would like an opportunity to meet others can do so. It will be organised by Jane O’Loughlin, who has a lot of experience running meetups and communities. Don’t worry, the networking will not be compulsory!

Location, location, location

In the original blog post I said that we would host other events elsewhere in the UK but feedback from people coming from outside London is that it’s a convenient location to get to. However, this was based on feedback from people who managed to make it, so if the location is preventing your attendance, please let us know, either in the cross-government Slack channel or in the comments below.

Let us know what else

The agenda of the next one is not set in stone so please feel free to share any thoughts or suggestions, for this or future events. And please do come along! Everyone working in government (both civil servants and contractors) is welcome. The invitation will be shared in the cross-government Slack and via mailing lists.

This post originally appeared on the GDS Technology Blog.

Join the SPA organising committee

13 June 2017

I help run SPA, a workshop-based conference on all aspects of advancements in software development – technology, processes, people and practice – and we are looking for people to join the organising committee for the 2018 conference.

Joining a conference committee is fun and good for developing skills

The advantage of helping shape conferences is that it gives you an opportunity to use and develop skills that you might not be using in your job, for example influencing people to submit sessions, encouraging and supporting people who might not otherwise feel able to speak at a conference. It means you can shape the conference into one that you would like to go to.

It helps you to understand a conference from the other side of the fence, so when you are applying for conferences you have a better idea of what people are looking for. It also looks good on your CV!

Once you’ve had a bit of experience organising one conference you can get involved in others. For example, after being the programme chair for SPA from 2013-2015 I went on to be on the programme committee for Velocity, Software Architecture and Continuous Lifecycle, all of which gave me an opportunity to shape high-profile conferences (and network with with some people I admire!). You also tend to get free tickets to conferences you help organise so it’s another way to get to go.

SPA is a unusual and interesting conference to be part of

SPA is unusual in that all the sessions are interactive in some way. There are coding workshops, games to teach you about personal resilience, goldfish bowl discussions about technologies, guided group work sessions on Agile processes and more. For an idea of what the sessions are like, have a look at this year’s programme; the keynotes (for example exploring the intersection of knitting and coding, needles provided) or previous conferences.

SPA has a submission process to encourage new presenters

SPA’s submission process includes a period where submitters can receive feedback on their submissions before the deadline to allow them to refine their submissions, and all speakers are offered ‘shepherding’ to help make their sessions the best they can be. Submissions are reviewed anonymously to promote diversity.

Being a conference or programme chair

The committee consists of two conference chairs and two programme chairs and other roles as needed. Traditionally there are two of each, with one moving on every year so you are never thrown in at the deep end.

We’d also very much like a marketing chair though we’ve not had someone in that post this year.

The conference has been running for more than 20 years so there is a great community and many of the existing team will stay in place, so it’s an opportunity to learn the skills involved in running a conference in a supportive environment.

There is a lot of admin and organisational work involved in putting on an event and we are really lucky to have an excellent events executive, Mandy Bauer, at BCS, who manages that side of things. SPA is a specialist group of the BCS, and we also usually host the event at the BCS offices in London. So as a conference chair you can focus on making the conference the best it can be.

The time commitment of being on the committee

For programme chairs, the work is usually from around September to March. Initially you’ll be publicising the call for submissions, and contacting people to encourage them to submit. Ideally you’d reach out to lots of people to encourage a broad range of speakers. Around January, you will be working with people to help give feedback on the sessions (there are lots of people involved in this so you won’t have to do it all yourself), followed by a review period and then working out the programme.

For conference chairs the work is mostly January to June; getting sponsors, inviting keynote speakers, arranging evening diversions (this year you can build a laser bot for cats!) and then running the actual conference on the days themselves.

In addition, there are other, non-chair roles and the commitment here is as little or as much as you can offer – helping spread the word about the CFP, helping give feedback and review sessions, attending the programme meeting if you can, helping out at the conference itself.

Contact us if you’d like to know more

If any of this sounds like something you’d like to take part in, or you have more questions, please get in touch with the organisers. We are having a meeting about next year’s conference on July 11th, and if you’d like to come to that please do get in touch.

Whether you are interested or not, it’s worth attending this year’s conference. This year we have sessions on topics as varied as knitting and coding, blockchain and pairing across skill levels without the drama.

Writing a business case

01 March 2017

As I become more senior I have more of a need to write business cases. I turned to the excellent Peter Grzeszczak for advice, and this is what he said.

Start with the problem, not the solution

Instead of opening with the solution you are proposing, start by defining the problem you are trying to solve. This is really useful to clarify your own thoughts on the matter.

Make it strategic

Make the strategic point – why is this a problem that we want to solve? What is the outcome we are trying to get to and how does that support the organisation’s stated aims? In our case, why is this something government should care about?

Think of three or four potential solutions

The reason you are writing this is probably to make a case for doing something that you already think is a good idea. It’s a good exercise to think about some other ways you could achieve the same outcomes. You might realise there is a better way to approach the problem. If nothing else, it demonstrates to the people the business case is aimed at that you have properly considered other options.

Your first potential solution should be ‘do nothing’

What is the cost of doing nothing? Are there any benefits to taking no action? Again, this is very useful for clarifying your own thoughts, as we tend to have a bias towards taking action.

Outline the costs and benefits of each potential solution

Often the benefits will be “soft” benefits, i.e. intangible, or things it is hard to quantify. For example, something might be good for recruitment, or it might lead to a better alignment with government policy.

If you can give actual or estimated costs for things that’s even better. For example, how much does a developer on average cost and how long would this piece of work need them for? Tangible benefits are also good.

Case studies can be used to support your rationale

Case studies can be quantitative or qualitative. The former is where you outline how a similar solution to the one you are proposing saved £X and is therefore evidence for your suggestion that your proposal will save £X. The latter is an example where you can’t offer hard figures, and is where you are recognising that your solution may be a leap of faith but your case study shows why you believe it’s a punt worth making.

The former is better, but the latter can strengthen your case if used judiciously.

List the risks and potential mitigations

Make sure you’ve considered what potential extra costs there could be, and how you would make sure they are addressed.

Make your recommendation

If you’ve made a good case so far, then it should be obvious at this point what your solution is. You’ve considered other options, including doing nothing, you’ve weighed up the costs, benefits and risks, and this is the case for your desired solution.

Include a management case

If they approve this case, how would you manage it as a project? Do you already have an idea of the group of people you would get together to work on it? If so, include these details.

You should also cover measurement. When you evaluate this piece of work at the end, how will you know it’s been successful?

Make it easy for those reading the business case to see that if approved, you would be able to deliver this successfully.

The process should take time

Part of the point of making a business case is sharing the information. It shouldn’t come out of the blue for anyone who will be involved in making the decision or supporting the work.

As a rough rule of thumb, for a business case for a significant piece of work Peter suggested you would probably be looking at around a month: you might spend a week or so on discovery, making sure you’re clear on the problem (maybe having a workshop) and working out who will be the sponsor; once you’ve got the options you’ll probably spend another week or two getting the data to support those options, and then a week or two to write it up.

Be loud and open as you go; you want input from people who have something to add, and you want people to know it’s happening.

Seek feedback as you go

Good people to ask for feedback are the people who you will need to support it, for example the people in charge of business operations and the senior sponsor. Talk to people to see how they think the case will be received, and whether the information is presented in the way people will want to see it.

It’s also worth (as with most things) finding a critical friend who is not involved to read through it.

The five cases for a government business case

We were not talking about producing a business case for the Treasury, though that’s a lot of what Peter does. Although that is a much more involved exercise, I found it useful to think of the different cases you are required to make for Treasury cases: strategic, economic, commercial, financial and management. It can be useful to think of your problem in those lights, even if you don’t go into all the detail. This guide on assessing Treasury business cases (PDF) is useful for more information on that.

Ultimately, it’s about telling a story

Your business case should be a narrative. Here’s the problem. Here’s a good solution. It might cost us money, but here are the reasons that it’s the right thing to do.

Come to the cross-government Open Source meetup

25 January 2017

We are organising a series of cross-government Open Source meetups to exchange ideas, talk about code we can reuse or collaborate on and build a community around Open Source.

Post-it note saying Share knowledge

Making code open is the foundation of the transformation of government. One of the major benefits of open code is how easy it makes collaboration, and potentially reuse; saving other teams time and effort. As I wrote about in my post outlining the next steps for open source in government, the best way to make that happen is to talk to each other.

The first meetup will be co-hosted with the Ministry of Justice (MOJ), on the afternoon of Friday 24th February, at MOJ’s office in London.

There will be short talks from GCHQ, Home Office and GDS about the Open Source work we’re doing. We will then have some discussion sessions, organised on open space/unconference principles, so attendees can set the agenda for what they would like to cover. Throughout the afternoon there will be plenty of opportunities to talk to colleagues working on the same things in other departments.

This event is only open to people working in government (both civil servants and contractors are welcome). If you would like to attend, please sign up to the cross-government technical architecture mailing list or ask for the sign up details in the cross-government Slack.

I hope to see you there to continue the interesting discussions we’ve started.

This post originally appeared on the Government Technology blog.

Next steps for Open Source in government

15 December 2016

I was recently appointed Open Source Lead at the Government Digital Service (GDS) with the aim of making more government code open, improving the reusability of the open code we already have, and helping government as a whole be a better member of the Open Source community.

Making code open is vital to the transformation of government. Working openly also supports our work with other governments and last week, the UK government reaffirmed its commitment to making source code open by default at the Open Government Partnership summit in Paris.

By making our code open and reusable we increase collaboration across teams, helping make departments more joined up, and can work together to reduce duplication of effort and make commonly used code more robust.

A lot of great work has been done across government on this and it’s clear that developers across government are seeing the opportunity to better meet users’ needs through code reuse. We’ve seen that there is demand for more action and more support through our cross-government StackTech events and the code sharing unconference that took place earlier this year.

In this post, I am going to talk about my first priorities as Open Source Lead and let you know how you can get involved.

Open Sourcing government code

Over the past five years, a huge amount of government code has been released under Open Source licences. This has been great for transparency, collaboration and encouraging good practices. Making things open makes them better.

However, most of this code is what we call coded in the open rather than Open Source Software. The teams don’t guarantee that they will support it or maintain it in the way Open Source Software needs to be, and a lot of it is not set up to be easily reused.

When code would be useful for other teams there are clear advantages to supporting reuse. For the other teams, and for government in general, the advantage is the chance to save time and money. In these cases, it might be worth taking the extra steps to make this code Open Source Software.

There are advantages to the originating team as well. Your code will be used and tested in a variety of environments and there is a greater chance of people finding issues and in many cases helping you to fix them. People who use the code often contribute bug fixes back to the original and these may help set direction and contribute features as well.

However, it can be a lot of work to make the code reusable and maintaining it is an extra overhead, so it’s important to focus the effort on the projects which are meeting the greatest user needs.

Teams across government are already doing great work producing reusable code. For example the GOV.UK frontend alpha, GCHQ’s Gaffer and Home Office Forms, to name just a few. Initially, I will be doing user research to understand what code that has already been written by government would be useful more widely, and I will then identify a few projects to focus on making into Open Source Software. There will be opportunities to get involved in this user research which I will talk about more next year.

Not all code needs to be Open Source Software but all code needs to be open

Even where the project does not meet user needs for reuse beyond its originating team, it’s worth making it well documented, with good commit messages, and blogging and talking about it, so that other teams can reuse your learnings if not the code itself. A great example of this recently is the digital apprenticeship service using GOV.UK’s smart answers code.

There are many benefits to making your source code open even if not fully Open Sourced, including encouraging good practices and making it easy for teams to collaborate. All new code written in government has to be open by default.

However, it’s not always easy for teams who aren’t used to it to make this happen. Firstly, it’s clear that our guidance is not as joined up as it could be so I’m going to be working on clarifying that and filling any gaps, and then I’ll look at how to address any other barriers we find through user research.

It’s all about community

The most important thing when sharing code and making code open is to talk to others working on same things, share ideas and learn about code you can reuse or collaborate on.

We held a cross-government meet-up on code sharing earlier this year and some great ideas came out of that. I will be building on this by organising meetups every few months as part of building a community around this work.

The next cross-government open source meetup will be in February, and GDS is co-hosting with the Ministry of Justice (MOJ). There will be a series of short talks from departments on what they are doing around open source, followed by some open space/unconference sessions. If you work in government and would like to attend (or speak about what you’re doing about Open Source in your department), sign up to the cross-government technical architects mailing list where I will post details next month. I will also blog more about it closer the time.

Making higher impact contributions to Open Source Software

We depend a lot on Open Source Software. For example, you can see from the GOV.UK Colophon a small fraction of the Open Source Software we use at GDS. Many teams contribute back patches to help improve these projects, but next year I’m going to be looking into how we can make higher impact contributions. This will help make sure that the Open Source Software that the government depends on is more stable in the long term; and also, giving back to these projects that we use for free is the right thing to do.

It’s worth mentioning that the primary focus of my job is not about driving adoption of Open Source Software. Open Source Software is already used widely across government. The Technology Code of Practice is very clear that you must give it equal consideration when choosing technology, and the spend controls team are doing an excellent job making sure Open Source Software is given a level playing field.

How you can get involved

If you are in government and would like to attend the meetup in February, please sign up to the cross-government architects email (google) group where we will post more details next month. There is also a cross-government slack with channels for a range of topics, including #open-code.

If you are interested in helping us with our user research on any of this please get in touch. I will also be talking next year about the specific work we are doing and how you can take part.

There is lots to do and these are just the first few things I’m focusing on. I’d be very happy to hear your thoughts.

This post originally appeared on the Government Technology Blog.

I am now Open Source Lead

18 November 2016

I’m pleased to announce that I’m now Open Source Lead at GDS. James Stewart has written a blog post about my appointment.

How to blog

16 September 2016

Someone recently asked my advice on blogging, particularly how I decide what to write about and whether I set aside specific time. Here’s what I said.

Share what you learn

This is my guiding principle, for blogging and otherwise. Anything you’ve learned is worth writing up, as someone else will then be able to learn it too. Writing it up also helps make sure you really understand it yourself.

Write about things you are asked about

If someone asks you for information, or you find yourself sharing the same information with different people, that could be worth writing a post about.

One of my most popular posts was one which mainly just links to other resources. I recently discovered that it had even been linked to in the call for papers for Web Directions Australia. It was basically the blog post form of an email I’d sent several times, trying to persuade people to submit to a conference.

Write about things you want to remember

I don’t have a very good memory, and one of the strategies I use for remembering things is to write them up. I have often referred back to posts I have written myself, for example on regex, on setting up a local wifi and on Git.

This helps build up a body of work, and also helps you to be less perfectionist about what you post. And it’s also very useful when you want to refresh your memory!

Write about things people ask you to write about

Sometimes people suggest you write a post about something you’ve explained to them or discussed with them, as the excellent Martin Jackson did for my post on how to raise good pull requests. I’m really glad he made that suggestion as I’ve seen that shared several times since (often in PRs).

If someone suggests you write a blog post about something that’s always a good hint as it means you’ll have at least one reader!

Don’t be perfectionist

Before I started blogging, I worried for quite some time about what I would say, how I would know to write about, or whether it would be interesting. In the end I decided to just go for it (hence the name of my blog).

My early posts are really very boring; they were just details of me getting to grips with Unix. I doubt anyone read them. But it got me started, and it got me into the habit of writing things up and speaking into the void.

The more you write boring posts, the more you think of how things can be interesting. Don’t wait for ideas to be good enough.

You could try being disciplined about a schedule

Having a set schedule can help. For example, the excellent Cate Huston posts every Monday, Wednesday, Friday and Sunday. She points out that this means you get better with practice, and also, treating the schedule as more important than the content avoids you making the wrong decision about what will be interesting to other people. (That whole post is useful advice on how to maintain side projects.)

The inspiring Mark Needham started his blog by committing to write a blog post every day about one thing he’d learned. He didn’t manage it every day, but over seven years he has managed more than one every other day.

Write the post when you get the idea

However, I don’t have a schedule. My aim is just to keep my site roughly up to date, which for me means I start to feel anxious if I’ve not posted in the past three months. I usually try and write posts when the idea strikes me and if I’ve been writing a lot, I might park it to post later.

For example, I wrote the first draft of this in August 2015, but I had just posted on the GDS Technology blog and I knew I had two more GDS blog posts coming soon, so I parked this one.

Writing a draft of it then was very easy, as I’d just had the conversation offering advice so it was all fresh in my mind, and it was easy to redraft it last night as the meat of it was all there.

Structuring your post

How to write is a whole nother post (/book) in itself but three things I bear in mind are:

You’ll get into the habit

Once you get into the habit of blogging and figure out your own style, you start to recognise what things can be formed into a blog post as they are happening.

For example, I’ve written up discussions at conferences, advice people have given me, work I’ve done and work I plan to do. It’s not all freeform discussion of why roofing should be more like software development.

The executive summary

If I could only give you two pieces of advice, I’d say: share what you learn, and JFDI.

Making my site look better on small screens

04 July 2016

I’ve had this blog for five years, but I’ve only recently started using my phone to browse the internet, at which point I realised it displayed terribly on a small screen. It’s a wonder anyone ever read my posts.

Screenshot of site on an iPhone before redesign

As a predominantly back-end developer, it wasn’t immediately clear to me what I needed to do to improve matters, so I thought it was worth making a note here once I figured it out.

Responsive design

You want the site to respond to information about the client accessing it to display in the best way for that client. In this case, I wanted the site to respond to browsers with a smaller screen and display differently, rather than just show everything as per a desktop browser but just much, much smaller.

The media query allows the site to get the capability of the device.

Redesign required

The first thing I needed to do was work out how I wanted the site to look on a mobile device, which actually took a bit of thinking about. I realised that the current layout wasn’t going to work well and, as is often the way of these things, probably already wasn’t working well.

I was using a three column layout. However, on some pages the right column was left blank, and on one page I was instead using a two column layout. Only one page was making full use of the three columns. It was time to let it go.

Only page using 3 columns Page using 2 columns

Redirecting a URL

I took that opportunity to also rename the Games page. I used to spend more time developing little games; now I do a more diverse range of side projects so I can showcase more of that here. Because my site is hosted on GitHub pages I could not do a 301 redirect, but I set up a meta refresh tag to redirect to the new page. A 200 to a 200 is not ideal, but is better than a 404.

You can see the redesign changes I made on GitHub.

Use Jekyll for the whole site

When I originally started this blog I handcrafted the HTML for each post, and the rest of the site was also handcrafted HTML. My principle was just to get started rather than waiting until I’d figured out the best way to do it.

When I started using Jekyll, I only used the Jekyll features for the blog. However the redesign from the inconsistently applied three column layout made it much easier to Jekyllise the whole site and allowed me to remove a lot of the duplication and handcrafting.

These initial changes actually made the site worse on mobile because there was more padding on the right.

Screenshot of site even narrower on iPhone screen

Set viewport

The first change after the redesign was to add the viewport meta tag with an initial scale of 1. This sets the zoom level at 1:1 so the page is rendered at the width appropriate to the device width rather than zooming out to fit the whole page onto the screen.

Make embedded video responsive

After setting the viewport initial scale, most individual posts looked good on mobile. However the JDFI page has all the posts on it, and it looked very wrong. All the content was squished to the left.

JFDI page with all content squished to left

It turns out that the code provided by YouTube and SlideShare to embed videos/slides into your site is not responsive; it has a fixed width. This means that the site renders the text correctly for the size of the device, but when it gets to the fixed width video it then zooms out to allow space for it.

An embedded video pushing the page size out

These two articles were useful in working out how to fix this. I changed the HTML to not have the fixed sizes and added some CSS.

It also turned out that on Safari (and hence on the iPhone) long lines of code were not broken, leading to the same effect as the fixed-width video, which I fixed with an addition to the viewport tag.

Only have two horizontal columns if device is large enough

Once I’d done all this set up I was at the point I needed to be, where I could change the layout based on the size of the device.

To do this, I looked through the CSS for anything that changes the horizontal structure of any of the elements, e.g. width, float etc, and put that inside a @media query block. Initially this was just the two columns but I later added the Twitter link.

Move columns into preferred order

After all these changes, I then changed the HTML to make the columns appear in the order I would like if they cannot be side by side.

There are many other improvements to make to this site but hopefully if you are reading on mobile it’s much easier. Do let me know!

Useful questions for one-to-ones

30 March 2016

When I started line managing people at work, three years ago, I got some great advice from the excellent Andrew Seward. He structured one-to-ones around the following list of questions:

  1. How's it going? (they always say 'fine')
  2. How are you feeling about the work the team's been doing?
  3. Of that work, how are you feeling about the parts you've personally been involved with?
  4. Do you have everything you need to do your job? (important one!)
  5. Is there anything you’d like from me that would help you do your job better?
  6. How is ${new person} doing? (if they have a new team-mate)
  7. Have you had a chance to work with the others in your team? How has that gone?
  8. How have you been getting along when working with other teams within development?
  9. How about when you've had to deal with other parts of the company? How did that go?
  10. ${offer some feedback – always some positive and some negative if necessary}
  11. Is there anything else you wanted to raise?

When I started out I didn’t try and subtly steer the conversation to cover these areas, I just explained what I was doing and then went through the questions one by one. It became more natural in later one-to-ones, but even just going through the questions in a very structured way sparked discussion and brought out issues that might otherwise not have come up.

All of my reports who have since taken on line management duties have asked me to share this list, which suggests that it was a useful tool for them too. I’d be interested to hear similar questions that have worked in your one-to-ones.

Code sharing in large organisations

24 March 2016

At Scale Summit last week I led a session on code-sharing in large organisations. I was particularly interested in how other organisations raise the visibility of shareable code. This was one of the main themes that came out of the recent code-sharing unconference at GDS.

These are the main things I took away from the discussion. For more comprehensive notes, check out Barrie Bremner’s write-up.

Some ideas of how to raise visibility of code

People in the session shared the way they publicise code/projects in their organisation. Techniques include writing blogs, either internal or external, presenting at internal meetings and writing things up on internal forums. Facebook have a newsletter called the Weekly Push which is used, among other things, to publicise less well known features or new products. In some offices this is pinned up in the toilets!

However, apart from the last one, these are mostly pull methods of communication. You have to know they are there and how to find them. The newsletter is a great idea, but in less joined-up organisations you can’t be sure that push communications will get to everyone.

It’s useful to have a community

The discussion kept returning to the value of having a community around areas of interest. If you are regularly talking to other people in your field, you are more likely to find out about similar work other people are doing. There can be a lot of noise, and a community can help you learn what works and what doesn’t, and can make it easier to talk to relevant people at the start of a piece of work, rather than when you’ve gone too far down a certain path. In order to encourage useful sharing of information and discussions taking place at the right time, an organisation could support these kinds of communities.

As well as meetings and distribution lists, people talked about other ways to share information, for example, having a Slack channel for a particular community. You could drop in and say “I have this problem, does anyone have a solution?”.

One person pointed out that we often focus on code-sharing when in fact the real goal should be community. If code is shared or reused as a result of that community, that is good, but having a community is where the real value lies.

For those working in UK government, there are already some forums for these discussions.

How to build a community

One useful idea came from Paul Gillespie (quoted with permission). He explained that at Skyscanner, they copy Spotify’s org struture, and have something called guilds. These are essentially communities of interest. For example, they have one for web dev, one for AWS, one for Python. These guilds have Slack channels, mailing lists and every two weeks they have a guild meeting, which is a one-hour webinar. They use Trello to set the agenda for this meeting, and each guild is run by a core committee.

I later talked to Paul a bit more and he said that in total they have around 6 or 7 guilds. You don’t want to have too many, because being involved in a guild is quite labour-intensive.

Beware of premature optimisation

Early on in the discussion it was pointed out that we should be mindful of what problem we are actually trying to solve before addressing how to share code. Many problems seem superficially similar but turn out not to be, so the same code/product/solution will not be appropriate to both. You may waste time coming to this conclusion or bloat software by trying to make it cover too many different scenarios.

There can also be a lot of pressure to consolidate, and some of this can come from senior management seeing false patterns, for example “too many case management systems”. It was noted that spotting actual duplication and finding prior art in this area is part of the role of an architect, but in a large organisation visibility of code is still difficult.

The problem we are trying to solve is to reduce duplication of work, rather than specifically duplication of code. More generally, we do not want teams to waste time reinventing the wheel. We do not necessarily want “the best tool for the job”, we want the most cost-effective tool, and that might be copying someone else’s code, or the team solving the same, or similar, problem in a different way.

Code-sharing isn’t always the answer

If someone has written code that is perfect for your use case, it can still be hard to share. Even if it is well documented, there is still a ramp up cost to understanding it, and it is unusual that the code can be dropped right in. Several people mentioned that you need to weigh up the cost of these factors against the cost of duplication of code. Arrayed against these factors, building it yourself might not turn out to be that expensive.

It’s important to weigh up the costs

The general feeling was that forming a community is very useful to prevent duplication of work or code, but it was also pointed out that there are costs. For example the time cost of communication and staying in touch, keeping documentation up to date etc. Again, these may outweigh the costs of duplicating work.

There are other advantages to being involved in communities of interest, but it is worth considering the cost of the time and effort. For example, while the idea of a Slack channel for a community, mentioned above, can be very useful, Slack can also be a drain on productivity.

We also returned a few times to the topic of sharing services, or platforms, rather than the code itself. Instead of writing code to be shared, build a service or platform that provides functionality. However, the question of cost came up again: building and operating a service is expensive and takes skilled people, as well as maintenance costs.

My take-home message

The main thing I took away from this discussion is that you need to be clear about what problem you’re trying to solve and what the costs of the solutions are. Sometimes an increased awareness of code that has already been written will solve your problem, but sometimes what you need might be access to a service, or it might be to share knowledge across a community.

Thanks very much to all who took part in the discussion.

Choosing Cloud Foundry for the Government Platform as a Service

17 December 2015

We previously wrote about looking into offering a platform to host digital services. This post is about the technology we chose for the beta and how we made that decision.

Government PaaS team

Comparing technologies for the prototype

The first thing we did was look in detail at the various open source and proprietary options available. I’ve previously written about how we compared the open source options and I mentioned that the front-runners were Cloud Foundry, Deis and Tsuru.

Deis was a very good option, but we ruled it out for our prototype for two reasons: it didn’t have the granularity of user permissions we needed, and it didn’t have a mature service broker, which would allow us to connect to external services, eg Postgres. Both of these things are on their roadmap, but the timing wasn’t going to work for us. However, we had a very interesting conversation with the Deis team and this is definitely a technology to keep an eye on.

With the proprietary options, the method of comparison was slightly different because the source code isn’t available to look at, and because it’s not usually as easy to get a sample platform up and running.

There were four proprietary vendors we were particularly interested in: Apcera, OpenShift Enterprise (Red Hat), Pivotal Cloud Foundry and Stackato. Each one of the vendors answered questions to a similar level of detail as we were able to learn by investigating the open source solutions ourselves. Because of commercial confidentiality, I can’t share the detail here but it allowed us to compare the proprietary solutions, and then again with the open source ones.

They all had advantages, but the one that most suited our requirements was Apcera.

Comparing Tsuru, Cloud Foundry and Apcera

We wanted to get a prototype up and running quickly so we could start showing it to our users in government to see if we were on the right lines. So we decided to start with an open source solution because there are no licensing issues.

We built a prototype using Tsuru, because it’s easier to get started with Tsuru than Cloud Foundry. Then we used that prototype in our user research - we wanted to make sure we understood which features were most important to our potential users.

We then built another prototype in Cloud Foundry to compare its features, ease of use and maintenance requirements to those of Tsuru. Simultaneously, we spent some time exploring a trial installation of the Apcera platform with our engineers providing feedback about each of the three different options.

Why we decided to go for open source rather than proprietary

Paul Downey, the product owner on the Registers team, has described the work we’re doing on Government as a Platform as fitting into three categories: product, platform and standard. Product, platform and standard sketch by

Licence: Creative Commons Attribution Paul Downey

This is how we apply those categories:

If the technology we choose is not open source, the product would be the proprietary option. So in effect, we’d just be recommending a product to other departments, and they would then have to build their own platform and incur costs over and above supporting and developing the platform. But having the code available for use by other departments as a platform would encourage collaboration with our colleagues in those departments.

Using an open source solution brings a number of other important benefits. The chance to contribute code upstream means we can help make the product useful to us and others in a similar position. Maintaining and building open source components and the ‘glue’ (ie the code required to keep it all together) builds skills and learning in-house. And it also echoes our tenth design principle: ‘Make things open: it makes things better’.

Choosing Cloud Foundry

The three technologies we looked at in more detail had a lot to recommend each of them.


Cloud Foundry:


While each of the technologies we looked at had advantages, the open source requirement is important to this project, so we had to rule Apcera out for now.

It was a very close contest between Tsuru and Cloud Foundry. After a lot of consideration we chose Cloud Foundry for our beta. The maturity of Cloud Foundry, as well as the size of its community, were the most significant factors in this decision.

However, all three technologies are very good and if your team’s requirements are similar to ours then it’s definitely worth considering them all.

Next steps

We hope to share more detail about what we learned about each technology in future posts, but in the meantime we’re now starting the beta build using Cloud Foundry. We’re coding in the open, and aiming to be hosting live services early next year.

This post originally appeared on the GDS Government as a Platform Blog.

Video of Operations: a developer's guide

27 November 2015

Video from my FFconf talk Operations: a developer’s guide.

Slides and links in previous post.

Slides and links from Operations: a developer's guide

07 November 2015

Slides from my talk Operations: a developer’s guide, at FFConf 2015.

The last slide is links; these are copied below so you can actually click on them!

Video to follow…

Configuration management


Using Vagrant (Useful getting started examples)


Containerisation vs Virtualisation

Make instead of Grunt/Gulp (Talk on using NPM as a build tool)

Tools for better dev (More detail on the 6-line Unix program)

A PaaS for Government

29 October 2015

I gave a keynote at Velocity EU about the work I’ve been doing on a PaaS for government.

Looking at open source PaaS technologies

27 October 2015

I’ve been working on a prototype of what a Platform as a Service (PaaS) for government might look like, as we wrote about in a previous post. One of the first things we did was look at the open source PaaS options that were available. This post is about how we did that and what we learned.

Comparison table of PaaS

The open source options we considered

We looked at a range of proprietary and open source options. In this post, I am focusing on open source. This is because much of the information we learned about the proprietary options was shared in commercial confidence. I’ll talk more about the proprietary options we considered and how we compared them in a later post.

Exploring the options

PaaS is a very fast-moving field at the moment and there are a lot of options. The first thing we did was take some time individually within the team to identify which options were worth investigating. We based that on previous experience, things we’d read about, and further research online. We were around eight people on the team, so we had a lot to draw on.

It’s not always the case that you are comparing like-for-like with PaaS technologies. For example, Cloud Foundry is a fully-featured PaaS, whereas Mesos is a cluster-management system. While Mesos on its own has a number of PaaS features, it didn’t meet a combination of the criteria "developer self-service" and "multi-tenancy" (e.g. no authentication, access control).

I wanted to investigate Mesos as it’s an interesting technology, so we looked at ways to combine it with other technologies who offer those features. We chose combinations of technologies based on what we found to be common combinations. In this example, you can see we looked at both Mesos + Aurora, and Mesos + Marathon + Chronos (Mesosphere).

At this stage, we ruled out things that we didn’t think were worth investigating further (for example, they were nowhere near production-ready) and worked out some combinations that made sense to look more into.

The longlist of technologies we investigated is:

Our selection criteria

In our previous post I outlined the four main criteria we’d identified from our user research: a PaaS would have to be multi-tenant, self-service, allow application developer support and be able to run on multiple public clouds. This had already allowed us to rule out some technologies (for example, a number of PaaS technologies only run on AWS). We also had some further must-have criteria. The complete list of our selection criteria is below:

Must haves:

Investigation points

Brett Ansley, our business analyst, wrote these up very clearly and with acceptance criteria to clarify what we were looking for. For example, for zero downtime deploys:

Given: an application that is receiving requests
When: a new version of the application is deployed
Then: there is no downtime of the application.

Comparing against our selection criteria

We then split into pairs and each pair took a technology in turn to evaluate it. Dan Carley, our tech lead, outlined some consistent steps to take in each investigation so that we could be sure each pair was investigating in the same way. For example, to investigate high availability:

Each pair spun up the technology they were using and investigated it. As they found the answer to each of the selection criteria, they marked it on the whiteboard (main photograph) so we (and any passers-by) could clearly see how we were progressing and which technologies had what. If any technology failed a must-have, the investigation would stop; otherwise it was time-boxed to two days.


The overview of what we learned about each can be seen from the photograph of the whiteboard above, and is summarised in this spreadsheet. It’s worth noting that the spreadsheet is slightly more up-to-date than the photograph of the board; for example Rancher and Tsuru were late entries, and some answers were updated with more information that we learned later.

One thing that I found particularly interesting was that multi-tenancy is not a feature of many of these technologies. For example, Kubernetes and Mesos, two widely used and interesting technologies, do not support multi-tenancy. There’s no way to ensure that a team of developers can administer only their application and not the applications of another team. This meant that they were not suitable for our purposes.

The tech that meets our needs

After going through this process of looking at and comparing a number of open source PaaS solutions, the clear front-runners were Deis, Tsuru, and Cloud Foundry. The next stage was to investigate these three technologies more and choose one to build a prototype. This has helped us with user research, which we'll share more on later. In the meantime, we hope sharing what we’ve learnt about these technologies is useful to you, and do let us know your thoughts in the comments below.

This post originally appeared on the GDS Technology Blog.

Building a platform to host digital services

08 September 2015

Right now, hosting services is one of the most time-consuming barriers for new digital services, and usually involves duplicating work done elsewhere. On the Government Platform as a Service team we’re working on solving that.

Repetition, repetition, repetition

Every digital service that government runs needs to have a place to run from; it needs to be hosted somewhere so that it is accessible to users via the internet. The service doesn’t ‘just work’; there is a lot of effort involved in setting up all the components required to host a service.

These components don’t vary much between services. Every service needs an automated way to let developers know when something is wrong, or to alert them to something. So, in practice, these groups of components end up looking very similar across very different services. The picture below shows you an example:

image showing three projects with the same technical stack, including alerting, monitoring, logging, each running on a cloud provider

As you can tell, there’s a lot of duplication. Teams all over government can end up duplicating work that’s already been done elsewhere. That means spending time on areas that aren’t their speciality, such as application monitoring or log aggregation, which stops teams from focusing on their areas of expertise.

It also leads to a lot of time searching for people with expertise in this area to hire. All of this takes time and money and leaves teams less time to focus on their users’ needs.

One way to address these issues is to provide a platform as a service (PaaS) that services could use for their cloud hosting. A shared PaaS would then change the diagram above into something more like the one below:

image showing three projects, each running on Government PaaS, which has a technical stack including alerting, monitoring, and logging, and running on three different cloud providers

A Government PaaS wouldn’t just solve the issue of duplication, and where to focus your teams. One thing that takes a lot of time in government is procuring commercial services and making sure they are accredited. If we could do that once, for the PaaS, then that could save service teams a great deal of time, while making sure that those aspects are being handled in the correct way.

What a Government PaaS needs

From the user research we’ve been doing it’s clear that it’s important that our platform has a concept of multi-tenancy - applications that run on the platform should be isolated from each other and not be able to read or change each others’ code, data or logs. It wouldn’t be appropriate, for example, if the Digital Marketplace application was able to access the data of the GOV.UK publishing platform.

We’ve also learned from our experience supporting GOV.UK that a platform where the people developing applications also support the application out of hours leads to better software and a better user experience. We want a platform that supports this model right from the beginning.

Apart from multi-tenancy and the support model, there are some other things that we feel are important in a shared PaaS.

It needs to be self-service. It needs to be easy and quick for application teams to get started, and the teams using the platform need to be able to make frequent changes. That means we need to make sure applications can be deployed and managed by the application teams, but also that they can make other administrative changes to their applications, for example configuring DNS. Allowing teams complete control of their applications will remove any unnecessary delays for them, and means the platform team can focus exclusively on iterating and improving the platform itself.

It needs to run on multiple public clouds. This approach ensures that we avoid being locked into a single provider, so we encourage price competition, while also removing the risk of a single point of failure. Changing infrastructure providers is very difficult to do if you’ve built to a single provider’s specification so this needs to be built in from the beginning.

What we've been doing

We’ve spent a couple of months exploring what a Government PaaS might look like and how it could help teams running digital services across government. We’ve spoken to many potential users, and we’ve worked closely with our colleagues in other departments who are working to address similar problems, and we’ve found that no existing departmental solution meets all the needs we’ve identified.

We’ve evaluated several open source and commercial options, and we’ve built a prototype and shown it to potential users – developers, web operations engineers and services managers – both within GDS and in other departments. We’ve tested our prototype by seeing how it works with real applications (for example, we tested it using Digital Marketplace and GOV.UK’s Government Frontend).

We’ll write about all of this more in later blog posts.

What we're doing next

We expect to be in alpha until the end of November, by which time we will have completed a detailed comparison of two open source PaaS technologies and addressed issues around performance, security, and scalability, for example. We are really interested in talking to more potential users, so if you are interested in getting involved in our user research, or seeing a demo of what we’ve done so far, please get in touch.

This post originally appeared on the GDS Blog and was co-written with Carl Massa.

A career path for technologists at GDS

24 July 2015

Until recently, career development for technologists at GDS was relatively unstructured, and dependent on what opportunities came up. Over the last few months I’ve been leading on developing a more structured career path for developers and web operations engineers. This blog post describes what that involved.

Working out our values

First, we needed to be clear on what skills we expect someone to demonstrate at each level (e.g. junior, senior). Every organisation has implicit cultural values, and we wanted to draw these out as well as technical skills, so it’s clear what behaviours we want to reward at GDS.

It's really important in government that people are well rounded, for example having the skills to share externally the things we do at GDS, having an ability to coach and lead colleagues. Making these things explicit is an important part of the exercise, and is one reason why it’s hard to take a career path from another organisation – their values might not reflect yours.

To identify these skills and values we held a series of facilitated workshops with a group of people from our technology community. In order to involve more of our technologists, bob walker, Daniel Roseman and Tom Byers gathered feedback between the workshops from others in the web operations, backend development and frontend development communities.

We also decided at this point that there shouldn't be a frontend/backend distinction; many of the skills are the same, and we want to encourage people to work across the full stack.

Adding more technical skills

The workshops produced an outline of what skills were expected at each level. However, it was quite light on details of technical expertise. It's harder to work that out by committee.

Myself, Brad Wright and James Stewart – three technical architects who are also senior line managers – listed the technical competencies we expect at each level. We also talked to line managers to see how what we had come up with fitted in with how we were already working on developing people.

At this point we had an alpha version of the career paths document, definitely not formatted as we would like, but with the information we had learned in it. You can download the first version as a PDF: Developer:web op career paths v1.0.

Use of the career path

The main aim of the career path work is to help people clarify where they are in their careers and what to focus on when working out how to progress, so the way the document is meant to be used is in a conversation with your line manager to talk about where you want to get to, and what practical things to focus on to help you get there.

The conversation focuses on the four skill areas: technical expertise, delivery, digital evangelism and leadership. So for example, you might be a developer, with technical skills at a senior developer level, but weaker on evangelism and leadership. If your goal is to get to senior developer, we would work out some objectives that would help you develop in the evangelism skill area (for example, blogging, speaking at events) and leadership (for example being a tech lead, line management, or owning a non-technical project).

Too much information!

Once we had version 1.0 of the document, the next thing to do was use the tech line managers as guinea pigs. I pitched the work so far and the conversation template at a meeting of the tech line managers (there were about 18 at the time) and asked them to go through the process with their own line managers and let us know how it went.

This process of user-testing was very useful and we learned a lot.

For example, one thing I had tried to make very clear was that this is not meant to be a tick-box exercise. It should not be the case that if you can show one example of each skill at each level, you automatically progress to the next level. The examples are meant to be an indication of the sorts of things you can be expected to do.

However, including so much detail at each level made it very hard to not do it as a tick-box exercise. Contrary to what we had expected, we needed to include less detail, to make it clearer that they were examples.

It was a useful framework for working out specific areas of development

One important thing we discovered was that even though the document itself still needed further development, the conversations using it were some of the most constructive and useful line management conversations we’d had so far.

I line manage four people, and all four of them came out of our conversation with useful, practical development objectives that they were able to immediately start working on in order to develop their careers as a technologist within GDS.

It’s still a work in progress

We’ve made some changes based on the feedback from the tech line managers, and with input from James Holloway, then a writer at GDS, we produced the next version, which you can also download as a PDF: Career_pathing_for_technologists_v2.0. We’re now using it, and all technologists at GDS have had an opportunity to work out their career plan with their line managers. There are still many areas for improvement, and Alice Bartlett, Edd Sowden and others are currently working on the next iteration.

One interesting point is that we may have gone too far the other way on removing detail. I suspect the way to balance this will be around how we communicate the process, rather than reaching a genuinely “perfect” amount of detail (this is another reason this document may not, as it stands, work for your organisation).

There are also several areas we didn't touch on and will have to be addressed in future. For example, this is not currently related to salary. The career development conversation with your line manager is about where you sit at the moment, where you would like to get to and what steps to take to get there, purely from a skills and career development perspective.

In addition, at the moment this is separate to the annual review process, though the areas for development identified here are really useful for working out development objectives for the annual review process.

It’s definitely been worth doing

Working out our career paths has been a really useful process and we are now doing it for other roles, such as delivery managers and designers. One important thing we’ve learned is that while the finished artefact is useful, the process of how you create that artefact is more important. It has to reflect what you value as an organisation, and this process forces you to be clear about what that is.

GDS started in 2011 with a small team of 12 people, and then grew very rapidly until we numbered 600 in 2014. When your organisation is first getting established, things like structured career development are rarely the top priority. Now we are maturing and people are thinking about staying on for some time, it’s critical to make sure we put these things in place.

This post originally appeared on the GDS Technology Blog.

Tips on writing a technical CV

19 July 2015

In the last few years I’ve been more involved in recruitment and seeing a lot of CVs for developer roles has given me some thoughts on what makes a good technical CV. Here are some tips to improve yours.

Make it relevant!

What I want to know when looking at a CV is whether you have the skills I’m looking for, and if not, whether you show the potential to gain them. I want to know what technologies and practices you know, and what you are doing at the moment. Ideally I’d like the two to be related – if I’m hiring for a Ruby developer (we’re not at GDS, we want good developers of any discipline) I want to know if you are going to hit the ground running.

So when you are describing your current or recent roles, make sure you highlight the things you are doing that are relevant to the job that you are trying to get. For example if the job spec asks for experience of leading a team, then make sure there is evidence for this in the description of what you’ve been working on. Use your cover letter or statement of suitability to draw attention to these areas. Make it easy for the hiring manager to see that you tick all the right boxes.

Together a CV and covering letter should leave no doubt that you know what is wanted and that you can provide it.

Address gaps and concerns

When you are reading a CV you notice gaps and short jobs, particularly recently. The good CVs are ones that address these rather than brushing over it. For example, one CV I saw, their most recent employment was two years ago. Had they been unemployed since then? No – they had taken some time out to raise a family. Another one had only been in their present job for three months. Why were they leaving so soon? They covered this in their letter – unfortunately the company was changing direction due to cash-flow problems. That’s fine – some of the best people get made redundant. But if you don’t explain, it invites the reader to wonder if you failed your probation, or if you changed your mind about working there and might change your mind about working here.

View anything like that through the eyes of the recruiter and offer an explanation, rather than attempting to gloss over it.

It’s also worth addressing any gaps in the required skills. In the example above, if you can’t provide evidence of team-leading from work, can you give other evidence, for example from captaining a sports team or running a Brownie pack?

Do what they asked for

Make sure you supply the requested information. In my case, I was reviewing CVs for a job as a developer at GDS. Our process has now changed but at the time our job adverts said that in order to apply, you need to send “a CV, a CV cover letter, and a written statement of suitability explaining how you meet all of the competencies and specialist skills required”.

A very large number of candidates do not include a written statement of suitability. This is an unusual requirement, but in the civil service it is an extremely important one because if you do not demonstrate evidence of the civil service competencies at some point in the process, we are not allowed to employ you.

OK, so this may not seem very relevant if you’re not thinking of applying to the civil service any time soon. But it is. As a developer, you want to demonstrate that you are able to understand users’ requirements, and a good place to start is with what they actually ask for.

Sell yourself

Essentially that’s what your CV is: a document for selling your skills. So make sure it does that. Say what you have done. Don’t say “I was asked to…” or “My team were given the task of…” Instead, make it clear that you actively seek out and take opportunities, rather than just being handed them.

It can be quite hard, as many of us are naturally quite modest, but it’s good to make it clear that you are a self-starter. It’s also worth phrasing things to show what your contribution to a team’s achievements were. I am interested in what your team achieved, but I am not hiring your team, I’m hiring you, and I want to know what you did to help the team reach those goals.

A profile can be useful

One thing that I find quite useful when looking at CVs is a profile at the top. Just a couple of lines that sum up your current role and what you’re looking for. It’s not essential, but it gives me an overview of what I’m going to read below and whether this person is likely to be a good fit for the role.

You don’t have to say everything!

I don’t really need to read four pages listing every job you’ve had since 1999. What I’m interested in is what you can do now.

Your CV should fit onto two pages. I want the current/most recent job, maybe the one or two before that if you’ve not been at your current place for very long, and if you do a lot of short contract work then maybe a bit more. And then summarise the rest – you just want to show a natural progression.

Similarly, very long lists of technologies are not that useful, particularly if they include things like SVN, Eclipse and Unix. If you are sending your CV to be included in a searchable database, for example for a recruitment consultancy, these will help, but if you are applying for a specific job, it is better to focus on the skills being asked for.

If you’ve worked on a full-stack project, you will have come into contact with a lot of technologies, and if you’re a good developer I don’t doubt you can pick up ones you don’t know quickly. Just concentrate on proving to me that you’re a good developer.

Context is useful

What shape has your experience been? A line summarising what a company’s business is and roughly how big it is is useful. Job titles in tech are very fluid and something like “senior developer” doesn’t mean the same thing from place to place. If you tell me how big the dev team is, then that gives me a bit of context.

And give details. Examples are good. You were a Linux admin; how many servers did you manage? Even better is examples of particular things you did. Don’t just say “I improved performance”, say “the site was experiencing extensive load spikes and I was able to diagnose the cause as X and implement solution Y which led to a reduction of Z%”.

Spend time on the layout

Yes, I know, you’re a developer not a designer. But get a designer to look over it if you can. Grey text, a lot of bold, tiny text – these all make it harder to read. When recruiting for the civil service we read each CV closely even if it is hard to do so, but in the private sector if your CV is too hard to read that might be enough for it to be rejected.

And be aware that it might be printed out.

A word on age

How old you are is irrelevant to how well you can do the job, and since the 2010 Equality Act it is illegal for employers to discriminate on grounds of age. So there is no need to include that extraneous information in your CV. Don’t include your birthdate. You don’t even need to include dates of your formal educational qualifications which would allow people to guess your age. And adjectives like “young” to describe yourself are also odd – you are inviting the recruiter to discriminate. Don’t do it.

In summary: make it easy for them to pick you!

It is much more cost-effective to rule people out at the CV stage than at the interview stage. It will probably take around 20 minutes for one person to look closely at your CV. But if you are invited in for interview, that’s a lot of extra investment of time on the part of the company.

Because we have a fair process in the civil service, you can be sure your CV will be properly reviewed. But the private sector is not bound by such rules. If your application is not great, it’s a much harder sell for the company to make the investment in interviewing you.

Essentially, you want to make it easy for whoever is in charge of hiring to choose your CV, because your CV is so obviously relevant to the role they are hiring for. This does mean that you have to revise your CV and cover letter for every job you apply for, but you should be used to that. You’re a good developer, so you are already prepared to do the hard work to make it simple. Right?

This post was edited on 12/12/2018 to add that your CV should be two pages.

How to raise a good pull request

21 April 2015

On our team we always commit code using pull requests, for review by someone who hasn’t worked on that code.

I was recently pairing with the excellent Martin Jackson. He had made a change to use Librarian-Ansible to manage our dependencies; but the pull request was difficult to review because most of the changes were in one commit. I paired with him to help make it easier to review, and he suggested I write up the guidelines I shared.

Write good commit messages

As an absolute minimum, you should use good commit messages. The GOV.UK styleguide on commit messages is a very good summary of how to do this and why.

Essentially, the diff shows you what has changed, and the commit message should tell you why. Ideally, you should make the explanation of why you made that change as if you were talking to someone who is looking at this code in two years, long after you’ve moved on to another project. What may seem obvious to you now, when you are head down in that code, probably won’t be obvious to you next month, let alone to someone else later on.

Do one thing in a pull request

The smaller the PR is, the easier it is to review and so the quicker your change will make it onto master. A good way to keep it small and manageable to focus on just doing one thing in a PR. The story you’re working on might involve several changes, but if you can, it’s worth splitting them into individual pull requests.

A clue as to when a PR might be doing too much comes when you’re writing the headline for the PR. If you find yourself saying “and” or “also” or otherwise trying to squeeze in a number of concepts, your PR might be better as two or more.

Make the pull request tell a story

When someone is reviewing the pull request, it should tell a story. Take the reviewer along with you. Each step should make sense in a story of how you introduced the feature.

For example, with the Librarian-Ansible change we rebased this commit into this series of commits. Each of those commits is self-contained and comes with a commit message explaining why that step is taken. Taken together, they tell a step-by-step story of how we introduced Librarian-Ansible.

This allows a reviewer to follow along with your process and make it easier for them to think about what you’ve done and whether the changes you’ve made are the right ones.

For example a reviewer might get to the point where we configure Librarian-Ansible to use something other than the default directory and wonder whether we should instead have changed our Ansible code to refer to the librarian-roles directory. Without the separation of steps into a story, it would be difficult to see that step and wonder about the change, so that potential review point is lost.

Make it a logical story

Ordering the commits so they tell a story can be quite hard to begin with, especially if you’re not sure how the piece of work is going to play out. After a while, you will get a feel for the flow of work and you’ll have a better idea of what small chunks to commit. Until then (and even then) git rebase interactive is your friend.

Apart from making the PR tell a story, it’s worth rebasing to keep connected changes together. For example, instead of adding some commits at the end that say “I forgot to add this file” or “Also making the same change here”, it will be clearer for the reviewer if you rebase, and add those changes to the original commit to keep the narrative. I’ve often reviewed a PR and made a comment like “this change also need to be made in X”, only to find that has been done in a later commit.

The cleaner and more logical the narrative of the commits in the pull request is, the easier it is for the reviewer to retain the whole context in their head and concentrate on the important things to review.

Provide as much context as possible

Imagine everyone else on the team has no idea what you have been working on. Ideally you want the pull request notification to arrive with a full explanation of what this change is and why you’re making it, so that anyone can pick it up and review it.

Link to the story it relates to (you can also use a webhook so the story or issue is automatically updated). Point to other PRs or issues that are related. Explain what tests you’ve done, and if relevant, what the reviewer can or should test to confirm your changes.

The more context you can provide, the easier it is to review which makes it more likely to be addressed quickly.

Make sure the pull request, when merged to master, is non-breaking

Ideally each commit should be an atomic, non-breaking change. This is not always possible. However, a pull request is a request to merge your code onto master, so you must make sure that it is complete, in that it is a non-breaking change. It should work, and the tests should still pass once your pull request is merged.

Never commit directly to master, no matter how small the change is

This is a rule on my team, for two reasons: communication and review.

When you have a number of people working on a codebase, you want to communicate to everyone what changes you are making. Raising a pull request is a broadcast to everyone watching that repository so even team members who have not been involved in that piece of work can keep an eye on what’s going on so that changes do not come as a suprise to them. You also do not know what everyone’s experience is, and raising a pull request can sometimes trigger useful input from an unexpected source; committing directly to master would have lost you that opportunity.

As well as making sure the whole team can keep an eye on what changes are happening, raising a pull request also allows the team to maintain a level of quality through code review. Some changes are so tiny that they (probably) don’t need a review, but by making a rule that you never commit directly to master, there’s no chance that something that should not have done will slip through the cracks.

An example of a good pull request

Take a look at this extremely good pull request raised by the excellent Alice Bartlett.

Alice did the work first, and then pulled out the changes one-by-one to a new branch to make it clear. While doing the work she refactored some code but in the final PR she has put the refactoring up front, to clear the way for the change she wants to make. This makes it much easier to review because the changes don’t then clutter up later commits. There is also a lot of detail in the overview.

Raising pull requests like this takes time, but it is really worth doing; it makes it clear for your reviewers, the rest of your team, and for the programmers who will be working on this codebase long after you’ve forgotten about the changes you made today.

Staying technical

29 March 2015

A few weeks ago I wrote a post about being a technical architect, and one of my main conclusions was that you need to stay technical. But how?

Phil Wills had a suggestion for me:

At QCon, I asked him more about this. Did he mean all recurring meetings? What about retrospectives? What about stand-ups? We ended up having a mini Open Space of our own, joined after a while by a couple of others, and then Phil proposed a session on ‘how to stay technical as a senior techie’ at last Friday’s Scale Summit. Here are some of the thoughts I got from that session.

Being interruptible is part of the job

Whether by “the job” we mean technical architect, manager, or senior developer, there was broad agreement that being interruptible is an important part of it. You are in that position because you have the skills to help others and it might be that interrupting you saves other people a lot of time. Twenty minutes of your time on a problem you’ve seen before could prevent someone else spending two days stuck down a rabbit hole, for example. So you need to learn how to be interrupted: how to minimise the negative affect interruption has on your own work and enjoyment of it.

Not losing context

A big problem with being interrupted is the loss of context, and we talked a bit about how to mitigate that. One suggestion that comes up frequently is blocking out time, e.g. core pairing hours. This may not work if you can’t always pick the times of meetings you need to attend but can certainly be useful when you can.

Another brilliant suggestion was around doing Ping Pong pairing. The reason this helps is that if you do get interrupted, your pair can carry on without you – this is not ideal, but if the interruption is unavoidable it does mean that the work continues and it’s relatively easy for you to reload the context when you return.

Other suggestions included working on smaller pieces of code, for example bugs in your issue tracker, rather than large features; and making sure that the code you write is such that you can keep it in your head. Finally, working on supporting tools that aren’t on the critical path can be a good way to keep coding without potentially blocking the team’s progress.

Staying current

Staying technical is not just about writing code, and we talked about some other ways of staying current. Some people found doing Coursera courses useful, though this does require a huge time commitment, usually outside of working hours, so won’t be sustainable for many people. Reading code, rather than writing it, was suggested, along with keeping up with what people are working on in the ticketing system.

One person talked about the company organising a regular Coding Dojo which seems like a great way to keep everyone current on what is being worked on. Another way to do this is mobbing, or mob programming.

The main way I stay current, which I don’t think was mentioned in the session, is via Twitter and some newsletters like Devops Weekly.

Leadership is not just about knowing the latest technologies

We finished up with a really interesting discussion around why we were all so concerned about staying technical, and it turned out that many of us were worried about losing what it was that had made us senior in the first place.

People pointed out that being senior is not necessarily being able to write perfectly in whatever cool new language recent graduates are using, it’s reasoning about problems. You are in a position to use your knowledge of similar problems to cut to the heart of the matter, and you can use your experience to ask the right questions.

Practical suggestions included using Socratic questioning to draw out problems people might not have thought about yet. “What do you think the solution is?” “Why?” “What would you do about X?”. You also should not be afraid to say when you don’t know, and “let’s work it out together”.

But for me, the take-home message here was best summarised by Robert Rees, who I quote with permission: “The fear is that you lose authority as you lose expertise, but actually that’s not true: authority comes from having delivered many projects.”

I found the discussion really useful and have some practical tips to take away. It was also nice to know that there are many others facing the same issues as me. Thanks to Phil for proposing it and to all who took part!

What is a technical architect?

01 March 2015

I have wanted to be a technical architect since I started out in IT, and last September I was delighted to achieve this long-held goal. But very quickly I realised that, while it was clear to me that the role of a technical architect is overseeing the technical aspects of large software projects, I really wasn’t sure what that meant I should be focusing on day-to-day. So I turned to some of the brilliant technical architects I know, both colleagues and elsewhere in the industry, for their advice.

Being a technical architect

In my first job as a developer, several years ago, I sat down with an architect I really respected, and asked his advice on how to get there. The diagram he drew for me in that conversation is the career plan I’ve been working off since then:

But now I am here it turns out there is a lot of variance. Different technical architects do completely different things. Some are hands-on with a project or small number of projects, writing code. Some have a consulting role. Some are inward-facing to the organisation, some outward-facing.

I know that I definitely want to stay technical, so I asked specifically about that and also for more detail of what the role involves. I focused on skills over and above my previous role as senior developer; for example, problem-solving, analytical skills and being a technical authority are all part of the developer role as well. I haven’t included people’s names as these conversations all took place in a series of private chats and emails.

A tech arch does what needs to be done

One thing that came up from almost all of the people I spoke to was that the technical architect does what needs to be done to move things forward. The role was variously described as “being a grown-up”, “the project shaman”, and “the rare responsible person”.

Of course, that could also be the description of the role of delivery manager and product owner and also, in some ways, any other member of the team. So one important thing I’ll have to think about is how to do what needs to be done to get the thing to work, without taking on too much of the non-technical aspects of that.

You need to stay technical

I definitely want to stay technical, and in any case some of the people I spoke to felt this wasn’t optional – you have to stay technical to do a good job. “You need to be coding as an architect. Try and spend as much time as you can on writing code.” This person also said “when things are going right, a tech lead and a tech arch are the same thing” and recommended Pat Kua’s series on technical leadership.

Staying technical while also doing whatever needs to be done to get the thing delivered is going to be tricky. One person advised me “the management stuff can eat up all the time if you let it, so make the time to step back and plan or engineer situations that will force you to do some ‘maker’ stuff.” One suggestion was that in a situation where I am overseeing a number of projects, I could go and visit one of them and then spend the rest of the day coding in one of their meeting rooms.

Someone also suggested that it’s good, if you can, to avoid regular, fixed commitments: “if you have a lot of steady state commitments, your ability to spend time on the most pressing issues tends to be squeezed.”

More specific than staying technical is actually trying to make sure you take the time to work alongside the people actually doing the work. One person who had done some job-hopping from technical architect to developer and back said that one thing the job changes had reminded them of is “how frustrating it can be working at the developer level with decisions being made ‘elsewhere’ in ‘smoke-filled rooms’. Working at that level a little may help you to keep in touch with these sort of issues”. Though you have to take care that people don’t treat you with kid gloves.

Responsible for communicating and enabling the technical vision

There were a lot of useful general comments on what the core skills of a technical architect are. It’s about owning the overall vision, establishing a common understanding of what the team is aiming for, and communicating it.

The technical architect should be paying attention to whether we are building the right thing, the right way, and the long-term vision. This might involve requirements, and it might involve a roadmap, but ultimately, the architect – as one person put it – should be “fundamentally lazy”; an enabler. Not command and control; rather communicate the long-term vision and let the team figure it out.

And the most important skill of a technical architect is the ability to find simplicity in complexity.

Influencing people

Many people point out that a technical architect’s role includes influencing people. As in a tech lead role this may well include stakeholder management, persuasion, and and maybe even convincing people to make big changes in the way they work.

One person had some very practical advice on this: “A conclusion from my initial architect days was that influencing people and coaching them is a major part of the job so I needed to work on it. The former doesn’t come naturally so I still have to make a conscious effort to take opportunities to talk to people without a specific purpose in order to form a relationship that will help us work together better later. People are far more of a challenge than computers! As someone said recently, if you’re looking for a complex adaptive system to work with, try people!”

Say no to things you really want to do

One skill that I’ve had to work hard to develop is learning to say no. Initially, you progress in your career by volunteering for a lot of things. You gain a breadth of experience, leadership skills and get to work on a lot of varied, interesting stuff. However, as people start to recognise you as someone who can get things done, you will be asked to take on more, and you need to learn to be selective in what you agree to take on so that you can properly focus your attention on the things you have taken on.

I thought I was quite good at that, but in my new role, I am going to have to take it to another level. I am going to have to start saying no to things that I really want to do. One person put it like this: “You will get lots of opportunities to do things that you would be really good at, that you would enjoy and you know that you would knock it out of the park; and you need to be able to say no to those things that don’t meet your goals.”

As a technical architect you are more in charge of your own time

Someone said “as a tech arch, you are largely in charge of your own time, so decide what you want to do and do it”. As noted above, a good thing to do is to carve out time for coding, design, or just thinking about a problem.

That doesn’t seem to reflect my current situation though. While it’s true I’m in charge of my time to the extent of how I help the team reach our goals, there seem to be a number of non-optional things I have to do in order to get there. A lot of that involves making sure to talk to people, and my days at the moment tend to be back-to-back meetings. Here is tomorrow:


And most of last week was the same.

In part, this is a function of the stage of my project is at; we’re just starting out so still need to communicate our initial goals and build the team. But I think it also indicates that there are some skills that I need to develop as a technical architect.

Luckily I have a great team and I’m confident that we can work out how to give everyone the room to do what they need to do, so my immediate next step will be working with the team on carving out the time for me to stay involved in the technical aspects of the project. I shall report back.

How we moved vCloud Tools from Coding in the Open to Open Source

19 December 2014

At GDS, most of our code is publicly available in our alphagov organisation on GitHub. We call this “Coding in the Open” rather than “Open Source”. As James explained in a previous blog post this is because for most of our software, we are not in a position to offer the support that true Open Source requires.

However, we do also have some Open Source Software, in our GDS Operations organisation on GitHub. When we started building vCloud Tools, they were “Coded in the Open”, and we wanted to make them into Open Source Software. Here are the steps we took to do that.

Ensuring the project has a long term home

For the first several months of vCloud Tools’ development, the four of us working on vCloud Tools were a separate team with a separate backlog. However, once the tools were ready to be used, we wanted to bring them back into the GOV.UK Infrastructure Team. Mike Pountney and I rejoined the Infrastructure Team with the vCloud Tools backlog, and our colleagues who hadn’t been working on vCloud Tools started picking stories up.

Initially this was mostly pair programming in order to share the knowledge. A couple of people who were particularly interested took more ownership, but everyone in the Infrastructure Team contributed, including the Head of Infrastructure, Carl Massa, who merged the change that published vCloud Tools to RubyGems.

Releasing each tool as a Ruby gem

Packaging the code as a Ruby gem makes it easier for others to get started with using each tool, using a commonly installed packaging tool. It also ensures that the code follows a relatively consistent structure. Most people used to working with tools developed in Ruby will quickly know how to install it, how to use it, and where to start if they want to make modifications.

I explained as part of a previous blog post the steps we took to release each tool as a gem.

Learning from others’ experience

We already had several Open Source projects, mostly Puppet modules and our own Open Source guidelines, but we were interested in learning from others who knew about maintaining larger projects. One of the main things I learned while talking to people with experience in this area was that when running an Open Source project, you want to optimise for bringing people into your community.

Because of that, we made some changes, one of which was to add contributing guidelines with the specific aim of being helpful to new contributors. We wanted to make it easy for people who were unfamiliar with what would be involved, so we included explicit detail about how to raise a pull request and what we are looking for. You can see from the contributing guidelines on vCloud Core that we say pull requests must contain tests, but we offer to help users write them.

To make it easier for contributors and reviewers to see the quality of submissions, we also added Travis configuration so that lint and unit tests and are run against all pull requests.

Guidelines for communication with contributors

At the same time we also set some ground rules amongst ourselves, for example how quickly we should make sure we respond to pull requests. We drew up some guidelines for reviewing external pull requests. These were on our internal operations manual, but I have now added these to our public styleguide for reviewing PRs.

Putting vCloud Tools with our other Open Source projects

As I mentioned, we have other Open Source projects in our GDS Operations organisation, and once vCloud Tools had met all our Open Source guidelines, we wanted to move the tools to this organisation.

Since there are six gems and they are related, we discussed moving them to an organisation called vCloud Tools. This would have some advantages, for example it would make it easier to add core maintainers who do not work at GDS. However, after some discussion, we felt that it was important to keep the link to GDS so that people who wanted to find out what work government was doing would be able to find their way to vCloud Tools along with other projects. This is also in keeping with what other organisations who have a lot of Open Source do, for example Etsy and Netflix.

We also created a website to showcase our Open Source, with a lot more detail on vCloud Tools.

Moving issues to GitHub

Another thing that came out of early discussions with other Open Source practitioners was the importance of public issue tracking. We had features and bugs in our own issue tracking system, but having them on GitHub issues is better for users because it means they can find them. For example, if a user experiences a bug, they can search for it and may find that we’ve fixed it in a later version or at least that we are aware of it. It also has the advantage that people might fix our issues for us!

At this stage we also added a ‘newcomer friendly’ label, for issues that don’t involve an in-depth understanding of the tools and don’t require access to a vCloud Director environment. This is to offer people an easier way to get involved in contributing.

Encouraging early users

Because we had been coding in the open from the start, we already had some early users. This was really useful, because we found they picked up things we had missed. For example, people coming to the project for the first time will notice onboarding details that you may have forgotten, and can help you clarify your documentation.

Users of your software also help by finding issues that you might have overlooked. For example, one of our early users noticed a discrepancy between our documentation and our code which turned out to be a bug. This was really useful because the bug was in a feature that we don’t use on GOV.UK, so if they had not raised it, we might never have noticed it ourselves.

In order to encourage users, we visited and spoke to teams both inside and outside government. This early contact helped us make sure we were focusing on features that would be useful for more than just ourselves, and in some cases our users even added these features themselves.

Sharing our experience with others

We’ve worked closely with some of our colleagues in government, and currently teams at the Ministry of Justice, HMRC and DVLA as well as external companies are using vCloud Tools for some of their provisioning.

However, visiting individual teams wouldn’t scale for every team that’s interested, so in order to share more widely we started talking a bit more about what we were doing with vCloud Tools publicly. We wrote some blog posts, including a one by Matt Bostock which shows how you could use vCloud Tools to provision your environment. Matt and Dan Carley gave a talk at the Ministry of Justice, and I gave two talks at conferences.

What happens next?

We still have work to do. One thing we’ve found difficult is when a contributor raises a pull request for a feature that is not something we need at the moment. There is work involved in reviewing the code and testing it, and when the feature is not a priority for GOV.UK we’ve found that it’s hard to make a case for doing that work ahead of other things we need to do.

We are really keen to support our Open Source projects, and including contributors’ code even when it’s not something we will use immediately helps the tools be more robust and useful. We’ve discussed some ideas around this, and our current plan is to allot some time every week specifically for working on Open Source, so that those kinds of features will be a priority during that time. We will see how we get on and report back.

We’re also aware that a wider range of contributors leads to other considerations and what that means about things like how we license code is another post in itself. Watch this space, and as always, your comments are very welcome.

This post originally appeared on the GDS Technology Blog.

How do I get more women to speak at my conference?

08 December 2014

Last year I wrote about how anonymous submissions affected gender balance at SPA conference. Short answer: not hugely, but it did have some other positive effects. This year, however, we did have more women speaking at the conference than previously. Here are some suggestions for how to replicate this at your conference.

The stats from SPA


8/39 = 21% of the session leaders were women.

2/14 = 14% of the solo session leaders were women.

3/8 = 38% of the people running more than one session were women.

If you compare this to the previous three years of SPA stats, this is both more women speakers and a higher proportion of the total speakers. In addition, unlike the three previous conferences I’d looked at, some women ran more than one session at the conference. In fact, three women did this.

I recently received an email from a friend organising a conference:

Hi Anna,

I'm looking for some help in getting more women speaking and participating at ${CONFERENCE}.

I've (again) promoted the Call for Speakers using (most of) the resources mentioned on the CallbackWomen site. I'm also promoting it through local women in technology groups.

I'm also using an anonymised review process and am (as always) being encouraging to first-time speakers.


These are all really good things to do, and I recommend any conference organiser does the same, but I did have one more suggestion, and that is outreach.

The best speakers may not realise it

As anyone who has organised a conference knows, there is little correlation between those who are most keen to submit and those who run the best sessions.

When we announced the call for proposals, as well as tweeting it and sending it to various mailing lists, I and my co-chair set about writing individually to people we thought would propose good sessions.

One name that was suggested to me was that of Sandi Metz. At the time, I wasn’t aware that she was the author of the extremely good Practical Object-Oriented Design in Ruby, but I looked at her website and the courses she was running sounded excellent, so I emailed her in the hope that she might be available.

She got back to me, but she was uncertain about submitting to SPA. Not because, as you might think, it wasn’t a big enough deal conference for her, but because she was uncertain that her material was appropriate for the conference. I quote from her original email (with her permission):

"I hear that SpaConf is a place for leading edge, new thought, experimental stuff, and I feel like I'm doing simple, straightforward, intuitive, well-known, obvious OOP".

I wrote her an impassioned email back explaining why I thought that topic would be brilliant at SPA (for example, it’s only obvious when you know how…) and she decided to submit.

Her session got top marks from all reviewers, was one of the best attended, and when the feedback came in, was rated the highest. I went to the session and it was really excellent. And amazingly, Sandi thanked me for persuading her to attend. Later, she said “I would never have submitted to SPA; you made this happen.”

You might think someone as successful and awesome as Sandi would realise that her session would be amazing. But that wasn’t the case. She explained, “I imagine that things that seem obvious to me are already known by everyone else”. And there are plenty of other excellent people out there for whom that is also true.

You have to encourage women to submit

Women are socialised not to put themselves forward (here is a fascinating study on one way to address this), whereas men are more likely to put themselves forward regardless of competence (Clay Shirky describes this useful skill as being an “arrogant, self-aggrandising jerk”).

This won’t translate to no women submitting; many do. But it will be worth your while approaching the women you would like to submit. And not just women but other under-represented groups. People of colour, differently-abled people, really shy people, for example. People who may have bought the hype that your conference is only for the cool newness, or who lack confidence, or who aren’t going to put themselves forward for whatever reason. Don’t optimise your process for people who think they are amazing.

This is borne out by my own experience. Ivan Moore asked me seven or eight times over a series of months “when are you going to submit to SPA?” before I finally submitted my Data Visualisations in JavaScript session. It was really successful, and I have since done several other conference talks.

Don’t just look for people who are already on the conference circuit. Ask women you know, people you work with, etc. Everyone has an interesting story to tell about what they are working on, or are researching, or their hobby, or what they are expert in.

You may have to put more work into encouraging those people to submit. It was not an insignificant amount of effort on my behalf to get Sandi to submit, but it was so worth it.

You need anonymity in selection to back this up

It’s important to note that if you are reaching out to people to ask them to submit, rather than directly to invite them to speak, you need anonymity to back this up. To quote from the article that inspired me:

If you go around encouraging people that are usually under-represented at your event to submit to the CFP and promise an unbiased selection, it ensures they don’t feel that they’ve been picked because of a particular personal feature, rather than the content of their proposals.

Our process at SPA remained the same in 2014 as it was in 2013. Read my original post for full details, but submissions remain anonymous until the point a draft programme is produced.

A not-exhaustive note on how to ask

I like your work on X. Y is relevant to our attendees. I saw your talk on Z at ${MEETUP} and think something similar/the same talk would be brilliant at our conference.

Not “our conference needs more women/people of colour/LGBT speakers”. I am already going to imagine that’s the only reason you asked me. It would be nice to know that it’s my work you’re interested in.

A shout out to Andy Allan, who organised Scotch on the Rocks. A slightly different scenario as he invited me to speak rather than asking me to submit, but it was a flawless handling:

If you are trying to encourage people new to speaking, these resources might help.

In summary

Seek out women and other under-represented groups who do great work in our field. They will not be hard to find. Pick out some examples of their work that show they’d be good for your conference. Contact them and ask them to submit. Be prepared to put some work into encouraging people.

This does take a bit more time and effort than just tweeting your CFP and leaving it at that. But if you want to put on a great conference – and why else are you bothering? – it is worth putting the effort in to seek out the people who have something interesting, new and exciting to say.

Build Quality In

03 September 2014

I am a published author! I have written a chapter on DevOps at GDS for the book Build Quality In. 70% of the royalties go to Code Club so do think about buying the book. There's a Q & A with me here.

Some thoughts on preparing an Ignite talk

07 July 2014

I did my first Ignite talk in November last year, based on roof bug-fixing, and these are some of the things I found useful while preparing.

Launch right in

I always start talks with some kind of anecdote or fact to get the audience interested, rather than a bio. Why should we care who you are until we know whether you’re worth listening to? So usually I introduce myself around the 2nd or 3rd slide.

But Russell Davies, who was incredibly helpful when I was preparing the talk, suggested I go one further – just keep telling the story. When he pitched it to me he said “it would be a brave move…”, which I took as a challenge I had to accept.

You will lose the beginning and the end

Scott Berkun’s Ignite talk on giving Ignite talks is very worth watching if you are considering giving an Ignite talk.

The main practical thing I took from it was that you will probably lose most of your first and last slide. Every time I practiced a run-through, I waited at least a few seconds before starting the first slide.

It was a good job I did. On the night, they had been playing music as people ascended the stage, but for some reason, they started mine when I was already on the stage, when I was about to start speaking. But because I’d practiced not starting immediately, it didn’t throw me, and it didn’t put me behind from the beginning.

Don’t be too tied to your slides

More great advice from Russell: don’t be too tied to your slides. Just talk, and if they coincide then it looks like you’re a genius, and if they don’t it doesn’t matter.

Do real practice

Another incredibly useful piece of advice I got from Jason Grigsby’s post about his Ignite talk was to do real practice. This is really important. When you make a mistake in a practice, then carry on – practice recovering. You will learn how to improvise until you get back on track – which is one of the most important skills for a successful Ignite talk.

They want you to succeed

People are at Ignite talks to have fun. It’s a great crowd, like doing a Best Man’s speech (yes, I’ve done one of those too) – the audience want you to do well. That, or fall off the stage.

It’s an excellent constraint

The format of Ignite talks is a good constraint to encourage creativity and I really recommend you give it a go. I enjoyed it a lot.

When it was finished I’d made quite a few mistakes and missed out things I’d wanted to say, but on balance, I think it went pretty well. Judge for yourself.

How I keep alphagov/fog updated...

16 June 2014

I refer back to this useful email from Mike Pountney pretty frequently, even though it’s easy enough to look up, so thought I’d save it here.

git checkout master &&

git pull &&

git fetch upstream &&

git merge upstream/master &&

git push

Using Git to refactor vCloud Tools into separate gems

04 June 2014

When I started working on vCloud Tools, we had most of our functionality in one repository. There wasn’t a clear separation of concerns – the code was tightly coupled – and it also meant that a user who only wanted to use one tool to do one job had to install the whole thing. So I pulled out each functional area into an individual Ruby gem.

Rewriting history

When creating the repository for each new gem, I didn't want to lose all the commit history; it is very useful documentation. If I had started each gem from scratch, the commit history would remain in the vcloud-tools repository, but I prefer not to have to go somewhere else to see the history of what I'm working on right now.

I also didn't want to create each directory side-by-side and then delete the unnecessary code, as this would make each repository much larger than it needed to be (as Git stores all the history) and would essentially be several duplicated projects with different bits deleted.

What I really wanted was to go back in time and to have started with this structure from the beginning.

Luckily, Git provides a mechanism to do this: git filter-branch.

Creating the new repository

To get started, I cloned the existing vcloud-tools repository locally:

git clone --no-hardlinks /path/to/vcloud-tools-repo newrepo

You need the --no-hardlinks flag because when cloning from a repo on a local machine, files under .git/objects/ are linked to the original to save space, but I wanted my new repo to be an independent copy.

I then deleted the remote in newrepo. I didn't want to push my new, pared-down repo over vcloud-tools.

git remote rm origin

Deleting irrelevant code

Having made a new repository, I then pruned away any code that was unrelated to that tool. So for example when pulling out vCloud Core, I pruned away all directories that didn't contain vCloud Core code.

For this, I used a tree-filter. This checks out each commit and runs a shell command against it, in this case rm -rf c, where c is an irrelevant directory or file.

git filter-branch --tree-filter "rm -rf c" --prune-empty HEAD

Because it's checking out each commit, it takes some time to do it this way (though it speeds up, as using --prune-empty removes commits that are left blank after the shell command does its job, so the total number of commits decreases as you progress through the task).

This command actually allows you to use any shell command you want, but I found that deleting things I didn't require one-by-one, while time-consuming, meant that I picked up some things that had been missed, for example files in the wrong place and tidy-up that needed to be done.

Tidying up

After each time you run this command and prune away files or directories, you need to do some cleanup. (I just wrote a little shell script and ran it each time.)

When you run git filter-branch a .git/refs/original directory is created, to allow for a restore. These objects will be retained if you don't remove them so you need to remove the references:

git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d

These are usually cleaned up by Git on a scheduled basis, but because I was going on to remove other folders, I wanted to expire them immediately, and then reset to HEAD in case that had changed anything.

git reflog expire --expire=now --all git reset --hard

Then, I forced garbage collection of all orphaned entities.

git gc --aggressive --prune=now

The final line of my shell script just output the size of the .git folder so I could see it getting smaller as I pruned away unneeded code.

du -sh .git

Important warning!

You need to be extremely careful when rewriting history. It is very important not to do this on a public repository unless you have a very good reason, as it makes everyone else’s copy of that repository incorrect. So I waited until it was finished and I was happy with my local version before pushing it up to a new repository.

Applying gem structure

For all tools other than vCloud Core, the first thing I had to do was redo the directory structure.

I also had to move the file that loads the dependencies, and during the pruning process is became clear that we had a lot of dependencies at the wrong level, or not required at all. Deleting code is very satisfying!

I then added the required files for a gem, for example a gemspec, a licence. At this point, I also added a CHANGELOG to help us move the tools to open source.

Some interesting things about Git

I discovered some new things. For example, Git is case-insensitive with regard to file names.

git mv spec/vcloud/fog/Service_interface_spec.rb spec/vcloud/fog/service_interface_spec.rb

told me:

fatal: destination exists, source=spec/vcloud/fog/Service_interface_spec.rb, destination=spec/vcloud/fog/service_interface_spec.rb

You need to force it with the -f flag.

Also, you can copy commits from another repository, as if you were using git cherry-pick to copy from a branch in the same repository, by creating a patch and applying it locally.

git --git-dir=../some_other_repo/.git \ format-patch -k -1 --stdout | \ git am -3 -k

Then I published the gem

To enable our continuous integration, I added a Jenkins job and a Jenkins hook to GitHub so that a commit to master will trigger a build in Jenkins.

Once I was happy that everything was present and correct and it was ready to publish, I added a gem_publisher rake task, and then included that in Jenkins. This means that when a commit is merged to master, if the tests pass and the version has changed, the new version is automatically published to RubyGems.

Ta-dah! vCloud Core.

Finally, I made a pull request on vCloud Tools to remove it.

Pulling out all gems

Over a couple of months I pulled out all the gems and the architecture of vCloud Tools now looks like this:

hand-drawn diagram of vCloud Tools

This approach follows the UNIX philosophy of simple tools that do one thing, which together can form a toolchain to do do more complex tasks. vCloud Core now takes care of the interactions with fog and the vCloud API, and the other tools depend on that. vCloud Tools is now a meta-gem that pulls in all the other gems as dependencies. This has definitely made it easier to develop on and use vCloud Tools, and I learned a lot about Git and Ruby gems along the way!

This post originally appeared on the GDS Technology Blog.


17 May 2014

At Scottish Ruby Conference earlier this week, among the many great talks I attended was Jess Eldredge’s excellent introduction to sketchnoting.

I liked the idea of doing sketchnotes since I came across the first ones I saw, by Amanda Wright, but had never properly tried it. I do makes notes in a fairly visual way – here, for example, are my notes from Kerri Miller’s excellent talk on concurrency.

concurrency notes

The main thing of course is that my handwriting is atrocious.

I was right at the front in Jess’s talk and felt self-conscious, so the notes I made were actually worse than my usual standard.

original sketchnotes notes

However, I was inspired, and also happened to have a nice pen with me, so tried to do a sketchnote in my last talk of the day, André Arko’s very interesting Development was the easy part.

tip of the iceberg notes

Not amazing. But better than my usual notes.

What I like

Jess’s advice was very practical and could be immediately applied. For example, the title is the bit that looks really good, and that’s because I got there early as she suggested and wrote it before the beginning of the talk.

Her main advice for bad handwriting was to slow down – and I did, and it really helped, in that I can actually read this, unlike my usual scrawl.

I did find it made me concentrate a bit more as everything André said I was trying to figure out what the key point would be to that section so I could write that down.

What I didn’t like

However, I clearly ran out of space. Data stores in the bottom right (above request time) was the last section he covered. The bit I missed out was that redis-sentinel runs an election and while it’s running, all candidates accept writes, but then when one is elected, all the unelected ones throw away their writes.

It’s a shame that this is the one thing I was unable to write clearly, as that was the only thing he said that was completely new to me. That made me realise that I was worrying too much about how the notes would look to other people and whether they’d be able to follow them, so writing things down that I already knew. Whereas one point Jess made several times was these are my notes, and don’t need to make sense to anyone else.

Also, a lot of things I note down in talks are actually items for my To Do list, either unrelated if my attention is wandering or, if I’m engaged, how I can apply these things in my projects. I probably don’t want to capture all of that for posterity.

So what did I learn?

It’s worth taking a nice pen and some plain paper to a conference, because at least I can make notes that will be easier for me to read when I’m looking back at them. I might even buy an artist’s pen, as I really like the shading Jess did.

If I slow down, my handwriting is almost legible!

It’s fun. I always fancied doing the bits of comic-book drawing that didn’t require artistic talent, like cross-hatching – this is my chance!

Another go

I decided to write up the notes from Jess’s talk again:

better sketchnotes notes


Building tools to provision our machines

07 May 2014

Over the last few months, I’ve been leading a small team building tools to automate creation and configuration of our servers.

Tools to automate provisioning

Currently, our production environment is hosted with Skyscape, and we manage it using VMware vCloud Director. Skyscape, and other providers using vCloud Director, expose a UI for managing VMs and networks.

However, a UI isn't suitable when you need to bring a lot of machines up in a repeatable, reliable way, for example when migrating the site to a new platform. In addition, if you can automate provisioning, the configuration files represent your infrastructure as code, providing documentation about your system.

VMware vCloud Director supports the vCloud API, but there was no widely-available tooling to orchestrate this kind of operation using the API. So we built vCloud Tools.

Our suppliers currently all use vCloud Director

The UK Government’s policy is cloud first and we procure services through CloudStore (which uses the G-Cloud framework). As government, it’s vital that we provide secure, resilient services that citizens can trust, and we need various assurances from our suppliers to support that.

There are a number of Infrastructure as a Service (IaaS) platforms that are able to provide these assurances, but when we procured our current suppliers last year, the vendors that met our requirements (both technical and non-functional, eg self-service provisioning, a pay-as-you go cost model) used vCloud Director.

We very much want to encourage potential new suppliers, but given the availability in the market, we can be reasonably certain we’ll be using at least one VMware-based solution for at least the next 12-18 months, and it’s likely that many other transformation projects will also be drawing on the same pool of hosting providers. So it’s definitely worth investing time to make provisioning easy.

Preventing vendor lock-in

Having an easy way to provision environments will allow us to move more easily between vendors that use vCloud Director, which will help prevent supplier lock-in. But we also don’t want to lock ourselves in to VMware, so we are taking several steps to guard against that:

Previous iterations of provisioning tooling

Automation of provisioning is something we’ve been iterating on since we launched GOV.UK 18 months ago. Prior to vCloud Tools, we used a tool we had built called vCloud Provisioner. This was really useful when moving onto our first platform, but because it was built quickly it has a lot of hard-coded detail about the environment, so it doesn’t help us be flexible when moving between suppliers. In addition these hard-coded details include sensitive information, so we can’t share it, meaning it cannot be useful to anyone outside of the GOV.UK team.

Several members of the infrastructure team worked on subsequent iterations of the provisioning tooling. These iterations included vCloud Box Spinner and vcloudtools. However, the people working on these were building these tools alongside their other business-as-usual work keeping GOV.UK up and running, so when they ran into issues, it was difficult to find time to address them. With the migration to the new platform looming, we needed to prioritise this piece of work in order to be able to bring up a new environment quickly, reliably and easily.

We think vCloud Tools will be more robust

There are several things we have done to improve the chances of producing robust, more widely useful tools in this iteration.

We have committed the time and resources to this, forming a small team who focus entirely on vCloud Tools rather than firefighting or other operations work, and we are coding in the open, so we won’t fall into the trap of including sensitive information that it is later too hard to factor out.

Not only are the GOV.UK team “eating our own dogfood” by using vCloud Tools to provision our new hosting environments, there are two other GDS teams also using the tools, the Identity Assurance Programme and the Performance Platform. Other exemplar projects have started using the tools, and we have already accepted pull requests from people we do not know, so there is the beginnings of a community around the tools. This keeps us from making it too GOV.UK specific and means that we get a lot of extremely useful user feedback.

And in the meantime we are contributing back to Open Source - in one recent release of fog, a huge number of the contributions were from the GOV.UK vCloud Tools team (who were, at that time, me (Anna Shipman), Sneha Somwanshi, Mike Pountney and Dan Abel).

These factors mean that we feel confident that we are going to produce tools that will continue to be very useful to us and other teams for some time.

More will follow

There will be future posts about particular aspects of what we’ve done and how, including one soon on how we used vCloud Tools in our platform migration. If there’s anything you’d particularly like more detail on, please let us know in the comments below.

This post originally appeared on the GDS Technology Blog.

Running OSS projects

03 April 2014

At ScaleSummit last week, I proposed a session on running Open Source projects. I am hoping to move the project I’m working on, vCloud Tools, from being coded in the open to being Open Source and wanted to get the benefit of the accumulated wisdom. It was an excellent session and I really got a lot out of it.

The executive summary is that if you are trying to build a community, you need to optimise for bringing people into the community, and a lot of the discussion focused around that. Here are some of the main things that I took away from it.

Good documentation is crucial

Optimise documentation for bringing people into your community. It’s unlikely to already be good for that, as you are so familiar with the code and the project that you don’t realise what someone new to it doesn’t know.

One way to do this is to make it clear that you are happy to answer people’s questions – even hundreds of questions – as long as they follow up by submitting a patch to clarify the documentation. Encourage people to raise bugs against the documentation.

Make it easy to get involved

Make it clear how people can contact you for clarification or with ideas. An IRC channel is a great, if you can make it work, but you need to be aware of certain drawbacks. For example, people tend to expect a quicker response than by email, so you need to make sure people are in it. It will be very quiet at night in the UK.

In a later discussion back at GDS we decided not to do that for this project because there would be a lot of effort to make it work and it would be an added distraction. We definitely don’t want to say “Come to our vCloud Tools IRC channel” if there’s no-one there.

As well as (or instead of) an IRC channel, you should have a mailing list. You can either make it open or make it so that only a subset of people can reply. Someone advised that it wasn’t a good idea to have lots of mailing lists, (*announce*, *discussion* etc). It was also pointed out that even if you’re not writing code, managing the community can easily become a full-time job.

The Perl community does communication well, and the example of Catalyst was given – this project has 450 committers and the maintainer has changed 5+ times. Also mentioned was this blog post about communication with newcomers to the project: Love your idiots.

Respond quickly to external contributions

One of the things that I especially wanted to know about was how quickly you should aim to respond to pull requests. This is particularly an issue in that we are going to be managing this in work time and there will be competing priorities. The general consensus was you need to respond within 24 hours, and it’s acceptable for this not to include weekends if you are clear in your documentation that this is the case. It’s important to note that the response doesn’t have to be a review or a merge, it can be as simple as a note saying “Thanks for this, I am going to review it in a couple of days”.

The most important PRs to pay attention to are the ones from new contributors. Again, optimise for bringing people into your community.

It’s a good idea to have CI run tests on PRs so you can see failures before merging – Travis is good for this as it integrates well with GitHub. However, it was stressed that it is important to review the contribution as well, even if it passes all the tests!

Communicate your vision

Something else I was particularly interested in is what do you do if you feel a well-meaning PR is pulling the project in the wrong direction? It was suggested that a lot of this comes back to documenting your original vision and being very explicit about what this software is aiming to do.

You need one person or a group of people to be responsible for maintenance, and they are the ones responsible for the vision. Someone gave the example of Mozilla in the early days who just accepted everything, and eventually had to rewrite the core offering.

But also, take it as it comes. If you as the maintainer don’t think a direction is correct, it’s completely fine for someone to fork the software, and you may later find that the fork is doing the job better. The example of GCC was given, where a separate fork existed for about 5 years, and the GCC maintainers eventually realised that the fork was what everyone was using and merged it back in.

Modularity also makes the process easier – having a common core that everyone can agree on is relatively easy, and divergence can be supported with plugins.

It is very important that if you close a PR, you do so with a clear explanation, whether this is due to poor quality or incompatible direction.

Don’t be too hardline

One thing we talked about was not being too strict with external contributors. Sometimes people might not have the experience to know how to write tests, either in general or in your particular test framework, so insisting that a PR must have tests before it can be merged is going to put people off who could have really valuable contributions. Some people said they are very happy to write tests for newbies. Talk to the contributor to find out why they wanted to make those changes and then show them the tests you’ve written, maybe asking them whether it looks like the test cases cover the functionality they wanted to add.

However, changes in functionality should definitely include updated documentation and it is more reasonable to reject a PR for lack of that.

Assume people have the best intentions

This is great advice for life in general. Even if what they are suggesting looks completely wrong and it’s hard to understand how they could have thought it was the right approach, assume they have the best intentions and proceed accordingly.

Issue tracking

We currently manage work on the project internally using PivotalTracker. My plan was, once we’re ready to make it OSS, we move remaining features and bugs to GitHub issues, and work from that. This was seen as a good idea – it makes it clear to people what we are planning to work on and (ideally!) prevents them from raising issues we already know about. It also has a major benefit – you can Google for the issue.

It must be an active project

You need to be using the software yourself, otherwise it’s a recipe for abandonware. And it’s good to make this activity clear – related to the above point about issue tracking, if all your activity is on your internal tracker and mailing list and private IRC channel, then it won’t be clear to potential users and contributors that the project is still active.

Contributing Guidelines

It is important to have contributing guidelines. This wasn’t discussed extensively in the session, but in an extremely helpful follow-up email from Tom Doran he pointed me at a number of great resources which included the Catalyst development guidelines. I also heard from George Brocklehurst who pointed me at the contributing guidelines for gitsh and also a really useful page that thoughtbot have on code review.

Something that was raised as we finished up the session and were leaving was that, especially as Government, we need to be very careful about who owns the code. For example, some people may be working for companies where their contract states that the company owns all OSS contributions.

What next for vCloud Tools?

These considerations and some others mean it will be a little while before we move vCloud Tools from coding in the open to OSS, but it was really useful to have this input which covered a lot of things I hadn’t thought about.

Many other things were discussed, including when you need a standalone website, and what tools are good for versioning documentation, but these were the things I found most useful and relevant. Thanks a lot to everyone who took part in the session, it was very valuable and constructive.


Note that ScaleSummit operates under the Chatham House Rule so none of the comments made in the session have been attributed, and thanks to Matt Bostock and Phil Potter whose write-ups (Matt’s, Phil’s), were a very useful addition to my own notes.

Are you thinking of speaking at a conference?

09 March 2014

As co-Programme Chair for SPA conference I often find myself trying to persuade inexperienced or completely new speakers that it is worth doing and that they have something great to say. Here are some of the resources that I send on to them.

Nicely expressed post – you don't have to wait to be an expert in something before you submit a talk: You can speak at a conference too.

More detail as to why what you have to say is worth hearing, and how to get started. So Why Should I Speak Publicly?

The list of accepted talks from last year's PyCon US is a long list of great-looking talks on a variety of subjects to spark ideas of what you might speak about.

If you are nervous about public speaking, this post could really change the way you think about it: Presentation Skills Considered Harmful. You are just the UI.

Finally, this is a really useful resource Mairead O'Connor sent me last week, containing short posts on all aspects of talk preparation:

Applying for a job at GDS

24 December 2013

Over the next few months we will be recruiting for a number of roles, including developers and web operations engineers. Since a civil service application is a bit different to other jobs you might be applying for, we thought it was worth explaining what the process involves, with some detailed guidance on how to write the most unusual part of the application - a statement of suitability.

Open and fair process

There are a number of measures in place to make it more likely that applications to join the civil service are treated fairly and on their own merit. You can read more about the civil service recruitment and selection guidance on the Civil Service website, but the main thing you should be aware of is that we try and ensure objectivity by having consistent criteria that we assess applications on.

Consistent criteria

The job advert will list some essential specialist skills and competencies. The specialist skills are likely to be what you are familiar with in job applications, for example for a WebOp, “Experience configuring and managing Linux servers”.

The competencies are a bit more unusual. They reflect important civil service behaviours, like leadership and delivering value for money, because we are recruiting not just technical people, but people to become civil servants. This helps us build a team of people who are not just great technically, but also have other crucial skills, like communication and team-working.
For more information on civil service competencies you can look at this document.

Why this matters to applicants

One of the three documents we ask for in your application is a statement of suitability, which is extremely important, as it is where you get a chance to show the panel the evidence for how you meet the essential skills and competencies described in the job advert.

If a candidate's application does not show evidence of even just one of the competencies, we are not allowed to invite that candidate for interview. This may seem a little harsh, but it's actually this kind of rule that tries to ensure the process is open and fair.

Based on this, we can see how the statement of suitability is so important in the application. Your CV is unlikely to provide quite the right sort of information on its own, as CVs tend to be a collection of achievements and responsibilities. The Statement of Suitability is your opportunity to fill out the gaps in your CV and explain you have the experience to do a great job at GDS. You must remember to give examples, as that is what we are looking for in the application.

How to write a good statement of suitability

The main thing required in the statement of suitability is that you demonstrate the essential skills and competencies asked for. Here is an example.

We recently advertised for Web Operations Engineers (and will be doing so again - watch this space!). One of the essential competencies we are looking for is evidence of how you "coach and support colleagues to take responsibility for their own development (through giving accountability, varied assignments and on-going feedback)".

If you have managed a team, it should be fairly straightforward to think of an example. But if you haven't, there are other ways you can provide this evidence.

The point here is that we feel an ability to coach and support colleagues is essential to being successful in the role, and we are looking for evidence that you have done this in some context before.

A word on how to phrase your answers

It's good to give detail, but your answers don't have to be really long. The best thing to do is think of the example you want to use and give us the essential information.
It may be useful to think of the CAR approach:

If you've covered all these points, you will have structured your example to give us all the information we need.

How to structure the statement

You might find it helpful to list each of the essential skills and competencies as headings and give an example under each, or you may prefer to write it in a letter or essay format. The structure is not important - just make sure that you cover all of the essential competencies asked for.

The other documents

The other documents we require are your CV and CV cover sheet. There's nothing different about how you should structure your CV compared to applying for a job in the private sector.
The CV cover sheet is a form which requires you to fill in some straightforward details so that we know how to contact you. Download this from the very bottom of the job application page, and include it with your application. If you are not already a civil servant you only need to complete three parts of this form. For existing civil servants, you need to fill in more sections of the CV cover sheet - your line manager should be able to help you with this.

Interview stage

Once we have sifted all of the applications we invite some of the applicants to interview. All candidates are asked a consistent set of questions for that role, and you are likely to also have to do some kind of coding or whiteboard exercise. If you are called for interview, you will get more information about what to expect in advance.

This post originally appeared on the GDS Technology Blog.

How Anonymous Submissions Affected Gender Balance at our Conference

17 November 2013

Mainly inspired by this article, I proposed that we make submissions anonymous for SPA2013. The committee (four of us) discussed what we thought the outcome would be for some time, before deciding to just try it and see how it worked out. And here's the answer.

The conference

Firstly, a bit about SPA. It stands for Software Practice Advancement, and it's a very interactive conference – you are unlikely to be sitting in a talk. You'll be writing code in a new language, or doing a group exercise in a process track, or taking part in a goldfish bowl discussion, or actively learning in another way. It's a great conference, and if you're interested in checking it out we are holding a free one day taster with some of the best sessions from this year in Cambridge on 23rd November. MiniSPA. It's free!

Enough pitching. I've attended SPA three years running and became co-Programme Chair for the 2013 conference, and while I think it's very good, one thing I think would improve it is more women leading sessions. I would also like it if we could get a greater diversity of speakers in other ways, but I thought I would focus on women for starters.


The first year I attended, SPA2011, there were 28 sessions, with a total of 43 session leaders between them. Of those 43 session leaders, 38 were men and the remaining 5 were women.

Other interesting things I noticed were that four people presented more than one session at the conference – all four were men. Only nine sessions were led by just one person, and of those nine, eight were men.


5/43 = 12% of the session leaders were women.

1/9 = 11% of the solo session leaders were women.

0/4 = No women ran more than one session.

The next year, there were 30 sessions and 42 presenters. Four were women. Again, the five presenters running more than one session were all men. Ten sessions were led by a single presenter – 8 men and 2 women (one was me!).


4/42 = 10% of the session leaders were women.

2/10 = 20% of the solo session leaders were women.

0/5 = No women ran more than one session.

How we did anonymity

At SPA we have three stages after submission of proposals.

  1. A feedback stage, where people offer advice to help the session submitter improve the submission (which can be edited up until the deadline).
  2. A review stage where at least three reviewers grade a submission according to whether they think it should be included in the programme.
  3. A programme meeting, when we pull the reviews together and decide on the programme.

In previous years, the presenters' names appeared alongside their submissions all the way through. This year, we made them anonymous right up until we had put together a draft programme, and then we did the big reveal, just to check that we hadn't scheduled one person to do several sessions at the same time, for example.


So did it make a difference?

At SPA2013, we had 27 sessions with 46 presenters, 40 men and 6 women. Of those, two were repeat session leaders (both men) and 10 sessions had single leaders, of whom one was a woman.


6/46 = 13% of the session leaders were women.

1/10 = 10% of the solo session leaders were women.

0/2 = No women ran more than one session.

So no, not really.

That is – to the number of women speaking. A lot of the conference feedback praised the variety of sessions and topics, more so than in previous years. So we did something right. I also don't know whether it made any difference to the number of women who submitted, as I don't have that data.

But there were two other very interesting outcomes.

Lack of positive bias is good for everyone else

The first unintended – but good! – outcome was that people new to speaking at the conference felt much more able to submit.

A very representative comment from the survey I conducted was:

"Since I haven't spoken at SPA before, I expect being anonymous helped me compete with regulars."

And correspondingly, the reviewers had a lot of very similar comments. Here's one that sums up most of them:

"We used to have too many reviews along the lines of 'a bit weak but I know Alice and Bob and they'll do a good job'. They didn't always do a good job and it resulted in too many repeat speakers, which I think new potential speakers find off-putting."

Many of the reviewers admitted they didn't actually like anonymity but realised it helped prevent bias. Of course, we are generally not aware of our subconscious biases *against* people, but one thing that anonymity made reviewers aware of was our tendency to be biased *towards* people we know. But of course, that kind of positive bias means that someone merely unknown will have to work harder to get their session considered.

Positive discrimination can be off-putting

The second benefit of the anonymity became apparent in a scenario that I hadn't even considered. One submitter proposed a session about probabilistic data structures, which looked excellent, but her language of choice was Haskell, and most of the feedback focused on attempting to get her to change to a more widely used language. Based on that, she concluded that the talk probably wasn't right for SPA, and assumed that she would later hear it had been rejected.

In fact the talk was very popular with the reviewers, receiving top grades from all of them, so it was a shoe-in for the conference. But when we contacted her to let her know she was in the programme, she initially said she was no longer able to give the talk. A day or so later, she got back in touch to say – to our relief – that she actually was available, and explained her initial reluctance:

"Truth is the session feedback concluded that the idea wasn't right for the conference, so I inferred that you must've accepted the session because I'm female, and had a bit of an emotional reaction; like any human I want to have opportunities based on my own merits."

In actual fact, this was not what had happened, as neither I nor any of the reviewers knew who she was. Her session had been accepted purely on the merits of the submission. And a massive advantage of the anonymity we had put in place meant that we could claim with certainty and evidence that we were not making decisions in a biased way, either for or against her submission based on her gender.

A possible objection

Some people surveyed felt that it might be appropriate to remove the anonymity at the review stage. Ability to present content can be at least as important as the content itself, and people felt the best way to get this information was from knowing who the speaker is. Here is a representative comment:

"I think that the knowledge of a presenter's past delivery can be a big differentiator when you get submissions of similar quality."

But really, this is just another way of saying we want to know when people are well-known – we want to put Alice and Bob in because they always do a good session.

Yes, a speaker's delivery and structure makes a huge difference to how well a session works. However, at SPA we assign shepherds to first-time presenters (or anyone who wants one) and they can be very helpful – my excellent shepherd Adam Iley, among other things, arranged a run-through at his workplace to allow me to practice my session with an unfamiliar audience.

So if we do accept a promising session where we don't know that the speaker is good, we might be able to help with that; and conversely, I tend to think that a good presenter should be able to write a good submission so that will shine through.


In any case, the committee felt that on balance it was positive, so we are continuing with anonymity throughout the whole process this year.

Why not get involved? Submit a session, or get involved in giving feedback on sessions, or both!

Check out the call for proposals, and have a look at the previous programmes I've linked to above for an idea of the kinds of sessions we like.

And if you have any questions, or want to get involved in giving feedback, please do get in touch.

Creating a Puppet module

09 September 2013

I learnt three very interesting things recently, courtesy of the brilliant Dan Carley.

  1. git filter-branch
  2. how to create a puppet module using our puppet module skeleton
  3. shopt -s dotglob

While creating the Puppet manifests for the GOV.UK mirror, we realised we needed a module that was part of the GOV.UK Puppet, so our task was to pull it out into a standalone module and have both GOV.UK and our new GOV.UK mirror use it.

The first step in creating a new module is to pull it out into its own git repository.

First, clone the repository the module is currently in. (Extra learning here: you don't need to clone it from GitHub, you can just git clone the existing repository.) You want --no-hardlinks so it's entirely separate:

git clone --no-hardlinks puppet puppet-auditd

At this stage our puppet-auditd repo still points at our puppet repo, so we want to remove the remote:

git remote rm origin

Now we want to get rid of everything that's not our module and make our module the root, and this is where we use git filter-branch (update 2014-04-12: GitHub have removed this without a redirect – tsk! – but you can find a historical copy here):

git filter-branch --subdirectory-filter modules/audit-d HEAD

The next thing we want to do is create the files in the framework around the subdirectory to make it a Puppet module, for example the Modulefile. For this we can use our Puppet module skeleton. However, we don't want to directly follow the instructions there as we already have a lot of the classes we need. Instead of creating it from scratch as per those instructions, we generate our new module stucture:

puppet module generate gdsoperations-auditd

and then then we copy over the generated files.

However, a normal recursive copy won't work:

cp -r A/* B

This will only copy non-dotfiles, and we want the dotfiles as well, like .gitignore.

However, attempting to copy the dotfiles as well:

cp -r A/.* B

leads to hilarious results. .. is a directory too, so the above command copies all those contents recursively as well. Try it. It's most entertaining. But it's not really what we want.

What we can do is toggle what is included in A/* for this shell. The documentation is here, but this is what we need:

shopt -s dotglob

In order to check what is going to be copied now you can:

echo A/*

and if you like it:

cp -r A/* B

The next step is to go through the files to work out which modifications to keep and which ones are not relevant.

Another tip from Dan: vimdiff is really useful for this task. I won't go into how and what we decided; you can see that here.

Some Regex in the form of a Picture!

10 August 2013

Where Do I Start?

06 July 2013

The first day at a new job is always a bit nerve-wracking. In a recent session at SPA 2013, we came up with some things a developer joining a new team can do to get up to speed quickly.

Whiteboard an architecture diagram

This is a brilliant way to get a quick overview of how the pieces of your new project fit together, and can be tailored to whatever level of understanding you (or the person you are talking to) has. It's a good way to meet people on other teams, when you're looking to get more detail on some of the parts of the diagram. And your fresh perspective can often be very useful to the person taking you through the architecture.

However, it can sometimes be hard to know who the right person is to ask, and it can also be hard to get the right level of detail.

Facilitate a retrospective

This is the fastest way to find out what's going on in a team, and beacuse you're new, you will essentially be an external facilitator which is really useful to the team. It's also a great way to find out what's going on in other teams once you've settled in.

However, it could be challenging if your new place of work is not used to agile working, and of course you do have to be the kind of person who has the confidence to faciliate a retrospective for a bunch of people you don't know yet.

Start putting a glossary together

...with the aim of putting it on the wiki.

This will give you a head start on understanding the concepts that the rest of team take for granted – and has the added advantage that you may hit on areas of vagueness or disagreement that can be straightened out.

It can however be hard to get started with this, and to work out what to focus on, and in some teams it might be a challenge to find anyone who understands them all.

In Summary: Do something, document something

People have different ways of learning and sometimes the explanations you get in your first week can lead to information overload and be hard to take in. So it can be useful to ask more directed questions, for example in pursuit of added detail for an architecture diagram, or to draw out actions for a retrospective.

We had some other ideas as well:

  • Pair program as much as you can, with different people
  • You could think about shadowing someone for a morning or a day
  • Talk to other teams - the teams that particularly can give you insight into what you need to know are the testing team, the support team and the ops team
  • Find out where people go for lunch and wangle yourself an invitation!
  • We also stole some ideas from other teams, like if the team's processes are not documented you could offer to do that as a way to gain understanding. For example, how well documented are the new joiner instructions? You are in a great place to improve those, and a big win would be automating some part of that. Also, meeting the users is a fantastic way to understand the business, so if that's not part of your induction you could arrange that.

    Thanks to Andy Longshaw, Eoin Woods and Nick Rozanski for organising the session.

    Roof Bug-fixing

    21 May 2013

    I often find myself viewing life through the lens of software development, a bit like when you play too much Tetris and all your friends turn into shapes. But recent events have made me think: maybe other things actually would be better if they were more like software development?

    Recently, I had a leak into my flat. By "recently" I mean it's been ongoing for four and a half months, but that's not what this post is about. This is about applying software development ideas to other fields, instead of the other way around. We talk about "software craftsmanship" – I'm thinking about the opposite, as applied to, for example, roofing.

    I first noticed the leak as I was about to go on holiday before Christmas. It rained very heavily, and literally hours before I was due to leave, I saw water bubbling up between the laminate floorboards in the hall. I pulled up the vinyl on the bathroom floor and then took the side of the bath off, and found that my bathroom was about an inch deep in water.

    Cut forward a fortnight to when the housing association manage to send out a surveyor to investigate. He tells me what is causing the problem: the flat above and to the right has – without permission – erected some kind of structure on their balcony, and this has caused damage to the brickwork, so when it rains water is seeping into my flat through the holes in their balcony.

    At this stage, it's a bit of a mystery to me how that works. And why shouldn't it be? I'm not a roofer. Or a surveyor, or a builder, or an architect. I have only the vaguest notion of how buildings are put together, and I don't think to ask for an explanation of how the water is filling up my flat. Apparently, the Offending Neighbour has drilled a hole to put up the mystery structure, and the hole is the problem. They are going to asphalt over it. And they do so.

    All well and good, and the insurance company come round to measure up the flat to fix the damage. Except the day they come to do this, it rains heavily, and once again, I see water bubbling up through the floor in the hall. They have asphalted over the hole in the balcony, but this hasn't fixed the leak!

    A builder comes to my flat to discuss it with me. This is where it starts to get interesting. The builder has already looked at the Offending Neighbour's balcony and now he wants to look at my flat. But he can't see how the pieces fit together. Eventually, I have to take him outside to point at the balcony to demonstrate that water coming from the balcony at the point at which it intersects with my flat would lead to damp at exactly the spot we see it, as per the diagram above.

    This is when I first start to think of this in terms of a bug. Here is the point at which you need to look at the bigger picture. We're not talking about a bit of mistaken logic in a Ruby program affecting the output of that program. We're talking about a timeout two layers away. The manifestation of the problem is a long way from the cause, and you really need a view of the bigger picture to be able to reason about it.

    So the builder goes away, and finally (after calls and emails and threats and complaints) the housing association get back to me and tell me that they are going to put something called flashing on the point at which the wall of my flat meets the wall of ON's balcony. This, they tell me, will definitely fix the problem.

    So this makes sense to me. At this point, I've got an idea of the bigger picture, though the details of the water seeping through the brickwork and then somehow bubbling up through the floor are somewhat hazy to me. But I do have one strong opinion at this point: I want to be sure this time that they have really fixed it. At this point, I conceive of the idea of a water test.

    You see, when we find a bug in our software, we try and recreate it. Then we know when we've fixed it. And something I was starting to notice here was that this wasn't happening. It was as if I had noticed the timeout, and made a configuration change in the general area, then marked the story as done and walked away. We don't do that, with software. And yet this situation that was making my life pretty inconvenient – three months in, my flat was covered in mould, smelt of damp and all my furniture was in one room while I waited for the leak to be be fixed – was being dealt with by people who seemed to be making general stabs in the direction of the problem, without any kind of theory or analysis.

    Of course, I didn't quite realise that – you expect people to be professionals, whatever the job is. But I was sure I wanted them to do a water test.

    But getting them to do this was pretty hard, even though it seemed completely obvious to me. What's the problem? Stand on the balcony with a hose, see if the problem is fixed! At one point, I was told it wasn't "feasible". I started to wonder if I was being too idealistic and actually a building was more like a huge legacy codebase where you may not even have access to all the libraries. Maybe I'd just have to accept a best guess instead of a rigorous approach.

    Finally, four months in, I managed to persuade them to do it. The head of repairs at the housing association came round to instruct the roofer, but even as he did this he was complaining that the water test was pointless, a waste of time, as it was due to rain today anyway and that would test it. He didn't really grasp the idea that a water test is a controlled test – it's a way to control the input so as to work out whether the output is what you'd expect were your conclusions correct. Rain – apart from being unpredictable – is not controlled. For a start, it rains all over the building at once.

    But then the roofer came round, and he was a true software developer.

    We went up to the balcony together and he explained to me what he was going to do. Firstly, he was going to hose down on the balcony, not going over the edge. This was to test the flashing.

    There is a gap between ON's fence and the wall of my flat. It seemed to me that the gap was too low and my suspicion was that rainwater was overflowing through the gap and thus soaking into my wall.

    However, the roofer explained to me that this was the design of the balcony, to prevent flooding. If it rained heavily, the water would flow along the gully and into the drain, and the gap was to prevent the balcony flooding if the rain was too heavy.

    However, secondly, he explained, he was going to hose down specifically through the gap and over the side of the wall of my flat so we could see what would happen if the balcony did flood. The reason he was doing these things separately, he told me, was so he could isolate the cause of the problem. If he did both at once, we wouldn't know what the specific problem was.

    Yes! Exactly! This is why we don't make the arbitrary configuration change and correct the logic in the Ruby program at the same time, because then how do we know? This man was speaking my language!

    The first thing took him about thirty seconds to discover. They turned the hose on the flashing, the water ran down the gully as planned and then – that was it. The drain was completely blocked. It took less than a minute for the balcony to flood and the water to start pouring through the gap and down my wall. Thus demonstrating another benefit of methodical testing – you surface things you might have assumed to be different.

    Later, when unblocking it, the roofer told me it must have taken years to get to that state. One might have thought that ON would have reported it at some point in those years. But why would she? She may not even have noticed – presumably she doesn't hang around outside when it's raining. It had not occurred to any of the previous investigators of this problem to check the drain. And while it may seem an obvious thing to have checked, one often overlooks the obvious, and that is why testing is good.

    The second thing took this software genius of a roofer a few more minutes to discover. After unblocking the drain, he hosed down the side of the my building and at this point I found that there was water coming in under the bath again. He looked closer at the building and saw, under where the balcony joined my wall, a gap.

    Having hosed the wall, he had seen that water ran into that gap and from there, the easiest escape would be into my flat, rather than back out. By attempting to recreate the problem, he identified the solution.

    So he unblocked the drain, and he filled in the hole with cement. And then, as if he had not covered himself in glory enough, he told me he was only 95% certain that this would solve the problem, whereas the previous fixes, I had been assured, were the solution. He knows a building is a legacy system. But he has the software methodology approach.

    There's more learning here as well – for example, to not assume that you won't understand a problem that isn't in your field – but the main thing I took from it was this: everything would be better if it was a bit more like software development. Craftsmen should be more like software developers.

    Upgrading to 12.04...

    06 May 2013

    I needed to write some code. Thanks to George Brocklehurst, I was no longer content to do this on my personal laptop using gedit – I required Vim.

    But I could not upgrade Vim!

    Failed to fetch 404 Not Found [IP: 80]
    E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?

    OK, maybe it was unrelated, but it was time to face the inevitable. I was on Ubuntu 10.10. I needed to upgrade. Skip to what I learned from this sorry saga.

    Great, so all I needed to do was update to next package from the Update Manager, right?

    Error during update
    A problem occurred during the update. This is usually some sort of network problem, please check your network connection and retry.
    , W:Failed to fetch 404 Not Found
    , E:Some index files failed to download, they have been ignored, or old ones used instead.

    Ah. Wrong. I left it so long that it was no longer supported.

    Right, so I couldn't go to 11.04. But actually, I didn't want 11.04, I wanted 12.04. So can I not just go directly there?

    Well, no, as it turns out. Or maybe you can. I still don't know. But threads like this suggested that I couldn't go directly to 12.04 without doing a fresh install and losing my files. I didn't fancy that, so I thought I'd go the long way round.

    Since I was out of time to upgrade via the Update Manager, I needed to download the 11.04 ISO, confirming that it was, as I thought, 64-bit:

    uname -m

    and then save the ISO to a CD. Having not done this before, I found this useful. You want the CD to have multiple files and folders, not just the ISO as one file.

    Great. So then you stick the CD in and just follow the instructions. Like these. Fine. No problem.

    Except... problem. This bit? That hung there for HOURS. If you click the arrow you can see the command it is hanging on, but Googling that on another computer (which I had by now switched on in order to continue with my life) gave me nothing.

    I'm telling you this, even though I'm embarrassed: I killed it. It had been hanging there for hours! I couldn't take it any more!

    And funnily enough, that turned out not to have an ideal outcome... it would not boot. On the plus side, it's dual boot and Windows seemed fine, though at this stage, I was pretty sure I could "fix" that too:

    Fine, never mind, all my files etc were backed up, it's no biggie. In fact, I could just install Ubuntu 12.04 at this point? Since it appeared that I may well have lost all my files.

    Except somehow, no. The other computer I was using was Windows, and – who knew? – it wasn't so easy to get the ISO onto a CD. After some trying, I gave up.

    Next plan was to install 11.04 from the disc I actually had. And, to my joy and incredulity – THIS WORKED!

    And then, amazingly, the upgrade to 11.10, and then 12.04, could be done via the Update Manager. Finally, six hours after I'd embarked on this foolish mission I was able to report back:

    ...shortly followed by...

    So what have I learned?

    1. How to check your system is 64-bit.
    2. How to embed tweets into blog posts (click the 'More' button to the right of the 'Favourite' button and all will be revealed).
    3. Not to put off the upgrade until such time that it's no longer supported and you have to go through all this!
    4. ALWAYS be backing up your files. As Jeff Atwood said, "until you have a backup strategy of some kind, you're screwed, you just don't know it yet". This could've been much worse for me.

    Happy programming!


    24 April 2013

    Today I learned from the awesome bob how to add storage to a box, including partitioning, mounting, and other things I'd heard of but had never done myself.

    We had to add some storage to the elasticsearch boxes. We generate a lot of logs and they all go to elasticsearch. By default when we provision servers we give them 32GB but this wasn't enough. So here's how to do it:

    1. Add the extra disk through our virtualisation platform's control panel.
    2. Reboot the box so that it sees the new disk.
    3. Partition the new disk. There are many ways to partition a disk - we used cfdisk, which has a GUI.
      cfdisk /dev/sdb
      This opens the GUI and then you tab through it to select options. We just accepted all the defaults, so the only time we had to change it was at the end, to select [write]. We did consider making the type Linux LVM as this would have meant we could extend it later, but in this case we just accepted the default type of Linux.
      This means that as well as /dev/sdb we have /dev/sdb1. (Extra learning points here - disks are added alphabetically. So the existing disk was sda. If, subsequent to this, we add another disk, it will be sdc.)
    4. Before we can mount the disk we need to give it a filesystem. We gave it ext4 because "everything uses ext4 these days" (bob). If you don't have a good reason for picking one of the others, then use ext4.
      mkfs.ext4 /dev/sdb1
    5. Now we mount the new disk to a temporary directory:
      mkdir /mnt/tmp
      mount /dev/sdb1 /mnt/tmp
    6. At this point we need to stop elasticsearch and also disable puppet (otherwise it will just start elasticsearch again on its next run):
      service elasticsearch-logging stop
      puppet agent --disable
    7. Now we need to copy the elasticsearch indices to the new disk. Because the indices are just files you can rsync them, as long as elasticsearch has been stopped, which it has.
      rsync -av --stats --progress /mnt/elasticsearch/* /mnt/tmp
      The -v flag above is 'verbose', and in addition, the stats and progress options give some more noisy output while the rsync is going on. Often, when using rsync you would want to add -z, but since this is just from one disk to another and not going over the wire then there's no need here. Finally, -a is 'archive'. As the man page helpfully notes, this is "same as -rlptgoD (no -H)". I will leave that as an exercise for the reader.
      This stage takes some time.
    8. When it's finished, we did two things, just as a (slightly paranoid) check to see if it worked:
      1. Run the same command again. If it has successfully rsynched everything, the two will be the same so it will output immediately.
      2. du -sh to check that /mnt/elasticsearch and /mnt/tmp are the same size.
    9. Instead of mounting the new disk in the same way as step 5 above, we can edit /etc/fstab. We add this line:
      /dev/sdb1 /mnt/elasticsearch ext4 defaults 0 0
      We should have thought about using Puppet here, as doing it like this means there are parts of the infrastructure that are not code.
    10. Next, we unmount sdb1 from /mnt/tmp:
      umount /mnt/tmp
    11. Then, we move the old elastic search directory and create a new one:
      mv /mnt/elasticsearch /mnt/elasticsearch-old
      mkdir /mnt/elasticsearch
    12. Now mount the sdb1 to the new elasticsearch. Because of step 9 above, we can pass either the mount directory or the directory to be mounted; we do not need both. This has the added advantage of testing our fstab setup.
      mount /mnt/elasticsearch
    13. All done! Now we need to:
      1. Restart elastic search
      2. Re-enable puppet
      3. Delete /mnt/tmp and /mnt/elasticsearch-old

    Roof Hacking

    24 March 2013

    Two weeks ago I was at the excellent /dev/fort. More on that another time; all I'll say now is that we were making a website in a fort.

    On one feature I was pairing (actually, tripling...) on some Django with the awesome George Brocklehurst and the brilliant James Coglan. James and I took a break to look at the causeway being underwater at high tide, and when we returned we found that George had relocated from our previous office (the green sofa) to the flat roof of the officers' mess. With his phone playing Ben Howard's cover of Call Me Maybe and glasses of whisky hand-delivered by the fantastic Chris Govias, it was perfect.

    Except it wasn't, because George's battery was running low, and he couldn't push his changes to my laptop because the wifi didn't reach that far. "Of course, we could always set up a local wifi on the roof," said George. Was he trolling me? No, he was not, and here's how:

    George committed to a local branch and then set up his mac as a network hub (a computer-to-computer network).

    Then I added a git remote that pointed at the repository on George's machine:

    git remote add george [email protected]:code/devfort7/hobbit/.git

    The IP address here was his machine on the network he'd created, and it was his ssh login (so I needed him to enter his password).

    This included the pleasure of having to type:

    git fetch george

    Then I merged george's branch into mine:

    git merge george/onboarding

    and we were ready to use my computer to continue the good work.

    Photo by the aforementioned brilliant James Coglan.

    How to Estimate

    21 January 2013

    I want to share this great idea about estimating that came from the excellent Mazz Mosley. Instead of worrying about estimating in hours or days, estimate in story points as follows:

    1: This is trivial. I know exactly the code I would write if I went back to my desk right now.

    2: This is quite easy. I know roughly what I'd have to do. I might have to look one or two things up.

    3: This is a bit complex. I might have to refresh my memory on a few things and there are a couple of unknowns.

    5: This is big. I have only a rough idea of how I'd do this.

    8+: I don't know how to do this.

    I've written 8+ rather than the more standard 8, 13, 21 etc because in our team we had an additional rule – if it's 8+ then the story is too big and you need to break it down a bit more. Maybe have a timeboxed spike? Then you will have more information for the next planning meeting.

    It doesn't matter how much time each point ends up being (and this will vary from team to team); after a few sprints of estimating like this the velocity will become meaningful and you can use it for predicting how much work you'll get through in future sprints.

    Testing the redirections

    10 December 2012

    Now that GOV.UK has replaced Directgov and BusinessLink, and departments are moving to Inside Government, we want to make sure that people visiting links to the old sites get where they need to be. We want them to be redirected to the correct page on GOV.UK, with no link left behind.

    This post is about the tools we built to make that possible.

    The first thing we needed was a list of everything we wanted to redirect: all the Directgov and BusinessLink URLs (links and Web addresses). This proved to be a fairly significant task - both sites had long histories and various different providers, so a comprehensive list of these URLs did not exist.

    Instead, we collected our own lists from a variety of sources, including traffic logs, records of friendly URLs (shorter, more memorable links that redirect to longer URLs), and the results of spidering the sites.

    This gave us a total of about 8,000 Directgov URLs and about 40,000 BusinessLink URLs.

    Wrangling the URLs

    Many of the lists of URLs existed in various spreadsheets, maintained by different people. We needed a canonical source of truth. So we built the Migratorator.

    The Migratorator is a Rails app, backed by a MongoDB database. It allows multiple users to create one-to-one mappings for each URL, where the mapping consists of the source URL, status (whether it will be redirected or whether, no longer representing a user need, it is now gone) and, if applicable, the page to which it will be redirected.

    Migratorator image

    As well as the mapping information, the Migratorator allows us to capture other useful information such as who has edited a mapping, tags showing information about the type of mapping, and a status bar showing how far through the task we are.

    Migratorator detail

    Checking the mappings

    We needed to confirm that the mappings were actually correct. We wanted several people to check each mapping, so we created the Review-O-Matic.

    The Review-O-Matic is also a Rails app and uses the Migratorator API to display the source URL and the mapped URL in a side-by-side browser, with voting buttons.


    We asked everyone in GDS to help us by checking mappings when they had some spare time. However, clicking through mappings can be dull, so we ran a competition with a prominently displayed leader board. The winner, who checked over 1,000 mappings, won cake.

    Confirmation from departments

    The Review-O-Matic presents the mappings in a random order, and the way it’s set up means that links within pages cannot be clicked. This is good for getting as many mappings as possible confirmed, but our colleagues in departments needed to check content relevant to them in a more methodical and interactive way. Enter the Side-by-side Browser.

    The Side-by-side Browser displays the old and the new websites next to each other. Clicking a link on the left hand side displays what this will redirect to on the right hand side.

    Side-by-side browser image

    The Side-by-side browser is a Node.js proxy that serves itself and the site being reviewed on the same domain, so that it’s ‘live’ and not blocked by the Same-Origin policy. We joked that, in essence, the side-by-side browser was a phishing attack for the good!

    Initially it used the Migratorator API for the mappings. However, once we’d built and deployed the Redirector, we could use that instead to populate the right hand side. As well as simplifying the code, this meant we could now see what the Redirector would actually return.

    At this point, we distributed it to our colleagues in departments to check the mappings and raise any concerns before the sites were switched over.


    We used another trick to test Directgov mappings while the site was still live. We created a domain called, which was handled by the Redirector, and a bookmarklet. By replacing the ‘www’ with ‘aka’ the bookmarklet allowed us to see what an individual Directgov page would be replaced with.

    The Redirector itself

    For the actual redirection, we use the open-source Web server Nginx. The Redirector project is just the process for generating the Nginx configuration. It’s written mainly in Perl with some PHP.

    Generating the Nginx config requires logic to determine from the old URL what kind of configuration should be used.

    For example, the important part of a Directgov URL is the path, e.g., while for BusinessLink the essential information is contained in the query string, e.g Redirecting these two types of URL requires different types of Nginx config.

    This logic, plus the mappings we gathered, make up much of the Redirector project.

    The joy of tests

    In addition, the project contains a suite of unit and integration tests, including one that runs every night at 5am. This test checks that every single URL in our source data returns a status code that is either a 410 ‘Gone’ or a 301 redirect to a 200 ‘OK’.

    For a few weeks before the launch we also ran the daily Directgov and BusinessLink logs against the Redirector to see if there were any valid URLs or behaviour we’d missed. By doing this we found that, for example, even though URLs are case-sensitive, Directgov URLs were not, and users would therefore expect to work in the same way as

    Going live!

    The final task was to point the DNS for the sites we’re now hosting at the Redirector. Now users following previously bookmarked links or links from old printed publications will still end up on the right place on GOV.UK.

    redirector post-it gets lots of votes at retrospective

    The configuration now has over 83,000 URLs that we’ve saved from link rot, but if you find an old BusinessLink or Directgov link that’s broken then let us know.

    Traffic through the Redirector is easing off as GOV.UK pages are consistently higher in the Google search results, but it’s been really exciting making sure that we do our best not to break the strands of the Web.

    This post originally appeared on the GDS Blog.

    Learning the perls

    20 November 2012

    After some time muddling through with Perl, I have accepted the inevitable – it's time to actually knuckle down and learn it properly. I have acquired some books, but I also require some direction – so I asked my excellent colleague and friend Mark Norman Francis to write a brief guest post for my blog.

    His instructions were "Complete the sentence: In order to understand C you need to understand Pointer Arithmetic. In order to understand Perl you need to understand...". He went many better, and produced the following:

    In order to write perl in a modern context, I think these are the basic skills you'll need.
    1. Perl's provided documentation is extensive and worthwhile, so learn to use `perldoc` to read both the provided manuals and the documentation of the builtin perl functions.
    2. Learn the difference between variables – scalar, array, hash, and references (hint for references: learn to make a hash of hashes).
    3. Learn how to use an array as a stack.
    4. Basic regular expression usage – matching, replacing, capturing substrings, case-insensitivity, metacharacters, and writing more readable expressions by having perl ignore the whitespace.
    5. Learn to use perl on the command line as a pipe filter instead of awk/sed.
    6. Learn how to write your own functions.
    7. Learn how to use core modules such as `File::Find` (write a tool to list files in a directory structure from smallest to largest in size).
    8. Refactor the code from step 7 to use a Schwartzian Transform to sort the file sizes.
    9. Learn how to install third-party libraries from CPAN.
    10. Install Moose from CPAN, then work through `Moose::Manual` and `Moose::Cookbook` to learn to write modern OO perl.
    11. Learn how to use `Test::More` to test code and `Devel::Cover` to produce coverage reports.
    12. Find an abstract intro to OO example (such as employee payroll) and write it in Moose with full unit test coverage.
    13. Lastly, read the `perlstyle` documentation, and then write your own style guide.

    Thanks, Norm. Thorm.

    Learning More About D3

    19 July 2012

    I had a really inspiring chat with Matt Biddulph about D3 this week. He showed me some really cool things. I made some notes.

    CAP again – or should that be PACELC?

    15 July 2012

    The last thing I wrote about was the CAP Theorem. Last week, Tom Hall visited our office and gave a very interesting talk, the central thesis of which was that the CAP Theorem, as often explained, is incorrect – or at least not very meaningful.

    The first point, explained very clearly in CAP Confusion: Problems with 'partition tolerance' is that P, "Partition Tolerance", as per the proof of the CAP Theorem, is not a property of your application; rather it is a property of the network it's on. The network may partition. To quote from Henry Robinson's Cloudera article:

    "Partition tolerance means simply developing a coping strategy by choosing which of the other system properties to drop. This is the real lesson of the CAP theorem – if you have a network that may drop messages, then you cannot have both availability and consistency, you must choose one."

    The rest of the article is very worth reading.

    The second point Tom covered is from Problems with CAP, and Yahoo's little known NoSQL system by Daniel Abadi. Abadi covers the point above very clearly, and then points out that there is a missing letter from CAP – L, for latency.

    "Keeping replicas consistent over a wide area network requires at least one message to be sent over the WAN in the critical path to perform the write... Unfortunately, a message over a WAN significantly increases the latency of a transaction... Consequently, in order to reduce latency, replication must be performed asynchronously. This reduces consistency."

    He suggests CAP should be rewritten as PACELC:

    "If there is a partition (P) how does the system tradeoff between availability and consistency (A and C); else (E) when the system is running as normal in the absence of partitions, how does the system tradeoff between latency (L) and consistency (C)?"

    Tom's talk covered many other interesting topics and is worth catching if you can. One of the other many interesting things Tom talked about was Vector Clocks. I won't go into that here, but they are worth reading up on! Here is a potential place to start...

    The CAP Theorem and MongoDB

    29 April 2012

    This week I learned some things about MongoDB. One of them was about how it fits in with the CAP theorem.

    They say a picture is worth a thousand words, and I think this diagram from my excellent new colleague Mat Wall while he was explaining it to me says everything:

    Over and out.

    OK, perhaps I can offer a tiny bit of exposition.

    The CAP Theorem is: where C is consistency, A is availability, and P is partition tolerance, you can't have a system that has all three. (It gets to be called a theorem because it has been formally proved.)

    Roughly speaking:

    If you have a web app backed by a SQL database, most likely, it is CA.

    It is C because it's transaction-based. So when you update the database, everything stops until you've finished. So anything reading from the database will get the same data.

    It can be A, but it won't be P because SQL databases tend to run on single nodes.

    If you want your application to be P, according to the CAP theorem, you have to sacrifice either A or C.

    With MongoDB, in order to gain P, you sacrifice C. There are various ways to set it up, but in our application we have one master database, that all writes go to, and several secondaries (as can be seen from the diagram: M is the Master, the Rs are the secondaries – also called replicas, or slaves). Reads may come from the secondaries. So it is possibly that one or more of the secondary nodes could be disconnected from the application by some kind of network failure, but the application will not fall over because the read requests will just go to another node. Hence P.

    The reason this sacrifices C is because the writes go to the master, and then take some time to filter out to all the secondaries. So C is not completely sacrificed – there is just a possibility that there may be some delay. We are not allowing a situation where the secondaries are permanently out of synch with the master – there is "eventual consistency".

    So you might use this in applications where, for example, you are offering the latest news story. If User A gets the latest news 10 seconds earlier than User B, this doesn't really matter. Of course, if it was a day later, then that would be a problem. The failure case of C is just around the time of the write and you want to keep that window of consistency small.

    There is also a concept of durability, which you can also be flexible with.

    Take the following two lines of pseudocode:

    1. insert into table UNIVERSAL_TRUTHS (name, characteristic) values ('Anna', 'is awesome')
    2. select characteristic from UNIVERSAL_TRUTHS where name = 'Anna'

    What we're saying when we sacrifice consistency is, if I run these two lines on the same node then when I run line 2, I can be sure it will return 'is awesome'. However, if I run line 2 on a different node, I can't be sure it's in already. It will still be "eventually consistent" so if I run it later (and it hasn't been changed again in the interim) it will at some point return the correct data.

    However, you can also configure MongoDB to be flexible about durability. This is where, if you run the two lines of code on the same node, it might be the case that line 2 hasn't run, and possibly even never will. You might do this, for example if you were storing analytics. If you are looking for general trends, it might not matter so much if 1% of the transactions fail, so you might configure it to be flexible on durability. Of course you wouldn't do that for something as crucial as characteristics about Anna.

    What Should Your Work Log Tell You?

    15 April 2012

    We've got some great data for the SPA2012 workshop on JavaScript visualisations. Hibri has managed to obtained three years' worth of 7Digital's work-tracking data and Rob has very kindly allowed us to use it (anonymised, of course).

    We're going to use this dataset to show different information using different libraries for the visualisations.

    What I'd really like your input on is, what would you like to know? If this was your work-tracking data, what questions would you want it to answer?

    For example, we could create a Sunburst diagram using JIT to show what proportion of total development time is spent on the four different applications. We could create an interactive line graph using Raphael to show the average time a piece of work takes, and how that average has changed over the past three years.

    A little more info about the exact data is below, or feel free to skip straight to the example questions, or just go right ahead and tweet me or email me (at this domain) with what kind of information you'd want from a work log stretching back that far. You might get a great visualisation prepared for you to show just that!

    The Data

    Here is an example:

    "Feature ID": "A-321",
    "Application Name": "Project A",
    "Type": "Bug",
    "T-Shirt Size": "S",
    "Added": "15/03/2011 12:12",
    "Prioritised, Awaiting Analysis": "15/03/2011 12:12",
    "Analysis Completed": "15/03/2011 12:12",
    "Development Started": "15/03/2011 13:21",
    "Systest Ready": "15/03/2011 16:21",
    "Systest OK": "17/03/2011 10:34",
    "In UAT": "17/03/2011 10:34",
    "Ready For Release": "17/03/2011 10:34",
    "In Production": "17/03/2011 10:35",
    "Done": "22/03/2011 10:48",
    "Lead Time": "2",
    "Cycle Time": "2",
    "Development Cycle Time": "2"

    So this is a really rich dataset. We have all the dates that various stages of the release process happened, as well as some information about what application it was for, what type of work it was (e.g. Bug, Feature, Build Maintenance), what the rough estimate was (T-shirt size), and how long the whole process took, both from first raised to done (lead time) and from dev start to done (development cycle time). The log goes back to 2009.

    Example Questions

    Here are a few of the questions we've thought about answering.

    Your Chance for a Hand-crafted Visualisation Just for You!

    We'd love this workshop to be as relevant and interesting as possible, so please do let me know what kind of information you'd like to see visualised and it may well make it into the talk! Tweet me or email me at this domain. Thank you!


    20 February 2012

    My goal is to be a better developer.

    A few weeks ago, I had an appraisal. There was a lot of positive feedback and also some really useful suggested areas for development. With the help of my good friend and unofficial mentor Dolan O'Toole, I pulled together some practical suggestions as to how I could improve them:

    I then drew up a rough syllabus of the next steps I should focus on:

    I'd be interested if any of you have suggestions for additions to this? What should a great developer know?

    Would Have Been Useful to Know at My First Game Jam

    14 February 2012

    I had no idea what to expect at my first game jam, and I was nervous about whether I'd be able to participate. I'm a programmer, but I'd never written a videogame, and I don't even much play videogames.

    As it turned out, everyone was really friendly and welcoming and I was able to get totally stuck in. But there were a few things that I didn't know that everyone else seemed to know, and I was too shy to admit I didn't know too. Now, following my new and brilliant strategy of "ask as many questions as you can, the stupider the better" (more on that in a later post), I have managed to learn some of these things, and I'll share them here. They are the game engine, game mechanics, and a little bit about art.

    The Game Engine

    When writing video games, people talk about the "engine". For example, "What engine are you using?" and "Don't write your own engine; this is not 'engine jam' ".

    A game engine is something that handles the nuts and bolts of a game for you, which basically comes down to displaying the graphics, dealing with input (keyobard, joystick etc), and handling collisions. For example, in a game, when your player runs into a baddie you might want them to bounce off each other at an equal pace. Using a game engine means you can just specify the details of the collision. Not using a game engine means you have to write code to describe the angles and velocities yourself.

    Unity is a game engine that handles all this stuff for you. You can copy your art into the folder structure and then drag and drop it to where you want it to be. Game objects have default collision behaviour and you can just write a script to change it.

    Flixel is another game engine. It's a level below Unity – there's no graphical user interface, and you have to do a bit more work with the graphics, but the collisions, keyboard input and various other features are there for you.

    A Note on Writing Your Own Game Engine

    I didn't understand at the time why my second game jam was so much harder than my first, but when I explained what we did to the brilliant Joe Halliwell afterwards, it all became clear. We disobeyed the advice. We wrote our own game engine.

    To be precise, we used OpenGL, which is a 2D/3D Graphics API written in C. It gives the instructions to the graphics card – you definitely don't want to be doing that yourself – at least not in a 48-hour game jam. The code was written in Python, and for the binding we used PyOpenGL. For the physics we used Box2D. OK, so we didn't exactly write our own game engine from scratch, but we assembled one from various constituent parts.

    The alleged reasons were that we didn't know Unity, most of us knew Python (except for me – that added another layer of complexity to the weekend!) and between us we had two Macs, two Linux boxes and one loser running Windows. The real reason is that the brilliant Trevor Fountain loves this stuff and, if it existed, would sign up to 'engine jam' like a shot.

    Game Mechanics

    Another phrase I heard a lot and pretended to understand was "game mechanics". To be fair, I thought I did understand it – I thought it was jumping, shooting etc. I was wrong.

    Game mechanics are more like constructs of rules. Unlocking an achievement is a game mechanic. Going up a level when you reach a certain number of points is a game mechanic. Dying when you lose 90% of your health bar is a game mechanic. So when someone asks "What are the game mechanics?", they mean something like, what are the gamey things here? What makes this a game, rather than just graphics moving across a screen?

    I like this list of potential game mechanics.

    A Little Bit About Art

    While I'm here, just a few words on pixel art vs vector graphics, something else I heard talked about which added to the background radiation of feeling I didn't quite understand what was going on.

    There's an extremely detailed article on the differences here but in very brief summary – technically speaking, pixel (aka bitmap, raster) art is basically dot-matrix, i.e. made up of pixels individually coloured, and vector art is created using mathematical functions and is basically points and the lines joining them. The art for your game may be either, or some combination of the two. A brief rule of thumb is that old games tend to use pixel art, but the choice is up to you – different tools for different times.

    However, it's more than just the technical differences, and the conversations you hear about pixel vs vector art will probably reflect something other than the strict technical distinction. Trevor summed it up for me very clearly, so here is an edited version of what he said:

    "To me, pixel art is more of an aesthetic choice than a technical term. It's true that older games had a pixel art aesthetic, especially on 8- or 16-bit platforms. Today, though, I think people use the terms 'pixel art' and 'retro' more or less interchangeably – games like Canabalt or The Last Rocket are undeniably retro, and use the pixel art style to great effect. Games like Angry Birds are rendered using the same technology (2d sprites), yet no one would ever say that Angry Birds qualified as pixel art.

    "As for vector art, true vector images (e.g. svg) are rarely used outwith UI design and Flash."

    Next steps

    How to Make Your First Videogame by the excellent v21 is great, and the short video You Can Make Video Games is fab and really worth watching. Or why not just sign up to the next Global Game Jam?


    JavaScript Talk Takes Shape

    21 January 2012

    Just over a week to go until the final deadline for the SPA submissions, and the JavaScript visualisations talk is shaping up.

    The plan is to demonstrate the use of a JS library to produce a certain kind of visualisation, followed by a 15-20 minute exercise where participants use the library to create a similar visualisation. I've said "rinse and repeat 4-5 times", though I'm slightly dubious as to whether we're going to have time to do more than three. Hibri and I are going to do one each and demo to the other, so we'll have a better idea then of how many we can cover. I'm really excited about what we're going to produce!

    At the moment we are planning to try and do:

    Is there a type of visualisation you'd really like to see us demonstrate, or a library you think we can't miss out? Please do let me know.

    There is also still time to give feedback directly on this and all other proposals for SPA 2012 here (SPA login required).

    Preparation Begins for the JavaScript Talk...

    18 December 2011

    As part of preparing the proposal for the JavaScript visualisations talk, I had a coffee with Ian Alderson. He did an excellent talk at SPA2011, along with Adam Iley, on HTML5. The presentation is still available online (using HTML5 for the slides!) here, I really recommend you check it out.

    Although Ian doesn't have time to co-pilot on the talk, he very generously gave up his time to give me some advice. Notes below. The first suggestion of his that I plan to follow is a comparison between SVG and Canvas – as he pointed out, all JS visualisation libraries will be based on one or the other, so that will definitely be a factor in what library to choose. It's not the case that one is just better than the other – both have strengths and weaknesses. Watch this space for more on this topic...

    JavaScript Visualisations Proposal

    30 November 2011

    I have proposed a talk for SPA2012 on JavaScript visualisations.

    The basic outline of the proposal is a workshop where we will demonstrate the use of different JS libraries to visualise different kinds of data with several 15-20 minute exercises where participants use the library to create a similar visualisation. Ideally participants would leave with an overview of how and when to use the libraries and tools we've covered, and ideally an idea of what factors to think about when choosing a JS data visualisation tool.

    Still in the very early stages of planning it, but I am in the process of moving this site over to a host that will support WordPress, at which point I will be able to invite comments/suggestions. In the meantime, any suggestions are very welcome, by email or twitter.

    Notes from the first planning meeting, with Hibri:

    My Continuing Struggles with Linux

    17 October 2011

    Summarised excellently by this cartoon. It's great fun, but...

    Sorl-thumbnail Comes Back to Bite Me

    12 September 2011

    Well, the lack of sorl-thumbnail came back to bite me when I got the latest version of the Django project I've been working on. Previous releases had lulled me into a false sense of security, but then the latest update, a bit too late in the evening, I had that old error:

    ImportError: No module named sorl-thumbnail

    So off I went again. I had no problem downloading sorl-thumbnail but everywhere I went wanted to put it into the wrong directory. It's not in the Synaptic Package Manager, and I couldn't find a PPA. Just to make it a bit more exciting, the dependency of the project I needed it for was something like version 3.2.5. The current version is 11.0.4.

    Ideally what I wanted to be able to do was set some kind of flag on the install that told it to install somewhere else. Surelly there is a way to do that? But I couldn't find it. I had got as far as actually (again!) RTFM, and was browsing through the Core Distutils functionality page when lo and behold, I stumbled upon this. Not directly relevant you might think – but wait, item 13? What's this?

    Yes, to cut a long story short, I had discovered symlinks. New to me, old to everyone else. It was short work to set one up, helped by this, and lo and behold: it works.

    Kind of begs the question as to why I didn't do that in the first place for all the packages... but never mind. Onwards and upwards!

    How to Write Efficient CSS – Evaluation of CSS

    29 August 2011

    I recently did a knowledge share at work on how to write efficient CSS, and one of my colleagues – a senior developer with a lot of experience – said "I felt the penny drop". High praise indeed, so I thought it was worth reproducing it here.

    The write-up is pretty long, so I've divided the divs into Basic CSS in which I cover:

    1. CSS
    2. Selectors
    3. The Cascade
    4. Inheritance
    and Evaluation of CSS in which I cover how CSS is evaluated and some things we can do to make it more efficient.

    How CSS is Evaluated

    The key information is these two points:

    1. For each element, the CSS engine searches through style rules to find a match.
    2. The engine evaluates each rule from right to left, starting from the rightmost selector (called the "key") and moving through each selector until it finds a match or discards the rule.

    1. For each element, the CSS engine searches through style rules to find a match

    The style system breaks rules up into four categories by key selector.

    1. ID Rules – for example button#backButton { ... }
    2. Class Rules – for example button.toolbarButton { ... }
    3. Tag Rules – for example treeitem > treerow { ... }
    4. Universal Rules – all other rules

    The CSS engine then takes each html element in the document in turn. If it has an ID, then the engine searches through the style rules and checks rules that match that element's ID. If it has a class, only Class Rules for a class found on the element will be checked. Only Tag Rules that match the tag will be checked. Universal Rules will always be checked.

    2. The engine evaluates each rule from right to left...

    So in the example button#backButton { ... } the key is the id "backButton". In the example I give in Basic CSS the key is the class "blog-post".

    The engine starts with the key and then evaluates the rule from right to left. So if you have a button with an id of "backButton", the engine first matches the id to the id and then compares the next selector – is the element a button? In the example from Basic CSS the evaluation for the second selector, ul#nav.dhtml li, is as follows. Does the element have a class of blog-post? If so, is it a link? If so, is there anywhere in its ancestry a list item? If so, is there anywhere in the ancestry of that list item an unordered list element with a class of dhtml and an id of nav?

    You may be getting a slight clue now as to why I think that selector is inefficient.


    There are some obvious ways we can start here.

    These are just recommendations as to how to write the CSS. If you care about your website's performance you should already be minimising and gzipping your CSS. So how much of an edge will these recommendations give you?

    CSS and Performance

    CSS and performance is a fairly hot topic right now, especially with all the cool things that you can do using CSS3. Dave Hyatt, architect for Safari and Webkit, said "The sad truth about CSS3 selectors is that they really shouldn't be used at all if you care about page performance." (The comment can be found here).

    That's certainly something I've heard, for example at conferences. However, another web giant, Steve Souders (works at Google on web performance) has a different opinion. It's worth reading the piece in full (there are charts and everything!), but the takeaway here is: "On further investigation, I'm not so sure that it's worth the time to make CSS selectors more efficient. I'll go even farther and say I don't think anyone would notice if we woke up tomorrow and every web page's CSS selectors were magically optimized."

    So why am I bothering with this? Well, a few reasons. One is I think it's always worth taking the time to find out how things work and I'm glad to be able to make a reasoned judgment on this.

    But also, there are various things to consider when thinking about writing efficient CSS. There is performance, obviously, but two other major concerns are ease of writing and (in my view more important) ease of reading. I love clean code, and I think it's crucial that code is easy for other developers to read. I'm not sure I care whether or not ul#nav.dhtml li is performant or not, it's certainly not clean, and it took me some brow furrowing to work out when it would apply. So personally, I'm going to follow the rules I've outlined above. What do you think? I'd love to hear your opinion.

    Useful Further Reading

    I gleaned most of the information about how the CSS engine works from Page Speed documentation and the Mozilla blog. I couldn't find any information about how IE evaluates CSS, please do let me know if you have any.

    To take things further with writing nice CSS, you could look at Compass/SASS, or my current favourite, OOCSS.

    How to Write Efficient CSS – Basic CSS

    06 August 2011

    I recently did a knowledge share at work on how to write efficient CSS, and one of my colleagues – a senior developer with a lot of experience – said "I felt the penny drop". High praise indeed, so I thought it was worth reproducing it here, even though other similar divs are available on the internet.

    The write-up is pretty long, so I've divided the divs into Basic CSS in which I cover:

    1. CSS
    2. Selectors
    3. The Cascade
    4. Inheritance
    and Evaluation of CSS in which I cover how CSS is evaluated and some things we can do to make it more efficient.


    For those who have no idea at all what CSS is, I can recommend nothing better than the w3schools excellent intro. The w3school is always my first port of call for definitive answers on html, CSS and JavaScript. I will assume that you at least know that CSS stands for Cascading Style Sheets, and realise why separating that from html can make development easier.


    CSS is applied to html elements using selectors. The syntax of CSS is:

    selector { CSS property : value; }

    If you want to add comments, use /* comment */ as C-style comments (//) will cause your CSS to fail silently – all you'll see is your website looking wrong.

    Let's look at some more details of how CSS is written., ul#nav.dhtml li {
    display: block;
    float: right;
    height: 24px;
    width: 24px;
    margin: 3px 3px 0 0;
    padding: 0 0 0 0;
    background: url(../images/film_help.png) no-repeat;

    1. Padding and the box model

    2. If you've ever worked with CSS at all, no doubt you have fiddled around changing the padding or the margin without actually knowing why. I certainly did. Until I saw this diagram, which suddenly made everything clear:

      Have a look at the page on w3schools, but basically the margin is how far the box is from other elements, and the padding is how much space you'd like between the inside of the box and the content. Genius. And obviously, although not given in the CSS example above, you can manipulate the border too, as I have done in my code samples on this page, for example border-style: dashed;

    3. Padding and Margin shortcuts

    4. It is possible to specify each direction of the padding, for example padding-left:3px. However you can also use a shortcut, as has been done in the example. The order is like a clock: Top, Right, Bottom, Left; so an element with the CSS in the example applied to it will have a margin on the top and the right. Another way to remember this is by remembering that if you don't get it right there will be TRouBLe. :o)

      However, you can make it even shorter than that. Three values set the top to the first, the right AND left to the second and the bottom to the third. For example:

      margin:10px 5px 15px;
      top margin is 10px
      right and left margins are 5px
      bottom margin is 15px

      Two values set the top and bottom to the first, and the right and left to the second, and one value sets them all to the same. Here if you didn't follow that.

    5. Ems

    6. Sizes can be given in pixels, ems or percentages. Pixels you know. Percentages are as a percentage of the enclosing elements (see inheritance). Ems are also calculated as percentages, where the default for most browsers is that 1em = 16px. However (of course) there is a slight twist in that ems are not calculated correctly in IE, so .85em is smaller than 85%. The w3schools recommendation is to use a combination of percentages and ems. The developer tools are invaluable in this respect and there is a useful discussion here.

      In the meantime you might want to remember this hilarious joke: Q. What did one em say to the other em? A. Who's your daddy?

    7. No-repeat

    8. A brief word on the image. background: url(../images/film_help.png) no-repeat;. The url is a link to where you have stored the image. no-repeat means you want it to appear once. The default is for it to repeat both horizontally and vertically. A nice trick for filling in a background is to create a slim image with the gradiations you want and then repeat-x so it fills the div.

      An aside: another way to include images is to instead use a data url. This is where you encode the image inline. The advantage of this is that it doesn't require an extra HTTP request to get the image, and the key to a faster website is minimising the HTTP requests. However, there are reasons (other than the ubiquitous incomplete cross-browser support) why you might not want to use data urls – a good discussion can be found here. Other ways to minimise the HTTP requests associated with loading images can be found here.

    9. Optional;

    10. The last semi-colon is optional. In fact, the structure of a CSS class is selector { CSS property : value } – you only need the semi-colon if you have subsequent instructions. I always put it in anyway though, there's no reason not to.


    There are loads of selectors in CSS. For now, I'll just talk about the most common:

    html element

    For example, p { color: red; } would make all text in paragraphs red.


    Giving the ID makes the one specific element behave in a certain way. It is certainly the most efficient selector (see how it's evaluated, but it isalso slightly pointless. The point of style is to make a website look and feel consistent – with your brand, with itself, etc. IDs are (supposed to be) unique on a page. So why would you want to style just one thing?


    The most common selector and the one I think we should all be using most of the time.


    The universal selector. Some interesting discussion here, but in summary it is the least efficient selector and also allows you to bypass inheritance. I say AVOID.


    x, y

    This means, apply the style you are outlining to all xs and all ys. For example p, .blog-intro { font-weight: bold; } makes the text in all paragraphs bold AND the text in all elements with the class "blog-intro".

    x y

    The descendant selector. This matches any y which is a descendant of any x. For example p .blog-intro { font-weight: bold; } makes all the text within a paragraph that has the class of "blog-intro" bold – even if it is nested several layers within. It is very inefficient. See later post

    x > y

    The child selector. This matches any y which is a child of any x. This is also inefficient, though less inefficient than the descendant selector., ul#nav.dhtml li

    So, to return to the selector in the example which I have so far ignored.

    To spell it out, the style outlined here will be applied to any link with the class of blog-post, AND any link with the class of blog-post which is a descendant of a list item which is itself a descendant of an unordered list with the id of nav and a class of dhtml.

    Pretty complicated. Over-complicated, I will argue. But just to note – the reason you might want both of these (you might think the first would cover the second) is because you may well have one or more intermediate styles that have altered the link in the list item further. For example if you have styled ul#nav.dhtml differently, the link will appear differently unless you reset it here.

    In my view, examples like this are why people do not like CSS. Oh, and this is a real example from a live application, by the way. Names slightly changed to protect the innocent.

    The Cascade

    As hinted at above, you can have multiple styles applied to an element, and the cascade is what determines which one is actually applied. CSS styles are cascaded in the following order, with the one applying last being the one that sticks:

    1. Origin (i.e. Browser then Author)
    2. Importance (normal vs !important)
    3. Specificity
      1. Inline
      2. Highest number of id selectors
      3. Highest number of class, attribute, or pseudo classes
      4. Highest number of elements and pseudo elements
    4. If still the same, the order in which they appear.

    Here for (very dense) more.


    Selectors are considered to be specific in the following way (ordered from the least to the most specific):

    1. The type selector
    2. The descendant selector
    3. The adjacent sibling selector
    4. The child selector
    5. The class selector
    6. The attribute selector
    7. The ID selector
    Here for more.


    The order, listed from the least precedence to the most precedence:

    1. Browser default
    2. External style sheet
    3. Internal style sheet (i.e. written in the head)
    4. Inline style (i.e. inside an HTML element)


    An external stylesheet will override an internal stylesheet if it's linked to in the head after the internal style sheet.

    And the most common trip-up in legacy code systems: The same class within a CSS file will override one written higher up in that file.


    You will have seen that important! takes precedence in the cascade. This is what you add if you can't get the style you want using the normal rules of cascade and inheritance. Don't use it unless you really have to – to me, it says "I don't know how CSS works so I'm just going to hack it."

    Developer Tools

    All modern browsers have developer tools so you can examine elements to see what CSS is being applied, and in a surprising departure from the norm, the best one in my opinion is the one available in IE when you press F12.

    For example, in this screenshot you can see that the colour being applied here is the one from the class ul.dhtml#nav not the one specified for body or the one specified for body.mainframe.


    I could not do a better job of writing up inheritance than this company presentation. I really recommend you go and click through it. It starts very simple and you may think it's a waste of your time but if you are not 100% sure how inheritance in CSS works then this will really help.

    A brief summary though if you really can't be bothered: Elements inherit styles from the elements within which they are nested. So in the following html, if you apply a style of font-weight: bold to p, then the text inside the span will also be bold.

    ‹p›I'm bold. ‹span›And so am I!‹/span›‹/p›

    Read on for how to write it efficiently!

    A Voyage of Discovery – Upgrading Firefox Part 2

    27 July 2011

    Right, it turned out my next step was not a chat with Joe Halliwell but instead with my good friend and mentor, Dolan O'Toole, and what I thought might be a voyage of discovery into how to make my installation of firefox run like a normal program turned out to be just a short trip of discovery to finding out that I hadn't actually been looking in the right places.

    After helpfully pointing out that I "sound like a linux newbie :)" (yes, yes I am), Dolan explained where I'd gone wrong: "You followed the instuctions as they are. The only thing you weren't aware of is that most of the time, someone would have done a build of a program for ubuntu and put it in an apt repository, usually on launchpad. I usually search for something like "firefox 5 ubuntu 10.10" if I need to install something before I resort to manual installations."

    Good tip, and here it is:

    So – I have learned that it's not quite as straightforward as finding the instructions and then following them – you also have to know stuff. That's OK though. And at least I have firefox the way I want it now. I'm sure there'll be another opportunity to get my hands dirty...

    Upgrading Firefox – part 1...

    20 July 2011

    So. Not content with the Ubuntu version of Firefox which is at 3.6.18, I decided that I wanted Firefox 5.0. Heck, I want 5.0.1.

    So I followed these instructions.

    To no avail. I followed them to the letter (or so I thought), but when I finished up by running firefox, to my joy, Firefox popped up, and to my moments later horror, it was still 3.6.18.


    Well, lots of reasons why, as I discovered when I unistalled Firefox 3.6.18.

    First of all, when it said "Run the following shell script" I just typed it into the console window. That did not work. Don't sigh, I'm new to this!

    So I created the shell script and saved it. Following, incidentally, instructions here – the bit I was missing was the permissions: chmod +x

    [NB. I couldn't agree more with the grateful commenter on that link: "Everybody just writes "run the shell script" but a complete beginner doesn't know, that shell scripts have to be executable and are started with ./"]

    OK I've run the script woo hoo! "The firefox command in your ~/bin directory will now run Firefox with the mozilla-build profile." I have the file, the file contains another script, that script should allow me to start firefox, right?


    bash: /usr/bin/firefox: No such file or directory

    Well that, presumably is why running the firefox command before started up the old firefox. (So – hey – I could have just run the script in the command line like I wanted!)

    So, I decided to roll back a step and delete the new ~/bin directory – I didn't like it anyway. I discovered I could actually run Firefox, from the firefox directory, by running this command: exec "/home/anna/firefox/firefox" -P mozilla-build "\[email protected]" . So I wrote a shellscript to do that.

    So now I can run firefox. It's 5.0.1. But I have to run it from the command line. Closing Firefox doesn't close the terminal window, but that terminal window is taken up with running firefox (maybe there's a way round that like with gedit &).

    A few other things. The firefox running remembers passwords. It remembers what I've set my home page to. But it doesn't start up on the home page even though I've asked it to. Agreeing to restart it does so. Behind the scenes, there are a couple of errors:

    (firefox-bin:1923): Gtk-WARNING **: Loading IM context type 'ibus' failed (firefox-bin:1923): Gtk-WARNING **: /usr/lib/gtk-2.0/2.10.0/immodules/ wrong ELF class: ELFCLASS64

    Ideally I want this firefox to run normally, i.e. be available on the menu, have an icon in the top panel, not to rely on a command window. I suspect my next port of call is to discuss this with Mr Joseph Halliwell. I hope there will be a part 2 where I turn out to have learned loads about this area!

    What I have learned about folder structure today

    22 April 2011

    Easy_install (as it comes) doesn't work for Debian-based systems.

    The default behaviour is to install things to /usr/local/ – this is the Gnu default – whereas the synaptic package manager installs things to /usr/. I imagine there may be a way to reconfigure the default behaviour, but I didn't get that far.

    This is how I found out this rather useful piece of information. I had installed easy_install, and then used it to download all the dependencies I needed to get started with this python/Django project I was going to work on. And then – nada.

    For about three hours, no matter what I did, I had this message:

    File "/usr/local/lib/python2.6/dist-packages/Django-1.3-py2.6.egg/django/db/backends/mysql/", line 28, in from django.db import utils ImportError: cannot import name utils

    My friend who was with me, trying to get me set up to start work on the project, is a Django expert, but he uses Windows. Hey, so do I! What am I even doing using linux? (See sidebar for answer...)

    We tried a lot of things. I won't go into them all here, mainly because I can't remember many of them. We tried a lot of things that people on the internet had suggested. As a last resort, we even tried to RTFM. To no avail.

    After we'd given up and my friend had left, I went back to it to try and figure out what was going on. I even tried to resort to the beginner standard of adding log messages, but all the files I wanted to edit were read-only. Finally, this page offered me a glimmer of a clue.

    Specifically, the question "Where are my site-packages stored?" and the answer:

    python -c "from distutils.sysconfig import get_python_lib; print get_python_lib()"

    Which for me, returns /usr/lib/python2.6/dist-packages

    Hang on though, the error is in /usr/local/lib . . . eh?

    A visit to the python2.6 folder in /usr/local confirmed that all of the stuff I needed was in there. And a call to my incredibly helpful and talented friend Joe Halliwell confirmed my growing suspicion that this was Not Correct. He explained to me the Gnu/Debian folder structure differences.

    So, I backtracked. I installed everything I needed (except sorl-thumbnail, which wasn't there. However, all I need to do is find a ppa. There is probably also another way to install it correctly. By the time this is all done, I may even know what it is.)

    I ran through a few errors on the way, but all of the ImportError: No module named django_extensions variety, and solved one by one, satisfyingly, by installing what I needed in the correct way.

    Finally, several hours after my friend came round to work on this, I now have a new error:

    File "/usr/lib/pymodules/python2.6/MySQLdb/", line 170, in __init__ super(Connection, self).__init__(*args, **kwargs2) _mysql_exceptions.OperationalError: (1045, "Access denied for user 'anna'@'localhost' (using password: NO)")

    It's progress.