Originally posted on Hatch Blog
Year 3 for the Hatch Engineering team was a massive year and not what we expected, but then 2020 was not what anyone expected. Hatch launched the Exchange within weeks of the COVID-19 lock-downs to help stood down workers find temporary work. We helped over 1500 people find employment and had over 12,000 people sign up. We then took the learning from that and brought forward our plans to help companies hire junior talent, not only students, launching a new version of Hatch.
The big highlight of 2020 was how the entire Hatch team pulled together in a herculean effort to build, launch, and operate the Labour Exchange in just a matter of weeks. It felt like the early days again and we generated a huge amount of learning from the experience.
The Exchange provided the team with an opportunity to experiment with low and no code solutions and alternate architectures to rapidly get a product to market. We ended up delivering the Exchange with a mix of Airtable and S3 as backend “UI” and data storage. This has led to us continuing to use Airtable as we started to build Hatch 2.0 later in the year.
Quality vs Speed Experiment
The focus on releasing the Exchange as quickly as possible gave us an unusual opportunity to build product in a very different way to which we normally do and compare the results! We didn’t have any CI/CD, we siloed engineers to code areas they were strongest, we limited the amount of collaboration on tech design and we had minimal automated tests although we did keep code reviews.
The result was we delivered customer value the quickest we ever have. We also had more incidents in those few months than we have had in the last 3 years, engineers had no opportunity to develop into other areas -focusing only on their core strengths, code was siloedand quickly became difficult to change and stressful to deploy, and the support and operational load was far higher.
Despite the stressful and untenable environment it created, it was a great experience for the engineering team and Hatch as a whole to see first hand the trade-offs involved for Velocity. As we pivoted the product later in the year we took a lot of lessons from the Exchange allowing us to build on top of low code platforms like Airtable in a way that was more maintainable than our first attempt.
Coming into 2020 building a remote work culture was one of the engineering team’s stated goals
With flexible work a continuing mega-trend this decade, we want to make sure remote working is built into our culture. Whilst we aren’t remote first Hatch does embrace flexible working (work from home Wednesday!) and a number of the engineering team are planning remote work trips this year. Whilst internally we are comfortable working together remotely, we want to lean into ensuring our remote practices and interactions with the wider team are equally as second nature as it is internally.
COVID-19 was a force multiplier for remote working everywhere. It helped us go deeper on this goal than we thought possible last year. Our previous flexible work arrangement (where everyone works from home on Wednesday) was mainly designed to create space for deep individual work which can be difficult in a busy office. This didn’t force us to really develop remote first tendencies as in-person collaboration was just scheduled on other days. Once thrust into full remote working we needed to develop clear agreements on sync and async communications, and effective ways of collaborating online.
The Engineering team felt far more productive remote than when in the office fulltime, but there are still issues to work through. The main issue is maintaining culture and team connection. We tried a number of options from team zoom lunches where we all cook the same dish, to Friday drinks. Unfortunately group social video is still not a solved problem and for the extroverts nothing beats the real thing.
Miro was one of the key contributors to our success during this period, the wider team has spent a lot of time in Miro. It’s especially good at facilitating remote collaborative sessions like retrospectives, although the feedback you get when presenting ideas and designs through Zoom can still be daunting with everyone often on mute or providing non-verbal feedback which you can’t see on your screen when looking at a Miro.
After the learnings from the Labour Exchange we accelerated our roadmap into being a platform for junior roles, not just students. Launching the MVP of Hatch 2.0 required re-thinking how we capture, assess and present applicant data and provided an opportunity to fulfil some of our 2020 goals the Labour Exchange had put on hold.
Splitting Site from App
Since I joined Hatch I’ve wanted to split the marketing site from the application code base. As our marketing team grew so did the support for the endeavour to improve the performance of the site. When we needed to re-launch the marketing site to account for our new offering it was the perfect opportunity. Based on some rapid prototyping we’d done in NextJs for the Exchange, we decided to use NextJs again for its ability to easily mix statically rendered pages with SSR rendered pages.
We use FAB to run the site on AWS CloudFront. It’s still early days for FAB and there are a lot of workarounds to have it run our site, but I love where it’s going and the feature set it opens up.
Engineering and Data Science Responsibilities
We made great progress in defining the responsibilities and boundaries between the Data Science and Engineering. After a few iterations of how to integrate node and python and where to host the code we settled on models as a service backed by a virtual feature store (thin wrapper over product databases for now) and clear definitions of which parts are maintained by DS versus engineering. We’ll outline this in more detail in a future post.
“Right-sized” Services and Orchestration
Hatch 1.0 was opinionated about how a hiring journey should work and we didn’t invest heavily in edge cases or unhappy paths. This made experimenting with new recruitment processes, like 2.0 or the Exchange, difficult. It made sense at the time, we were building for a very specific use case but as we moved into 2.0 we wanted to be able to test a number of different ways to add value to the recruitment process. We needed a more flexible set of tools we could orchestrate into processes to serve the current hypothesis.
This allowed us to accelerate into our microservice right-sized services architecture. Building services with clear data boundaries and well defined responsibilities that we glue together, where needed, with orchestration services that encode a particular workflow we are using. Rather than needing to make each component service know about its neighbours, they provide an interface API and set of events which we can use in the orchestration service to wire together workflows.
We continued building our services with Lerna in our monorepo which has been working well. We are still looking for a better solution or tooling for deploying the changed set of services more efficiently.
We took our learning with Airtable from the Exchange and applied to our 2.0 MVP to rapidly spin up the operational aspects of role assessment definition and fulfilment. It allowed us to rapidly test data structures, hypothesis and iterate before we commit to building out bespoke product. We are even doubling down on it this year with custom Airtable apps to provide better UI over the data. We know we eventually need to replace this, but we hope to have iterated enough that we will really understand our needs once we start building.
We also started trialling Webflow to empower operations and marketing to own their own content and rapidly iterate. It’s still early days and the single concurrent user in the designer is proving to be a blocker, so it may not be Webflow in the future but empowering the entire team to iterate quickly without engineering is proving invaluable.
A UI challenge with service oriented architectures is that a piece of UI always needs more data than a single service can provide. We end up writing facades either as a service or in the UI itself to aggregate the needed data. The promise of GraphQL alleviating this issue and throwing in change notifications for free has been something the team has want to test out for awhile. This year we built our new version of the internal scoring tool, Astria, with GraphQL (AppSync) to kick the tyres in a contained way.
Personally I was underwhelmed. The tooling is still immature and the overhead of introducing another way of accessing data and building services (versus our REST API’s) wasn’t worth the value it delivered. Change notifications only work well if everything is mutated with GraphQL, injecting mutations from other avenues is clunky, and the experience left me thinking WebSockets would be just as easy for our current architecture.
Currently we are leaving it contained to this one service, however Apollo Federated Schemas has potential to solve the issue of data aggregation across our services. Once this is available for AppSync we may kick the tyres again.
Hatch’s wider 2021 focus is outlined here, from an engineering perspective this is what we want to focus on.
- Growing the team – we’re searching for a talented Full stack Product Engineer to join our team
- Improving Team Knowledge Sharing – The surface area of our product has grown, we maintain Hatch 1.0, the Labour Exchange, Hatch 2.0 and a marketing site for each. Plus we are growing our team. Keeping the team aligned on our best practices and how everything fits together architecturally is a big challenge for 2021.
- Data Science Collaboration – We have grown our Data Science team and will be focusing on our Matching Science in 2021. We laid good ground work in 2020 for how engineering and science work together. This year we will be doubling down on this and building on that initial ground work to create more maintainable systems.
- Building a marketplace for junior talent – All of the above is in service of aligning our product offerings into a single marketplace that helps us execute on our mission.
Technology & Practices Radar
From this year we are also going to start tracking our Technology & Practices Radar
Highlights of the new entrants for us include
- Linc and FAB’s – interesting front end containerisation which was recently acquired by CloudFlare.
- Archium – Which we hope will accelerate team understanding of the overall system and how it hangs together.
Some of the tools and tech we have put on hold include:
- MobX – in favour of pure state management. The adoption of hooks really showed us we didn’t need centralised state management for our app.
- GraphQL – detailed above