Engineer’s Codex is a publication about real-world software engineering.
LLMs have been all the rage in tech since ChatGPT’s release in November 2022. It feels crazy to think that it was released almost 2 years ago! Yet, in that same timeframe, there have been many other exciting advancements in software engineering that haven’t received the hype they may have deserved because of the LLM hype.
It’s true that LLMs are revolutionary, and while I work with LLMs daily, there are so many other things that are fascinatingly progressing. I go over some topics below and provide a host of links for each topic for those interested in learning more. I see these topics as trends that are only growing as time goes on.
Local-first software
Local-first software has been around for decades, but it seems that the past few years have seen increased activity in making the developer experience around local-first software better. Local-first software prioritizes storing and processing data on a user’s local device rather than relying solely on remote servers.
The best intro to the local-first software paradigm is by Ink and Switch: Local-first software: You own your data, in spite of the cloud.
Companies and packages like React-Query, PouchDB, InstantDB, Legend-State, PowerSync, ElectricSQL, and more focus on making the sync process easier between client and server.
Local-first is the natural next step of user experience. Making things persist locally allows for near-zero latency for interactions, better user experiences, and more resilient clients in low-to-no internet connections. I predict this will become particularly important as user expectations for software continues to rise in the future.
Some interesting work in this space is conflict resolution. Conflict resolution is needed when there are local changes and server-side changes that conflict with each other, and thus, need resolution. I wrote up a quick summary of the most common conflict resolution methods below. (Note: Some systems combine multiple strategies, like using CRDTs for simple data types and three-way merging for more complex documents.)
CRDTs (Conflict-free Replicated Data Types)
CRDTs are data structures designed to be merged automatically and deterministically without conflicts. They use mathematical principles to ensure that no matter the order in which changes are applied, all replicas will eventually converge to the same state. CRDTs are great for collaborative applications like text editors, shared to-do lists, or any system where users make frequent updates offline.
The best articles I’ve seen for CRDTs are An Interactive Intro to CRDTs and A Gentle Introduction to CRDTs.
Another article that argues against CRDTs for certain experiences is You don't need CRDTs for collaborative experiences.
Operational Transforms (OT)
Operational Transforms are also commonly used in collaborative text editors like Google Docs. OT involves transforming operations (like insertions or deletions) on shared documents to ensure that concurrent edits are applied in a consistent way. OT allows for real-time collaboration while maintaining data consistency, even when multiple users are editing simultaneously across devices.
Last-Write-Wins (LWW)
This is a simpler approach where, in case of conflict, the most recent change (based on timestamp) is accepted as the final version. While this strategy is straightforward and easy to implement, it may lead to data loss if conflicting changes are made almost simultaneously. It's best suited for scenarios where the latest update is usually the correct one or the preferred one.
Three-Way Merging
Inspired by version control systems like Git, this method involves maintaining a base version of the data along with changes from different devices. If conflicts are detected, the system attempts to merge them or prompts the user to resolve them. This approach provides flexibility but may require user intervention, so it’s best used in cases where it’s okay to have input from the user to resolve data. Still, this merge does its best to auto-merge everything. There is a “simpler” non-merge solution where conflicts are simply shown to the user, who can then pick the correct side themselves.
Event Sourcing
Instead of storing the current state, the system stores a series of events that led to that state. When conflicts occur, the system can replay events to determine the most accurate or meaningful resolution. Event sourcing provides an audit trail and allows for complex conflict resolution strategies, but it can be more complex to implement.
—
Links to read:
WebAssembly
WebAssembly is a binary instruction format that allows code to run directly in the browser with near-native speed.
The possibilities of WASM are super exciting:
In-built code execution in the browser
WASM allows developers to run complex applications within the browser itself, bypassing the need for server-side processing. This is awesome for tools like code editors, development environments, and real-time simulations that can now operate entirely client-side.
SQLite in the browser
By running a SQLite database directly in the browser with WASM, web apps now have an alternative to IndexedDB and LocalStorage to store and query data locally. This opens up new possibilities for offline functionality and data caching. This connects really well with the local-first movement along with SQLite growing in popularity in general. (I talk about this more below, but Notion uses WASM SQLite).
Faster user experiences
WASM supports caching of compiled code, resulting in faster load times on subsequent visits. This means faster, better user experiences. It also lets applications deliver sophisticated functionality on par with native apps, without the need to repeatedly re-download code. This is a great article on this: Code caching for WebAssembly developers · V8
AI in the Browser
Yes, I know that I said this was a list of exciting non-AI innovations, but this one is very relevant and particularly promising. WASM has made it possible to run machine learning models directly in the browser, without needing a backend server. This lets AI-driven features, such as image recognition or text analysis, to function quickly and privately on users’ devices without any cloud costs. Projects like TensorFlow.js with WASM backends enable running AI models with improved performance compared to pure JavaScript implementations. Google Chrome had a great talk on this: WebAssembly and WebGPU enhancements for faster Web AI, part 1.
SQLite’s Renaissance
Most developers, when thinking about what database to use, think about MySQL/Postgres (relational) or Mongo/Dynamo (NoSQL). For many production grade applications, it’s probably a good idea to use one of these for the foundation of your database.
But it’s not always necessary. Kent Dodds argues that you could probably just use SQLite in production. So does Stephen Margheim at the Ruby Conf. Regardless of whether you agree or not, the points they makes in favor of SQLite are valuable:
zero latency
simplified setup
easy multi-instance replication
can handle massive databases (more than you think)
much easier to develop and test with.
But in reality, SQLite simply has tradeoffs. My favorite quote from the founder of SQlite is simply that:
"Think of SQLite not as a replacement for Oracle but as a replacement for fopen().".
SQLite has recently been seeing a lot of renewed activity in terms of hype too. Why? It’s probably because of cycles. Dev tools go through cycles of hype and usage, and now that there’s more commercial force behind SQLite’s usage, we’re seeing a lot more public writing pushing to use it. That’s not to say SQLite was some infrequently used database before.
Hacker News has some interesting discussion on people who have used SQLite as a primary database.
While I don’t see too much movement to SQLite as a primary database in general, I do see value in SQLite for local-first software (especially combined with WebAssembly advancements).
Expo (React Native) constantly releases new versions of expo-sqlite with cool new features and upgrades. Expo v52 introduces a new Storage drop-in replacement for React Native’s AsyncStorage, with added synchronous APIs for convenience. In local-first mobile apps, SQLite is a rock-solid local store and usually a better option than AsyncStorage.
Notion implemented SQLite caching in their desktop apps in 2021, resulting in:
50% faster initial page loads
50% faster navigation between pages
Inspired by this success, Notion's team decided to bring these performance gains to their web app using WebAssembly (WASM) SQLite. This led to Notion reducing their page navigation latency on the web by 20% using SQLite client-side. (Summary, Blog Post).
They also go over some interesting problems they faced, like having to optimize WASM SQLite loading and fixing slow disk reads on slower devices.
Cross-Platform is getting better (React Native, Flutter)
Cross-platform is trudging along, making great strides in performance, dev experience, usability, and more. For example, React Native recently introduced their New Architecture, which leads to more polished experiences.
One of the coolest recent examples of this is Shopify migrating their entire mobile app to React Native. (Source) Now, 86% of their code is shared between iOS and Android. Not only that, performance improved along the way! They improved screen load times by 59%, app launches are now 44% faster, and webviews are now 63% faster.
This doesn’t really take away the need for native development of course. Mustafa Ali gives some great lessons learned from the experience too:
“Native code and native devs are crucial. There is no replacing the experience and taste that comes from having built high-quality mobile apps.”
“100% React Native should be an anti-goal. Use native wherever it is the best tool for the job (widgets, Siri shortcuts, watch app and complications, etc), or in places where there are high performance requirements.”
“Achieving good performance takes work and should be a priority from the start. Measure everything and relentlessly optimize every layer. Add automated monitoring to catch regressions.”
Automated reasoning
Automated reasoning is a bit more obscure, but also super exciting. Automated reasoning focuses on using logic and mathematical proofs to make sure that systems behave as intended. Unlike traditional testing, which validates system behavior in specific scenarios, automated reasoning allows engineers to verify correctness across all possible scenarios. Basically, it’s a shift from reactive testing to proactive verification, leading to (ideally) better reliability and security. Amazon has a gentle intro to automated reasoning published.
AWS also recently wrote a great blog post on how they’ve been using automated reasoning techniques in their infrastructure. Over a decade of applying these techniques, AWS has found that formally verified code often outperforms the unverified code it replaces. This is because the process of formal verification helps uncover bugs early, leading to optimizations that improve runtime performance.
For example, they found that by building a formal spec of IAM, they were able to optimize code that processes over 1.2 billion requests per second, resulting in a 50% performance boost. For S3, they used automated reasoning to find hidden bugs, which let them ship updates more frequently - from every quarter down to every 1-2 months.
Thank you for your perspective. AI seems to be clouding our vision, but other things, some unexpected like SQLite, are revolutionizing the landscape.