Not bad ideas here but I think there's some issues that need addressed. Don't confuse DRY with bad design. The idea behind DRY is to abstract and encapsulate in case you need code later, not create bad design trying to make the perfect function or class. The example provided actually breaks SOLID principles so it should never happen if DRY is applied correctly. You should also never setup local servers in unit tests. I don't care what Google says. Google has hundreds of developers that do nothing more than write internal tooling. Have you seen a large CI/CD pipeline with in-memory servers running? It can take up to 20 minutes for a deployment depending on the size of the project. Especially if you're a monorepo team taking advantage of GitOps, in-memory anything will be a nightmare to handle.
The reason I stopped reading programming blogs around 2012 that I was mad that they kept doing this: something worked in their particular field well, and they evangalize that everybody should be doing it the same way. I got even madder when I noticed that they are all of the same field, the typical "web startup" field, they don't even notice that very different fields like embedded, game, or ERP dev exists.
Your comment reminds me of this.
I had quit reading programming blogs when they evangelized automated testing. I do not use this in ERP. First, I am 100% sure no one ever calls my code, it is always the last layer, because it is kinda scripting. Second, I am 100% sure my code is correct, because it is simple. Third, my code has no units. A function can always go hunting in the database, read configuration data, master data, and then make a decisions. It can look up anything in a database. How to even unit-test that.
I am more of a business logic scripter, ERP consultant, but FWIV:
1) yes, in my world of integration, I call this data ownership: either the ERP software owns the customer data, or the CRM software. Not both. Updating the data should only be permitted in one of them. The other gets synched automatically. This is the difference between a consultant-developer and a developer. When I put my consultant hat on, I tell managers stuff like we will completely forbid users to update the data in this or that system. Developer-only developers rarely have the gall to do so.
2) yes, generally, in the business process scripting world this is not even new, because the truth is our scripties don't even know about abstraction as such, as we get our requirements piecemeal and do no real design. If one requirement is to import customers, and then a year later there is a requirement to import vendors, we mostly just import vendors and do not try to refactor the customer import into something more reusable. I do not even copy-paste, I noticed when I do that I always forget to change some detail. I better type it out anew even when it is 80% the same.
3) I don't know about that, we do not unit test because our input unit is basically a whole database. Any function at any time may look up anything in the database. Do this expect when the customer has this flag, but if the item has that flag do it anyway etc. We have no units to speak of, our functions do not simply work with their inputs but look up all the stuff in the database. Our testing is manually trying things out, but mostly it is just knowing the whole system by heart and never making any mistakes. Seriously. Sometimes I do not even test things manually. I know exactly what my code does. There is the ritual of asking users to test but I know fully they do not test either. It still works. The trick is that I am always a one man team at least for one specific module. No other persons code ever calls mine, ever. I get it that automated testing is for teams, you change something, and then someone else calls your code and it breaks.
4) I am confused about this, throwing things into a database actually does minimize mutable state? I would be a fan of throwing things into a database multiple times and reading back, if it would be necessary for me, and not keeping things in memory, because if something looks fooked in the database, you immediately know which part of the code did it. For me it is really necessary for some basic billing software, but if I had something more complex, I would totally have tables like "50% finished invoice" and "75% finished invoice" if that is your idea? It would really help debugging, troubleshooting.
Thanks for the points! Agree with not overusing mocks. Best case is having a design that enables the testing pyramid with easy unit tests and integration tests for stuff with more dependencies. As you mentioned, boundaries can get blurry here.
I once accidentally followed #4 to the letter. It was an API in PHP and every interaction started with a clean controller context and /v1/entity/{id} being pulled from the DB. 99% of the time my service wasn't to blame.
I don't think you hit the nail on the head but I believe you're into something. Repetition, sync, single source of truth, and mutability/derivative data... They all intersect. Example: don't repeat yourself if it's, say, am upload path where one service writes and another reads from. But DO repeat yourself for one-liner ops that are as basic and common as a simple reduce on an array. It's when you need reliable consistency that DRY is important. I've seen so many hotshot devs in interviews write a "reusable react hook" just to demonstrate reliability when the operation is not ever possibly in a million years going to be anything more than a basic one-liner anyway. Why I took points off is because: they just turned 10 one-liners into ... 10 slightly different one-liners + 1 file with 5 lines that everyone has to study to determine what it's purpose is (because why does it exist)... and now everyone has to understand how to use YOUR code instead of just the framework or language itself. And you may have done that in purpose which has me thinking you are pathological. So don't do it. Overall, you just won't have many of these issues if you write small, SOLID, egoless code. Any issue you do encounter will be MUCH easier to deal with. Lastly, use a Sandbox for you apps. Just try it. It solves 80% (or more) of the problems everybody in every shop constantly always repeatedly painstakingly stumble on and faceplant on -- security, testing, coupling, scaffolding, reliability, consistency, readability, debugging, and my favorite -- developer experience :)
I think in some ways 1 and 2 are related. You don't want to repeat your sources of truth (or, alternatively, "places where decisions are made"). But you do need to be sure that what you're trying to unify via DRY is actually the same thing. Semantics is key over syntax. Whether code actually looks the same ends up being irrelevant - I have "++i" all over my code. :) And you can have two pieces of code that are doing the same thing that don't actually look alike.
(I had a thought relating 1 to 4 - don't store state you can compute = single source of truth - but I went back and realized you had already addressed that.)
Not bad ideas here but I think there's some issues that need addressed. Don't confuse DRY with bad design. The idea behind DRY is to abstract and encapsulate in case you need code later, not create bad design trying to make the perfect function or class. The example provided actually breaks SOLID principles so it should never happen if DRY is applied correctly. You should also never setup local servers in unit tests. I don't care what Google says. Google has hundreds of developers that do nothing more than write internal tooling. Have you seen a large CI/CD pipeline with in-memory servers running? It can take up to 20 minutes for a deployment depending on the size of the project. Especially if you're a monorepo team taking advantage of GitOps, in-memory anything will be a nightmare to handle.
The reason I stopped reading programming blogs around 2012 that I was mad that they kept doing this: something worked in their particular field well, and they evangalize that everybody should be doing it the same way. I got even madder when I noticed that they are all of the same field, the typical "web startup" field, they don't even notice that very different fields like embedded, game, or ERP dev exists.
Your comment reminds me of this.
I had quit reading programming blogs when they evangelized automated testing. I do not use this in ERP. First, I am 100% sure no one ever calls my code, it is always the last layer, because it is kinda scripting. Second, I am 100% sure my code is correct, because it is simple. Third, my code has no units. A function can always go hunting in the database, read configuration data, master data, and then make a decisions. It can look up anything in a database. How to even unit-test that.
PRY is more commonly known as Write Everything Twice (WET)
I am more of a business logic scripter, ERP consultant, but FWIV:
1) yes, in my world of integration, I call this data ownership: either the ERP software owns the customer data, or the CRM software. Not both. Updating the data should only be permitted in one of them. The other gets synched automatically. This is the difference between a consultant-developer and a developer. When I put my consultant hat on, I tell managers stuff like we will completely forbid users to update the data in this or that system. Developer-only developers rarely have the gall to do so.
2) yes, generally, in the business process scripting world this is not even new, because the truth is our scripties don't even know about abstraction as such, as we get our requirements piecemeal and do no real design. If one requirement is to import customers, and then a year later there is a requirement to import vendors, we mostly just import vendors and do not try to refactor the customer import into something more reusable. I do not even copy-paste, I noticed when I do that I always forget to change some detail. I better type it out anew even when it is 80% the same.
3) I don't know about that, we do not unit test because our input unit is basically a whole database. Any function at any time may look up anything in the database. Do this expect when the customer has this flag, but if the item has that flag do it anyway etc. We have no units to speak of, our functions do not simply work with their inputs but look up all the stuff in the database. Our testing is manually trying things out, but mostly it is just knowing the whole system by heart and never making any mistakes. Seriously. Sometimes I do not even test things manually. I know exactly what my code does. There is the ritual of asking users to test but I know fully they do not test either. It still works. The trick is that I am always a one man team at least for one specific module. No other persons code ever calls mine, ever. I get it that automated testing is for teams, you change something, and then someone else calls your code and it breaks.
4) I am confused about this, throwing things into a database actually does minimize mutable state? I would be a fan of throwing things into a database multiple times and reading back, if it would be necessary for me, and not keeping things in memory, because if something looks fooked in the database, you immediately know which part of the code did it. For me it is really necessary for some basic billing software, but if I had something more complex, I would totally have tables like "50% finished invoice" and "75% finished invoice" if that is your idea? It would really help debugging, troubleshooting.
I thought, its yet another post on SOLID but thanks for presenting new views.
I'm sorry, but I'm a bit lost on your example in point 4. What is the link between cache and mutable state?
I would also add YAGNI and SOLID to this list
Thanks for the points! Agree with not overusing mocks. Best case is having a design that enables the testing pyramid with easy unit tests and integration tests for stuff with more dependencies. As you mentioned, boundaries can get blurry here.
I once accidentally followed #4 to the letter. It was an API in PHP and every interaction started with a clean controller context and /v1/entity/{id} being pulled from the DB. 99% of the time my service wasn't to blame.
I don't think you hit the nail on the head but I believe you're into something. Repetition, sync, single source of truth, and mutability/derivative data... They all intersect. Example: don't repeat yourself if it's, say, am upload path where one service writes and another reads from. But DO repeat yourself for one-liner ops that are as basic and common as a simple reduce on an array. It's when you need reliable consistency that DRY is important. I've seen so many hotshot devs in interviews write a "reusable react hook" just to demonstrate reliability when the operation is not ever possibly in a million years going to be anything more than a basic one-liner anyway. Why I took points off is because: they just turned 10 one-liners into ... 10 slightly different one-liners + 1 file with 5 lines that everyone has to study to determine what it's purpose is (because why does it exist)... and now everyone has to understand how to use YOUR code instead of just the framework or language itself. And you may have done that in purpose which has me thinking you are pathological. So don't do it. Overall, you just won't have many of these issues if you write small, SOLID, egoless code. Any issue you do encounter will be MUCH easier to deal with. Lastly, use a Sandbox for you apps. Just try it. It solves 80% (or more) of the problems everybody in every shop constantly always repeatedly painstakingly stumble on and faceplant on -- security, testing, coupling, scaffolding, reliability, consistency, readability, debugging, and my favorite -- developer experience :)
To your #1 - what about operational data for analytical purposes? What about analytical data for machine learning models?
I think in some ways 1 and 2 are related. You don't want to repeat your sources of truth (or, alternatively, "places where decisions are made"). But you do need to be sure that what you're trying to unify via DRY is actually the same thing. Semantics is key over syntax. Whether code actually looks the same ends up being irrelevant - I have "++i" all over my code. :) And you can have two pieces of code that are doing the same thing that don't actually look alike.
(I had a thought relating 1 to 4 - don't store state you can compute = single source of truth - but I went back and realized you had already addressed that.)
Great article, by the way!