Language and Idiom in Programming Languages

Usually we talk about programming languages in their literal syntax - BNF trees, keywords, all that raw characters on the buffer. “Dialects” would be something like gcc vs Visual C++, or the “ML family”. We talk about “style” as the perennial flame war elements like braces and variable names. Compare the Linux style guide to something like AS Style or Oxford. It’s all about the elements on the page, not the content.

I’m thinking this is the wrong idea (or at least too low a level). Each programmer has their own style of writing, patterns they use, the structures they prefer, techniques they use, architecture and code organization. It’s like a writer in English or any other language: just because you know vocab and syntax doesn’t make you Shakespeare.

Like most languages, we learn primarily by imitation. You learn your idioms from your peer group - if they talk a certain way, you will too. I often feel like a fish out of water because my colleagues are speaking one dialect or broken English, and I’m speaking something more formal. I can understand them, but it’s not particularly elegant. I certainly have my own idiosynchrasies and biases in my personal style. Maybe we fight so much about braces because it’s as much a proxy for the real dialect differences.

Frameworks gain an additional facet when you consider them in the context of writing. The 3 paragraph essay is the strongest example I can think of. The writer is “filling in the form” when they write. I always wanted to push beyond the format. Freeform writing is quite hard for many people. Starting with a blank page or buffer is paralyzing. Something like Rails or another strongly opinionated framework fits the same concept. Hiring someone who “knows Rails” is implicitly hoping that they write the same idiom. Consider a sonnet as more of an “architecture” - constraining to a structure. More freeform writing takes more discipline and skill to prevent it from jumbling into a mess - and it can still be difficult to understand. Writing spaghetti is just some crazy dude shouting at people you can’t see. We as an industry do a lot of raving and shouting.

In the end, one person is not going to affect any significant change in idiom or dialect of another, except through osmosis. The only way you learn is reading good code. Any writer who claims don’t read other writer’s work so they can maintain their own idependence is a third rate hack. Code reviews or pairing are the most direct way to get exposure. Reading other’s code, especially open source. Use terminology from the state of the art. Don’t anthropomorphise the machine. Introspection is the final key technique - critical reading of your own writing.

Setting the pace initially for an organization really makes a difference. One Shakespeare doesn’t make the whole crew writers, but it will set pace, and give something to emulate. Also for rising stars, their best bet is to apprentice to a good writer so they learn all the good habits and not the bad ones. Your first job is probably the most formative experience.

I do believe there are masters out there with a high level of language and idiom. Given that our discipline is highly technical, I don’t expect flowery prose. But clear, concise, well architeched code is a critical aspiration.

“Expressiveness” is a word bandied about a lot. I think there are a couple meanings.

  • High information content. DSLs and scripting languages, especially Python (one way to do it). Jargon and idiom are there to condense information. Higher level languages attempt to remove boilerplate and legwork like memory management so the focus is on the information. DSLs are supposed to be even more information dense. A frequent complaint is that you don’t have full ability to express anything you want if it isn’t in the vocab. Failure mode is something like Perl and regex - so informationally dense it becomes impossible tangle.
  • Flexibility. C is ur-example, but dynamic programming languages or macros make the cut. You can skin the crap out of that cat. Information density may be low, but granularity is high. Failure mode is basically reinventing a DSL, or losing the high level information in the boilerplate; illegibility becomes a problem. Good for telling the machine exactly what to do (see next item), or being clever.
  • Working in the wrong paradigm. Trying to write functional programming in Java, OOP Java in Scala, or any other way of talking crosswise. Usually a complaint rather than an expression of support. The language doesn’t provide structures or features to write code in a certain way, or they are more verbose. Usually someone ends up trying to write the missing functionality into the language.

Tool vs Framework practice

In line with my results vs system thinking exploration, I think I’ve found what may be another dichotomy in software development practice. The tool over framework principle has been fundamental for my engineering practice for my whole professional career. Composable tools have served me better than large frameworks. Usually I’ve considered it a style choice with consequences, but it may represent a more fundamental attitude towards engineering. Even if your system doesn’t use an explicit 3rd party software framework, the framework way of thinking may be present. As always, this is not a black & white descriptor.

Tools

  • Unix principle: do one thing and do it well
  • Composable tools provide platform as needed.
  • Enable and empower.
  • Give developers tools to build with.
  • Intimately familiar with tools.
  • Likes new tools, actively adopts them.
  • Likes the command line.
  • Learns by reading source and documentation.
  • Vertical development.

Benefits

  • Flexibility.
  • Reusable and composible functionality. Encapsulation.
  • Application performance from choosing only the tools needed.
  • Adoption of new tools rather than building.
  • Reduce costs by leveraging the best tools.

Costs

  • Upfront investment in learning multiple tools
  • Fragmentation of implementation.
  • Emergent behavior between tools and components.
  • Complexity of system.
  • Fragility

Frameworks

  • Microsoft principle: play in our garden, it’s got everything you could need
  • convention over configuration.
  • Insulate and focus.
  • Give developers places to slot in logic. Frameworks provide all platform details.
  • Likes IDEs.
  • Always wants familiar tools. Distrusts new tools or new uses of tools.
  • Learn tools to the level that the benefit is achieved, no further.
  • Learns by example and imitation. Likes recipes and copy/paste.
  • Web frameworks as ur-example.
  • Horizontal development.

Benefits

  • Knowledge transfer between organizations. (“Looking for a Rails developer…”)
  • Quick turnaround from idea to something presentable.
  • Commercial tools and support.
  • Reduce costs by focusing on business logic over implementation details.

Costs

  • Founders once framework no longer provides functionality.
  • Sprawl and inertia.
  • Reiventing the wheel, ineffective tool use.
  • Duplication and redundancy.

Horizontal vs vertical

Tool and framework nicely describe an attitude a developer or team has towards their work. There are two descriptors of the software projects themselves I’ve used for a while, which I think very closely map to the tool and framework thinking: I call them vertical and horizontal.

A vertical system has narrowly focused independent components, building up complexity through the combination, usually in a heirarchy. Vertical systems leverage tools appropriate at each component.

Horizontal systems are very flat, with many similar broad components at the same level. Horizontal systems leverage frameworks and use the same set of tools for each component.

The classic example of a horizontal system is one using the typical web framework (Rails, Django, Cake, etc). Ostensibly there are layers (a distinct pattern “Rails-MVC”), but it usually devolves into what I call macaroni code. Each component has a data access system bound to one database table, a routing class, and a UI template. Repeat for every table in your database. Best case, you get these macaroni tubes of functionality from the UI to the datastore. Horizontal development scales well. New functionality can be added without affecting architecture. There is a strong established pattern for devs to follow; it lends itself well to most dev shops and outsourcing firms where developers are mostly low skilled. Functionality is (hopefully) strongly grouped and easy to find. Service oriented architecture to this style means splitting the macaroni into separate processes. Failure modes include; needs beyond the basics of the horizontal layer, problems outside the scope of the framework, unmanageable sprawl, slow moving change. Compare with an assembly line - lots of very similar operations done at a large scale.

Vertical systems are more rare, and less visible due to their general lack of reliance on frameworks. Growth of non-traditional software tools in data analysis, distributed systems, and embedded systems have driven tool development. In the web space, tools like Sinatra and akka-http represent a different approach, providing only routing functionality and expecting other tools to provide for other needs. Akka itself is a great example of providing a set of tools for solving concurrency model problems. Vertical systems are more likely to become spaghetti and fragile, due to the increasing complexity of interconnected components. If each component becomes connected to most of the others, the number of possible connections goes up exponentially. Good vertical systems are build heirarchically, with the complexity managed by abstracting away details of a component’s inner workings and subcomponents. Service oriented architecture to this style is just another way to compose the system components together. Compare with an internal combustion engine - distinct components, working together in a complex way.

Results vs System oriented thinking

I spent quite some time thinking why I don’t understand the development practice of most programmers I’ve encountered. I realized there is a major gap in thinking between us. I’ve formulated this as two distinct camps - result oriented and system oriented. Obviously no organization or person will fit squarely in one camp, but I have found it a useful distinction.

Results oriented

Runtime behavior is your business product. Code is a means to an end.

Key indicator: someone would buy your company for your client base.

Concerned with generating income.

Now oriented.

Typical patterns

  • UI first.
  • Heavy reliance on frameworks (fallback to website always).
  • Use familiar tools beyond original purpose
  • First thing that compiles.
  • Customer driven. Very reactive.
  • Test driven development (result first).
  • Shortest path from point A to point B

Interests

  • Short term growth
  • Quick, reactive development
  • Technical debt is acceptable
  • Simpler problems solvable with a tool.

Failure modes

  • Technical debt outpacing growth
  • Intractable problems - e.g. not solvable in short time, requires new techniques, not suitable for current tools.
  • Over promising features

System oriented

Code is the product. Runtime behavior is a result of the product working

Key indicator: someone would buy your company for your technology.

Concerned with building capital.

Future oriented.

Typical patterns

  • System and data first.
  • Architecture driven.
  • Stakeholder driven.
  • Planned features.
  • Builds tools.

Interests

  • Sustainable product.
  • Patents and Licensing.
  • Long term stability.
  • Harder problems requiring new technology

Failure modes

  • Perfectionism
  • Failure to deliver
  • Over engineering (you ain’t gonna need it).
  • Unresponsive to customer needs.

I think the split represents what I’m seeing in most software shops - they are pretty much solely focused on results rather than systems. The first code that gets the results they want is the correct solution and then move on. I’ve watched devs with years of experience write brand new code which contains functions in the thousands of lines. They were unable to describe the functioning interactions in their code that they had written in the previous month. The code was not a system; but rather a large growing collection of functionality drifting as needed.

I’m very clearly in the system oriented thinking camp. I like to say that if you do your job right, no one will notice, which is probably true. The result oriented person appears to be much more productive since they are in the thick of reacting to problems and immediate needs. They also move quickly, because future consequences are not at the forefront of their thinking.

Proactive maintenance is very system oriented. Reactive changes to failures or criticism represent result driven thinking. If something isn’t actively an emergency it gets ignored. When you find a major problem, they quickly became interested in fixing the problem because the result was wrong. If they don’t see the incorrect result, it’s not a problem.

I think that much of my struggles have come from being at odds with other people’s goals. If I’m pushing for a longer dev time on a project to add unit tests or do some refactoring, I’m adding more work, not reducing it. The result is already done, why make it harder for yourself? I haven’t figured out the best way to align long term pain with good practice and consequences with current behavior.

On the flip side, I have struggled against the failure modes of the system oriented camp. Over engineering in particular is a struggle. My deep investment in backend and infrastructure over user interfaces and the whims of customers has frustrated people, making them think that I have no interest in their needs.

Principles of Building Software

“Any fool can write code that a computer can understand. Good programmers write code that humans can understand.” Martin Fowler, 2008

If you can’t measure it, you can’t change it!

The Schizophrenia of Computing

Computer science is a branch of mathematics. Computer engineering is a branch of electrical engineering. Software engineering is a branch of sociology. Balance the three if you can.

Clean Code

  • Write code so the poor bastard after you doesn’t curse your name.
  • Code with intention.
  • Immutable > Mutable
  • Data > State
  • Separation of concerns > tight coupling
  • Composition > Inheritance
  • Explicit > Implicit
  • Polymorphism > Conditional
  • Polymorphism != Code sharing (OOP does not require inheritance or abstract classes)
  • Functional > Imperative
  • Declarative > imperative (i.e. what > how)
  • Abstraction > Code grouping
  • Clean code > needless optimization
  • referential transparency (account for all inputs, all outputs including exceptions)
  • small sharp tools (do one thing, and do it well)
  • Build composable tools.
  • high signal to noise ratio.
  • Separate structural changes from behavioral changes.
  • Limit inheritance.
  • Maintain locality of functionality.
  • Develop personal craftsmanship. Never stop learning.

Architecture

  • Domain > Code
  • Why -> What -> How
  • Build heirarchies of systems.
  • Asynchronous first.
  • Data > Code
  • Code > Configuration
  • Code > Convention
  • Tests > Documentation
  • Create APIs. Start with client/server
  • Establish boundaries and surfaces.
  • Choose the right tool not just the tool you have.
  • Caller > Callee. Don’t push concerns into the callee.
  • “First thing that compiles” is not an acceptable engineering decision.
  • Actors > Objects (or rather, actors encapsulate better than objects.)
  • Website is not an architecture
  • “There is nothing more permanent than a temporary fix.”
  • Encapsulate state in state machines.

Testing and Error handling

  • Test around separation of concerns
  • Test for behavior.
  • Fail loud, fail fast
  • Edge case handling > Make it go away
  • Understanding how things work > sacrificing to the elder gods and hoping things happen
  • test for stability
  • Errors are a failure in the design or implementation of code. Exceptions are output of your function.
  • Mocking is a smell that you have too many dependencies.

Data

  • Data > State. State is data that is variant over time.
  • structured types > dynamic data
  • explicit schema > implicit schema
  • Immutability > mutability
  • Data transforms > tickling
  • Serialize at boundaries only.
  • Cannot promote derived data.
  • Null considered harmful. Use explicit definitions to indicate “value may not be present”.
  • Objects and structures should be in a stable state once they leave the constructor.

Concurrency

  • Event-driven / Reactive > Explicit calls
  • Asynchronous > Synchronous
  • Can treat local as remote, but not vice versa.
  • Time is a continuum.
  • Communication > synchronized state
  • Separate context of execution from behavior.
  • Threads and processes are contexts of execution, not concurrency primitives.
  • Share nothing.

Code reviews

  1. Does this represent one change?
  2. Does it present clean system surfaces? (Does everything fit together.)
  3. Does it present clean public surfaces? (Is it useable)
  4. Automated tests. (Does it do what it should around the surfaces?)
  5. Style (Is it clear and readable?)