Friday, June 29, 2018

Web API Status Codes and Errors

In a previous post titled I'm in the future of the web and it doesn't work I complained in general about how "the web" is so old, so primitive and so full of hacks that writing applications for it is a nightmare. One pet hate of mine is REST, a convention that has unfortunately gained worldwide popularity in recent years. It's important to be reminded that REST is not a protocol, it is a convention, and a vague one at that. You have to invent your own convention of how to encode information in the HTTP requests and responses, and how to handle the various unpredictable HTTP response status codes you may receive.

In early Web API services that I wrote, I naively attempted to coerce the business logic results into HTTP status codes. So for example, getting a database row that's not found would return a 404 (NotFound), or creating a new row would return 201 (Created), or deleting a row would return 204 (NoContent).

I soon realised that most of the HTTP status codes have meanings that are peculiar to HTTP and have nothing to do with typical business logic. For example, if someone makes a service call with a bad parameter, what do you return? Perhaps a 400 (BadRequest) seems appropriate? Well no, because the request was not "bad" according to the way a 400 is defined: it was not too large, it didn't have malformed syntax, nor did it have deceptive routing (see List of Status Codes). Your request actually worked correctly, it's just the business logic that considers it a "bad request".

Web searches will reveal endless contradictory arguments and claims about how data and status information should be round-tripped using REST. I'm so sick of the arguments and complicated code that I have throw in the towel and adopted my own simple convention. I have in-fact hijacked REST and turned it into the carrier for a simple protocol.

Only these status codes are returned by my services.

200 (OK)
The request was successfully processed by the application without any exception being thrown. Some extra information is needed to determine if the processing succeeded of failed in the "business" sense, and I have a convention for that. See the 200 section below.

500 (Internal Server Error)
All unhandled exceptions are converted into this status code. See the 500 section below.

Other
Anything else is unexpected and is regarded as a fatal problem that is probably outside of the control of the application. Things like an incorrect Uri, a misconfigured server, a security problem, and so on, can result in all sorts of weird status codes, and I just don't care as it means something is really stuffed up and needs special attention. Calling applications may decide to abort and show a "pink screen" in these cases.

Having just a few meaningful conditions greatly simplifies the internal coding to issue a request and inspect a response.

200 OK

Indicates a response was generated without any unhanded exceptions during processing, however, more information may be needed to determine what happened inside the business logic.

Deleting a database row is an example where you may need to know if anything was actually deleted. Adding a row may need to report if the row was inserted or updated. In cases like this you need more information. My .NET coding convention is to have a response base class like this skeleton:

public class ResponseBase
{
  public bool Success { get; set; }
  public int? Code { get; set; }    // could be an enum
  public string Message { get; set; }
}

Every class that can be serialized into a response body is derived from this class. So values in the Success, Code and Message (or whatever you like) properties can report in more detail what happened along with any real response data. This provides extra consistent information in all responses from your application.

The extra response information could be placed out-of-band in the headers for example, but I find it simpler to have them in the body.

500 (Internal Server Error)

In .NET Web API projects you can use the OnException handler to intercept all unhandled exceptions (see Exception Handling in Web API). I convert all exceptions into a 500 response using skeleton code like this:

var ex = context.Exception.GetBaseException();
var error = new HttpError(ex.Message);
error.ExceptionType = ex.GetType().Name;
error.MessageDetail = ex.StackTrace;
context.Response = context.Request.CreateErrorResponse(HttpStatusCode.InternalServerError, error);

In service calling code you can now be certain that all 500 responses contain consistent helpful information about the unhandled exception. The response can be serialized into a class like this:

public sealed class InternalError
{
  public string Message { get; set; }
  public string ExceptionType { get; set; }
  public string MessageDetail { get; set; }
}

You can then decide if the response is "fatal" or not and take appropriate action.

Summary

By reducing REST responses down to two status codes and expanding the 200 response with extra information, it could be argued that I am hijacking REST and converting it into a toy protocol. I can't argue with that, but as an application developer I really need a protocol to move data back and forth in a consistent manner. REST is just a style or a convention and the vast majority of HTTP status codes are utterly meaningless to applications, so it was basic instinct that drove me to use REST the way I've described in this blog post.

For practical purposes, I just need to send and receive serialized information, and it either works or it doesn't. I don't give a toss about idempotency, url-based resources or the zoo of strange archaic response status codes that are defined, I just want to send and receive SOAP-like envelopes of information.

The fact that I'm doing this hints that there is something deficient or inappropriate about REST.

Wednesday, February 28, 2018

I'm in the future of the web and it doesn't work

I have seen the future and it works -- Lincoln Steffens (1931)
I have seen the future and it doesn't work -- Zardoz (1974)

We are well into the 21st century now, and "the web" is broken, it's sick, and all the endless band aids and stitches being applied to it to keep it alive and keep it working are actually suffocating and killing it. Modern software development of web apps is a nightmare because the web is based on a primitive protocol that is being pushed to absurd lengths to create web apps that have responsive rich UIs like desktop programs.

Some History


The web that become publicly available around 1992 was based upon a very simple protocol called HTTP: you requested a document from a specified address and the response hopefully contained the data, which was usually HTML for display in a browser. The protocol is stateless, connections are short-lived and have no memory of what requests you make or what they contain. This was all you needed for a basic web browsing experience.

Then people had the idea that programs could possibly run in the web browser and you could do some useful "work" in it. The underlying framework was far too primitive for this, so an arms race began to pump it up with cookies, parameters, special HTML tags, scripts, and so on in attempt to provide the necessary features. By the end of the 1990s, developing web apps was a nightmare. The development tools were primitive, debugging was nearly impossible, the few standards weren't adhered to and different browsers behaved and rendered differently.

From my usual .NET developer perspective, the release of the Microsoft .NET Framework around 2002 provided ASP.NET Web Forms so that server-side programs could interact with web browsers and programmatically construct HTML for the browser to render. All this did was take the basic HTTP request-response protocol and insert a ridiculously complicated pipeline of events into the middle of it. The release of ASP.NET MVC several years later, gave you more control over the rendering pipeline, but it just replaced one type of complexity with a different type.

All of the enhancements, extensions and frameworks built around the web in the first 20 years or so to support web apps were basically attempts to dress up a pig in a wedding gown. The underlying engine driving the web is still just built from HTTP, HTML and web browsers which were never designed to support rich, connected and stateful applications.

The REST Disease


For many years on the .NET platform I used SOAP web services, which is a simple and well-defined network protocol. It unfortunately wasn't in widespread use outside the Microsoft ecosystem. Then for some stupid reason, REST became really popular in recent years, which is a terrible tragedy, as it's important to realise that unlike SOAP, REST not a protocol, it's just a convention. As a result, it's not self-documenting, the formatting of message bodies is not defined, the response codes have narrow meanings, error handling is undefined, and everyone argues about what should be conventional. Like JavaScript, REST was one guy's pet project which seems to have spread like a disease without cure.

The set of verbs (GET, PUT, POST, etc) combined with the http response codes (OK, NotFound, Created, etc) are so restrictive and inflexible that it's practically impossible to fit them over a service that performs realistic business work. It's depressing that something as vague and ill-defined as REST has become so popular. Evidence of this is the fact that the web is jammed with articles arguing about every aspect of REST ... how to process errors, how to transfer in segments, what response codes to use, how to process binary, and so on. REST has poisoned the web. See my other blog post on REST where I attempt to tame it by turning it into a simple protocol.

The JavaScript Disease


I believe that the craze in recent years for developing JavaScript frameworks to drive web applications on the client-side is cancerous and an evolutionary dead-end.

The JavaScript language was created as a hobby project with a stupid misleading name that was a marketing ploy to ride on the trendy name "Java". I have used a dozen scripting languages in the last 30 years, covering a wide variety of styles and platforms, and by a long margin, JavaScript is one of the worst in all respects. It's a jumble of functional ideas and procedural constructs, there are no standard libraries, scopes are inconsistent, no namespaces, null/undefined confusion, no consideration of how to structure large projects, not even any concept of a "project". JavaScript lacks absolutely everything necessary to create large, sensibly structured projects. When compared to scripting languages designed and created by skilled developers, JavaScript really looks like someone's unfinished hobby project.

So here we are 22 years after JavaScript was released, and they're finally attempting to standardise the language (lookup ECMA standard). The JavaScript language is so crude and clumsy that people have written whole pseudo-languages on top of it to make it useful, JQuery and TypeScript for example. The mere fact that anyone would need to write such things is a strong hint that something is rotten.

In recent months I was compelled to research modern JS frameworks seriously so one could be chosen for a possible browser-based app that displayed and managed marketing data statistics. I initially chose the latest Angular because it seemed popular (it even has articles in MSDN magazine) and it was complete, in the sense that you didn't need to glue together multiple frameworks.

I watched 5 hours of a 10 hour long Pluralsight lesson on Angular and I generated multiple starter projects using different IDEs and commands. It was around this time I became dismayed and quite shocked by what I discovered about the JS ecosystem.

Firstly ... how can a lesson in writing JavaScript be 10 hours long?! Well now I know ... it's a gigantic framework full of conventions, templates, components, services, binding, routing, validation, pipes, filters, injection, observables, and so on. Angular looks like the result of a graduate student's indulgent thesis project. Why do you need something so monstrously large and complicated just to write a goddamn app in a web browser?! After 5 hours I realised I was watching the cogs spin on a huge over-engineered Rube Goldberg Machine that was displacing useful stuff out of my brain.

Next ... I thought I was hallucinating, but both of the Angular starter projects I created contained about 27,000 files and 1.6 million lines of code. And so I was cruelly reminded that JavaScript is a "scripting language" and does not support compilation or optimisation, or any of the features we normally associate with mainstream languages. I have used scripts for decades to glue things together, make repairs or perform utility work, and it's great that they can be managed as text files, usually of moderate size. It has been a law of programming all of my life that if you find yourself writing huge scripts, then you're digging a hole for yourself and it's time to migrate to a "real" language. So here we are with JavaScript and that old sensible rule has been shredded. JavaScript has become a kind of text-based assembly language, despite the fact that it is utterly unfit for the purpose. As an example, the desktop program Azure Storage Explorer is written using JavaScript based Electron, and the resulting installation folder ludicrously contains about 5000 script files and 3000 folders. This mess is caused by JavaScript dangerously leaking into places where it was never intended to be used.

Summary


So here we are in 2018 attempting to write professional programs for the web that are based upon 25 year old technology that was never designed for the purpose. More and more JS frameworks and tools are being released to assist client-side development, but as far as I'm concerned it's like trying to polish a giant turd, and even claiming to make it edible.

What's the answer? Imagine a parallel universe where all the big companies and standards institutes worked together to create what I call "an app player" that has nothing to do with web browsers, HTML or scripts. The principle behind Flash, Silverlight and Java Applets could be stolen to create a well-defined virtual machine and communications protocol, and a "player" could be created for different platform UIs. A team of talented students could even design and create such a thing.

WebAssembly and Blazor for .NET developers looks like a potential cure for the JavaScript disease, as you can write managed C# code to run in the browser. In early August 2018 the Blazor preview is basically working well and it gives me hope that JS and its endless frameworks will fade into the background and no longer be a primary web development language. However, one terrible problem is not solved by WebAssembly … the rendering. It looks like we will still be completely dependent upon the web browser with HTML (and probably some JavaScript) to render the UI. HTML is completely and utterly inadequate for creating UIs for business applications, as there is no concept of a viewport (screen size), no virtualization of list items, primitive layout features, no custom controls and no interactivity. HTML was invented in the early 1990s for simple drawing of text and images and is incapable of rendering rich UIs that are common in desktop applications. So although WebAssembly and Blazor may mercifully displace JavaScript as the primary client-side web development language, we are still stuck with the brain-dead web browser as the application host environment. Not much progress really.

Wednesday, January 24, 2018

Collections Database History

a.k.a. Thinking outside the relational database box


Like most software developers, I have "pet projects" that are used to try new technologies and platforms while performing some useful utility function for work or leisure. I have one such project that has been running for 36 years ... and it's finally practically finished!

The offending pet project is casually called "the collections database". It started around 1985 as a text DataSet on a Fujitsu OS/IV X8 mainframe. The DataSet originally listed my LP and EP records, but over the decades it grew to include other formats of audio and video titles as well as books, magazines and collectible items along with pictures and media samples. The various incarnations of this project accidentally provide an interesting historical timeline of changing technology and the various never ending fads that grip the IT industry now and then.

But the main reason I'm writing this post, is because for the first time in decades, after a dozen attempts using different technologies, the project is practically completed. And it was completed in about two weeks of evening hobby time, whereas previous attempts have dragged on unfinished for years.

What happened!? It's basically due to changing from a relational database (SQL Server) to a document database (Cosmos DB). The lesson in this is that RDBs can be used almost by habit because they have been so prevalent for 20 years or so, but the world is changing and there are alternative databases that can liberate you from the confined RDB world. Read on...

The Early Years

In May 1992 I purchased my first PC running Windows 3.1 (see Computers History). I immediately transcribed a fanfold printout of the mainframe DataSet into an Excel spreadsheet. Access 2.0 was released later that year and I imported the data into joined relational tables, thereby undergoing a very useful training exercise in data normalisation and RDB design principles. Despite years of hobby work in Access I remained frustrated by the restrictive, clumsy and verbose development environment and could never produce an application that I liked.

Around the same time I ran experiments with C++, MFC and the Windows SDK to write a native desktop application, but the complexity and difficulty level was so staggeringly high that it would have burnt my life and patience away.

The .NET Years

During the first 10 years after the release of .NET the collections database was restarted many times with project suites named Opus, Jade, Topaz, Nimbus, Agate and Folio. These suites used the following technologies, platforms and kits at various times as their popularity waxed and waned.

  •  Windows Forms  A rich and stable way of creating sophisticated desktop programs, but once you get used to data binding in WPF you will never go back to Windows Forms.
  •  SQL Server  The majority of my suites have used either the express or full versions as the collections database. It's a famous product, but it has a gigantic installation and runtime footprint, you need special skills and tools to maintain it, and even creating a basic database requires schema planning, scripting and management. SQL Server is heavyweight overkill for hobby projects.
  •  SQLite  One incarnation of my suite had abstracted the underlying database away so it could be replaced with SQL Server, SQLite or any other popular relational database. This delicate technical exercise eventually had no practical value and was abandoned. Despite the popularity of SQLite, I found it to be most irritating because of 32/64-bit distribution issues, the shocking difficulty of getting Visual Studio designer support working, everything inside it is stored as text, poor Unicode support, and worst of all: you cannot alter the schema without recreating the whole database. Update Feb-2021 : See SQLite Rehabilitated (it's much better now).
  •  netTiers  This CodeSmith template was originally a fabulous tool for generating database CRUD, but it eventually became dependent upon a non-free version of CodeSmith. It also generated STEs (self-tracking entities) which fell out of favour because they tie your app too closely to the data model. The release of Entity Framework eventually made netTiers redundant.
  •  Entity Framework  EF4 was the first primitive but useful release. I fell for the early trap of using their STEs which were dropped from later versions because they fell out of favour, and I agree they were a bad idea (except for simple utility work). EF6 has matured well and I use it in a variety of current projects. Never pass raw entities outside the data layer, create special DTO classes for that purpose.
  •  ASP.NET Web Forms  It's really hard to write non-trivial Web Forms apps. The complicated event pipeline combined with the sheer dumbness of HTML will have you clawing your face off trying to create an app that looks good and works reliably on different platforms. The modern chaos of the web itself is partly to blame. See: I'm in the future of the web and it doesn't work.
  •  ASP.NET MVC  This was originally going to be the "Web Forms killer" framework with a greatly simplified pipeline that gave you much more control over rendering. I soon found that they had replaced one heavy and complex Web framework with a different type of complex Web framework. There is so much hidden plumbing and conventions inside ASP.NET MVC that it's a real struggle to find out how to do something to correct way. If you have to decide which framework is the least worst, then MVC probably wins. Update Dec-2020 : Forget ASP.NET completely and read Blazor Webassembly Notes.
  •  Silverlight  One of my suites had a beautiful rich Silverlight UI, and I have Silverlight active in some live applications very effectively creating web apps that would have been impossible by any other means. Sadly, in 2012 Microsoft announced end-of-life for the product. This leaves us with no rich UI toolkit for rendering business apps in the browser. The only alternative is use huge slabs of JavaScript with SGV images and the canvas to create the illusion of rich UI. It's a tragedy. See my realted blog post Silverlight Death and Funeral.
  •  SOAP  This XML based network messaging protocol is simple to use and configure, but it has unfortunately been pushed aside by the rapid adoption of web services using the REST convention. This is also a tragedy, as SOAP is a complete standard protocol, whereas REST is a vague convention. See: Web API Status Codes and Errors.
  •  WCF  This was Microsoft's attempt to unify many communications protocols under a single set of service-oriented APIs (including SOAP). Unfortunately, the result is over-complicated and fiendishly difficult to configure and extend. WCF is also eclipsed by the rise in popularity of REST.
  •  JavaScript  Forget it. I recently researched the creation of a browser hosted app using Angular (because of its popularity) but was utterly shocked by what I discovered. The whole JavaScript ecosystem is a putrid smouldering pit. It's so appalling that I have written a separate damning blog post. Although a web search reveals there are already countless scathing anti-JavaScript and Angular articles to read!
All of the frameworks, tools and kits listed above have been used over the last 16 years to write a collections database suite. Many of them are now obsolete, not-supported, out of favour, or too difficult to use. This shows how volatile the software development industry is, riven by fads, competition, conflicts and lack of long-term focus.

    Complexity and RDB Tables

    All of my attempts to create a collections database suite have been hampered by complexity in two areas: (1) The user interface (2) Processing a relational database.

    Custom user interfaces are delicate and laborious to create, and I think all developers have learned to live with that. The introduction of WPF with XAML and binding improved the productivity and stability of writing desktop app UIs, but creating good quality web UIs remains an odious task.

    However, I have been continually frustrated by the effort required to pull and push the required data in and out of relational databases. This is the classical problem sometimes called the impedance mismatch, where data that has been beautifully normalised and stored in RDB tables is unlikely to be in a form suitable for processing by an application. Sometimes editing a single title in the collections database would require stitching together data from several tables and then shredding it to reverse the process. You can use an ORM like Entity Framework, but you have to add packages and references, create the data model, generate files, create DTO classes, perhaps write and map stored procedures, and so on. Just using an RDB bloats your code and increases complexity.

    On the Agate project Wiki page I have an archived article which describes how the collection database was reduced to 3NF (3rd normal form). The process of creating normalised RDB tables can be a rewarding experience which has a kind of mathematical elegance. It is this normalised elegance however that produces the previously mentioned impedance mismatch and increases project complexity.

    With the recent rapidly growing popularity and availability of cloud-based schema-less databases, I suddenly realised it was time to completely rethink how to implement a collections database and app suite. Read on...

    Cosmos DB

    During the 2017-18 Xmas break I decided to re-familiarise myself with Microsoft's Cosmos DB which is a rebranded Document DB product from several months earlier. I was quickly impressed. There are NuGet packages, a simple managed SQL-like API, configuration from the Azure portal and good documentation.

    As an experiment, I migrated my complete current SQL Server collections database into a Cosmos DB collection. While doing so, I flattened (de-normalised) all titles into individual documents with nested properties for values like owners, genres, artists, files, tracks, etc.

    I suddenly realised that storing my collection as "documents" instead of normalised tables is more natural and it makes manipulating the data much easier. Each title in the collection is a self-contained document that can be loaded and saved in one go. There is no need to mess about with relational tables, joins, DTOs and similar things that I previously complained about.

    To continue the experiment, I created a WPF desktop program to browse and edit the documents (see screenshot). And to my surprise, thanks to the greatly simplified processing, I had a working program over one weekend, and over two weekends it was practically finished. A similar program using an RDB would have taken several times longer.
    Side Note: You may occasionally have data which is best represented as a tree structure, or more generally as a graph of connected objects. In this case neither RDBs or simple documents are suitable, so consider using the Graph API over Cosmos DB documents which provides a startlingly powerful new way of storing and querying data.

    Summary

    Relational databases are fabulous if you have data that can be normalised and rarely changes shape, and you don't mind the burden of hosting, designing, scripting and code generation. The history of my collections reveals to me that I was using RDBs out of force-of-habit, and that I had over-normalised my data, which produced elegant tables but caused me to suffer from the impedance mismatch problem mentioned above. As a result of this, my collections projects were always clogged with complex code and libraries to manipulate relational data. Once I realised that my data was better represented as documents and I moved to Cosmos DB, it was like having a huge weight lifted off my shoulders, and thanks to the reduced code and dependencies my project was finally completed after so many decades.

    The new collections database management suite is called Hoarder and the full source code is available as a reference in an Azure DevOps repository.