Saturday, December 4, 2021

Miscellaneous Gotchas

Bitmap to BitmapSource

I was using CreateBitmapSourceFromHBitmap to convert a GDI Bitmap to a BitmapSource for use in a WPF app, but the resulting images were colour "flattened" like they had a reduced palette. Web searches reveal about 4 different ways of doing the conversions, and they all did the same thing.

It turns out all the conversion code was working, but the ImageList providing the source images had the default ColorDepth=8Bit which was too small for the actual images. Removing and adding the ImageList back with 16Bit depth and adding the images back to its collection fixed the problem.

The first static method mentioned above works well, so ignore all the other ways of converting.

Incredible Exceptions

When starting the Web API project I started getting weird exceptions like "could load CoreLib.XmlSerializer" and "unexpected EOF or 0 bytes from the transport stream". I thought I had a terrible bug or HTTPS misconfiguration or mismatch in vital libraries and I could not locate where in my code these errors were coming from. I eventually realised I had turned off VS2022 Debugging > Just My Code and I was seeing exceptions deep inside the .NET runtime. I don't remember changing that setting, but ticking it on restored my previous calm debugging experience.

ASP.NET Core Out-of-Process

I deployed two ASP.NET Web API services to Azure in different virtual directories, one was .NET 5 and the other .NET 6. The second started with this error:

500.35 ANCM Multiple In-Process Applications in same Process
    The worker process can't run multiple in-process apps in the same process.

I knew this was caused by the new In-Process feature of .NET Core, but it took almost an hour to find a way of reverting to the old behaviour by putting this line in the project files:

<AspNetCoreHostingModel>OutOfProcess</AspNetCoreHostingModel>

Stopping and starting the Azure App Service will allow both apps to run overlapping.

C# Records

I was so keen on the new C# 9 record types that I created a dozen of them in a new project because they save so much code. You can create a full featured struct or class like this:

public record MyThing(int Id, string Name, string? Note = null);

The trouble was, a week later I had to use the records in the request and responses of a Web API and I was quickly reminded that they are not serializable because they don't have empty constructors. I had to turn them back into normal classes.

Documentation claims that records work correctly with JSON serialization, so this issue is still being researched.

Cosmos DB LINQ Expression

After updating to the latest Microsoft.Azure.Cosmos 3.22.1 package (I forget the old version) I received a BadRequest response from a simple LINQ query. The problem was this where clause in the LINQ expression:

...Where(t => t.Pictures.Length == 0)

The array length comparison could not be transformed into a Cosmos SQL statement using ARRAY_LENGTH, it was generating some function named ArrayLength() which failed. I'm not sure if this is a bug in the LINQ provider (unlikely?) or mixed versions of something. The fix was to use this condition instead:

...Where(t => t.Pictures.Count() == 0)

Minimal Web APIs

Creating a Web API project requires a huge amount of repetitive ceremonial code, so I was quite thrilled to learn about the arrival of the minimal Web API in ASP.NET Core. An attempt to migrate an older controller-based project to the new format ended poorly. The actions are now performed by methods of the App (WebApplication Type) instead of methods in controller classes. The old project used a T4 template to generate dozens of method stubs, so that needed a complete refactor to get going.

The stumbling block was attempting to put a variety of custom attributes on the minimal methods. Some Swagger documentation can't be put on the minimal methods. Some filters aren't available either, and it took a while to stumble upon some web articles this list the shortcomings of minimal APIs.

So for now I consider them an interesting and useful experiment, but their limits should have been clearly documented. I'll return to considering migration once the shortcomings are plugged.

Validation and Nullable

After updating to Visual Studio 2022 and enabling nullable reference types I received a validation error response attempting to POST a class in the request body. I was not using any validation so it made no sense.

It turns out my class had a string id property which I was saving with a null value because it would be set on the server side. So unexpected and unwanted validation was happening silently. A crude but quick workaround was to define a special string UnsetId = "_UNSET_ID" which replaced the null value.

Minimal Web API Authorization

In a controller-based Web API project you can define an Attribute which implements IAuthorizationFilter and intercept every request, inspect the headers for the presence of a valid key. In a minimal API there are no methods to put attributes on, so you have to use a middleware class and intercept the requests in the InvokeAsync method. There are many samples available online.

If you use context.Response.WriteAsync(string) like the samples then you will get an empty response body. I found a call to context.Response.WriteAsJson(string) was needed to get a body.

Minimal Web API no XML

Extension method AddXmlSerializerFormatters doesn't seem to add automatic XML support as it does in controller based projects. One web post pointing to this line of code hints that only JSON is supported.


More to come ... I'm in the middle of renovating a large project suite while moving it up to .NET 6, and I'm stumbling over many weird 'gotchas' that need documenting for posterity.

Monday, November 29, 2021

C# Compiler Source Generators

T4 History

I've been a fan of the T4 Text Template facility since I stumbled upon it about 15 years ago. A .tt file in a project with the build action TextTemplatingFileGenerator blends smoothly into the project, edit experience and build process. There are plenty of good articles around on how to use T4 Text Templates.

Writing a .tt file can be a delicate process at first, but once you get the hang of coding the <# #> and <#+ #> pairs you will soon see that vast amounts of repetitive code can be reliably generated with ease. I have used T4 successfully in many large projects.

There are no serious disadvantages to using T4 as far as I'm concerned. You may want to find a Visual Studio extension for .tt file syntax colouring, otherwise the editor shows the contents as plain text. One .tt file generates one output file by default, but if you want to generate multiple split files (like old Entity Framework templates) then look for some utility classes people have published to allow that.

Roslyn Source Generators


In mid 2021 I was very interested to hear that source generation had become a Roslyn compiler feature. In a nutshell, you write an annotated class that acts like a special compiler analyzer, reference the class in a project and it is invoked in the compile stage after the syntax tree has been built. The generator can inspect the whole tree and use it to generate code which is added to the compilation. For tutorials see:


At the time of writing this I've only had time to sanity check that source generation works as advertised. It does of course, but there are some serious worries.
  • Creating a source generator is a moderately complicated process compared to T4 templates. You have to create a class that follows strict conventions and composes the source code as a string, resulting in lots of tedious string manipulation. The consuming project needs a special reference to use the generator.
  • I can't find any way of debugging a source generator. I've seen samples of rather cryptic code that create a mock compile environment, but it looks like hours of research.
  • The whole process seems to be silent, that is, there is no clue about what is happening when inside the source generation process. Days later I found that you have to set <EmitCompilerGeneratedFiles> True in the consuming project to preserve the generated files which can be found deep under the obj folder. A mistake in the generator will produce typical build errors, but double-clicking an error does not jump to the generated source file.
  • Examining the compiler generated syntax tree is really tricky (see next section).

OnVisitSyntaxNode

The generator class can use the OnVisitSyntaxNode method to examine each syntax tree node generated by the compiler, probably using the provided information to discover what code generation is required. The problem is ... figuring out what is passed to this method and how to cast it and how to use it is harder then abstract algebra. Dozens of samples perform weird casts and call mysterious methods to get information about classes and members. There are no clues about what information is available to this method or how to retrieve it.

I'm sure the Roslyn generated tree and classes are all documented somewhere, but where? I'm actually quite angry about how bewildering it is to try and do something useful in this method. Luckily I found enough bits of sample code to get me going.

This article suggests you install the .NET Compiler Platform SDK to provide a visualiser experience inside Visual Studio (I just did, and it works). I also remembered that LINQPad has a similar visualiser feature. These tools may help you progress and understand how to write better generators.

Summary


The source generation feature is cleverly implemented at just the right level in the build process, but it's just so delicate to code, deploy and consume correctly. One little mistake and it all goes to hell.

If you are generating "code from your own code", then source generation is the most appropriate technique. The classic example is generating INotifyPropertyChanged members, as discussed in one of the links above.

If you are generating code (or anything else) from arbitrary rules you have invented, then T4 Text Templates are the best choice. For example, I often create an XML file containing a mixture of rules about what classes, interfaces, stubs, etc are required in the projects, then T4 templates read that XML file and generate whatever is needed in different parts of the app. You don't just have to generate C# code, I've previously used T4 to generate test data and XAML files for a WPF desktop app.

Monday, November 22, 2021

Count Lines of Code

If you have written any utility which counts lines of code, then throw it away because I accidentally discovered someone has done the job ... properly! I've deleted my old scripts because they were really tricky to write and the numbers they produced could only be regarded as approximations. Parsing arbitrary file contents accurately is inherently difficult and I'm glad someone with more patience, expertise and time has done the job for me.

See ▸ Count Lines of Code on GitHub

Just download cloc.exe and put it in a folder with your other favourite utilities.

There are dozens of command line switches to control the behaviour. I originally excluded files with 'Generated' in the names and other patterns for files that are not human authored, but it turns out it wasn't necessary because it can somehow tell the difference between different types of generated files. For example, it conveniently produces separate result lines for 'C# Generated' and 'C# Designer'. I'm not sure if it's looking at the file name or special contents of the files, but I'll assume it knows what it's doing.

It's simple to use and it produces neat, concise and probably accurate statistics.

Monday, September 27, 2021

Cosmos DB Request Unit Logging

This post replaces an earlier one titled Catch Cosmos Client Statistics where I briefly explained how you can use a request handler to intercept statistics about each database query. The statistics include the Request Units (RUs) expended by each query, which is correlated to the actual cost of running the database.

A request handler class is provided with more information that I realised a couple of weeks ago. You can fully reconstruct each query and retrieve matching statistics such as RUs, elapsed time, database size and record count. All of this information can be combined to create concise logging output which can help you pinpoint performance bottlenecks or expensive queries.

Construct a CosmosClientBuilder class with the endpoint and key to the Cosmos account. Use AddCustomHandlers to add an instance of a class derived from RequestHandler. Override SendAsync and extract the following information:

request.ActivityId (a Guid string)
request.Method
request.RequestUri
request.Content (see note #1 below)
request.Headers.ContinuationToken
response.Headers["x-ms-request-charge"]
response.Headers["x-ms-resource-usage"] (see note #2 below)

Note #1 - Content

The Content Stream is null for GET requests which are generated by simple read-by-id queries, but is present for POST requests generated by more complex queries. I found the Content to actually be a MemoryStream containing the full query text. It can be extracted like this:

byte[] buff = new byte[request.Content.length]
request.Content.Read(buff, 0, buff.Length);
string query = Encoding.UTF8.GetString(buff);

Note #2 - Usage

The database usage information is unfortunately embedded in a joined string of key=value pairs, but an easy way of extract the values is like this:

  string s = response.Headers["x-ms-resource-usage"];
  Match m = Regex.Match(s, @"documentsCount=(\d+)");
  int docCount = int.Parse(m.Groups[1].Value);
  

Take care when referencing the Header collection values, as the keys present in different requests are a little bit unpredictable. It's safest to use Headers.TryGetValue("key").

While experimenting with this code in my own hobby suite called Hoarder, it revealed subtle bugs which were causing bursts of duplicate queries. One bug was a race from different places to read some important records which were supposed to be cached, but multiple reads were running before any value was cached. It's interesting to watch the raw database query activity as your app runs. Bugs and performance problems can be quickly identified.

At one point I saw a set of 6 duplicate queries and I thought it was a bug. It turns out my query was over all records, and what I was seeing was one initial query followed by 5 continuation queries. The request.Headers.Continuation token value tells you if a continuation query is running.

You can aggregate the useful information I've mentioned above and send it back to parent apps via a callback, or send it to your preferred logging library.

Addendum: When the handler is hosted in a web app live in Azure I found that the request.Header.ActivityId was sometimes null and other information was missing. I think it's safe to just skip logging in these cases, as an identical request will occur soon after with all the expected data available.


Sunday, September 12, 2021

Cosmos DB Client V3

 The latest Cosmos DB V3 client library makes coding easier for .NET developers. The class model is simpler and more sensible, and the LINQ extensions let you write concise and type-safe queries.

In the old library, to get the Id and Name of every comedy CD in my library I would need to construct a query string like the following and pass it to a long complicated database query call:

SELECT c.Id, c.Name FROM c
  WHERE c.Type=2 AND
      c.Media="CD" AND
      ARRAY_CONTAINS(c.Genres, {Name:"Comedy"}, true)

I was using nameof() to help strongly-type the resulting string (not shown), but it was rather clumsy and verbose. Now I can code a strongly-typed LINQ statement with intellisense:

var query = container.GetItemLinqQueryable<Title>()
    .Where(t => t.Type == DocumentType.Title &&
        t.Media == "CD" &&
        t.Genres.Any(g => g.Name == "Comedy"))
    .Select(t => new { t.Id, t.Name });

The LINQ query expression tree is converted into an actual database query and the results by:

var feed = query.ToFeedIterator();
return await IterateFeedResults(feed);

The last method is a helper I made up, and it's a great practical example of how you can turn a loop of async calls into a return value which is a convenient async enumerable sequence:

async IAsyncEnumerable<T> IterateFeedResults<T>(FeedIterator<T>> feed)
{
  while (feed.HasMoreResults)
  {
    foreach (var item in await feed.ReadNextAsync())
    {
      yield return item;
    }
  }
}

The logical structure of a Cosmos database is now more closely represented by the classes. The preamble to running queries is simply this:

var client = new CosmosClient(uri, dbkey);
var database = client.GetDatabase("MyDatabase");
var container = database.GetContainer("MyContainer");
var query = container.GetItemLinqQueryable(...);

For more information about how I enjoy using document databases like Cosmos DB, see my old blog post titled Collections Database History.

Monday, September 6, 2021

Catch Cosmos Client statistics

Skip this article and read the more detailed replacement above titled Cosmos DB Request Unit Logging.

This is another reminder to myself, but others may find it useful. You can add a custom handler to the CosmosClient to catch performance information and other statistics in the request headers, then save them for reporting. The technique is identical to adding a chain of handlers to the HttpClient class. You catch important information at a single function point and keep your code clean.

Construct a CosmosClientBuilder class with the endpoint and key to the Cosmos account. Use AddCustomHandlers to add an instance of a class derived from RequestHandler.

The derived class will override SendAsync, await the response and extract important information from the headers. Here is a dump of some GetItemAsync request headers from LINQPad.


Use a callback from the handler or some similar technique to extract the interesting values.

Thursday, September 2, 2021

Web API arbitrary response data

This is about .NET Framework, not .NET Core.

 A couple of times a year I have to return a Web API response with a specific status code and serialized object in the body. Sounds easy, but every time this happens I spend 30-60 minutes searching until my fingers bleed for the simplest answer. I always find lots of stupid complex answers in forums, and the MSDN documentation leads you down all sorts of complicated options that involve formatters that assume a specific restrictive response type.

All you need to do is something like this example in a controller:

if (password != "XYZZY")
{
  return return Content(
     HttpStatusCode.Forbidden,
     new MyErrorData() { Code = 123, Message = "Password failure")
  );
}

You can specify any response status you like and put any object of your choice into the body and it will be serialized back to the client using the active formatter. So the object will return as XML or JSON or whatever.

There are so many confusing pre-baked responses like Ok, BadRequest, etc, along with weird variations and overloads that you can't choose the one you need. It turns out that Content() is simplest general purpose way of sending a response.

Web API in .NET Core uses different techniques for what I've discussed here.

Wednesday, August 25, 2021

SDK Project attributes

The newer SDK-style project files (*.csproj) do not support all of the Assembly attributes that could be used in older project files. A simple example is the AssemblyTrademarkAttribute. There is no <Trademark> element defined in the new projects. Arbitrary attributes can however be added like this sample:

<ItemGroup>
    <AssemblyAttribute Include="System.Reflection.AssemblyTrademarkAttribute">
      <_Parameter1>My Trademark</_Parameter1>
    </AssemblyAttribute>
</ItemGroup>

It's a bit ugly and I can't find formal documentation of the elements, but it works fine. Some people are using this technique to add InternalsVisibleTo attributes to their assemblies without the need to manually add an AssemblyInfo.cs file to their project.

An interesting trick is create your own Attribute with a string property and constructor parameter, then use it like this:

<ItemGroup>
    <AssemblyAttribute Include="MyCompany.AssemblyBuildTimeAttribute">
      <_Parameter1>$([System.DateTime]::UtcNow.ToString("yyyy-MM-dd HH:mm:ss"))</_Parameter1>
    </AssemblyAttribute>
</ItemGroup>

A couple of weeks later I realised that only attributes with string constructor arguments can be used this way. If you try to add CLSCompliant for example, you'll find there's no way to pass a bool value to it. Searches hint that this behaviour is expected and suggest you put extra assembly attributes in a separate cs file in the traditional way.

October 2023 Update

You can now create non-string attributes. I found this GitHub issue: https://github.com/dotnet/msbuild/issues/2281. I also read last week in the msbuild documentation that the fix has been released, but I can't find it... It's something to do with _Parameter1_IsLiteral. To be continued...


Wednesday, August 4, 2021

Delete Azure Tenant

 Some time over the previous few years ago I was playing with the Azure B2C feature and I accidentally created two extra Tenants under my Azure Subscription. When I signed-in to my Visual Studio subscription I could see 3 tenants listed in various menus and lists. I made occasional passing attempts to delete the irritating extra Tenants, but the delete page always told me there were active users and applications that had to be removed first, although those lists were empty.

Today I was so fed-up with the Tenant delete failures that I did more research and experiments and I finally succeeded in deleting the two extra useless Tenants.

In the Azure portal go to Azure Active Directory > Manage Tenants > click a useless Tenant in the list to open the details blade on the right and copy the Tenant ID.

Run PowerShell ISE as Administrator. Connect to the useless Tenant.

Connect-AzureAD –TenantID <TenantID>

List objects in the Tenant.

Get-AzureADServicePrincipal

There will be an unpredictably long list of objects. Try to remove as many as possible using this loop.

Get-AzureADServicePrincipal |
    Select-Object -ExpandProperty ObjectId |
    ForEach-Object { Remove-AzureADServicePrincipal -ObjectId $_}

This removed everything from one of my Tenants and I could then delete it in the portal.

The other Tenant had about half of the objects removed and I still couldn't delete it because the portal said I had apps registered, although the app list was empty. I went to various pages and stumbled around, signed out and in again, and after a few minutes returned to the app list page. NOW there was an unfamiliar app listed and it said I couldn't delete it. I went ahead and deleted it anyway, and it worked. Now I could delete the final Tenant.

So overall, the process of deleting Azure AD Tenants is an unpredictable and frustrating process. You get misleading or missing information from the portal pages. Luckily, some web searches told me to use PowerShell to find and remove the objects that were blocking deletion, and even that process seemed nondeterministic.

It took me about two hours of concentrated effort to delete my two useless Tenants.

Saturday, April 17, 2021

BigInteger Factoring

On a recent Sunday afternoon I was writing some code snippets in LINQPad to play with Fermat's Little Theorem and Carmichael Numbers. As part of the experiment I wrote a simple method to factor integers, just using trial division up to the square root. Although it wasn't completely dumb trial division, as my sequence of trial divisors is {2,3,6n±1}, which produces: 2 3 5 7 11 13 17 19 23 25 29 31 etc.

Note: A wheel could be used for more efficient trial division, but for the purposes of my experiments it was overkill.

What surprised me was the overall speed of the trial divisor factoring using the BigInteger struct. I expected factoring time to become unreasonably slow at about 32-bits, but it was factoring 48-bit (15-digit) numbers in sub-second time.

As a further experiment I fed progressively larger primes into the factor method to see when it would hit the "exponential wall" and become impractical.

NLog10SecsFactors
100,0035.00.00PRIME
1,000,0036.00.00PRIME
10,000,0197.00.00PRIME
1,000,000,0079.00.00PRIME
50,000,000,02110.70.01PRIME
100,000,000,00311.00.01PRIME
500,000,000,02311.70.04PRIME
1,000,000,000,03912.00.02PRIME
5,000,000,000,05312.70.05PRIME
10,000,000,000,03713.00.07PRIME
50,000,000,000,05313.70.16PRIME
100,000,000,000,03114.00.24PRIME
500,000,000,000,05714.70.52PRIME
10,000,000,000,000,06116.02.29PRIME
20,000,000,000,000,00316.33.26PRIME
500,000,000,000,000,02117.716.29PRIME
1,000,000,000,000,000,00318.022.84PRIME
2,000,000,000,000,000,05718.332.53PRIME

As the table shows, the simple algorithm hits the 30-second mark at around 18-digit numbers and I stopped there. My PC is a few years old and not particularly powerful, and the factoring was single-threaded, so I'm astonished that the BigInteger class performed so well for so long.

The code

// Naive factoring loops
IEnumerable<BigInteger> Factor(BigInteger n)
{
  double sqr = Math.Sqrt((double)n);
  foreach (long div in Divseq())
  {
    if (n == 1)
    {
      break;
    }
    if (div > sqr)
    {
      yield return n;
      break;
    }
    while (n % div == 0)
    {
      n = n / div;
      yield return div;
    }
  }
}

// Endless sequence of test divisors: 2, 3, 6N±1 ...
IEnumerable<long> Divseq()
{
  yield return 2;
  yield return 3;
  for (long i = 6; ; i += 6)
  {
    yield return i - 1;
    yield return i + 1;
  }
}

Out of curiosity I used Visual Studio 2022's performance profiler to find out what was happening with objects and garbage collection while running tight loops that created BigInteger structs. It turns out that vast numbers of uint arrays are created to hold the internal values of the BigIntegers. You can see the GC reclaiming large amounts of memory every couple of seconds, which results in a typical sawtooth shape in the memory graph. Despite this GC stress, the performance is still surprisingly good.

Note that the BigInteger class is not really needed for the experiments shown above because the long (Int64) is large enough to hold all of the numbers involved. However, I was playing around with much larger numbers in other code not shown in this article. If you convert the Factor method above to use long instead of BigInteger then it runs consistently about 4 times faster.


Sunday, April 11, 2021

Enumerate Azure Storage Accounts

This article uses deprecated classes and libraries. For a modern code sample see: Enumerate Azure Storage Accounts (New)

Last year I pieced-together some C# code which enumerates all of the storage accounts in an Azure subscription. It took many hours of searching and frustrating experiments with bits of sample code from various sources to get the following code working. For my own sanity and in case it helps others, I've pasted a minimal sample below so it doesn't get lost (using fake Ids).

The code can be used as the starting point for a custom utility that scans the contents of all containers, tables and queues in all storage accounts within a subscription. I have created a small project that uses the code in a reusable library (see AzureUtility in DevOps).

Many hours were wasted trying to figure out what the four magic parameter values where, and if and how they were created.

Subscription Id

A guid-like string taken from the Azure portal > Subscriptions > Account blade. Clearly labelled.

Tenant Id

A guid-like string taken from the Azure portal > Active Directory blade. Clearly labelled.

Application Id

In the Azure portal > Active Directory > App Registrations you need to register an app. You can call the App whatever you want, as it doesn't have to physically exist, it's just the name of a "ghost" app that generates an Id and will be given permission to read the Azure subscription.
Then in the Subscriptions > IAM blade add a role assignment "Owner" for the ghost app name. A lesser role may be sufficient to read the subscription, but I ran out of patience for more research.
In May 2023 I found that the Azure Portal UI has changed, so the previous paragraph is not so simple any more. You click Add Role Assigment and are taken through a three step wizard to select the Role, select the principal, then confirm. Apps are not listed in the selection list on the right and you have to manually start typing the name of the app, then it should be listed and can be selected.

Password

The password is created in the Active Directory > App registrations > Certificates & secrets blade. Add a secret for the ghost app and save the generated random string, as it's your first and last chance to see the full secret.
async Task Main()
{
  const string subscriptionId = "02a145d5-d731-40af-903f-59be6d3ef1ca";
  const string tenantId = "3254e567-75e9-4116-9ae2-d3147554faf9";
  const string applicationId = "b06375a4-1211-4368-a796-2e30066d0c27";
  const string password = "SDxB9y5JJU.7q.GJY~tZN4To8CI_4-LCJ_";

  string token = GetAuthorizationHeader(applicationId, password, tenantId).Result;
  var credential = new TokenCredentials(token);
  var storageMgmtClient = new StorageManagementClient(credential) { SubscriptionId = subscriptionId };
  foreach (var s in await storageMgmtClient.StorageAccounts.ListAsync())
  {
    string rg = s.Id.Split('/')[4];
    string key = storageMgmtClient.StorageAccounts.ListKeys(rg, s.Name).Keys[0].Value;
    WriteLine(s.Name);
    WriteLine($"- Id = {s.Id}");
    WriteLine($"- Kind = {s.Kind}");
    WriteLine($"- Primary Location = {s.PrimaryLocation}");
    WriteLine($"- Key = {key}");
    WriteLine($"- Endpoint Blob = {s.PrimaryEndpoints.Blob}");
    WriteLine($"- Endpoint Table = {s.PrimaryEndpoints.Table}");
    WriteLine($"- Endpoint Queue = {s.PrimaryEndpoints.Queue}");
    WriteLine($"- Endpoint File = {s.PrimaryEndpoints.File}");
  }
}

private static async Task<string>> GetAuthorizationHeader(string applicationId, string password, string tenantId)
{
  ClientCredential cc = new ClientCredential(applicationId, password);
  var context = new AuthenticationContext("https://login.windows.net/" + tenantId);
  var result = await context.AcquireTokenAsync("https://management.azure.com/", cc);
  if (result == null)
  {
    throw new InvalidOperationException("Failed to obtain the JWT token");
  }
  return result.AccessToken;
}

Saturday, March 27, 2021

Blazor ignore rewrite rule

The root folder of a Blazor app contains a Web.config file with the following rewrite rule to send all requests into the Webassembly runtime:
<rewrite>
  <rules>
    <rule name="Serve subdir">
      <match url=".*" />
      <action type="Rewrite" url="wwwroot\{R:0}" />
    </rule>
    :
  </rules>
</rewrite>

Note that it intercepts every request without exception. But you may want exceptions, like my case where the Blazor app was deployed into the root of a web site which had apps in existing virtual directories. The Blazor rule was "hiding" the virtual directories so all requests for them produced the standard Blazor "nothing at this address" message.

To make a path exempt from Blazor routing you can insert a rule like the following before the one above:

    <rule name="Ignore myapp" stopProcessing="true">
      <match url="myapp" />
      <action type="None" />
    </rule>

Now the request for path myapp will be ignored and processed normally without reaching Blazor.

Hours of web searches failed to reveal the trick above, so I'm wondering if I'm the only person in the world who hit this routing problem. After reading the MSDN documentation on the rewrite module I eventually stumbled upon the correct rule. You can combine multiple paths into one large rule using the <condition> element, which is explained in the documentation.

One remaining problem is how to deploy a Blazor app with the Web.config already edited to contain the extra ignore rule. Another hour of searching fails to reveal an answer. It seems that a Blazor deploy will overwrite any existing Web.config and I can't find any pre-deploy point where the file can be adjusted.

Wednesday, February 24, 2021

Generating Excel documents in C#

From a .NET project I needed to generate an Excel document containing a single sheet containing titles, numbers and dates with simple but pleasant formatting. I found there are two ways of doing this without using 3rd party products.

Open XML

This is the "low level" option. Add NuGet package DocumentFormat.OpenXml to your project and following the instructions in the Open XML SDK documentation pages.

The advantage of this option is that you can create and manipulate modern Office documents without any external dependencies.

The disadvantage of this option is the staggering complexity of manipulating document elements at the lowest level. It's like writing Office assembly language.

I tried to create an Excel document with simple formatting and I almost died of exhaustion from the effort. I was going reasonably well until I had to create styles, and all of the examples I could find on how to build sets of styles were so verbose, fragile and complicated that I threw in the towel.

Using Open XML is an option for specialists who need to manipulate Office documents without dependencies and have the staff or time to research how to do it properly. Web searches hint that there are so many quirks and "gotchas" that there is danger of madness.

COM Interop


April 2023 Note -- In the following section I describe how to create a new sheet and format and fill cells. It turned out in my particular case that it was much simpler to use Excel to manually format a skeleton document in exactly the way I desired, setting cell formats, columns widths, fonts, etc, then load the document and simply put values into the cells. This technique is highly recommended if you have sheets with a regular and structured layout, such as a report with columns. For other more complex cases though, you'll have to fall back to verbose code for cell formatting.

After abandoning the Open XML option I attempted to use classic COM Interop to manipulate an Excel document. I used COM to read Excel sheets about 10 years ago and I remembered that it was only moderately tricky.

Add a project reference to the Microsoft.Office.Interop.Excel PIA to get access to the COM interfaces and classes for working with Excel documents.

This option is certainly much easier than using Open XML, but there's an initially steep learning curve before you "get the feel of it" and become productive. It took me 6 hours of solid work and intense web searching to create a simple but nicely formatted document. Some of the COM interfaces are strongly-typed and you get helpful intellisense in Visual Studio, but many properties are of type object and there is no way of figuring out what to do without running many web searches in the hope that you will find a working sample. Seemingly simple tasks like setting cell borders, background colours or columns widths requires obscure bits of code that would be nearly impossible to discover without the help of samples from other poor people who have cracked the secrets.

📌 Important Note — COM references must be released or copies of Excel will remain running in your user Process. In the following code you will see careful use of ReleaseComObject to release every COM object. Take special note of the symmetry of the startup and shutdown code. For more information see Release Excel COM Objects.

Following are snippets of useful COM code that took me hours to discover.

Create a new customised Excel document

_Application app = new Application();
Workbooks books = app.Workbooks;
_Workbook book = books.Add();
Sheets sheets = book.Sheets;
_Worksheet sheet = book.ActiveSheet as _Worksheet;
sheet.Name = "Product Versions";
sheet.PageSetup.Orientation = XlPageOrientation.xlLandscape;
app.StandardFontSize = 10;

Create a style

Style headstyle = book.Styles.Add("HeadStyle");
headstyle.HorizontalAlignment = XlHAlign.xlHAlignCenter;
headstyle.Font.Bold = true;
headstyle.Interior.Color = System.Drawing.ColorTranslator.ToOle(System.Drawing.Color.PowderBlue);
// --repeat--

Create a row of 4 heading text cells

Range range = sheet.Range["A1", "D1"];
range.Style = headstyle;
range = sheet.Range["A1"];
range.Cells.NumberFormatLocal = "@";   // This means Text
range.ColumnWidth = 24;    // First column is wider
Marshal.ReleaseComObject(range);
sheet.Cells[1,1] = "Heading A1";
sheet.Cells[1,2] = "B1";
// --etc--

Set some cells as dates

range = sheet.Range["B3", "H3"];
range.Style = footstyle;
Marshal.ReleaseComObject(range);
range.Cells.NumberFormat = "dd-MMM-yyyy";
sheet.Cells[3, 2] = UpdatedTime.ToOADate();

Setting cell borders

Setting cell background colours causes the borders to be overwritten. If you want cell borders around background coloured cell then you have to set the borders manually.

range = sheet.Range["C1", "D12"];
range.Cells.Borders.Color = System.Drawing.ColorTranslator.ToOle(System.Drawing.Color.Gainsboro);
range.Cells.Borders.LineStyle = XlLineStyle.xlContinuous;
Marshal.ReleaseComObject(range);

Graceful shutdown

Finally, the workbook is saved to a file and everything is closed and released (even sheets which wasn't used). I prefer not to set app.Visible = true, but simply use Process.Start to "launch" the file afterwards. Start Task Manager or procexp.exe and ensure that an instance of Excel can be seen starting and then ending when your Interop code runs.

string savename = @"C:\temp\Sample.xlsx";
if (File.Exists(savename)) File.Delete(savename);
book.SaveAs(savename);
book.Close();
app.Quit();
Marshal.ReleaseComObject(sheet);
Marshal.ReleaseComObject(sheets);
Marshal.ReleaseComObject(book);
Marshal.ReleaseComObject(books);
Marshal.ReleaseComObject(app);
Process.Start(savename);

By piecing together bits of the above code it should be possible to create an Excel document that has custom formatting for font size, cell background colours, cell content category types, cell borders.

Handling of cells and ranges is really clumsy and I'm wondering if there's a better way than referencing single cells with the Cells[row,col] syntax and ranges with Range["A1","B2"] format. Mixing the formats is quite irritating.

There are many other appearances and techniques I haven't tried, like text colour, merging cells, font styles (italic, bold, etc). It's pretty likely that just about anything that's possible in Excel can be done via COM Interop as well, but finding out how will surely require meticulous reading of the documentation or hours of web searching for working samples.

Be careful of the order of code. I found that some code would be frustratingly ineffective until I carefully reordered a couple of lines. I was often just shuffling code around until it worked.

I haven't figured out exactly how to mimic the formatting of cells as you do in Excel with the Ctrl+1 dialog. The number formats make little sense at the moment.

Good luck. It's the least worst way of making Excel documents without 3rd party libraries.

Sunday, February 21, 2021

SQLite rehabilitated

 About 10 years ago I attempted to use SQLite in some projects, but getting it integrated and working nicely in a .NET project was a fragile and frustrating process. I remember a cavalcade of compile and runtime errors caused by 32/64 bit confusion and making bad choices due to scatter-brained documentation.

As an experiment this weekend I tried to process one of my old SQLite databases in a modern .NET Core project via Entity Framework, and I'm very pleased to report that it was a really smooth process. I now consider SQLite rehabilitated and is a good candidate for anyone who needs an in-process relational database that needs little to no configuration and has no external dependencies.

The SQLite documentation points out that you don't have to use it for relational data, it can be used as a mini file-system for storing arbitrary key-value data. Their simplest example is a table with a string primary key and another BLOB column for anything you like. A decade ago I would have considered using an ESENT database for that sort of thing, but now I consider SQLite to be a far superior choice.

What follows is a reminder to myself about how to process a SQLite database in a .NET Core project.

Add the Nuget package Microsoft.EntityFrameworkCore.Sqlite.

Create the classes for the tables. Here are skeletons of two tables that had a one-many relationship.

public sealed class Media
{
  public long Id { get; set; }
  public string Name { get; set; }
  :
  public List<Volume> Volumes { set; set; }
}
public sealed class Volume
{
  public long Id { get; set; }
  public string Title { get; set; }
  public long MedId { get; set; }
  :
  public Media Media { get; set; }
}

The relationship is one Media to many Volumes. Note also that the foreign key column is named MedId, which caused a confusing runtime error. When I ran a query like this:

var source = context.Volume.Include(v => v.Media);

I got the runtime error column MediaId is not defined, which is quite true, that name is not defined anywhere in my code or the database. It turns out that Entity Framework assumes that foreign keys are named tableId. So had I chosen the name MediaId instead of MedId I wouldn't have had this problem. To work around this I had to override OnModelCreating in the following code that defines the context.

public class OpusContext : DbContext
{
  public DbSet<Media> Media { get; set; }
  public DbSet<Volume> Volume { get; set; }
  protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
  {
    optionsBuilder.UseSqlite(@"Data Source=C:\temp\opus.db3");
  }
  protected override void OnModelCreating(ModelBuilder modelBuilder)
  {
    modelBuilder.Entity<Volume>()
      .HasOne(v => v.Media)
      .WithMany(m => m.Volumes)
      .HasForeignKey(nameof(EFLiteTest.Volume.MedId));
  }
}

This manually defines the relationship and which column is the foreign key.

In summary: by adding the package and defining the row and context classes in the traditional way I had a complete way of processing the SQLite database with all of the strongly-typed convenience provided by Entity Framework.

Note - Types

Remember that SQLite has a lot of quirks compared to other big name RDBs. The weirdest thing is that it's "weakly typed" (in my words) or "flexibly typed" (in their words). You can create a table with columns that all have strictly defined types and lengths as you would in SQL Server, but SQLite ignores the column definitions and will let you put anything of any length into any column at runtime. This is all described on their quirks documentation page. If you're using Entity Framework and not using raw INSERT statements, then this quirk is irrelevant because all data manipulation will be strongly-typed.