Saturday, April 17, 2021

BigInteger Factoring

On a recent Sunday afternoon I was writing some code snippets in LINQPad to play with Fermat's Little Theorem and Carmichael Numbers. As part of the experiment I wrote a simple method to factor integers, just using trial division up to the square root. Although it wasn't completely dumb trial division, as my sequence of trial divisors is {2,3,6n±1}, which produces: 2 3 5 7 11 13 17 19 23 25 29 31 etc.

Note: A wheel could be used for more efficient trial division, but for the purposes of my experiments it was overkill.

What surprised me was the overall speed of the trial divisor factoring using the BigInteger struct. I expected factoring time to become unreasonably slow at about 32-bits, but it was factoring 48-bit (15-digit) numbers in sub-second time.

As a further experiment I fed progressively larger primes into the factor method to see when it would hit the "exponential wall" and become impractical.

NLog10SecsFactors
100,0035.00.00PRIME
1,000,0036.00.00PRIME
10,000,0197.00.00PRIME
1,000,000,0079.00.00PRIME
50,000,000,02110.70.01PRIME
100,000,000,00311.00.01PRIME
500,000,000,02311.70.04PRIME
1,000,000,000,03912.00.02PRIME
5,000,000,000,05312.70.05PRIME
10,000,000,000,03713.00.07PRIME
50,000,000,000,05313.70.16PRIME
100,000,000,000,03114.00.24PRIME
500,000,000,000,05714.70.52PRIME
10,000,000,000,000,06116.02.29PRIME
20,000,000,000,000,00316.33.26PRIME
500,000,000,000,000,02117.716.29PRIME
1,000,000,000,000,000,00318.022.84PRIME
2,000,000,000,000,000,05718.332.53PRIME

As the table shows, the simple algorithm hits the 30-second mark at around 18-digit numbers and I stopped there. My PC is a few years old and not particularly powerful, and the factoring was single-threaded, so I'm astonished that the BigInteger class performed so well for so long.

The code

// Naive factoring loops
IEnumerable<BigInteger> Factor(BigInteger n)
{
  double sqr = Math.Sqrt((double)n);
  foreach (long div in Divseq())
  {
    if (n == 1)
    {
      break;
    }
    if (div > sqr)
    {
      yield return n;
      break;
    }
    while (n % div == 0)
    {
      n = n / div;
      yield return div;
    }
  }
}

// Endless sequence of test divisors: 2, 3, 6N±1 ...
IEnumerable<long> Divseq()
{
  yield return 2;
  yield return 3;
  for (long i = 6; ; i += 6)
  {
    yield return i - 1;
    yield return i + 1;
  }
}

Out of curiosity I used Visual Studio 2022's performance profiler to find out what was happening with objects and garbage collection while running tight loops that created BigInteger structs. It turns out that vast numbers of uint arrays are created to hold the internal values of the BigIntegers. You can see the GC reclaiming large amounts of memory every couple of seconds, which results in a typical sawtooth shape in the memory graph. Despite this GC stress, the performance is still surprisingly good.

Note that the BigInteger class is not really needed for the experiments shown above because the long (Int64) is large enough to hold all of the numbers involved. However, I was playing around with much larger numbers in other code not shown in this article. If you convert the Factor method above to use long instead of BigInteger then it runs consistently about 4 times faster.


Sunday, April 11, 2021

Enumerate Azure Storage Accounts

This article uses deprecated classes and libraries. For a modern code sample see: Enumerate Azure Storage Accounts (New)

Last year I pieced-together some C# code which enumerates all of the storage accounts in an Azure subscription. It took many hours of searching and frustrating experiments with bits of sample code from various sources to get the following code working. For my own sanity and in case it helps others, I've pasted a minimal sample below so it doesn't get lost (using fake Ids).

The code can be used as the starting point for a custom utility that scans the contents of all containers, tables and queues in all storage accounts within a subscription. I have created a small project that uses the code in a reusable library (see AzureUtility in DevOps).

Many hours were wasted trying to figure out what the four magic parameter values where, and if and how they were created.

Subscription Id

A guid-like string taken from the Azure portal > Subscriptions > Account blade. Clearly labelled.

Tenant Id

A guid-like string taken from the Azure portal > Active Directory blade. Clearly labelled.

Application Id

In the Azure portal > Active Directory > App Registrations you need to register an app. You can call the App whatever you want, as it doesn't have to physically exist, it's just the name of a "ghost" app that generates an Id and will be given permission to read the Azure subscription.
Then in the Subscriptions > IAM blade add a role assignment "Owner" for the ghost app name. A lesser role may be sufficient to read the subscription, but I ran out of patience for more research.
In May 2023 I found that the Azure Portal UI has changed, so the previous paragraph is not so simple any more. You click Add Role Assigment and are taken through a three step wizard to select the Role, select the principal, then confirm. Apps are not listed in the selection list on the right and you have to manually start typing the name of the app, then it should be listed and can be selected.

Password

The password is created in the Active Directory > App registrations > Certificates & secrets blade. Add a secret for the ghost app and save the generated random string, as it's your first and last chance to see the full secret.
async Task Main()
{
  const string subscriptionId = "02a145d5-d731-40af-903f-59be6d3ef1ca";
  const string tenantId = "3254e567-75e9-4116-9ae2-d3147554faf9";
  const string applicationId = "b06375a4-1211-4368-a796-2e30066d0c27";
  const string password = "SDxB9y5JJU.7q.GJY~tZN4To8CI_4-LCJ_";

  string token = GetAuthorizationHeader(applicationId, password, tenantId).Result;
  var credential = new TokenCredentials(token);
  var storageMgmtClient = new StorageManagementClient(credential) { SubscriptionId = subscriptionId };
  foreach (var s in await storageMgmtClient.StorageAccounts.ListAsync())
  {
    string rg = s.Id.Split('/')[4];
    string key = storageMgmtClient.StorageAccounts.ListKeys(rg, s.Name).Keys[0].Value;
    WriteLine(s.Name);
    WriteLine($"- Id = {s.Id}");
    WriteLine($"- Kind = {s.Kind}");
    WriteLine($"- Primary Location = {s.PrimaryLocation}");
    WriteLine($"- Key = {key}");
    WriteLine($"- Endpoint Blob = {s.PrimaryEndpoints.Blob}");
    WriteLine($"- Endpoint Table = {s.PrimaryEndpoints.Table}");
    WriteLine($"- Endpoint Queue = {s.PrimaryEndpoints.Queue}");
    WriteLine($"- Endpoint File = {s.PrimaryEndpoints.File}");
  }
}

private static async Task<string>> GetAuthorizationHeader(string applicationId, string password, string tenantId)
{
  ClientCredential cc = new ClientCredential(applicationId, password);
  var context = new AuthenticationContext("https://login.windows.net/" + tenantId);
  var result = await context.AcquireTokenAsync("https://management.azure.com/", cc);
  if (result == null)
  {
    throw new InvalidOperationException("Failed to obtain the JWT token");
  }
  return result.AccessToken;
}