A place for me to put reminders, tips, tricks and 'gotchas' about software development. It's all public in case others might find it useful.
Wednesday, October 15, 2014
Hashing Short Strings
In mid 2014 I resurrected the program, modernised it a bit and added a pick list of 22 different hashing and checksum algorithms to see how their behaviour compared to String GetHashCode. The results were so interesting that I have placed the program's C# Source Code on Bitbucket and I composed the following article to discuss the results in detail:
Hashing Short Strings
The hashing article is buried deep inside my personal web site with one obscure link to it, so I thought it was wise to mentioned in this Blog to boost its publicity. Fellow geeks who like maths, crypto and hashing should enjoy the article.
Friday, October 3, 2014
sc.exe syntax trick
sc create FooService binPath="C:\blah\foo.exe" start=auto
After staring and squinting at the documentation for some time I eventually notice that the documentation has a space after the = sign on the options. I have never seen a command in any platform on any operating system (including Windows) in the last 30 years that used a syntax like this:
option= SomeValue
(note the space) But it's true! Putting a space after the = signs in my create command makes it work. As the self-described criminal genius Vizzini said, "Inconceivable!".
Tuesday, September 30, 2014
IIS REST Verbs give 404 and 401
404 Not Found
My GET and POST requests all work, but PUT and DELETE return status 404 from IIS and my code is never reached. I found it was necessary to go to IIS Manager > Web Site (or app) > Handler Mappings > double-click SimpleHandlerFactory-Integrated-4.0. In the dialog click Request Restrictions > Verbs tab > add PUT and DELETE and any other verbs you desire to the comma joined list.
401.3 Access is denied
I've overcome the 404 but now I get a security violation. Mercifully the default body of the response tells me it's a 401.3 which is an ACL violation on the ashx file. Procmon unexpectedly does not show me any Access Denied events to help me diagnose the violation. After a few experiments I conclude that you have to give Authenticated Users Read and Write permission to the ashx file.
And there goes another hour of my life I'll never get back.
Addendum March 29th 2016
A brand new small REST service was giving 404 for all DELETE verbs on my live server, not in development. I followed the instructions above but it made no difference. Then I noticed the Web.config file had a <system.webServer> <handlers> sections which was commented out. Inside it removed and added ExtensionlessUrlHandler-Integrated-4.0 allowing all verbs. I uncommented the section and now the deletes started working.
I presume I could also have edited the corresponding handler in the IIS configuration dialogs, but not this time.
Thursday, September 18, 2014
Post Build ILMerge vs LibZ
"%ProgramFiles(x86)%\Microsoft\ILMerge\ILMerge.exe" /out:\\shared\utility\$(TargetFileName) /wildcards $(TargetFileName) *.dll /targetplatform:v4,%windir%\Microsoft.NET\Framework\v4.0.30319
I haven't tried using ILMerge on Framework 4.5 projects, but web searches hint that there are a few hurdles to getting it working.
It is well known that ILMerge does not work on WPF assemblies. The author says:
ILMerge is not able to merge WPF assemblies. They contain resources with encoded assembly identities. ILMerge is unable to deserialize the resources, modify the assembly identities, and then re-serialize them. Sorry!
In 2013 I stumbled upon an ILMerge replacement utility called LibZ (see Codeplex). The author explains the motivation for writing LibZ on the home page and has a nice technical discussion of how it works. Most importantly for me, LibZ has no problem with WPF assemblies. I have replaced all of my ILMerge post build commands like the one above with something like this:
xcopy $(TargetFileName) \\shared\utility /Y /D
libz inject-dll -a \\shared\utility\$(TargetFileName) -i *.dll
Notice that I copy the target file to the shared utility folder where it will finally live, then I process that file. I prefer to do that so the original build output file remains untouched.
Wednesday, September 3, 2014
pkzipc extract to subfolder
pkzipc -ext -dir \\shared\archive\140325.zip * tempfiles
But no matter what I did it kept extracting the files into the current folder. My command looked similar to the sample in the official PDF online documentation (except they used ".." as the output folder).
It turns out you have make the subfolder first, otherwise pkzipc just silently ignores the output folder name you specify and puts them in the current folder.
So the sample used ".." which always exists and therefore always works. The weirdly bad example wasted 15 minutes of my time because I thought I had the syntax subtly wrong.
Monday, September 1, 2014
Real Random Numbers
July 2022 Update — The random.org and ANU Quantum web services are now behind paywalls. You have to register with both of them, even for free tier access, and the quota limits for free access are so cripplingly small that the services are now beyond the reach of hobby consumers. I hope that alternative similar services with more generous quotas may become available in the future. Some quick web searches reveal that there are hardware devices available that generate true random numbers, some cheap and simple, some very expensive. YouTube search results also suggest that this topic is clearly of interest to programmers and engineers. Links to sample code below originally written in 2014 have been removed.
November 2022 Update — The random.org Integer Generator does provide a quota that is generous enough for hobby use.
I was pleased to discover that there are many public web services available that provide real random numbers. You can easily generate pseudo random numbers in every modern programming language and platform, but those numbers are generated by deterministic algorithms and are not truly random.
Popular algorithms such as combined LCGs, the Mersenne Twister and the Subtractive Generator produce astronomically long sequences of pseudo random numbers that pass the toughest batteries of tests for randomness. So long as these algorithms are seeded and used cautiously they will satisfy most normal requirements. Be aware though that the internal state of these algorithms can be deduced by watching a certain number of sequential outputs, after which the sequence can be predicted forever. This predictability makes such pseudo random sequences unsuitable for use in cryptography.
When randomness is required in cryptography you should use a cryptographically secure pseudo-random number generator. Developers on the .NET platform can use the RNGCryptoServiceProvider class. Secure random numbers are slower to generate; my RandPlot application shows that a combined LCG can generate 2,300,000 numbers per second whereas the crypto secure class generates 165,000 per second. In practice this 14x speed difference probably won't be an issue because secure random numbers are usually used in small quantities for seeds or keys.
Thanks to online services we now have an exciting new alternative to pseudo or secure random numbers: real random numbers generated by natural processes. I found the following services attractive because they are backed by robust theory and they have friendly APIs to allow client applications to consume them.
ANU Quantum Random Numbers Server
The Australian National University generates random data in real-time at 5.7 GBits/sec by measuring quantum fluctuations of the vacuum. They provide a Web API that returns random data as JSON in three selectable formats.random.org
Randomness is distilled out of atmospheric noise generated by radio receivers tuned between stations. There is an old Web API that returns data as text or XML and there is a new JSON API.I've personally become quite attracted to the ANU's Quantum generator because of its futuristic flavour, the tantalising and trustworthy theory behind it, the blazing fast speed of the generator and the simple API.
The wonderful thing about using a service like the Quantum generator is that you never have to worry about even the tiniest flaws that may theoretically appear in random numbers generated by algorithms. You will never need to re-seed the generator. The previous century of detailed research and measurement of random number algorithms becomes a historical curiosity when you have access to real random numbers.
Legacy codes samples and outdated commentary has been removed [July 2022].
Sunday, August 31, 2014
Setup project user logging
[RunInstaller(true)] public class MyCustomActions : Installer { public override void Install(IDictionary stateSaver) { // Custom install processing here } protected override void OnBeforeUninstall(IDictionary savedState) { // Custom uninstall processing here } // Other overrides are available }
In the Setup project you open the Custom Actions Editor window and add the assembly containing the custom class to the event nodes. The methods will then be called at the appropriate points in installer processing.
It's possible to add custom dialogs to the install sequence, name the input fields and associate them with properties passed to the custom action events. This simple pre-packaged process is described in other web articles and is not the subject of this post.
An Installer derived class inherits the LogMessage method, but the output from this method does not seem to appear in the installer log. If you run msiexec.exe over an MSI file with the /L*v switch to produce the maximum possible verbose output, you will not see the LogMessage output.
For a long time I wondered where the LogMessage output went. Early last year I found an obscure hint in a web page (that I have lost) that says you can add something like this to the CustomActionData in the custom actions designer:
/logfile="C:\temp\custom-log.txt"
Saturday, July 5, 2014
Dynamic C# code compilation
I have created a small example Visual Studio 2015 solution that shows how to dynamically compile a file of C# source code into an in-memory assembly, create and instance of a Type in that assembly and call one of its methods. The full project and source is available in this repository:
https://dev.azure.com/orthogonal/CsCompile
An interesting trick in the project is to define the symbol ORDINARY which causes the project and source to compile in the simple traditional way. Without the symbol dynamic compilation takes place.
The important part of the example code is worth extracting and displaying here.
var provider = new CSharpCodeProvider(); var parameters = new CompilerParameters(); parameters.GenerateInMemory = true; parameters.ReferencedAssemblies.Add("System.Core.dll"); parameters.GenerateExecutable = false; parameters.CompilerOptions = "/define:ORDINARY"; string codeToCompile = File.ReadAllText(@"..\..\SourceCode\Worker.cs"); var results = provider.CompileAssemblyFromSource(parameters, codeToCompile); if (results.Errors.HasErrors || results.Errors.HasWarnings) { // Display the results.Error collection return; } Type t = results.CompiledAssembly.GetType("cscompile.Worker"); dynamic worker = Activator.CreateInstance(t); worker.SayHello();
Don't forget though that there are other ways of generating dynamic code. See the MSDN articles titled Using Reflection Emit and Using the CodeDOM. These techniques are much more difficult but they are strongly-typed and more robust than simply feeding free form text into a compiler.
Friday, July 4, 2014
AppDomains from libraries in subfolders
Early 2022 note: The AppDomain class has been basically deprecated after the introduction of .NET Core and later frameworks. This article is therefore only meaningful for the traditional .NET Framework. For more information see .NET Framework technologies unavailable on .NET Core and .NET 5+.
It's a nice deployment pattern to isolate "plug-in" code into library files in subfolders under the application folder and run these plug-ins in a separate AppDomain. By loading libraries and the assemblies they contain into a separate AppDomain it's possible to apply a restrictive security policy them and it's possible to unload them.
The Assembly Load, LoadFile and LoadFrom methods load libraries into one of the contexts in the AppDomain of the caller, and if this is the initial AppDomain of the Process then it cannot be unloaded until the Process terminates.
Organise your projects like this skeleton:
MyApp
MyApp.Plugin.Common
MyApp.Plugin.PluginOne
MyApp.Plugin.PluginTwo
The application and the plugin projects reference the common library, never each other. All of the plugins should implement an interface defined in the common library which defines their public contract. The plugin classes must be derived from MarshalByRefObject to allow strongly-typed communication via Remoting between the AppDomains.
When the application is deployed, arrange the folders like this:
Application Folder
theapp.exe
MyApp.Plugin.Common.dll
Application Folder\Subfolder One
MyApp.Plugin.PluginOne.dll
MyApp.Plugin.Common.dll
Application Folder\Subfolder Two
MyApp.Plugin.PluginTwo.dll
MyApp.Plugin.Common.dll
The application can search subfolders at runtime to find plugin libraries. You might use a folder naming convention or place a special XML file of instructions beside the plugins to identify and describe them. When a plugin is identified it can be loaded, called and unloaded like this:
var ads1 = new AppDomainSetup();
ads1.ApplicationBase = folder1.FullName;
var dom1 = AppDomain.CreateDomain("Domain-1", null, ads1);
string file1 = Path.Combine(folder1.FullName, "MyApp.Plugin.PluginOne.dll");
string class1 = "MyApp.Plugin.PluginOne.ClassName1";
var plug1 = (IPlugin)dom1.CreateInstanceFromAndUnwrap(file1, class1);
Console.WriteLine(plug1.SayHello());
AppDomain.Unload(dom1);
This code creates an AppDomain with its base folder set to the subfolder where the plugin was found. It then uses CreateInstanceFromAndUnwrap passing the full path of the plugin's folder and the name of the class to instantiate in the AppDomain. The returned value (actually a proxy) is cast to the common interface and it can be called like a normal class method. The AppDomainSetup class has many other properties that help configure the AppDomain, such as specifying a config file.
The applications and the plugin projects do not reference each other and at runtime they only communicate using an interface over Remoting between the AppDomains.
While writing the real application that uses the technique described above I accidentally used the CreateInstanceAndUnwrap method (without the From) and I received misleading "Assembly not found" errors. It took me hours of suffering before I realised my dyslexic mistake caused by the similar names. The MSDN documentation describes the subtle differences between the two methods.
The technique described in this article is the manual way of implementing plugin isolation via AppDomains and is suitable for simple scenarios. The Managed Extensibility Framework (MEF) and the Managed Add-In Framework (MAF) are worth exploring for more complex scenarios.
Friday, April 18, 2014
Implementing SymmetricAlgorithm
Then I wondered how difficult it was to implement SymmetricAlgorithm and create a class that behaves just like the standard classes (Aes, DES, etc). It turns out to be more tricky and 'fiddly' because there are many interrelated properties and virtual members that are difficult to understand clearly even after reading the MSDN documentation. You also need to implement three classes: the provider and the pair of encrypt/decrypt transform classes.
I set myself the challenge of writing a simple but completely functional set of symmetric algorithm classes that behave just like the standard ones. The internal algorithm is the childish technique of encrypting and decrypting 8-byte blocks by XORing them with a sequence of pseudo-random blocks in either ECB or CBC modes (this algorithm is otherwise known as Snake Oil or Craptography)..
The resulting project, code and test data can be downloaded from the repository. The code is well documented, but there are some points worth clarifying.
CanTransformMultipleBlocks
If you set this property to true, then the transform methods may be passed long buffers which are a multiple of the block size. It is your responsibility to process the blocks sensibly. You may set this property true if you know you can process many blocks more efficiently. For the demonstration code I set this property false and only transform single blocks.
Decrypting the final block
During decryption we need to know when the final block is being transformed so the padding can be stripped from it. Unfortunately, the TransformBlock method does not distinguish the blocks and there is no way of telling which block is the final one. However, TransformFinalBlock is always the last method to run and it's always passed input length zero because there are no trailing bytes. A workaround is to delay writing decrypted blocks until the next one is read, so writing "lags" one call behind the reading. So when we hit the TransformFinalBlock call we know that the "lagged" block is the one to be padding stripped. This "lag" workaround is a nuisance as it clutters the decryption code a bit.
ADDENDUM #1
A few days after finishing the sample code I thought I'd replace the childish pseudo-random encryption with something also simple but more realistic. I searched for TEA (Tiny Encryption Algorithm) and discovered that someone had practically duplicated my code using the XTEA algorithm, except their version lacked C# style, lacked some Dispose calls, and they didn't support CBC mode. See: eTutorials 14.4 Extending the .NET Framework.
ADDENDUM #2
Rather than use one of the TEA variants as the encryption algorithm I decided to use a pseudo-random sequence of 64-bit numbers. Such sequences produce nice random looking ciphertext for my demonstration, but it is well documented that they are worthless for serious cryptography. I didn't actually have a 64-bit random number generator handy, so I took this incredibly simple and effective unsigned 32-bit one-liner from Numerical Recipes in C, Chapter 7:
uint s = seed value;
s = s * 1664525u + 1013904223u;
And I turned it into this:
ulong s = seed value;
s = s * 2770643476691ul + 4354685564936844689ul;
The new 64-bit multiplier is a prime that is near the square of the 32-bit multiplier. The new 64-bit addend is a prime near (Sqrt[5]-2) * 2^64 as suggested by Knuth. I have no theoretical proof that these numbers are good, but running 5 billion iterations through the RandPlot application shows no lattice structure or recycling. Based on this heuristic proof only, I feel that this is a great choice when speed, reliability and simplicity are desirable for random number generation. The low-orders bits of the sequence are of course highly correlated, which could be partly cured by a Bayes-Durham shuffle, but I didn't bother with it for this sample.
I found the prime numbers thanks to Mathematica's RandomPrime function.
Sunday, April 13, 2014
Implementing HashAlgorithm
See the 2022 article titled CRC vs Hash Algorithms for a news update about the CRC and XXHash classes that are now part of the .NET Core class libraries. There is no longer any need to use borrowed code for CRC or SHA3.
I often use the classes MD5 and SHA1 for creating data "fingerprints" or secure hashes. I was pleased to discover recently that authors of the SHA-3 hash algorithm have created a C# implementation and published it as a Nuget package. This means .NET developers can easily use this new NIST certified algorithm.
For years I wondered how hard it was to implement your own hash algorithm and make it look and behave like the standard ones. I vaguely guessed it was tricky due to the large numbers of virtual and abstract base class members. However, after running an experiment one morning I found it's actually quite easy. For my experiment I tried to create a HashAlgorithm implementation that produced a 32-bit output using the CRC-32 algorithm. The resulting class is just this:
public class HashCrc32 : HashAlgorithm { private Crc32 crc; public HashCrc32() { crc = new Crc32(); } protected override void HashCore(byte[] array, int ibStart, int cbSize) { crc.Update(array, ibStart, cbSize); } protected override byte[] HashFinal() { byte[] buff = BitConverter.GetBytes((uint)crc.Value); Array.Reverse(buff); return buff; } public override void Initialize() { crc.Reset(); } public override int HashSize { get { return 32; } } }
The code for the Crc32 class I use above can be found all over the Internet in various slightly different forms which all seem to work correctly. My own copy can be found in the Orthogonal.Common repository, look under the Basic folder.
As you can see, only a handful of members need to be overridden. The virtual HashCore and HashFinal methods are called by all of the familiar public methods that hash buffers and streams in various ways, so by implementing those virtual methods the class will work correctly in all scenarios.
Saturday, March 29, 2014
Copying NTFS security
Preamble tale:
A few days ago I was shutting down Windows 7 and it suddenly produced a BSoD telling it had a SERVICE_EXCEPTION. I always reboot after something like this happens to be sure there's no permanent damage. Sadly, I found the following weird problems:- Internet Explorer would silently not run.
- Double clicking iexplore.exe did nothing.
- Double clicking iexploe.exe 32-bit produced "File or Directory is corrupt".
- Most of the Start Menu icons had vanished.
- Most of the Administrative Tools Menu icons had turned to white plain ones and did nothing.
After running
chkdsk C: /R
and rebooting I saw a dozen repair messages (including one for iexplore.exe) and Windows seemed to be running mostly normally. The Start Menu had returned, but a few of the Administrative Tools icons were still plain and dead. So the chkdsk had graced me enough time to run backups and save settings and configs to prepare for a fresh install.Ever since I last installed Windows 7 on a new SanDisk 256GB SSD during Xmas 2013 I had been receiving occasional random boot failures which required a power off-on to overcome. I figured this was some quirk of the combination of hardware I had and just ignored it as a minor irritation. The partial failure of the SSD now combined with the random boot failures to provide sufficient evidence that the SanDisk SSD was faulty and failing. So the BSoD and the strange symptoms were all caused by a failing SSD.
Copying NTFS Security
While reinstalling applications on a fresh Windows 7 on a new Kingston brand SSD I decided that the 34GB of Sibelius 7 sound files were a waste of space on the SSD, so instead of sending them to the defaultC:\Program Files (x86)
folder I changed the C:
to a D:
to send them to a large mostly empty HDD.However, the resulting
D:\Program Files (x86)
folder did not have the same special security permissions as its sibling on the C:
drive. As a matter of neatness and principle I wanted to give the D:
folder the same permissions as the original. You can copy the permissions like this from an elevated command prompt:icacls "C:\Program Files (x86)" /save "%TEMP%\perms.txt"
icacls D:\ /restore "%TEMP%\perms.txt"
This saves and restores a folder DACL using an intermediate text format. Note that the second command does not specify the target folder name, as it's specified inside the text file, so you specify the target as the parent folder. Because the folders names are identical, there was no need to edit the text file. If the DACL was to be applied to a folder or file of a different name then you would need to manually edit the text file and change the name which is present in the first line of the file. My text file looked like this:
Program Files (x86) D:PAI(A;;FA;;;S-1-5-80-956008885-3418522649-18310 [cut]
SNK, projects, users and key containers
It would be nice to put a single snk file in a well-known location and have all projects reference that file, but the Visual Studio project properties do not seem to support this (if I'm wrong, please explain the trick to me!). To share a snk file it seems necessary to edit the csproj files manually and add something like this:
<PropertyGroup> <SignAssembly>true</SignAssembly> <AssemblyOriginatorKeyFile>..\Common Files\MyCompany.snk</AssemblyOriginatorKeyFile> </PropertyGroup>Another way of sharing a strong name is to put it into a named key container like this:
sn.exe -i MyCompany.snk MyCompany
This puts the snk file data into a protected part of the registry and avoids the need to have the snk file lying around at all. The Visual Studio project properties do not seem to support this either, so you'll have to edit the csproj file and add something like this:
<PropertyGroup> <KeyContainerName>MyCompany</KeyContainerName> </PropertyGroup>
This was working well, but a day later I ran Visual Studio as Administrator and compiles failed telling me that the key container was not found. I eventually found I had to run this command to make the key container name available to all users:
sn -m n
The documentation on this switch is a bit misleading. This means that keys are not user specific (as I thought), but they are available for all users.
Friday, January 24, 2014
Parallel.For with a disposable class
----------------
Chaps, I think I have found the formally correct way of giving each Parallel ForEach thread its own copy of a disposible and unsharable class. There is an overload of ForEach that lets you do something at the start and end of each worker thread. Below is a skeleton of my code. It's interesting to put trace displays in the 3 callback blocks and look at the lifetime of the threads and the order in which things happen.
The following skeleton sample code runs a parallel loop over an enumerable set of file names to calculate a hash of the recursive contents of all files under a folder. A small helper class lives for the duration of each worker thread. Each worker accumulates a subtotal hash which is later safely xor'd into the total hash with a safety lock.
The important thing about the parallel loop is that it has 3 callbacks. The first runs when a worker thread is created, and we simply return a new work data object. The second runs when a worker must process an item, and we hash a file and xor it into the work data sub-hash. The third runs when the worker is finished, and we safely xor the sub-hash into the total.
This pattern is very effective, as a "chunk" of work is done on each worker thread, then each chunk is added to the total. A lock only exists for a short time when each worker ends and adds its sub-hash to the total hash.
While the parallel loop is running, calling cts.Cancel() will asynchronously cause the loop to graciously cancel all threads and the ForEach will throw an OperationCancelledException that you can catch and report to the caller.
Remember that the Parallel.ForEach call blocks. In more realistic scenarios you would place the ForEach call inside a Task and await it, but I haven't bothered here because it clutters the sample code.
private sealed Class WorkData { public HashAlgorithm Hasher = MD5.Create(); public byte[] SubHash = new byte[16]; } private byte[] totalHash = new byte[16]; var cts = new CancellationTokenSource(); var po = new ParallelOptions() { CancellationToken = cts.Token }; var filesource = SafeEnumerateFiles(@"C:\temp\testdata"); try { Parallel.ForEach<string,WorkData>(filesource, po, () => { return new WorkData(); }, (filename, pls, index, data) => { po.CancellationToken.ThrowIfCancellationRequested(); using (var stream = new FileStream(filename, FileMode.Open, FileAccess.Read)) { byte[] hash = data.Hasher.ComputeHash(stream); XorBuffer(data.SubHash, hash); } return data; }, (data) => { lock (totalHash) { XorBuffer(totalHash, data.SubHash); data.Hasher.Dispose(); } }); Trace("Total hash = {0}", BitConverter.ToString(totalHash)); } catch (OperationCancelledException) { Trace("The parallel loop was cancelled"); }
ftp.exe and passive mode
We discovered that a C++ app running in an Amazon instance could not communicate with the outside world to manipulate files via FTP. While trying to debug this terrible problem I used the built-in ftp.exe Windows app to try and simulate and solve the problem. I expected that the "vanilla" ftp program would be a good base-level way of testing the situation. It turns out that two wrongs had made an even worse wrong. I think I knew this years ago but forgot: ftp.exe does not support passive mode. By using a different FTP client such as Windows Explorer or FileZilla I found that FTP processing worked fine. It turns out that the C++ app needed to flip a property to use passive mode and it would work, and all the testing with ftp.exe was a dreadful waste of time because instead of testing the problem I was actually reproducing it. Part of the awful waste of time was sending PASV commands to ftp.exe, which seemed to be accepted, but actually did nothing.
Manage IIS with AppCmd.exe
C:\Windows\System32\inetsrv\appcmd.exe
Voluminous help is available on the iis.net Getting Started page. Useful commands are:
appcmd list site appcmd list app appcmd list vdir appcmd list apppool
The output from these commands can help you find orphaned directories and applications.
ROBOCOPY in post build
robocopy source destination switches /NJS /NJH /NP /NDL if %ERRORLEVEL% LEQ 7 exit /B 0