After an hour of searching and fiddling around I came to the conclusion that the none of the System.IO classes correctly report the encoding type, which means you can't simply and automatically round-trip a file's encoding when it's processed as a text file. Other web comments seems to support this, but if anyone knows otherwise, please let me know.
I reluctantly used the following code to calculate a file's text encoding.
private static Encoding CalcEncoding(string filename) { var prencs = Encoding.GetEncodings() .Select(e => new { Enc = e.GetEncoding(), Pre = e.GetEncoding().GetPreamble() }) .Where(e => e.Pre.Length > 0).ToArray(); using (var reader = File.OpenRead(filename)) { var lead = new byte[prencs.Max(p => p.Pre.Length)]; reader.Read(lead, 0, lead.Length); var match = prencs.FirstOrDefault(p => Enumerable.Range(0, p.Pre.Length).All(i => p.Pre[i] == lead[i])); return match == null ? null : match.Enc; } }
This method 'sniffs' the file and finds any encoding with preamble bytes that match the start of the file. It's clumsy to have to do this. If you get null back then you have to chose a suitable default encoding, and a new UTF8Encoding(false) class is a good choice on Windows where UTF8 encoding without a BOM is the default for most text file processing.
Once you have the original encoding (or a suitable default), pass it into the StreamWriter's constructor and you can be sure that the original encoding and BOM will be preserved.