Recently I have been working on an API and wound up running into an issue where a third party application would happily take "Invalid" XML and continue on the way. The main culprits were out of spec acceptance of boolean and date fields.

The reason this came up is we are wrapping a third party api and want to be able to point users to our new api through a DNS resolution change so the url points at our new server and client applications do not notice any changes.

TL;DR; Go to the Code

Here is a very quick example of the main issues:

C# Object


[XmlRoot(ElementName = "XmlObject")]
public class XmlObject
{
    [XmlElement(ElementName = "AString")]
    public string AString { get; set; }

    [XmlElement(ElementName = "ABool", DataType = "boolean")]
    public bool ABool { get; set; }

    [XmlElement(ElementName = "ADate", DataType = "dateTime")]
    public DateTime ADate { get; set; }
}
<XmlObject>
    <AString>Some Random String</AString>
    <ABool>False</ABool>
    <ADate>2019-04-1</ADate>
</XmlObject>

Unless you have dealt with XML a lot the first question is what is wrong with that? There are two errors according to the xml specs. First, the boolean needs to be lower case (false). Second, the date needs to have two digit days (2019-04-01).

.net core web API is great for adding content negtiation to accept XML as well as JSON. Add a nuget reference to Microsoft.AspNetCore.Mvc.Formatters.Xml and call .AddXmlSerializerFormatters() chained from .AddMvc() and its set up and working with a to spec XML parser dropped right into the pipeline.

On the other hand, if we need custom deserialization, we need to have a custom implementation of XmlSerializerInputFormatter returning a custom XmlReader that manipulates the strings in the XML to prevent boolean or date parsing from failing.


Code

So on to the code because thats why you all read this.

The first thing to add is our custom XmlReader by way of XmlTextReader

We need to override two methods and a constructor. XmlSerializerInputFormatter only uses the Stream constructor so thats all I have here. YAGNI FTW.

The two methods are used for different kinds of xml tags.

public class CustomXmlTextReader : XmlTextReader
{
    public CustomXmlTextReader(Stream stream) : base(stream)
    {
    }

    public override string ReadContentAsString()
    {
        var text = base.ReadContentAsString();

        if (bool.TryParse(text, out bool boolParseResult))
        {
            text = XmlConvert.ToString(boolParseResult);
        }

        else if (DateTime.TryParse(text,
                                   out var dtParstResult))
        {
            text = XmlConvert.ToString(dtParstResult, XmlDateTimeSerializationMode.RoundtripKind);
        }


        return text;
    }

    public override string ReadElementContentAsString()
    {
        var text = base.ReadElementString();

        if (bool.TryParse(text, out bool boolParseResult))
        {
            text = XmlConvert.ToString(boolParseResult);
        }
        else if (DateTime.TryParse(text,
                                   out var dtParstResult))
        {
            text = XmlConvert.ToString(dtParstResult, XmlDateTimeSerializationMode.RoundtripKind);
        }

        return text;
    }
}

Next is to create a custom XmlSerializerInputFormatter and override CreateXmlReader so we return our CustomXmlReader when needed.

This isn't the greatest way to do things since every bit of xml the comes in will get parsed with our custom parser, but if you dig into the source code of XmlSerializerInputFormatter it always returns a text reader so its not too terrible.


public class CustomXmlSerializerInputFormatter : XmlSerializerInputFormatter
    {
        //Only care about this one since the others are going away.
        public CustomXmlSerializerInputFormatter(MvcOptions options) : base(options)
        {
        }

        protected override XmlReader CreateXmlReader(Stream   readStream,
                                                     Encoding encoding)
        {
            return new CustomXmlReader(readStream);
        }
    }

And finally we get to the content negotiation. If we just needed the default we could just do the following. If you don't need crazy custom stuff, do this. It's the new way and we dont have to manually set things up.

public void ConfigureServices(IServiceCollection services)
{
    services.AddMvc().AddXmlSerializerFormatters();
}

However we are doing custom stuff so we have to do it old skool


public void ConfigureServices(IServiceCollection services)
{
	services.AddMvc(config =>
	{
		// Add XML Content Negotiation
		config.InputFormatters.Add(new CustomXmlSerializerInputFormatter(config));
		config.OutputFormatters.Add(new XmlSerializerOutputFormatter());
        //Add media type formatting so we will use xml parsing for the content-type
        config.FormatterMappings.SetMediaTypeMappingForFormat("xml","application/xml");
	});
}

So now we are up and running with non standard xml parsing able to take a bit more of the muk people throw at us.


Done but not finished

This post made possible by stack overflow and diving into the source code. The two most helpful answers on stack overflow were Implementing the custom XmlReader and Using a custom XmlSerializer. Of course I didn't see any one put them together to get the functionality I wanted. So After getting every thing working I typed this up.