.NET

Performance oriented .NET XML and JSON serialization

Microsoft .NET framework provides multiple out-of-the-box data serializers for data transformations. The most famous one used since .NET 1.0 version is XmlSerializer, while one that has got more famous since .NET 3.0 framework is DataContractSerializer.

But they are not the only two serializers the framework offers. So in this essay, let’s see the different serializers .NET framework offers and how they are different from each other.

So here’s the list of serializers:

  • XmlSerializer: The most commonly used xml serializer
  • JavaScriptSerializer:  Introduced in ASP.NET Ajax Extensions in .NET 2.0, and now marked as obsolete, primarily provided JSON serializer
  • DataContractSerializer: Introduced in .NET 3.0 with Windows Communication Foundation (WCF), this is default serializer in WCF.
  • NetDataContractSerializer: A not-too-famous serializer that includes CLR type information in serialized xml which DataContractSerializer does not.
  • DataContractJsonSerializer:  Introduced in .NET 3.5, this class is handy in generating JSON output of an entity

Next, let’s define a class Employee and implement a serializer class and try out XmlSerializer, DataContractSerializer, NetDataContractSerializer and DataContractJsonSerializer with examples.

Step 1- Defining Employee class

The structure of Employee class is different for different serializers.  So make a note that XmlSerializer requires a parameterless or a default constructor, while other serializers do not require so.  Other serialisers based on DataContractSerializer require DataContract and DataMember attribute on the class and its members, while XmlSerializer requires either a native type, or ISerizable implemented complex class (read: class)

    ///<summary>
    /// Employee class for all other serializers
    ///</summary>
    [DataContract]
    public class Employee
    {
        [DataMember]
        public string Name { get; set; }
        [DataMember]
        public int EmployeeId { get; set; }
        ///<summary>
        /// Note: Default constructor is not mandatory
        ///</summary>
        public Employee(string name, int employeeId)
        {
            this.Name = name;
            this.EmployeeId = employeeId;
        }
    }
    ///<summary>
    /// Employee class for XmlSerializer
    ///</summary>
    public class Employee
    {
        public string Name { get; set; }
        public int EmployeeId { get; set; }
        ///<summary>
        /// Parameter-less constructor is mandatory
        ///</summary>
        public Employee() { }
        public Employee(string name, int employeeId)
        {
            this.Name = name;
            this.EmployeeId = employeeId;
        }
    }

Step 2 – Defining the Serialization Factory

To define the serialization factory, we will define an enum SerializerType, and a factory class SerializerFactory and add reference to System.Runtime.Serialization using “Add Reference” option

    public enum SerializerType
        {
            ///<summary>
            /// XmlSerializer
            ///</summary>
            Xml,
            ///<summary>
            /// DataContractJsonSerializer
            ///</summary>
            JSON,
            ///<summary>
            /// DataContractSerializer
            ///</summary>
            WCF,
            ///<summary>
            /// NetDataContractSerializer
            ///</summary>
            CLR
        }

The factory class could be plain vanilla object creation based on the enum (SerializerType) value, however creation of serialization object is heavy on performance. Hence, we would like to cache it in the memory for re-use. So the factory class has been optimized for better performance using a Dictionary of serializers.

    public static class SerializerFactory
        {
            private static Dictionary<Type, Dictionary<SerializerType, object>> _knownObjects;
            static SerializerFactory()
            {
                _knownObjects = new Dictionary<Type, Dictionary<SerializerType, object>>();
            }
            internal static ISerializer<T1> Create<T1>(SerializerType serializerType)
            {
                Type type = typeof(T1);
                if (_knownObjects.ContainsKey(type))
                {
                    if (_knownObjects[type].ContainsKey(serializerType))
                        return ((ISerializer<T1>)_knownObjects[type][serializerType]);
                }
                ISerializer<T1> returnValue = null;
                switch (serializerType)
                {
                    case SerializerType.Xml:
                        returnValue = new XmlSerializer<T1>();
                        break;
                    case SerializerType.JSON:
                        returnValue = new JsonSerializer<T1>();
                        break;
                    case SerializerType.WCF:
                        returnValue = new WcfSerializer<T1>();
                        break;
                    case SerializerType.CLR:
                        returnValue = new ClrSerializer<T1>();
                        break;
                    default:
                        throw new NotSupportedException(“Unknown serializer type”);
                        break;
                }
                if (_knownObjects.ContainsKey(type) == false)
                    _knownObjects.Add(type, new Dictionary<SerializerType, object>());
                _knownObjects[type].Add(serializerType, returnValue);
                return returnValue;
            }
        }

Step 3 – The Main Program (consuming application)

Our main program should be able to support serialization of Employee class, or a list of employee class as shown below:

   class Program
        {
            static void Main(string[] args)
            {
                List<Employee> employees = new List<Employee>()
                {
                    new Employee("Tim", 1392902),
                    new Employee("Shawn", 156902),
                };
                ISerializer<List<Employee>> xmlSerializer = SerializerFactory.Create<List<Employee>>(SerializerType.Xml);
                string xml = xmlSerializer.Serialize(employees);
                ISerializer<List<Employee>> jsonSerializer = SerializerFactory.Create<List<Employee>>(SerializerType.JSON);
                string json = jsonSerializer.Serialize(employees);
                ISerializer<List<Employee>> clrSerializer = SerializerFactory.Create<List<Employee>>(SerializerType.CLR);
                string clr = clrSerializer.Serialize(employees);
                ISerializer<List<Employee>> wcfSerializer = SerializerFactory.Create<List<Employee>>(SerializerType.WCF);
                string wcf = wcfSerializer.Serialize(employees);
                Console.ReadKey();
            }
        }

Step 4 – The Serializer implementations

To make this essay shorter and easy to comprehend, only two implementations have been mentioned here: Xml and JSON serializer.  The other two have been included in the source code.

Implementing XmlSerializer

As mentioned earlier, the Xml Serializer requires a default constructor without which the program will throw a runtime exception.

    public class XmlSerializer<T> : ISerializer<T>
        {
            System.Xml.Serialization.XmlSerializer _xmlSerializer =
                new System.Xml.Serialization.XmlSerializer(typeof(T));
            public string Serialize(T value)
            {
                MemoryStream memoryStream = new MemoryStream();
                XmlTextWriter xmlTextWriter = new XmlTextWriter(memoryStream, Encoding.UTF8);
                _xmlSerializer.Serialize(xmlTextWriter, value);
                memoryStream = (MemoryStream)xmlTextWriter.BaseStream;
                return memoryStream.ToArray().ToStringValue();
            }
            public T Deserialize(string value)
            {
                MemoryStream memoryStream = new MemoryStream(value.ToByteArray());
                XmlTextWriter xmlTextWriter = new XmlTextWriter(memoryStream, Encoding.UTF8);
                return (T)_xmlSerializer.Deserialize(memoryStream);
            }
        }

Implementing JSON Serializer

A Json Serializer is very handy serializer specially when dealing with REST services, or JavaScript, or cross-platform messaging applications.  In recent times, JSON has gained more adaptability considering the ease to understand the serialized output and the cleanliness

    public class JsonSerializer<T> : ISerializer<T>
        {
            DataContractJsonSerializer _jsonSerializer = new DataContractJsonSerializer(typeof(T));
            public string Serialize(T value)
            {
                MemoryStream ms = new MemoryStream();
                _jsonSerializer.WriteObject(ms, value);
                string retVal = ms.ToArray().ToStringValue();
                ms.Dispose();
                return retVal;
            }
            public T Deserialize(string value)
            {
                MemoryStream ms = new MemoryStream(value.ToByteArray());
                T obj = (T)_jsonSerializer.ReadObject(ms);
                ms.Close();
                ms.Dispose();
                return obj;
            }
        }

Step 5 – Comparing the serialization results

XmlSerializer

<?xml version=”1.0? encoding=”utf-8??>
<ArrayOfEmployee xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xmlns:xsd=”http://www.w3.org/2001/XMLSchema”>
 <Employee>
  <Name>Tim</Name>
  <EmployeeId>1392902</EmployeeId>
 </Employee>
 <Employee>
  <Name>Shawn</Name>
  <EmployeeId>156902</EmployeeId>
 </Employee>
</ArrayOfEmployee>

A default schema/namespace defined by w3.org is added in the root node and the collection is named as ArrayOfEmployee. The output is always a valid Xml.

DataContractJsonSerializer

[
{"EmployeeId":1392902,"Name":"Tim"},
{"EmployeeId":156902,"Name":"Shawn"}
]

There is no schema added to the serialized string and the string is more clean and readable.  Items are grouped by parenthesis { } and the collection is encapsulated within Box brackets [ ].

DataContractSerializer

<ArrayOfEmployee xmlns=”http://schemas.datacontract.org/2004/07/Serializers” xmlns:i=”http://www.w3.org/2001/XMLSchema-instance”>
 <Employee>
  <EmployeeId>1392902</EmployeeId>
  <Name>Tim</Name>
 </Employee>
 <Employee>
  <EmployeeId>156902</EmployeeId>
  <Name>Shawn</Name>
 </Employee>
</ArrayOfEmployee>

A default schema/namespace defined by Microsoft and w3.org is added in the root node and the collection is named as ArrayOfEmployee. The output is always a valid Xml.

NetDataContractSerializer

<ArrayOfEmployee z:Id="1" z:Type="System.Collections.Generic.List`1[[Serializers.Employee, Serializers, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]]" z:Assembly="0" xmlns="http://schemas.datacontract.org/2004/07/Serializers" xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:z="http://schemas.microsoft.com/2003/10/Serialization/">
 <_items z:Id="2" z:Size="4">
 <Employee z:Id="3">
 <EmployeeId>1392902</EmployeeId>
 <Name z:Id="4">Tim</Name>
 </Employee>
 <Employee z:Id="5">
 <EmployeeId>156902</EmployeeId>
 <Name z:Id="6">Shawn</Name>
 </Employee>
 <Employee i:nil="true"/>
 <Employee i:nil="true"/>
 </_items>
 <_size>2</_size>
 <_version>2</_version>
</ArrayOfEmployee>

A default schema/namespace defined by Microsoft and w3.org is added in the root node and the collection is named as ArrayOfEmployee. The output is always a valid Xml, however the Xml nodes also define CLR metadata such as Type, Size, Version, Id, etc.

Performance benchmarks

I modified the example to add 200K employees to the collection to benchmark the performance results.  For the first time, serialization took more time as the serialization object was not cached, but for the subsequent times there was 17-44% improvement in the performance.

XmlSerializer (1): Time to executed 1142.0654 mSec
XmlSerializer (2): Time to executed 635.0364 mSec

DataContractJsonSerializer (1): Time to executed 847.0484 mSec
DataContractJsonSerializer (2): Time to executed 611.0349 mSec

CLR (1): Time to executed 2179.1246 mSec
CLR (2): Time to executed 1914.1095 mSec

DataContractSerializer (1): Time to executed 539.0308 mSec
DataContractSerializer (2): Time to executed 413.0236 mSec

What is worth noticing is the that DataContractSerializer is the fastest serializer, followed by DataContractJsonSerializer and XmlSerializer.  Unless absolutely required NetDataContractSerializer should not be used.

I hope this essay helps in understanding serializers better!

Download the source code [serializers.zip] from SkyDrive 

Reference: Performance oriented Xml and JSON serialization in .NET from our NCG partner Punit Ganshani at the Punit Ganshani blog.

Related Articles

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button