Different objects have different memory address
namespace OverridingGetHashCode
{
class Program
{
static void Main(string[] args)
{
var a = new Number() { Value = 1 };
var b = new Number() { Value = 1 };
// a,b contains equals semantic values but the Equals returns false
// cause a and b object addresses are different.
Debug.Assert(!a.Equals(b));
}
}
public class Number
{
public double Value { get; set; }
}
}
Overriding Object.Equals method
namespace OverridingGetHashCode
{
class Program
{
static void Main(string[] args)
{
var a = new Number() { Value = 1 };
var b = new Number() { Value = 1 };
// a,b results equals cause the overridden Equals method checks
// against semantic values of two different objects.
Debug.Assert(a.Equals(b));
}
}
public class Number
{
public double Value { get; set; }
public override bool Equals(object obj)
{
return Value == ((Number)obj).Value;
}
}
}
Object.Equals may not enough for collections: use of GetHashCode
namespace OverridingGetHashCode{
class Program
{
static void Main(string[] args)
{
var hs = new HashSet<Number>();
var a = new Number() { Value = 1 };
var b = new Number() { Value = 1 };
hs.Add(a);
// the hashset contains a, but not the semantic equals object b
// cause the GetHashCode of b is different from a by default
// (different instances)
Debug.Assert(hs.Contains(a));
Debug.Assert(!hs.Contains(b));
}
}
public class Number
{
public double Value { get; set; }
public override bool Equals(object obj)
{
return Value == ((Number)obj).Value;
}
}
}
With the use of GetHashCode two different object instances have a chance to be Object.Equals
namespace OverridingGetHashCode
{
class Program
{
static void Main(string[] args)
{
var hs = new HashSet<Number>();
var a = new Number() { Value = 1 };
var b = new Number() { Value = 1 };
hs.Add(a);
// hashset contains b from the semantic point of view cause equals a
Debug.Assert(hs.Contains(a));
Debug.Assert(hs.Contains(b));
}
}
public class Number
{
public double Value { get; set; }
public override bool Equals(object obj)
{
return Value == ((Number)obj).Value;
}
public override int GetHashCode()
{
return Value.GetHashCode();
}
}
}
Features and requirements to override GetHashCode method
- two semantic equals objects must have the same hash code number or they definitively considered different regardless the Equals result.
- two different semantic objects can have the same hash code number (collisions), but
- avoid that two different sematic object have the same hash code number, cause
- more collision results in more calls to the Equals method
- maintain minimal cpu overhead on gethashcode
- when composing the xor expression of fields in the hashcode include only those that are used against a simple equality check ( == ) in the Equals
If two different semantic objects share the same hash code (2) then an additional check using Equals will ensure with a detailed check if these two objects are the same at all. (3) This will results in a slower check cause the detailed check using Equals will span over more objects, so is best to avoid hashcode collision for really different objects.
Xor of GetHashCode members
Computing the xor value of more integers numbers is a fast way to retrieve a combination of these numbers.
When xor works
namespace OverridingGetHashCode
{
class Program
{
static void Main(string[] args)
{
var hs = new HashSet<Number>();
var a = new Number() { Value = 1, ValueB = 2 };
var b = new Number() { Value = 1, ValueB = 2 };
hs.Add(a);
// hashset contains b from the sematic point of view cause equals a
Debug.Assert(hs.Contains(a));
Debug.Assert(hs.Contains(b));
}
}
public class Number
{
public double Value { get; set; }
public double ValueB { get; set; }
public override bool Equals(object obj)
{
return Value == ((Number)obj).Value && ValueB == ((Number)obj).ValueB;
}
public override int GetHashCode()
{
return Value.GetHashCode() ^ ValueB.GetHashCode();
}
}
}
When xor not works
namespace OverridingGetHashCode
{
class Program
{
static void Main(string[] args)
{
var hs = new HashSet<Number>();
var a = new Number() { Value = 1, Tolerance = .1 };
var b = new Number() { Value = 1.2, Tolerance = .5 };
// a and b are object semantically equals
Debug.Assert(a.Equals(b));
hs.Add(a);
Debug.Assert(hs.Contains(a));
Debug.Assert(!hs.Contains(b)); // hash not find b in the set!
}
}
public class Number
{
public double Value { get; set; }
public double Tolerance { get; set; }
public override bool Equals(object obj)
{
var other = (Number)obj;
var diff = Math.Abs(Value - other.Value);
var maxtol = Math.Max(Tolerance, other.Tolerance);
return diff <= maxtol;
}
public override int GetHashCode()
{
return Value.GetHashCode() ^ Tolerance.GetHashCode(); // wrong
}
}
}
The example above shown that using xor of the field without reasoning is not a good practice: the override of GetHashCode is not an obvious task.
Inserting the xor with the Value and Tolerance fields in the GetHashCode does not comply with the first requirement "Two semantic equals objects must have the same hash code number or they definitively considered different regardless the Equals result".
In fact is easy to see that the objects
var a = new Number() { Value = 1, Tolerance = .1 };
var b = new Number() { Value = 1.2, Tolerance = .5 };
appears different at a simple view ( they have Value and Tolerance different ).
More it not comply with the 6) rule "when composing the xor expression of fields in the hashcode include only those that are used against a simple equality check ( == ) in the Equals", in fact the Equals contains complex operations over Value and Tolerance fields :
public override bool Equals(object obj)
{
var other = (Number)obj;
var diff = Math.Abs(Value - other.Value);
var maxtol = Math.Max(Tolerance, other.Tolerance);
return diff <= maxtol;
}
A basic solution to let the HashSet works as expected would be the follow :
...
public override int GetHashCode()
{
return 0;
}
...
In other words we lost the behavior of an hashset and we are using effectively a List<Number> collection, cause to search an object the worst case is to check against all N objects in the list using the Equals method. After all, not all classes implements complex Equals methods.
Implements IEquatable<T>
When overriding Equals(Object) method is a good practice to mark the class as IEquatable<T> and implements the Equals(T) too. (see C# objects equality).
Overriding Object.GetHashCode by Lorenzo Delana is licensed under a Creative Commons Attribution 4.0 International License.