Interesting String.GetHashCode() Issue Between ASP.NET 1.1 and 2.0

Please note that this post was migrated to my new blog platform. There may be bad formating, broken links, images, downloads and so on. If you need an item on this page, please contact me and I will do my best to get it from my backups.

~E

Today a fellow co-worker, David Penton, ran into an interesting issue about a background ASP.NET thread using ASP.NET 2.0, instead of ASP.NET 1.1 when the individual website was set to run under 1.1.

First a little background.  The internal staging server is Windows 2003 R2.  In the past Windows 2003 would throw an exception at the web application level when two or more websites were sharing the same application pool, and they were set to different versions of ASP.NET.  The R2 release seems to have resolved this issue, hence our IT Administrator running most sites under the common Default Application Pool.

We have a client that requires an ASP.NET 1.1 build of the website.  So the developer was working within VS2k3.  The background processes in question of the web application is to process the searching algorithm we use for an out-of-the-box CommunityServer install (nicknamed the SearchBarrel).  We have an Enterprise Search addon available that uses Lucene.NET, but for this client they are using the search barrel.  The SearchBarrel breaks up a post into individual words, then issues a ToLower() and then GetHashCode() on the string for each word.  We store this Int32 hash in the database as number matching is faster to index than string matching.

The String.GetHashCode() method is different between .NET 1.1 and .NET 2.0.  So when you are upgrading an application from 1.1 to 2.0, and you are storing the HashCode for strings somewhere, you’ll have to generate new HashCodes in .NET 2.0.

The issue the developer ran into was very odd.  The post was using a mix of English and Chinese, so we are dealing with an extended character set as well.

(The word we are trying to hash)
ps3对蓝光技术的采用也是令大家称道的原因之一 

When a single word mixed English and binary characters without spaces (i.e. Chinese Simplified as above), the background SearchBarrel CSJob (the thread) would generate an ASP.NET 2.0 HashCode for the above word! 

(.NET 2.0 HashCode of the word above)
-309760669

(.NET 1.1 HashCode of the word above)
1104497610
 

Yes, the website was set to ASP.NET 1.1.  Yes, the binaries were built under .NET 1.1.  But yet, we were getting an .NET 2.0 hashcode.  The only thing that came to my mind was that it was using the default application pool, which was shared with many other staging websites - mostly ASP.NET 2.0 sites I’m guessing (since most of our clients have moved to ASP.NET 2.0).

It gets even odder.  If the developer was to edit the post, change that one word of mixed English and Chinese characters to insert a space between the two different languages, clearing the search barrel and letting the background thread re-hash the post - it would then use ASP.NET 1.1 to generate the HashCode!

Very very odd.  And we could re-produce it consistently by adding and removing that space and forcing the searchbarrel to rebuild.

The only thing we could guess is that .NET 1.1 choked on the English + Chinese character mix when encoding the hashcode and somehow reverted back to ASP.NET 2.0 to generate the GetHashCode() method.

The fix?  Move the website to its own dedicated Application Pool.

Interesting String.GetHashCode() issue between ASP.NET 1.1 and 2.0

> Revision History
> About the author