Unexpected Bots and reCAPTCHA defense

A few weeks ago I soft launched my product to allow for direct sign ups from the public web site, Process PA. We’ve been running for a few month with just our foundation customers making sure things are running well before letting anyone register without interaction. I updated the web site without announcing to just test the process a bit before (hopefully) driving more traffic to the site.

Interestingly, I started getting random sign ups registered in the database. Many had even verified their email address. I hadn’t been directing people to the site yet and where we are now we don’t get many unknown visitors. Sure enough, it appears they are bot accounts. Quite surprising that bots are onto new sites quickly and filling out registration forms. I’m not sure what they expect out of it.

CAPTCHA Required

This is kind of a pain. While building a startup I have many things to do. And stopping bots this early on I didn’t think would be required. Fortunately putting CAPTCHA in place is pretty quick. However, I started using a very popular NuGet package BotDetect CAPTCHA. Although implementing was easy it results in those horrible user experience that everyone hates. I did not want to add any friction to legitimate sign ups.

Although the BotDetect CAPTCHA claims “not one confirmed case of automated CAPTCHA breaking by spammers” I’m sceptical. I did a thesis on vision processing over 10 years ago and it’s gotten much better since then. Spammer may not be breaking them, but Google states, “it can decipher the hardest distorted text puzzles from reCAPTCHA with over 99% accuracy”.

Google No CAPTCHA reCAPTCHA to the rescue

You’ve seen it across many websites now. Launched December 2014, this provides the simple ability for the user to check the box that says, “I’m not a robot”. And they are, most of the time, done. So much better for the user. So much harder for the bot.

Implementing is very easy from the instructions on the admin site which contains your keys. Get started at Google reCAPTCHA. Client side is a script include and a div. Server side is a web request. It is even simpler nicely wrapped up as an attribute from the NuGet package reCAPTCH.MVC with clear instructions on their project site.

With it all in place now, it looks like I’ll be needing human customers to keep up the sign up rate now that the bots aren’t allowed in. If you are tired of doing minutes and governance manually for your association, club or board come and try out Process PA.

Database unable to be recreated after delete with Entity Framework

Nice little gotcha when getting started with Entity Framework using the built-in ASP.NET MVC templates. Wanting to start from a fresh database using the MSSQLLocalDB there is a few little things to do that aren’t necessarily obvious:

1. Delete the mdb and ldf

Go to the App_Data directory and delete.

If the delete fails, make sure you stop IIS Express which is holding a handle to the files.
Stop IIS Express

If the delete still fails, make sure you have closed connections in the server explorer in Visual Studio.
Close Server Explorer Connection

Still failing to delete? Go to Task Manager and kill the SQL Server Windows NT [sqlservr.exe] process.
Kill the process

2. Run Update-Database from the Package Manager Console

Update-Database

Fails with “The EntityFramework package is not installed on project ‘####’.
Check the default project is set to the correct one which contains the Migrations folder. Mine keep defaulting back to the Tests project.

Fails with “Cannot attach the file *.mdf as database”.
I found the answer on StackOverflow. The default connections string contains the Initial Catalog property. Which is all good for a full SQL Database but not LocalDb. Removing that property and it is all good.

Summary

Entity Framework seems quite good with Code First Migrations. I find the documentation really hard to get up to speed if you are coming new to it. Much of the documentation seems to focus around the difference and pieces together what to use for the latest version wasn’t the simplest. There is a bunch of little things you need to be aware of and practice with Add-Migrations and making sure everything is upgradable properly. Persisting with it [pun intended] has been worthwhile so far…

Capture IIS Network Traffic in Fiddler

I have an IIS application that is querying Azure Active Directory Graph API from the server. I wanted to capture what is requests are happening using the client API. By default Fiddler does not capture these requests. Fiddler inserts itself into the WinINET layer as a proxy which is bypassed by IIS outgoing traffic.

To capture these requests  coming from an IIS application pool. Add to your web.config after the <configSections> element:

  <system.net>
    <defaultProxy enabled="true">
      <proxy proxyaddress="http://127.0.0.1:8888" bypassonlocal="False"/>
    </defaultProxy>
  </system.net>

where 8888 is the fiddler listening port, found in Tools > Fiddler Options

image

This can be found in the Fiddler documentation.

Happy web traffic debugging!

No need to use DateTime UTC again!

We were always taught, whenever storing or comparing dates always store them as UTC. That’s how I remember it at least. But that is not actually the correct answer. According to the detailed Coding Best Practices Using DateTime in the .NET Framework the rules state “a developer is responsible for keeping track of time-zone information associated with a DateTime value via some external mechanism”. Which leads to the recommended Storage Strategies Best Practice #1:

When coding, store the time-zone information associated with a DateTime type in an adjunct variable.

I don’t think I have ever seen this actually done. But no worries, since .NET 3.5 and SQL Server 2008 there is new type to use. Today I was just introduced to DateTimeOffset. This solves the issues of storage and calculations ensuring that the time zone offset is always stored with the date.

Represents a point in time, typically expressed as a date and time of day, relative to Coordinated Universal Time (UTC).

The code sample on the documentation page I think shows the difference and usefulness quite well.

using System;

public class DateArithmetic
{
   public static void Main()
   {
      DateTime date1, date2;
      DateTimeOffset dateOffset1, dateOffset2;
      TimeSpan difference;

      // Find difference between Date.Now and Date.UtcNow
      date1 = DateTime.Now;
      date2 = DateTime.UtcNow;
      difference = date1 - date2;
      Console.WriteLine("{0} - {1} = {2}", date1, date2, difference);

      // Find difference between Now and UtcNow using DateTimeOffset
      dateOffset1 = DateTimeOffset.Now;
      dateOffset2 = DateTimeOffset.UtcNow;
      difference = dateOffset1 - dateOffset2;
      Console.WriteLine("{0} - {1} = {2}", 
                        dateOffset1, dateOffset2, difference);
      // If run in the Pacific Standard time zone on 4/2/2007, the example 
      // displays the following output to the console: 
      //    4/2/2007 7:23:57 PM - 4/3/2007 2:23:57 AM = -07:00:00 
      //    4/2/2007 7:23:57 PM -07:00 - 4/3/2007 2:23:57 AM +00:00 = 00:00:00                        
   }
}

When would you prefer DateTime over DateTimeOffset? Introduced here by the BCL team and it is detailed when you may want to use the DateTime over the DateTimeOffset. Summarized I would say it is when you are doing interop with OLE or when you don’t care about time only date, like birthdays.

DateTimeOffset is the new preferred type to use for the most common date time scenarios.

Shortcomings due to the loss of time zone information and only using the offset is the main issue. If you are really serious about dates and time and are heavily using them, you may consider Noda Time started by Jon Skeet. My only issue with using DateTimeOffset virtually all the time in place of DateTime is that I have only found out about this years later.

Technorati Tags: ,