Tuesday, July 9, 2013

Jumping into Micro ORM, Ctd

As you may remember from our last installment, I'm going to experiment with cobbling together a small proof of concept project using C# .NET, a Micro ORM (PetaPoco), and a lightweight database (Sqlite).

I've already got a Visual Studio 2010 project (targeted at the .NET Framework 4.0 runtime), so it's time to get PetaPoco.

Getting PetaPoco

At this point it appears that I have two options for getting PetaPoco - making it a project dependency with NuGet, or cloning the source from Git. For now, I'll go with the latter. (For a nice quick 'n easy into to Git, you can't get much simpler than this.)

After cloning to a local repository, I see a new solution and associated files in the new PetaPoco directory. I open the solution file in Visual Studio 2010, and it builds a .dll on the first try with a couple of (hopefully minor) warnings.

------ Build started: Project: PetaPoco, Configuration: Debug Any CPU ------  PetaPoco -> C:\Users\bricej\repos\PetaPoco\PetaPoco\bin\Debug\PetaPoco.dll------ Build started: Project: PetaPoco.Tests, Configuration: Debug Any CPU ------c:\Windows\Microsoft.NET\Framework\v4.0.30319\Microsoft.Common.targets(1605,5): warning MSB3245: Could not resolve this reference. Could not locate the assembly "Npgsql". Check to make sure the assembly exists on disk. If this reference is required by your code, you may get compilation errors.  PetaPoco.Tests -> C:\Users\bricej\repos\PetaPoco\PetaPoco.Tests\bin\Debug\PetaPoco.Tests.exe------ Build started: Project: PetaPoco.DevBed, Configuration: Debug Any CPU ------  PetaPoco.DevBed -> C:\Users\bricej\repos\PetaPoco\PetaPoco.DevBed\bin\Debug\PetaPoco.DevBed.exe========== Build: 3 succeeded or up-to-date, 0 failed, 0 skipped ==========

Setting up PetaPoco with Sqlite

At this point I don't have Sqlite downloaded/installed/configured/etc.

PetaPoco doesn't seem to say anything about Sqlite. But, reading from the comments in PetaPoco.Core.ttinclude, we see some examples of setting up DbProviders, and this helpful nugget:
Also, the providers and their dependencies need to be installed to GAC
That's good info, because it doesn't look like Sqlite is intended to live in the GAC  (the Global Assembly Cache, where  the .NET runtime may go looking for referenced dlls) by default:
All the "bundle" packages contain the "System.Data.SQLite.dll" mixed-mode assembly. These packages should only be used in cases where the assembly binary must be deployed to the Global Assembly Cache for some reason (e.g. to support some legacy application on customer machines). 
So let's download the bundle version (If we already knew what we were doing, we might have gotten away with downloading just the binaries and using the gacutil...)

Next, we're going to need a .NET wrapper for Sqlite so that we can set it up as a datasource in Visual Studio. There appear to be multiple options, but let's try ADO.NET 2.0 Provider for SQLite.

After installing the Sqlite wrapper, I can now set up a connection to an existing database, or create a new Sqlite one. Let's create one:





Next let's add a table (using the sqlite command line tool) to see if our database is working:



Getting somewhere...





Running the built-in unit tests for PetaPoco didn't work out so hot: Permission to Rock - DENIED.

Hmm. Lots of invalid cast exceptions. It *seems* like the unit tests are able to connect to the database and are performing the setup/teardown steps. But clearly something fundamental is wrong.

Let's try and rule out the unit tests themselves as the problem. Since we already created a table 'foo' with two columns (int/varchar), let's create a Poco and see if we can query it:

   public class foo
    {
       public int id{get;set;}
       public string name { getset; }
    }

Testing the query:

   // Create a PetaPoco database object
   var db = new PetaPoco.Database("sqlite");
 
   try
   {
        // Show all foo    
        foreach (var a in db.Query<foo>("SELECT * FROM foo"))
        {
             Console.WriteLine("{0} - {1}", a.id, a.name);
        }
            }
    catch (Exception ex) {
        Console.WriteLine(ex.Message + Environment.NewLine + ex.StackTrace);
    }

Outputs the following:



So our stack (C# .NET 4.0 / PetaPoco / Sqlite) clearly works, but there's something unsettling about those unit tests failing. I'm not committed to PetaPoco yet. It should be easy to try out another Micro ORM framework since 1) in theory ORM frameworks try to be as implementation agnostic as they can get away with, and 2) we haven't really written any code yet - so why not try something else?

Trying a different Micro ORM

Let's look for something that's actively maintained, and has a bit more friendly support/documentation for Sqlite.

MicroLite ORM? Actively maintained, lots of documentation, Sqlite examples...

Getting MicroLite ORM

Although you can clone the project from Github, the docs say to install from NuGet. Ok then. For those of you new to NuGet, start here. (If you've ever used a package manager on a *nix distro, this will be a pretty familiar concept.)

After you install MicroLite with NuGet (depending on which packages you choose) you should see something like this when you right-click on your solution and choose "Manage NuGet Packages for Solution."



Ok, now let's see if we can reproduce our simple SELECT query.

Our new foo, with MicroLite attributes:

    [Table("foo")]
    public class foo
    {
        [Column("id")]
        [Identifier(IdentifierStrategy.DbGenerated)]
        public int id { getset; }
 
        [Column("name")]
        public string name { getset; }
    }

And the code to do a simple query:

var sessionFactory = Configure
     .Fluently()
     .ForConnection(connectionName: "sqlite", 
         sqlDialect: "MicroLite.Dialect.SQLiteDialect")
     .CreateSessionFactory();
    using (var session = sessionFactory.OpenSession())
    {
        var query = new SqlQuery("SELECT * from foo");
        var foos = session.Fetch<foo>(query); // foos will be an IList<foo>
        foreach (foo a in foos) {
            Console.WriteLine("{0} - {1}", a.id, a.name);
        }
    }

That's a bit more coding than the simpler PetaPoco version. In fact it's starting to look a lot like Hibernate! (Well, that's unfair, but one of the original design goals was a much simpler code base than a full-ORM would require.)

Let's see what happens if we use MicroLite's built-in conventions (vs. configurations for table/column mappings). Refactoring the above code we get:

    //[Table("foo")]
    public class foo
    {
        //[Column("id")]
        //[Identifier(IdentifierStrategy.DbGenerated)]
        public int Id { getset; }
 
        //[Column("name")]
        public string name { getset; }
    }

And :

   #region new_code!
 
    Configure.Extensions() // If used, load any logging extension first.
        .WithConventionBasedMapping(new ConventionMappingSettings
            {
// default is DbGenerated if not specified.
                IdentifierStrategy = IdentifierStrategy.DbGenerated, 
                // default is true if not specified.
                UsePluralClassNameForTableName = false 
            });
            
    #endregion
 
    var sessionFactory = Configure
      .Fluently()
      .ForConnection(connectionName: "sqlite", 
            sqlDialect: "MicroLite.Dialect.SQLiteDialect")
      .CreateSessionFactory();

    using (var session = sessionFactory.OpenSession())
    {
        var query = new SqlQuery("SELECT * from foo");
        var foos = session.Fetch<foo>(query); // foos will be an IList<foo>
        foreach (foo a in foos)
        {
            Console.WriteLine("{0} - {1}", a.Id, a.name);
        }
    }

There's one small change I made that might be hard to notice. I made the foo.Id property camel case (it was foo.id previously). Why? Because MicroLite's convention-based mapping strategy requires that
"The class must have a property which is either class name + Id or just Id (e.g. class Customer should have a property called CustomerId or Id)." 
(Side note - I also had to rename the column in the foo table to "Id" - yuck! I assume this is because MicroLite's convention-based mapping is case-sensitive on both ends - code and database objects.)

This post is starting to get long... so I'll snip out the part where I try full CRUD operations using both MicroLite and PetaPoco - but both work fine as expected.

We need a tie-breaker

Both Micro MicroLite and PetaPoco will serve as nice smallish ORM frameworks. MicroLite appears to be more recently maintained and has lots of features, but PetaPoco is super simple to use and requires (at least in simple CRUD examples) a lot less code.

Let's have a race and see what happens. How fast can each of these insert 10,000 rows? (Note: times are in milliseconds)

First, MicroLite:

  DateTime start = System.DateTime.Now;
            
            using (var session = GetFactory().OpenSession())
            {
                using (var transaction = session.BeginTransaction())
                {
                    for (int i = 0; i < 10000; i++) {
                        var foo = new foo();
                        foo.name = "MicroLite Insert Test " + i;
                        session.Insert(foo);
                }
                    transaction.Commit();
                }
            }
            Console.WriteLine("Elapsed: " + (System.DateTime.Now - start).TotalMilliseconds);

Outputs: Elapsed: 1811.1036

Next up, PetaPoco:
  DateTime start = System.DateTime.Now;
 
            // Create a PetaPoco database object
            var db = new PetaPoco.Database("sqlite");
 
            for (int i = 0; i < 10000; i++)
            {
                foo foo = new foo();
                foo.name = "PetaPoco Insert Test " + i;
                db.Insert("foo""Id", foo);
            }
            Console.WriteLine("Elapsed: " + (System.DateTime.Now - start).TotalMilliseconds);

Outputs: Elapsed: 115555.6095

WOW! PetaPoco took nearly 2 minutes to insert what MicroLite did in under 2 seconds. (NOTE: See updates below regarding performance improvements)

What about read operations? I created similar code that reads the 10,000 rows created in the bulk insert test and adds them to a Dictionary<int, string>.

PetaPoco:
Run #1 Elapsed: 219.0125
Run #2 Elapsed: 125.0072
Run #3 Elapsed: 101.0057
Run #4 Elapsed: 96.0055
Run #5 Elapsed: 98.0056
Run #6 Elapsed: 121.0069
Run #7 Elapsed: 118.0068

MicroLite:
Run #1 Elapsed: 926.0529
Run #2 Elapsed: 355.0203
Run #3 Elapsed: 398.0228
Run #4 Elapsed: 351.0201
Run #5 Elapsed: 483.0276
Run #6 Elapsed: 347.0199
Run #7 Elapsed: 357.0204

Hmm. PetaPoco is actually quite a bit faster than MicroLite. Both appear to be doing some caching since the first run is significantly slower than subsequent runs.

Are these fair tests? Is there a way to make them more Apples-to-Apples? Let me know in the comments...

Summary and Conclusions


I started out looking for a stack that would let me run a.NET Winforms app on top of a Micro ORM for simple CRUD operations on a Sqlite database.

I found at least two that would be reasonably serviceable: PetaPoco (the original choice), and MicroLite ORM.

PetaPoco just "feels" simpler to use and less like the clunky Spring/Hibernate syntax I wanted to avoid. But it doesn't appear to be actively maintained. Then again, there are 500+ hits on Stackoverflow (always a useful measure ;-) for PetaPoco, and 6 for MicroLite.

Which one to use? Originally, I was leaning MicroLite due to PetaPoco's apparent performance issues - until I realized they weren't issues at all. So for now, I'm leaning towards PetaPoco. EDIT: see updates below, including adding Dapper to the mix.

Downloads

If you'd like to steal the tech in this example, a Visual Studio 2010 solution is here.

UPDATE - Now with Dapper

Well leave it to me to forgot the Granddaddy of them all, Dapper. If you don't know what Dapper is, it's the Micro ORM that StackOverflow uses. If you don't know what StackOverflow is, welcome to Earth, and please enjoy the show.

I didn't originally include Dapper because although I had heard of it, I didn't think of it as a Micro ORM (not for any particular reason - just my unfamiliarity with it).

I've added Dapper to the solution you can download (see Downloads), and you can judge for yourself how it compares. I, for one, was hoping for the magical unicorn combination of MicroLite speed and PetaPoco simplicity. Alas, not so (caveat: I am sure that I could be "doing it wrong," but as far as out-of-the-box and less than 1 hour of Googling goes... Edit: Yes, I was doing it wrong).

Dapper took over 77 seconds to insert 10,000 rows. Also, compare the syntax for inserting a new record.

Here's PetaPoco:
 foo foo = new foo();
 foo.name = "Created by PetaPoco";
 db.Insert("foo""Id", foo);

Here's Dapper:
 string sqlQuery = "INSERT INTO foo(name) VALUES (@name)";
 conn.Execute(sqlQuery, new{ name="Created by Dapper" });

I don't say this to start a Holy War, but how would you like to be the gal who has to refactor the all the strings in the queries that use the 'foo' class when the business requirements come along to separate the 'name' property into 'FirstName' and 'LastName'? Or when the foo class grows to be a few dozen (hundred?) properties long? (Yes I know it's probably not good design if a class has that many properties but we've all been there.)

I am sure Dapper has many pluses going for it, but my expectations were (probably too) high.

Maybe in a future project I'll write a PetaPoco-like wrapper for Dapper, and finally be happy :-).

UPDATE #2 - Massive Performance Improvements

I just couldn't believe that StackOverflow's MicroORM was so slow, and was convinced I was doing it wrong. Well, I was. For doing the kind of work I was doing (lots of inserts), it turns out you are supposed to wrap the work in a TransactionScope.

So, adding the System.Transactions library to my project, and wrapping the inserts with the following:
      using (var transactionScope = new TransactionScope())
            {
              // do lots of transactions
             transactionScope.Complete();
            }
Dapper was able to insert 10,000 rows in a stunning  493 milliseconds. Folks, we have a clear winner with performance. Now if only the Dapper had cleaner insert/update syntax!

Likewise, PetaPoco has a similar mechanism:
 var db = new PetaPoco.Database("sqlite");
 using (var transaction = db.GetTransaction())
 {
    // do lots of transactions
    transaction.Complete();
  }

Which brings the PetaPoco bulk insert time down to right around 1 second (1087 millis).

So now we're back where we started. PetaPoco's speed is comparable to the others, and it's got very nice, simple syntax. And if I hit a gnarly roadblock, I know I at least have a couple of decent alternatives to fall back on.

No comments:

Post a Comment