MongoDB Query Performance Gotchas

At my current job, we use mongodb as our main database and as you can imagine we query our database a lot. The followings are what I found while investigating some slow performance. Hopefully it will prevent you from making the same mistake or pinpoint the source of problem quickly. We are using Rails 4 with Mongoid.

1. Don’t use regular expression with /i in queries.

One of the offending query looked pretty innocent. Customer is embedded document with email attribute and email field is indexed.

def self.with_email(email)
  self.where("customer.email" => /\Aemail\z/i).count
end

When I ran Benchmark.bmbm where the method was repeated 100 times, it returned the following result.

       user     system      total        real
   0.090000   0.010000   0.100000 ( 27.656723)

Compare that with when you call the same without regexp.

def self.with_email(email)
  self.where("customer.email" => email).count
end

Performance result is day and night.

       user     system      total        real
   0.080000   0.000000   0.080000 (  0.122888)

So, why the difference? I couldn’t quite figure it out myself. I knew that Mongodb uses B-Tree for index and index for email was already created. What about the regular expression was making it slow? I thought perhaps the anchor tags (\A and \z) were causing it, but that wasn’t it. I tried using substring of email and anchoring the beginning of email string (using ^), but that didn’t yield difference either.

I finally turned to Stack Overflow for assistance, and within a few minutes, someone provided exact answer I was looking for. They key to the answer was in output of explain method.

Look at the nscanned and indexBounds in the output of explain method for the first method. It scanned all index because the upper and lower bounds are not defined ("" and {}).

{
                     "cursor" => "BtreeCursor customer.email_1",
                 "isMultiKey" => false,
                          "n" => 781,
            "nscannedObjects" => 781,
                   "nscanned" => 500000,
    "nscannedObjectsAllPlans" => 781,
           "nscannedAllPlans" => 500000,
               "scanAndOrder" => false,
                  "indexOnly" => false,
                    "nYields" => 1397,
                "nChunkSkips" => 0,
                     "millis" => 406,
            "indexBounds" => {
    "customer.email" => [
        [0] [
            [0] "",
            [1] {}
        ],
        [1] [
            [0] /test/i,
            [1] /test/i
        ]
    ]
  }
}

Compare that to the output of explain method for the second method. Exactly the same upper and lower bounds, and it only scanned extremely small number of objects. This explains the big performance difference between the two methods.

{
                     "cursor" => "BtreeCursor customer.email_1",
                 "isMultiKey" => false,
                          "n" => 230,
            "nscannedObjects" => 230,
                   "nscanned" => 230,
    "nscannedObjectsAllPlans" => 230,
           "nscannedAllPlans" => 230,
               "scanAndOrder" => false,
                  "indexOnly" => false,
                    "nYields" => 1,
                "nChunkSkips" => 0,
                     "millis" => 0,
            "indexBounds" => {
    "customer.email" => [
        [0] [
            [0] "test@email.com",
            [1] "test@email.com"
        ]
    ]
  }
}

2. Don’t convert collection to array

Another slow performance was due to the following line of code. ProductA collection had about 20 documents and ProductB collection had about 2 documents. Another seemingly innocent code, but why?

products = ProductA.all + ProductB.all

Back to measuring performance with Benchmark (this time I just did it with Benchmark.bm).

Benchmark.bm do |x|
  x.report {n.times {|n| products = ProductA.all + ProductB.all}}
end
       user     system      total        real
  99.810000   2.630000 102.440000 (114.282440)

Let’s confirm the performance of retrieval.

Benchmark.bm do |x|
  x.report {n.times {|n| ProductA.all.count}}
end
       user     system      total        real
   0.430000   0.000000   0.430000 (  0.437339)

Benchmark.bm do |x|
  x.report {n.times {|n| ProductB.all.count}}
end
       user     system      total        real
   0.440000   0.000000   0.440000 (  0.456449)

That’s pretty reasonable performance. So, it must be something in the addition of two collections. When you add two collections using Mongoid, it returns an array, so it might be something related to converting to an array. Let’s check the performance on that.

Benchmark.bm do |x|
  x.report {n.times {|n| ProductA.all.to_a}}
end
       user     system      total        real
 110.070000   1.640000 111.710000 (117.081572)

Ah-ha! Looks like we found the problem. Converting to an array is quite slow. In order to use single collection, without converting to an array, I created a parent class – ProductParent, and ProductA and ProductB would inherit from that class. In such way, one collection is used and subclasses are distinguished by just type field. Actually, it should have been parent/subclass structure since ProductA and ProductB are kinds of a Product, and both share many of the same attributes.

After the re-arrangement, the following is the performance.

Benchmark.bm do |x|
  x.report {n.times {|n| ProductParent.all.count}}
end
       user     system      total        real
   0.480000   0.010000   0.490000 (  0.490489)

What a difference it made!

Finding these two performance gotchas have been a fun journey, and I hope they are useful to you. Do you have any gotchas you have found? Let me know, and I will post a link here!

Post to Twitter

(org.mongodb.mogod) Exited with exit code: 100

For some reason, I couldn’t run mongo shell and the console was overrun by the same error message like the following.

8/1/11 6:49:15 PM com.apple.launchd[1] (org.mongodb.mongod) Throttling respawn: Will start in 10 seconds
8/1/11 6:49:25 PM com.apple.launchd[1] (org.mongodb.mongod[394]) Exited with exit code: 100
8/1/11 6:49:15 PM com.apple.launchd[1] (org.mongodb.mongod) Throttling respawn: Will start in 10 seconds
8/1/11 6:49:25 PM com.apple.launchd[1] (org.mongodb.mongod[394]) Exited with exit code: 100

And searching on Google didn’t turn up any interesting results.

IT PAYS TO LOOK AT YOUR LOG FILE! Well, this shouldn’t be new, but it somehow skipped my mind until I saw the config file.

In the log file was the following helpful error message.

**************
old lock file: /usr/local/mongodb_data/mongod.lock.  probably means unclean shutdown
recommend removing file and running --repair
see: http://dochub.mongodb.org/core/repair for more information
*************

Mon Aug  1 18:49:35 [initandlisten] exception in initAndListen std::exception: old lock file, terminatingMon Aug  1 18:49:35 dbexit:
Mon Aug  1 18:49:35 [initandlisten] shutdown: going to close listening sockets...
Mon Aug  1 18:49:35 [initandlisten] shutdown: going to flush diaglog...
Mon Aug  1 18:49:35 [initandlisten] shutdown: going to close sockets...
Mon Aug  1 18:49:35 [initandlisten] shutdown: waiting for fs preallocator...
Mon Aug  1 18:49:35 [initandlisten] shutdown: closing all files...
Mon Aug  1 18:49:35 closeAllFiles() finished
Mon Aug  1 18:49:35 dbexit: really exiting now
Well, after deleting the lock file, everything was back to normal.

When in doubt, always look in your log file!

Post to Twitter