There is a great deal of confusion about the WITH NOLOCK query hint. The following is a transcription of a conversation that I had with Carlos Chacon on the SQL Data Partners Podcast #57 about the WITH NOLOCK hint, and some of the misconceptions about it. If you haven’t heard the SQL Data Partners Podcast you might want to check it out.
WITH NOLOCK Hint
Carlos: So the next on is the WITH NOLOCK hint and I think this is on the list, ultimately, because there’s a lot of misinformation out there.
Steve: Absolutely, yes. The NOLOCK hint is one of my peeves on SQL Server, actually. I see it used a lot and really 99 percent of the time I see it used, people think that it’s doing something different than what it does. So, I heard the statement, “But I want to run a query in the production system but not impact or block anyone else. Shouldn’t I just use NOLOCK?” And the answer there is NO. The NOLOCK hint tells SQL Server to ignore other people’s query locks. Basically, to read dirty or uncommitted data at that point, which can lead to missing rows or phantom rows or data showing up in results. But it doesn’t do anything to stop the locking or the blocking in any way on the query that’s calling it. So it’s the equivalent of saying read uncommitted on a specific table that it’s referencing.
Carlos: Now, would that be your advice to people who want to move away from that? Is to transition the isolation level to READ UNCOMMITED?
Steve: Well, effectively whether you’re using NOLOCK on every table in your query it or you’re just using the isolation level of READ UNCOMMITED, the result is the exact same thing. But what I would do, if you’re running into this, and you can’t fix –
Carlos: Let me go back. Ultimately the idea is that it’s not giving you what you want. To your point, people think, “Oh, NOLOCK means no impact.”
Carlos: So that isolation level is not quite the same thing either, right? It’s not preventing impact.
Steve: Right. We’ll get to alternatives here in a second. But let’s take the example of your bank account. and you have the balance in your bank account, probably wherever you bank is storing that in a database somewhere. Let’s say that it’s SQL Server and on payday you happen to get your paycheck deposited and you see a nice lump of money there. And then you run a report or somebody runs a query while an index is being defragmented or a page split is occurring, it could show that you have two paychecks on that query. Then a moment later you run that query again and you have no paycheck on that payday.
Steve: Now if it was me and that was my bank, and they’re telling me that I have twice my normal paycheck or no money in my account, I would be really concerned and perhaps consider changing banks at that point.
Carlos: Yeah, it wouldn’t take too many times for that to happen.
Steve: Yep. Exactly. So there’s some solution, some way around this. If you’re finding that you need to use NOLOCK, either with NOLOCK hint or READ UNCOMMITED, because you have queries that are running and not returning, which is one of the almost legitimate reasons that people do it quite often. The thing you really need to look at, at that point, is what’s the root of the locking or blocking and figure out what’s causing that. For instance, you might have a really big update statement that’s being run and it’s locking it for everyone. If that’s something you can break up and run smaller or figure out how to run faster, you can eliminate the locking so you don’t have to use NOLOCK when you’re running the queries.
Steve: Another alternative is that you could take a look at is the READ COMMITTED SNAPSHOP Isolation level. If you put your entire SQL instance into read committed snapshot isolation (RCSI), which has less locking going on at that point, and reduces the need for NOLOCK.
Steve: So one of the things that people argue on this, is, “You’re never going to see the phantom reads or the duplicate rows when you’re using NOLOCK.” Well, there is an example I did where I had a table with just a million rows in it and it had a clustered index on a GUID (unique identifier), not a best practice but it was a great way to show fragmentation and page splits and things. And I went through and I was just updating that unique identifier to a new value which caused rows to be shifted around. I had that running in one window and I had another session open in SQL Server Management Studio and all I did was select count(*) from that table. And I hit refresh on it and refresh on it. Sometimes it would say I had a million rows on that table; sometimes it would say I had a million and ten. Other times it would say I had 990,000 rows in there and sometimes it would say I had 995,000.
Carlos: And it was the same number of rows, you were just updating?
Steve: Exactly, because what happens is when you’re updating a row to cause it to go from one page to another, and then you’re using NOLOCK, you might scan that row on the first page and then it shifts to the second page before you get to it, and you count that row again. So it’s a very legitimate case of, if you’re having blocking issues, if you turn NOLOCK on you might get those missing rows or duplicate rows getting returned.
Carlos: Wow, very interesting. I was one of those who think, and it does run in a lot of circles, in fact it was just yesterday somebody was like, “Yeah I’m running these big queries but I’m using NOLOCK so it should be good.” And it’s just that misinformation there about what the expected result is.
Steve: Yep. I use to work with a guy who always referred to this jokingly as the, “NOLOCK is the go faster switch”. Anytime you’ve got a slow query you throw a NOLOCK on it and you get faster results. Well, you might not get the results you’re looking for.
Additional Notes on WITH NOLOCK
One thing that I didn’t mention in the podcast that I probably should have is that there are some really great reason to use WITH NOLOCK. These usually involve writing queries that check specific things on your SQL Server where the query may never return. For instance to query what queries are running when you have deadlock occurring. The cases where NOLOCK is appropriate are certainly few and far between.
Another case where people will often use WITH NOLOCK, that I don’t always agree with is for data warehousing where your data is rebuilt on a daily or weekly basis, and it is not changing. WITH NOLOCK would be alright in this situation if you know for sure that nothing is changing, including index rebuilds or reorganization. It may be a better option to use RCSI rather than WITH NOLOCK in this situation.
This was an excerpt from Episode Episode 57 of the SQL Data Partners Podcast that I co-host with Carlos Chacon. The podcast is a great way to learn something new, or improve your knowledge on a SQL Server topic every week.