NoSQL vs SQL
Mark Woodward
markw-FJ05HQ0HCKaWd6l5hS35sQ at public.gmane.org
Tue Dec 7 00:07:13 EST 2010
On 12/06/2010 06:15 PM, Edward Ned Harvey wrote:
>> From: discuss-bounces-mNDKBlG2WHs at public.gmane.org [mailto:discuss-bounces-mNDKBlG2WHs at public.gmane.org] On Behalf
>> Of Mark Woodward
>>
>> Any rants/raves/comments either pro/con about NoSQL or SQL?
>>
> I completely agree with you, in that (a) that video is the funniest sh*t I
> think I've seen all year... (Also this one:
> http://www.youtube.com/watch?v=FL7yD-0pqZg )
>
> And (b) There is missing context in nearly all of the conversations people
> have about SQL "not scaling." To put it in context: No, google apps can't
> use SQL in the backend because SQL just can't scale to the massive numbers
> of servers and simultaneous clients serves, and the massive number of points
> of entry that satisfy the inbound web requests. In order to exceed the
> workload capacity of a single SQL server, you're talking about several Gb
> per second (or at least several hundred Mb). When you reach that level of
> usage, then you start needing a more scalable solution. You can't possibly
> reach these levels if you have a single 100Mb connection to the Internet,
> which is much larger than 99% of businesses or home users presently have.
> In order to get such a large internet connection, typically companies spend
> thousands of dollars per month, if not tens of thousands per month, and a
> complete staff of IT people, with a fully managed, high performance, highly
> redundant network infrastructure...
>
> If you are serving users, it will require several hundred simultaneous power
> users before you approach the scaling limits of a single SQL server. More
> typically, several thousand simultaneous, because they won't all be "power"
> users constantly generating maximum work load.
>
> Or you have a cluster of compute-heavy servers mining data in a data farm...
>
>
The most important aspect of these scalability discussions, and one
which I frequently find lacking in the "NoSQL" camp is a critique of
just how many full scale transactions a SQL database can have.
Assuming that you will have one system sitting on one SATA disk with a
disk seek time of 10 ms and a rotational speed of 10,000 RPM, I/O is
your bottleneck. CPU speed is more or less infinite in comparison. There
are about 166 revelutions per second, giving you a probability of 1/2
revolution plus seek time to position the head at a random sector. So,
we have 10ms (seek) plus 3 milliseconds, giving us an average of 13ms
average head positioning time. This gives us a worst case average of
77 arbitrary write procedures per second.
The actual iops is higher based on seek time these days, but lets use
the worst case scenario. (We could use a RAID system and multiply
performance)
In an ACID database configuration, assume 1/2 maximum, i.e 36 writes per
second on the database. That's 36 transactions (read/write) per second.
Assuming 1 transaction per page view, that amounts to about 90 million
page views a month (on average) as a sustainable number.
So, a good SQL database with no scaling tricks on a bog stock modern PC
based server will serve a web site as busy as all but the very most
popular sites on the web. Someone, please tell me, what are the NoSQL
guys going on about with regard to scalability?
More information about the Discuss
mailing list