[Discuss] Rob Conery's critique of MySQL?
Richard Pieri
richard.pieri at gmail.com
Thu Aug 2 20:44:56 EDT 2012
On 8/2/2012 4:36 PM, Mark Woodward wrote:
> Not to be snide, but 8 million is not a big number.
That's 8 million patients. Multiply that by everything that the VA has
on each and every one of them and you get a very large data set.
It's not the largest data set that I'm aware of. The largest is the
data out of the LHC which is around 200 petabytes. CERN went the other
way. They started with an object databases but eventually dropped it
due to poor market development OODBMSs. They currently use relational
databases for storing and retrieving metadata. Bulk data is stored in
flat files.
> Well, "billions" of transactions per day should be doable in a cluster.
That's what Ameritrade and Oracle thought but they couldn't make it work.
> If your oracle database is crashing, it is misconfigured.
The Oracle techs working with Ameritrade couldn't keep the cluster
going. They eventually gave up when Ameritrade wouldn't commit to
replacing the entire cluster with bigger servers.
> Financial
> transactions are a dangerous thing, you really do need ACID for
> fiduciary responsibility.
Cache' delivers full ACID guarantee. I told you I wasn't talking about
NoSQL/MongoDB.
> You are avoiding the topic, the "storage system," is separate from the
> implementation of the objects. The objects know how to serialize and
> restore themselves as well as upgrade. The storage and location of
> objects is not involved.
Of course I am. It's not relevant to the topic, which is the technical
merits of object vs relational databases.
> That is not a "how," it is a adjective and a plural noun. One does not
> need to use relations in a database, but one has them if they need them.
> An RDBMS is a tool not some kind of mandate.
Then why bother with a relational database at all? The singular
strength of a relational database is the relations between data. If you
don't use relations then the relational database is the wrong tool for
the job.
> Yes, ok, that is done with the XML/JSON class description. What's the
> problem?
The problem is that you're stuck with tables. You don't have an object.
You have an object stored in a table. Even if it is a table with a
single column and a single row it's still a table.
> If I said the XML was stored in a binary polymorphic object file and it
> could be retrieved by its ID, would that make a difference? Because,
> that is exactly what is happening. For convenience, we call the the
> polymorphic object file a "table."
Sure, that works. Again, why bother with a relational database if you
want to short-circuit all of the relational functions? Which was my
original point: why bother with inferior tools like relational databases
when superior tools like object databases are available?
> Sorry, no. It is either a hash table, or they are hiding the index from
> you. Either way, it doesn't matter because databases have hash indexes.
Nope. Binary trees or multidimensional arrays. Typically, an object
database doesn't cache index data which it doesn't have. It caches objects.
> And if you say that objects don't need that kind of indexing, then you
> miss the real power of database. If you have 8 million objects, say
> patients in a database. How do you find them by social security numbers?
> How about by last name? How about by symptoms?
You walk a balanced b-tree. The worst case for a binary tree search is
O(log n). Then the patient object is loaded into cache and data access
times drop to O(1).
>> Better performance,
> How? Prove it.
O(log n) typical worst case for object searches vs. O(log n) typical
best case for relational searches. In real applications object searches
are 2-20 times faster than comparable relational searches.
>> greater scalability,
> How? Prove it.
Ameritrade.
>> faster deployment,
> How? Prove it
The VA Hospital's ahead of schedule and under budget deployment.
>> easier
>> maintenance,
> How? Prove it
Admittedly it is company propaganda, but case studies from InterSystems'
customers show that Cache' is easier and faster than Oracle for
application development and support.
>> and typically at a lower cost for all of it.
> PostgreSQL is free. It doesn't get much lower in cost.
Hardware, sysadmins, DBAs, application developers, test teams. All
these cost money. If you can deliver an application on leaner hardware
then you reduce cost. If you can deliver it in less time then you
reduce cost.
> No, I needed a DNS system that could replicate, allow user access,
> managed rights and privileges, etc. I could coble something together, or
> use a package that worked out of the package. It was a no brainer.
I implemented something similar at a previous gig using shell scripts.
It worked perfectly. It was a no-brainer. And that's still my own
confirmational bias speaking.
> I do have some expertise in PostgreSQL, sure, but I always try to find
> the best tool for the task. I have used SQLite and I have done a fair
> amount of storage systems where an RDBMS is not appropriate.
Consider this for your next project: a relational database is never
appropriate. Work from that. I'm certain that you will be surprised,
in a good way, at what you discover.
--
Rich P.
More information about the Discuss
mailing list