One of the most interesting presentations in Architecture & Design
world was the eBay Architecture presentation by Randy Shoup and
Dan Pritchett.
The presentation was only one hour long, so Randy and Dan didn't cover
all the topics in the slide. Here are some of the insight I took from
this presentation.
Architecture evolution - eBay actually went through several architecture
revolutions.
Their initial architecture cannot even begin to scale to their current
loads. It was, however, a very good fit for their initial quality
attributes - specifically, the emphasis on time to market and costs.
This shows the importance of balancing quality attributes. Sure an
architectural change is painful but if they'd future proofed too much I
doubt they would ever get something working.
V2 demonstrated
that traditional 3-tier architecture would only scale so far. It was
nice to see how it evolved though. Also with the move from version 2.4
to 2.5 and later to 3 we see eBay learning about
CAP -
the hard way. In its final (current) incarnation eBay's data
architecture prefers partitioning and availability over consistency.
This doesn't mean they forgo consistency altogether - just that they
trade the comfort zone of ACID transactions with the BASE approach.
Where BASE - stands for Basically Available, Scalable/Soft state &
Eventually Consistent. .
eBay partitions thier data in two levels
one is a SOA like division by business areas (users, items etc.) and
the second level is an horizental partitioning based on access
paths.This BASE approach to data was dubbed by Dathan Pattishall (from
Flicker and Friendster)
as sharding (via
HighScalability).
This approach means things like high partitioning, no distributed
transactions (also see below), denormalization etc. (you might also
want to read the item I wrote on
denormalizaiton in InfoQ yesterday).
The more major implication here is that when it comes to internet scale, the database looses its importnace - or as
Bill de Hora nicely puts it:
The
use of RDBMSes as data backbones have to be rethought under these
volumes; as a result system designs and programming toolchains will be
altered. When the likes of Adam Bosworth, Mike Stonebraker, Pat Helland and Werner Vogels are saying as much, it behooves us to listen.
As
I said the data architecture of eBay is SOAish - partitioned their
components and data along business lines, and they apply many of SOA
principles. They don't however unite data and components to create a
service and they don't (seem) to have the same contract boundaries
that SOA promotes (Randy told me that they are currently contemplating
SOA).
Returning to the eBay
do not use transactions.
"no transactions" which seems very controversial - but if we just
consider some of the points I made on transactions between services in
previous posts - it is the only logical way to ensure scaling. By the
way, as can be expected they do use transactions - when they are local
e.g. if the users table is spread over a couple of table both will be
updated together).
The application layers also follow the
segmentation by business areas. eBay cacse metadata/immutable data as
much as possible. keep the application stateless (i.e. state comes from
client/db) e.g. they don't use sessions. The DAL virtualized the
horizontal partitioning mentioned above for the rest of the code.
It
was also interesting to that eBay developed its own messaging
infrastructure - though Randy and Dan did not provide alot of details
on that
Development process - It seems that eBay is using some
hybrid of feature driven development with waterfall (i.e. the
development is feature by feature - but the development of a feature is
waterfallish). The do have a constant delivery rate which they
synchronize using the concept of a train. if you have a features that
is it will be added to the train which is scheduled to arrive around
the time your feature will be ready. Several features are delivered as
a package which gives a predictable (weekly). I guess it also gives
them some nice metaphors to use such as a feature that doesn't make it
- misses the train or the train leaves on time etc.
The slides of the presentation can be
downloaded from Dan Pritchett's site (They not from the same event but they are pretty much the same slides. Also you can read Elliotte Rusty Harold's
account of the presentation.