Questions About DRBD
Backround: We are in need of a HA server in a small office environment and
DRBD looks like it has some synchronization issues at times (like during
updates). We only have about 100GB that needs to be on the HA server and
server load will be extremely low. The data will probably increase about
10%-25% per year if we archive older office data, and 50%-75% each year if
we don't.
Point is we use a mix of consumer grade and used enterprise grade hardware
which WILL be a problem if we don't preemptively plan for it; and
pre-built quality servers DO fail, so redundant servers seems like the way
to go.
The Plan: We are thinking it would be good to find (2) of the best
bang-for-our-buck used servers and synchronize them. We simply need
SATA/SAS and space for as many drives as can be had for the price. These
servers seem like they can be had for $100-$200 (+some parts and
additional drives) if you catch a deal.
This would theoretically mean a server could fail and if we took days to
get to it, as long as we didn't have another coincidental failure (servers
don't break that often when you have two or three), things would still hum
along until our IT department (me) could get to it. We would use Debian as
an OS.
Some Questions
(A) How does DRBD handle controller failure? That is This shows DRBD
before the storage driver, so what happens when the controller fails and
writes dirty data? Is the data mirrored to the other server or not and is
there risk of data corruption across servers in cases like these?
(B) What are the fail points for DRBD; that is theoretically as long as
one server is up and running there are no issues EVER. But we know that
there are issues so what are the fail modes using DRBD since most of them
should theoretically be software? (For RAID you mainly have drive failure,
controller failure, software errors, and user errors; how about DRBD?)
If we are going to have two servers for this, would it be reasonable to
run VM's on each with MYSQL and Apache for database and web server
replication? (I am assuming so)
Is DRBD reliable enough? If not, is the unreliability isolated to certain
tasks, or is it more random. Searching turned up people with various issue
but this IS the internet with seemingly more bad info than good.
If data is being synchronized over LAN, does DRBD use double the
bandwidth? That is, should we double up on NICS and do some link
aggregation and trunking? Then maybe put them on separate routers on
separate circuits and UPS's in separate rooms and now you really have some
redundancy!
Is this too crazy for an office in terms of server management? Is there a
simpler REALTIME alternative (granted DRBD seems simple in theory).
We already have a server. So it seems to me a second server with a
dedicated drive for DRBD could easily be had for around $150-$250 with
some smart shopping. Add a second router, more drives, more NIC's (Used),
and (2) UPS's and were talking $1,000 +/-. That is relatively cheap! And I
am hoping this would mainly buy us time during a server fault. Drive
failures don't worry me at all with RAID these days. It's other hardware
failures like controllers, memory, or power supplies that might require
downtime to diagnose and fix that are the concern.
Redundant servers for us means used hardware becomes more viable with more
up time and more flexibility for me to fix things when my schedule allows
vs having to stop everything to repair the server or incurr the cost of
calling an IT tech (I'm fast enough for our system it is cheaper for me to
perform diagnosis and repairs)
Anyone out there who knows this stuff?
Hopefully I didn't miss that these questions have easy searchable answers.
I did a quick search and didn't find what I was looking for.
No comments:
Post a Comment