FireWire
Real Application Cluster Combine RAC, Linux and FireWire
disk for low-cost development environment.
by Brian Carr (bcarrdba@neo.rr.com)
and William Garrett (wgarrett@neo.rr.com)
Toolbox:
Oracle 9i (9.0.1.4.0) Database Server with RAC option; Red Hat Linux
7.1.
Topics include:
Oracle database installation, Linux kernel versions, and kernel settings.
You should have a good knowledge of Oracle database administration
and Linux (or UNIX) operating system management, but whether you're
an established DBA and UNIX administrator or merely a DBA and Linux "newbie," the
basic advice and techniques here will save you lots of time and aggravation.
With
the demands of a 24/7 marketplace, a highly available
and scalable database is getting increasingly more
important. In the past, you had to choose from
one of two options in a cluster. High availability
clusters were used to protect your database from
hardware failure. Load balanced clusters with many
nodes were used to support a much larger volume
of traffic than single multi-processor implementations.
Redundant components such as additional nodes,
multiple interconnects, and arrays of disks helped
provide high availability.
Such
redundant hardware architectures avoid single points-of-failure
and provide exceptional fault resilience. RAC takes
the cluster architecture even further, providing
improved fault resilience and incremental system
growth by offering connection failover and load
balancing in the same cluster. In the event of
a system failure, RAC ensures your database will
still be available.
In
the event of a large spike in traffic, RAC can
distribute the load over many nodes. Not only does
RAC make good sense from a data availability and
performance point of view, but with large SMP servers
going for a premium price a pair of 2 processor
servers can be half the cost of a single 4 processor
server. RAC gives you the availability and scalability
that enterprises demand.
RAC
provides the following key benefits to e-business
application deployments:
Flexible
and effortless scalability, so that adding nodes
to the database is easy and manual intervention
is not required to re-partition data when processor
nodes are added.
A
high availability solution that masks node and
component failures from end-users.
A
single management interface for DBAs to perform
installation, configuration, backup, upgrade,
and monitoring functions once. Oracle automatically
distributes the management functions to the appropriate
nodes. This means the DBA manages one virtual
server.
An
alternative to using a Storage Area Network (SAN)
or Network Attached Storage (NAS) is an external
FireWire hard drive enclosure. This allows a low
cost solution for testing your systems in a RAC
environment prior to roll-out on your production
RAC. With the use of commodity hardware you can
build your development environment for a fraction
of the cost.
This
article looks at the process and some helpful tips
to help you configure RAC for your development
environment using a low cost alternative to a SAN,
etc. After reading this article you should be able
to setup your RAC environment more quickly and
with fewer headaches.
Background
/ Overview
Our
environment was set up to provide us with a low
cost development environment to see how RAC fit
into our application environment. We had two Compaq
700 MHz Pentium3 desktops, each with 512MB RAM
and 10GB internal drives. We also had a spare switch
to use for the interconnect. The only hardware
we needed to purchase was an external FireWire
enclosure, an IDE hard drive and two FireWire adapters.
These additional hardware components came up to
less than $400.
The focus of this article is using a FireWire drive with RAC on Linux. Therefore
we will not rehash the installation of RAC on Linux since this is well
documented in Note Id 184821.1 available on Metalink. You should also read
the Oracle RAC installation documentation. We will focus on configuring
Linux for FireWire support and how to test your RAC configuration for failover.
By
the end of this article you will understand the
steps necessary to setup your own RAC environment
for testing clustered applications. For those familiar
with Linux and Oracle we estimate approximately
1-2 days worth of work to setup your development
RAC environment.
1:
Ensure your configuration is certified.
The
setup we used was Red Hat 7.1 with RAC 9.0.1. If
you’re interested in using another distribution
of Linux or 9iR2 be sure to check the certified
configurations.
Click
on the "Certify and Availability" button
on the menu frame
Click
on View Certifications by Product hyperlink
Choose "Real
Application Clusters"
Choose
the correct platform.
2:
Obtain proper FireWire chipset and adapters.
Some FireWire chipsets are better than others at handling multiple logins.
In order for RAC to work properly both nodes need to be logged in to the external
FireWire hard drive simultaneously.
We
used an external hard drive enclosure that contained
an Oxford Semiconductor OXFW911 sbp2 chipset, which
supports up to four concurrent logins. The hardware
savings is quite noticeable here, as two FireWire
adapters and a shared disk can be bought for less
than a single fiber channel controller (let alone
the cost of a full SAN implementation).
3:
Kernel configuration
In order for FireWire to be recognized by Linux it is recommended that you
use a 2.4.19 updated kernel. We downloaded and unpacked the updated kernel
from http://otn.oracle.com/tech/linux/open_source.html
4:
Driver modification
As specified in the sbp2.c file of the new kernel source, you’ll want
to change this file to allow support of multiple logins. To do this change
the line
static
int sbp2_exclusive_login=1
to:
static int sbp2_exclusive_login=0
This
modification is well documented in the file and
will allow both nodes to have access to the external
FireWire hard drive simultaneously. You will want
to read through the rest of the source code in
this file as there are several tuning parameters
which can be set here.
5:
Compile the kernel Prior to compiling the kernel be sure to
run “make xconfig” from
the /usr/src/linux directory. Choose
the IEEE 1394 (FireWire) support (EXPERIMENTAL) menu
option. Set the following options to “y”:
IEEE
1394 (FireWire) support (EXPERIMENTAL)
OHCI-1394
support
SBP-2
support (Hard disks, etc)
Enable
Phys DMA support for SBP2 (Debug)
Raw
IEEE1394 I/O support
IEC61883-1
Plug support
Next
build the kernel according to your distributions
instructions.
You
may run across an error “nodemgr.c:
1307: parse error before ‘else’” when
the kernel is compiling. This is a verbose debug
option. We commented out line 1304 and recompiled
the kernel.
Another
error we ran into was “Error
invoking target install of makefile /op/oracle/product/9.0.1/plsql/lib/ins_plsql.mk”.
To resolve this problem you’ll need to edit the file $ORACLE_HOME/bin/genclntsh
and change the following line:
LD_SELF_CONTAINED=”-z
defs”
to read: LD_SELF_CONTAINED=””
6:
Detect FireWire devices
The easiest way to add/detect new FireWire devices is to run the shell script
rescan-scsi-bus.sh. The script may be found at: http://www.garloff.de/kurt/linux/rescan-scsi-bus.sh
When you run this script you should see the following type of response:
Host adapter 0 (ide-scsi) found.
Host adapter 1 (sbp2) found.
Scanning for device 1 0 0 0 ...
NEW: Host: scsi1 Channel: 00 Id: 00 Lun: 00
Vendor: WDC WD12 Model: 00JB-75CRA0 Rev:
Type: Direct-Access ANSI SCSI revision: 06
1 new device(s) found.
0 device(s) removed.
This indicates your FireWire drive was detected by Linux.
7:
Partitioning your drive
If you decide to only use one external disk, then you’ll need to be aware
of a couple of things. If you do not use Logical Volume Manager (LVM) or Oracle
Cluster File System (OCFS), then you’ll most likely have to use FDISK
to partition your raw devices. You will be limited to 3 primaries, 1 extended
and 11 logical partitions. This means you will not have room for all the default
tablespaces the Database Creation Assistant (DBCA) uses.
For
this reason we decided to drop the USERS and TOOLS
tablespaces during the DBCA setup. This obviously
doesn’t follow the optimal flexible architecture
(OFA), but since this is a development system it
will work just fine. You could also use multiple
disks to allow you to set up additional partitions.
It’s
recommended that you use LVM or OCFS.
8:
Oracle patch
Install Oracle 9i Database server with the RAC option according to Note
ID 184821.1 available on Metalink. There
was an issue filed for shutdown immediate taking <60 seconds to unregister.
This was filed as Bug 1841387 which was fixed in the 9.0.1.1 patch
set. However Oracle Support recommended we apply the 9.0.1.4.0 patch
set.
You
can download this from http://metalink.oracle.com and
simply follow the installation instructions. Once
you’ve created and started your database,
you’re ready to connect to the RAC from a
client workstation.
9:
Testing SESSION Failover
You may use SESSION or SELECT type failover. SESSION is the simplest type.When
the connection to an instance is lost, SESSION failover results only in the
establishment of a new connection to a backup instance. Any work in progress
is lost.
SELECT
Failover is implemented by transparently re-executing
the SELECT statement and then bringing the cursor
up to the same point as it was before the failure.
There's
no automatic recovery mechanism built into SELECT
failover to handle DML statements, such as INSERTs
and UPDATES which are in progress when a failover
occurs. Your application will still need to use
error checking routines and transactions, but now
if a failure occurs you can try the transaction
again on the same connection. If it was a node
failure the connection has already reestablished
itself to another node.
The METHOD
parameter defines if Oracle pre-establishes
connections to connect to the backup node.
BASIC : In this case it simply establishes
a new connection to the backup node. In this case the backup node isn't
used until the used node crashes.
PRECONNECT: In this case it connects also
to the backup node so that the switch from one instance to the other
is quick.
Now
comes the exciting part, testing your Real Application
Cluster for failover. The following text describes
the connect string in the clients TNSNAMES.ORA
file for SELECT failover.
From a client SQL Plus session connect to your RAC database. To determine which
instance you connected to you can run the following:
SQL> select
instance_number,instance_name from v$instance;
In
our case we were connected to instance “clust2”.
So if we were to run an SQL statement and take
down this instance, then we should be failed over
to instance “clust1” on the other node.
The
next step is to run an SQL SELECT statement. Make
sure it will run long enough for you to shutdown
the instances you’re currently connected
to. You could import a table from your production
system via the imp/exp utilities.While the SQL
is processing, type the following on your server
to shut down the instance where your SQL is processing:
where
clust is the name of your cluster database and
clust2 is the specific instance you want to take
down. If everything is working properly you should
see your SQL results pause for a moment and then
pick back up. Once the SQL has completed verify
that you are connected to the other instance by
running:
SQL> select
instance_number,instance_name from v$instance;
Conclusion
Now you’ve seen the steps necessary to configure RAC on Linux using FireWire
drives. Oracle Real Application Clusters with FireWire on Linux enable you
to build a robust clustered system on a shared disk using inexpensive hardware.
This will allow you to test your clustered applications and get experience
managing a cluster.
Authors:
Brian Carr (bcarrdba@neo.rr.com) is
a Senior Database Administrator and Oracle Certified Professional at a manufacturing
company, in Akron, Ohio.
William Garrett (wgarrett@neo.rr.com)
is a Senior Application Developer/Web Technologies at a manufacturing company,
in Akron, Ohio.