Archive for the ‘Uncategorized’ Category
Why Site Recovery Manager Update 1 is so welcome..
Like a breath of fresh air, like a long lost friend from college, like the smell of toll house cookies on a lazy summer afternoon, that is how I feel about SRM update 1. Although the list of new features could have been longer in this release I am hoping that Update 1 was merely a stopgap (to appease the masses) for the more feature rich release of Update X or SRM X.0, whatever route VMware chooses to take this. Anyhow here are the new features in this release…
- New Permission Required to Run a Recovery Plan
SRM now distinguishes between permission to test a recovery plan and permission to run a recovery plan. After an SRM server is updated to this release, existing users of that server who had permission to run a recovery plan no longer have that permission. You must grant Run permission to these users after the update is complete. Until you do, no user can run a recovery plan. (Permission to test a recovery plan is unaffected by the update.) - Full Support for RDM devices
SRM now provides full support for virtual machines that use raw disk mapping (RDM) devices. This enables support of several new configurations, including Microsoft Cluster Server. (Virtual machine templates cannot use RDM devices.)..
This is a big deal and was discouraging to alot of our customers, I am glad to see this was pushed to the forefront.
- Batch IP Property Customization
This release of SRM includes a tool that allows you to specify IP properties (network settings) for any or all of the virtual machines in a recovery plan by editing a comma-separated-value (csv) file that the tool generates.
Heaven help you if you every had to change the IP properties for all you VM’s at the recovery site, again another great time saver…
- Limits Checking and Enforcement
A single SRM server can support up to 500 protected virtual machines and 150 protection groups. This release of SRM prevents you from exceeding those limits when you create a new protection group. If a configuration created in an earlier release of SRM exceeds these limits, SRM displays a warning, but allows the configuration to operate. - Improved Support for Virtual Machines that Span Multiple Datastores.
This release provides improved support for virtual machines whose disks reside on multiple datastores.
I am seeing this more and more at client sites, its very disconcerting that customers are doing this as the ramifications of said practices only hurts you in the long run. But none the less the need has been feed..
- Single Action to Reconfigure Protection for Multiple Virtual Machines
This release introduces a Configure All button that applies existing inventory mappings to all virtual machines that have a status of Not Configured.
Sweet baby Jesus, this is a great time saver as well, what shall I do with all my spare time that I am getting back? You tell me..
- Simplified Log Collection
This release introduces new utilities that retrieve log and configuration files from the server and collect them in a compressed (zipped) folder on your desktop. - Improved Acceptance of Non-ASCII Characters
non-ASCII characters are now allowed in many fields during installation and operation.
EMC RecoverPoint-RPA Volumes (Bit II)
Wow…there is so much to Recoverpoint, the overwhelming content of it has sparked a fire in me and gotten me extremely jazzed about learning this product and CDP in general. So lets continue…
The 3 three volumes I am about to mention within RecoverPoint are everything. Not only are they required as part of the installation but they set the stage for how RP will perform and function. Remember its all about planning with RP do it right up front, size it properly up front and you will experience quite a remarkable product.
So the 3 classifications of volumes that exist in a Recover Point implementation are user volumes, journals and the repository. What follows is a breakdown of each:
- Journal-This is nothing more than a container for snapshot images for a particular replicated user LUN. But lets not stop there it most certainly serves other purposes, what follows are the percentages of the journal allocated to each said function.
- 75% (variable) is dedicated to snapshot images and will hold as many as its capacity allows. Assuming the images have been distributed to the remote copy in the case of CRR and CLR, it follows a FIFO process, or First In First Out. As new images come in the old ones are removed to house the new ones.
- 20% (variable) is used for the sole purpose of logged image access (physical and virtual). Image access is as it implies, accessing a PIT for the point of reading or writing to that replica volume, all changes are logged to this area of the journal. This is a variable area of the journal that can be resized from space from the snapshot image retainer (75% area), but will require an outage and loss of PIT’s within the volume. Keep in mind, that image access is temporary and prolong access could cause replication to cease.
- 5% (fixed) is system partitioned space for RP. It holds the virtual pointers needed to bring physical and virtual image stitched access to fruition.
Depending on what type of replication you are using, whether its CDP, CLR, or CRR, sets the stage for how many journal volumes are needed.
- Repository-This volume in particular is key to housing the configuration for all clustered RPA’s as well as consistency group, replication set, policy settings and group sets among other things. The idea here is by maintaining all config info on the array you seamlessly allow all replication activities on a failing RPA to failover to another RPA(s). Only one volume per cluster (both local and the remote side) is needed to which both RPA’s are enlightened. The minimum size for the repository is 4 GB, however it SHOULD be 124G, that is what is realistic, here’s what I mean..
- The first 4G is used for the aforementioned cluster configuration information. Outside of that, each consistency group created is earmarked 2G of the 120G left. Simple math tells us that the limit in relation to CG’s is 60 per cluster or 30 per RPA. This 2G is used within the replication process during points of WAN flap or drops, a temp caching location of sorts.
- Furthermore, replication marking data is stored here which creates the grounds for more efficient resynchronization of replicated volumes.
Don’t skimp on this volume, it should be sized up front (remember 124G) as resizing on the fly is not practical. A resize will require disablement of all CG’s, all journals will be cleared, a full sweep of your environment, and a new activation license. Not only that, be prudent and give this volume the juice, it should exist on fast spinning disk as its role is quite ponderous. Take away…Give it 124G upfront and save your self a lot of pain on the back end.
- User-This refers to the production source, the local copy (CDP) and the remote copy (CRR). Every production volume has a copy volume, which is defined in what is called a replication set. Replication sets define a mapping between the prod volume and the local or remote copy. Every write on the production volume is replicated to the remote or local journal and then copied to the remote or local copy, consistency is the name of the game here. The replica volumes should be the same size as the production source, bigger and you are wasting space, smaller and errors will occur.
Alright so enough of that, what’s next? Design considerations or maybe clarification on some of the terminology I used…its getting interesting agreed?
EMC RecoverPoint-Hardware Awareness (Bit I)
Here is a quick rundown of what shells the idea of an RPA or RecoverPoint appliance. Under the covers its nothing more than a Dell 1950 (BTW, EOL’d Q2 2008). Corresponding to the 2.4 and the 3.0 release there exists two generations of the RPA’s easily recognizable by the HBA’s in use, Gen1 2Gb HBA’s, Gen3 4Gb HBA’s. Here is the breakdown..
- Gen 1:
- Phase 2 1950
- PCI-X bus architecture
- 2 2Gb Qlogic 2342-Dual ported
- 2 sockets Dual Core, Intel Xeon Woodcrest
- Compatible with RP V2.4 or V3.0 (V2.4 will not work with Gen 3 hardware)
- Gen 3 (No I didn’t skip 2, huh?):
- Phase 3 1950
- PCI-E bus architecture
- 2 4Gb Qlogic 2462-Dual ported
- 2 sockets Quad Core, Intel Xeon Harpertown
- Compatible with RP V3.0 and above only
Note–the shear fact that these are commodity servers implies that with the RP software and a linux 2.6 kernel in hand you could effectively build your own RP appliance for a lab environment, more on this in another post…
The HBA’s on the Gen3 hardware are auto sensing dual mode adaptors, meaning they act as either an initiator or target depending on what they are connected to. This, I imagine, would be a welcome change from the Gen1 HBA’s as only ports 0 and 2 were designated as targets, 1 and 3 as initiators.
In addition, there are two GiGe ports per node. One for management and LAN replication and one for WAN replication traffic. Each RPA needs an ip address as well as one VIP for floating IP management.
The linux kernel and RP software are installed locally on the RPA’s. Each appliance has 2 73G drives configured as a mirror set. Furthermore, the local identity of the RPA is also stored here, name, ip address, etc. All consistency group, replication sets, bit mapping, etc are stored within the repository (SAN based, expansion in future posts).
Something to keep in mind, EMC sells RP in a minimum 2 node configuration, however this doesn’t imply that a single node will not work only that HA is no longer possible. If CRR or CLR is in play, your destination site hardware must mirror your source site hardware. If you have a 2 node cluster at Site A, you must have a 2 node cluster at Site B, this collectively is known as a System.
How about that for a quickie, next post…explanation of the volume type functions within RecoverPoint and perhaps a tad bit more..stay with me.