blog.virtualtacit.com

Root Down in a Virtualized World

Sunshine with INQ…

leave a comment »

You know, this may be common knowledge, there may be a thousand other utilities that serve this function but I thought I would further propagate this utilities cause especially in the virtual world. INQ is the tool I speak of and it brings a whole lot of goodness. Ok its not that great, but it does allow you from a host perspective to map the LUN id of a specific volume. Useable? I think so.

Here is what you need, the binaries…located here, ftp://ftp.emc.com/pub/symm3000/inquiry/latest. Run the utility from the command line on any OS of which there is a tool. Below is what you will see…

 

image

If multiple arrays are attached to the host then verify the correct vendor. Here particularly we are focused on the DGC VEND which as you know is an EMC CLARiiON array. What we want to focus on is the first 2 digits which represent the LUN ID in Hex. Simply convert that number to decimal and you have your LUN ID.

Now here’s the downside, I haven’t figured out how to recognize LUN ID’s beyond 255 (FF) as the last two LUNs represent LUN ID 257 and 258. See what I mean, not so recognizable anymore from this utility but I assume I am missing something as most Symm LUN ID’s run well past 255. Maybe this is just a limitation of other array’s outside of Symm’s, anybody know?

Written by Joe Kelly

November 15, 2008 at 12:58 pm

Posted in clariion

EMC RecoverPoint-RPA Volumes (Bit II)

leave a comment »

Wow…there is so much to Recoverpoint, the overwhelming content of it has sparked a fire in me and gotten me extremely jazzed about learning this product and CDP in general. So lets continue…

The 3 three volumes I am about to mention within RecoverPoint are everything. Not only are they required as part of the installation but they set the stage for how RP will perform and function. Remember its all about planning with RP do it right up front, size it properly up front and you will experience quite a remarkable product.

So the 3 classifications of volumes that exist in a Recover Point implementation are user volumes, journals and the repository. What follows is a breakdown of each:

  • Journal-This is nothing more than a container for snapshot images for a particular replicated user LUN. But lets not stop there it most certainly serves other purposes, what follows are the percentages of the journal allocated to each said function.
    • 75% (variable) is dedicated to snapshot images and will hold as many as its capacity allows. Assuming the images have been distributed to the remote copy in the case of CRR and CLR, it follows a FIFO process, or First In First Out. As new images come in the old ones are removed to house the new ones.
    • 20% (variable) is used for the sole purpose of logged image access (physical and virtual). Image access is as it implies, accessing a PIT for the point of reading or writing to that replica volume, all changes are logged to this area of the journal. This is a variable area of the journal that can be resized from space from the snapshot image retainer (75% area), but will require an outage and loss of PIT’s within the volume. Keep in mind, that image access is temporary and prolong access could cause replication to cease.
    • 5% (fixed) is system partitioned space for RP. It holds the virtual pointers needed to bring physical and virtual image stitched access to fruition.

Depending on what type of replication you are using, whether its CDP, CLR, or CRR, sets the stage for how many journal volumes are needed.

journal

  • Repository-This volume in particular is key to housing the configuration for all clustered RPA’s as well as consistency group, replication set, policy settings and group sets among other things. The idea here is by maintaining all config info on the array you seamlessly allow all replication activities on a failing RPA to failover to another RPA(s). Only one volume per cluster (both local and the remote side) is needed to which both RPA’s are enlightened. The minimum size for the repository is 4 GB, however it SHOULD be 124G, that is what is realistic, here’s what I mean..
    • The first 4G is used for the aforementioned cluster configuration information. Outside of that, each consistency group created is earmarked 2G of the 120G left. Simple math tells us that the limit in relation to CG’s is 60 per cluster or 30 per RPA. This 2G is used within the replication process during points of WAN flap or drops, a temp caching location of sorts.
    • Furthermore, replication marking data is stored here which creates the grounds for more efficient resynchronization of replicated volumes.

    Don’t skimp on this volume, it should be sized up front (remember 124G) as resizing on the fly is not practical. A resize will require disablement of all CG’s, all journals will be cleared, a full sweep of your environment, and a new activation license. Not only that, be prudent and give this volume the juice, it should exist on fast spinning disk as its role is quite ponderous. Take away…Give it 124G upfront and save your self a lot of pain on the back end.

  • User-This refers to the production source, the local copy (CDP) and the remote copy (CRR). Every production volume has a copy volume, which is defined in what is called a replication set. Replication sets define a mapping between the prod volume and the local or remote copy. Every write on the production volume is replicated to the remote or local journal and then copied to the remote or local copy, consistency is the name of the game here. The replica volumes should be the same size as the production source, bigger and you are wasting space, smaller and errors will occur. 

Alright so enough of that, what’s next? Design considerations or maybe clarification on some of the terminology I used…its getting interesting agreed?

Written by Joe Kelly

November 6, 2008 at 2:47 am

Posted in Uncategorized

EMC RecoverPoint-Hardware Awareness (Bit I)

leave a comment »

Here is a quick rundown of what shells the idea of an RPA or RecoverPoint appliance. Under the covers its nothing more than a Dell 1950 (BTW, EOL’d Q2 2008). Corresponding to the 2.4 and the 3.0 release there exists two generations of the RPA’s easily recognizable by the HBA’s in use, Gen1 2Gb HBA’s, Gen3 4Gb HBA’s. Here is the breakdown..

  • Gen 1:
    • Phase 2 1950
    • PCI-X bus architecture
    • 2 2Gb Qlogic 2342-Dual ported
    • 2 sockets Dual Core, Intel Xeon Woodcrest
    • Compatible with RP V2.4 or V3.0 (V2.4 will not work with Gen 3 hardware)
  • Gen 3 (No I didn’t skip 2, huh?):
    • Phase 3 1950
    • PCI-E bus architecture
    • 2 4Gb Qlogic 2462-Dual ported
    • 2 sockets Quad Core, Intel Xeon Harpertown
    • Compatible with RP V3.0 and above only

Note–the shear fact that these are commodity servers implies that with the RP software and a linux 2.6 kernel in hand you could effectively build your own RP appliance for a lab environment, more on this in another post…

The HBA’s on the Gen3 hardware are auto sensing dual mode adaptors, meaning they act as either an initiator or target depending on what they are connected to. This, I imagine, would be a welcome change from the Gen1 HBA’s as only ports 0 and 2 were designated as targets, 1 and 3 as initiators.

In addition, there are two GiGe ports per node. One for management and LAN replication and one for WAN replication traffic. Each RPA needs an ip address as well as one VIP for floating IP management.

The linux kernel and RP software are installed locally on the RPA’s. Each appliance has 2 73G drives configured as a mirror set. Furthermore, the local identity of the RPA is also stored here, name, ip address, etc. All consistency group, replication sets, bit mapping, etc are stored within the repository (SAN based, expansion in future posts).

Something to keep in mind, EMC sells RP in a minimum 2 node configuration, however this doesn’t imply that a single node will not work only that HA is no longer possible. If CRR or CLR is in play, your destination site hardware must mirror your source site hardware. If you have a 2 node cluster at Site A, you must have a 2 node cluster at Site B, this collectively is known as a System.

How about that for a quickie, next post…explanation of the volume type functions within RecoverPoint and perhaps a tad bit more..stay with me.

Written by Joe Kelly

November 4, 2008 at 5:50 am

Posted in Uncategorized

Email Xtender, Large Ingest Woes

leave a comment »

Normally I wouldn’t comment on such topics as Archiving, as that subset of the IT world tends to be quite spiritless. But to lessen the head banging across the globe I thought I would spell out some fundamental tweaks to lessen the large ingest woes within EX. What is assumed, however, is a basic understanding of what is involved in an EX implementation. Here is what we got, in situations where you are moving from a competitive archive solution to EX there is a certain amount of finesse that needs to be massaged into the EX configuration. Assuming that all the archived data has been puked from the other archive system onto a temp exchange server for ingestion into EX please take note of such considerations…

 

image

  • Your message center volume sizes must be small enough to force closure by EX in a reasonably amount of time. In most situations other than this, 90MB is appropriate. But unless you message center drive is large enough EX will fill up that drive during the initial bulk load. So what are your options-
    • Move the message center drive to the container drive or resize the message center drive to accommodate the initial ingest. The container drive is usually sized large enough as this is ultimately where your archived emails end up.
    • Change the following registry value to force closure of your container volumes after 2 hours. The default behavior is to close the volume after the allotted capacity (in this case 20MB) or 5 days of inactivity.

        [HKEY_LOCAL_MACHINE\SOFTWARE\OTG\EmailXtender\RecordParms]

        “DirIdleTimeOut”=dword:00001c20, set DWORD to 7200 seconds for 2 hours

    • Other options noted here, detailed below…
      • Navigate to the registry HKLM\PSoftware\OTG\EmailXtender, perform changes in the following registry keys in that path..
        • MaxNumIndexProcesses (original value 4 change hexidecimal to 6 or 8 ) No more than 8 This will speed up indexing
        • MsgVaultDays ( originally set to 1 change hexidecimal value to 0)
        • MsgVaultHours (original value 0 can change hexidecimal to 1)
        • MsgVaultMins (originally set to 0 can change hexidecimal to value of 15, 30, etc.)
        • If you use MsgVaultMins then you should keep MsgVaultDays to 0 and MsgVaultHours to 0. All the MsgVault keys refer to how often we will create a folder in Emailxtract. The more often we create folders the less objects will be created in the folders and it will increase performance.
        • MaxRetryUpd (originally set to 60 change decimal value to 15, 10, or lower  Do not change to 0)
        • IndexIterationTime (originally set to 600 change decimal value to a value like 300, 200, 100 etc.  Do not change to 0)
        • Purpose of this key is how often EX goes through the IndexDropDir to see if there is anything that it needs to process

 

  • What about throughput? Besides your obvious quick wins here, ie. more cpu, mem, seperate LUNs/spindles for each operation, here are a few other options to try.
    • Separate your initial archive tasks by distribution lists within Exchange. Create say 4 DL’s with your site mailbox’s even distributed among them. Point one archive task within EmailXtract to one DL, another archive task to another DL and so on.
    • Other options noted here, detailed below..

        For optimum performance it is recommended that MSMQ (ie:C:\WINDOWS\system32\msmq), Operating System (ie: C:\WINDOWS), Index Directory (ie: E:\Program Files\OTG\EmailXtender\EmailVault_Index), Message Center (ie: E:\Program Files\OTG\EmailXtender\MsgCenter), Payload if you’re running big Extract tasks (ie: E:\Program Files\OTG\EmailXtender\payload), Container (ie: E:\EmailXtender), Mail Root Drop directory if you process Bloomberg or IM (ie: C:\Inetpub\mailroot\Drop), and possibly SQL server (ie: C:\Program Files\Microsoft SQL Server) be located on separate SPINDLES.

        You can use this step to locate or move the Index drive
        ————————————————————————————–
        Stop all services. Copy Emailvault_index directory to new drive. Create a new string key in regedit under HKEY_local_machine\software\otg\emailXtender called Indexdir and put the new drive letter in this. Restart services and indexing will start to place files in the new drive.
        A path can also be entered in this string (Indexdir) to reflect the new directory path. If a directory path is listed, it is important to know that the emailvault_index directory will be created at service startup as well as the 3 working subdirectories, indexdir, baddir, and dropdir. Once this is created, the data from the previous location should be copied into each newly created directory having the same name.

        You can use this process to relocate the Payload directory
        ——————————————————————————————-
        Make sure that there are no active Extract tasks, close Extract completely (make sure it is not running on the system tray)
        Use regedit to edit the key in HKLM\Software\OTG\Emailxtender Payload key: enter the location of the Payload directory.

        You can use this process to relocate the Message Center
        ——————————————————————————————
        Disconnect connection mailbox and make sure that there are no active Extract tasks running.
        Close all open volumes and ensure that all Qs are zero
        Stop all EmailXtender services
        Use regedit to edit the key in HKLM\Software\OTG\Emailxtender MsgVaultDir key: enter the location of the Message Center directory.
        Restart all EmailXtender services

        You can use this process to relocate the MailRoot\Drop directory
        ——————————————————————————————————-
        Reinstall IIS on the desired directory
        Use regedit to edit the key in HKLM\Software\OTG\Emailxtender SMTP_DropDir key: enter the location of the SMTP Drop directory.
        You can move MSMQ using the applet on Control Panel; you will have to reboot the system. Please make sure that all volumes are closed and all Qs are empty.
        You can select the new Containers location using the EmailXtender administrator GUI.

Written by Joe Kelly

October 28, 2008 at 6:13 pm

Posted in Uncategorized

Data De-dupe on Primary storage not so Peachy

with 2 comments

In case you guys missed it, NetApp offered a shakeup back in July about the ability of their V-series line to de-dupe their competitors primary storage, noted here. I would have to agree with the general consensus that deduplication has its place and its place aint’ on primary storage. Applying de-dupe to secondary, tertiary storage and backup operations is really the meat behind this features punch.

The goal here is to provide your production data with the means to achieve high throughput, low latency execution, right? What you are doing is permeating your critical business infrastructure with an operation that is known to degrade performance. Not to mention this doesn’t in any way, shape or form provide the customer with an end to end, low maintenance, SUPPORTED solution. As a customer you must decide whether or not your investment in your array (whether EMC, HP, HDS, IBM, etc)  is all for not. 

By fronting your array with the V-series essentially strips all management capabilities of your array and reduces it to JBOD. Your investment is no longer an investment. E-labs, EMC’s own interoperability entity, works with associated vendors to resolve issues for qualified configurations. These support agreements run deep with various vendors but NetApp, particularly the V-series, is not one of them.

peach

So here is what NetApp suggests:

· Support for de-duplication of primary data on third party storage arrays from EMC, HDS, HP, and others when connected to NetApp V-Series Virtualization systems.

· NetApp de-duplication, a feature of the Data ONTAP operating system provided at no cost on FAS systems, is now also offered free with the V-Series.

· End-to-end de-duplication including primary data, as opposed to other vendors’ de-duplication of only backup or archive environments.

· Improved business efficiency and reduced data management complexity using V-Series with non-NetApp storage arrays.

· De-duplication helps improve space efficiency and reduce raw storage requirements.

· By using V-Series with de-duplication, customers are able to better control their heterogeneous data growth while reducing costs and simplifying data management.

· More than 10,000 NetApp systems and 2,500 customers running NetApp de-duplication technology.

· All NetApp storage technologies will include de-duplication by the end of 2008.

Ok so here’s the rub when it comes to data de-dupe on NetApp Filers:

· Active snapshots? sorry you can’t de-dupe 

·Severe volume limitations are imposed as part of the 3D process

· No de-dupe over FlexVols, Aggregates, or Filers.

· Backup to tape inflates data to pre-dedupe size

· Since de-duplication is a post-process operation, NetApp offers no reduction of capacity requirements for initial purchase of new systems.

· No reclamation of space in block based storage (FC and iSCSI)

· Scheduling complications are now a reality. Avoiding periods of snapshotting, replication, archiving and general heavy work loads can be difficult.

· NetApp says “If there is very little new data, run de-duplication infrequently, because it doesn’t make sense to unnecessarily consume CPU resources.” http://www.netapp.com/us/library/technical-reports/tr-3505.html

·  De-duplication itself is free, but are SnapVault and SnapMirror?  Should I remind you that nothing in life is free.

De-dupe, like every other storage feature, whether its EMC, NetApp, DataDomain,etc, has its positives and negatives. Just make sure you as a customer look beyond the marketing wooglie booglie and understand the technology you are depending on.

One last thought, if you do decide to turn on NetApp 3D please take 10 minutes to fill out a “I told you so” form that releases any wrong doing from NetApp  when your performance dips below the equator, http://www.crn.com/storage/209901632, I kid you not….

Written by Joe Kelly

October 5, 2008 at 4:13 pm

Posted in storage