Netapp and Veeam – Combining Snapmirror and Snapvault

January 30, 2016, 10:39 pm

≫ Next: VMware vCenter 5.5 Update 2 upgrade to Update 3b Woes

≪ Previous: How to enable Remote Desktop via Powershell

I have been trying out a few different configurations in my lab with a Netapp and Veeam setup running snapmirror and snapvault on the same volume.

Traditional if you wish to utilize offsite backups you would use Netapp Snapvault and if you were to utilize disaster recovery you would use Netapp Snapmirror.

There is nothing wrong with doing it that way, however the challenge we saw was the double up of storage, 1 destination volume for snapvault and 1 destination volume for snapmirror.

The configuration I’m going to outline below, utilizes 1 destination volume for both snapmirror and snapvault, which could save you a heap of storage and money.

Combining Snapmirror and Snapvault

Let’s set the scenario. We have a VMware environment with 1 datastore called nfsdatastore1. By the name you can tell this is an nfs mount. This datastore contains a hand full of virtual machines.

What we want to achieve is, disaster recovery and offsite backups using the minimal amount of space in our offsite location as possible.

To do this we will create a snapvault relationship.

Ensure that the 2 Netapp’s are peered together

SourceFiler::> cluster peer create -peer-address 1.1.1.1 (Change the IP to suite your environment)

SourceFiler::> vserver peer create -peer-vserver destinationvfiler -peer-cluster destinationcluster -applications snapmirror

Create a Snapmirror relationship with type XDP

DestinationFiler::> snapmirror create -source-path sourcevfiler:nfsdatastore1 -destination-path destinationvfiler:nfsdatastore1_sv -type XDP

DestinationFiler::> snapmirror initialize -source-path sourcevfiler:nfsdatastore1 -destination-path destinationvfiler:nfsdatastore1_sv -type XDP

Veeam Backup and Recovery to Drive Snapvault

We are going to use Veeam Backup and Recovery to drive the Netapp Snapvault component. Within the backup job – Storage – select configure secondary destinations for this job

You will now have an option to select Secondary Target. Within here click Add and select Netapp Snapvault. Click edit and select the number of offsite backups you would like to retain

Now that we have Veeam setup to control our Netapp Snapvault (offsite backups), we can now start on the configuration for Disaster Recovery

Netapp Disaster Recovery Setup

We will create a schedule of when we would like the source volume snapshots taken

SourceFiler::> schedule cron create -name nfsdatastore1_dr -minute 05

Next, create a snapshot policy.

SourceFiler::> snapshot policy create -vserver SourceVfiler -policy nfsdatastore1_dr -enabled true -schedule1 nfsdatastore1_dr -count1 2 -prefix1 nfsdatastore1_dr -snapmirror-label1 nfsdatastore1_dr

Now we will apply this snapshot policy to the volume

SourceFiler::> volume modify -vserver SourceVfiler -volume nfsdatastore1 -snapshot-policy nfsdatastore1_dr

What we have now is a Snapshot policy that will create a snapshot every hour on the volume nfsdatastore1 and retain 2 snapshot copies.

We’ll now get started on the config at the destination side.

First up lets create a schedule. The schedule will be applied to the snapmirror relationship later on. One IMPORTANT note with the schedule, is that we must leave out the hour at which the Veeam Backup runs. If Veeam detects that there is an existing snapmirror transfer going on, it will skip trying to transfer the snapvault snapshot and throw up a warning at the end of the backup job. Currently there is no retry settings for snapmirror/snapvault within Veeam (including version 9)

DestinationFiler::> schedule cron create -name nfsdatastore1_dr -hour 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 -minute 15 (Notice here that I have left out our 11pm or 23 due to the Veeam backups running at 11pm from the source side)

Next up is to create a snapmirror policy

DestinationFiler::> snapmirror policy create -vserver DestinationVserver -policy nfsdatastore1_dr

The snapmirror policy rule below, transfers the snapshot with label nfsdatastore1_dr (with the appended time and date) from source to destination and keeps the last 2 snapshots

DestinationFiler::> snapmirror policy add-rule -vserver DestinationVserver -policy nfsdatastore1_dr -snapmirror-label nfsdatastore1_dr -keep 2

Now we can tie all this together in the existing snapmirror relationship

DestinationFiler::> snapmirror modify -source-path sourcevfiler:nfsdatastore1 -destination-path destinationvfiler:nfsdatastore1_sv -policy nfsdatastore1_dr -schedule nfsdatastore1_dr

CAVEAT: There is 1 caveat with this setup and that is, it will be crash consistent when you bring up your vm’s on the disaster recovery side.

If you have any questions feel free to leave comments below

↧

VMware vCenter 5.5 Update 2 upgrade to Update 3b Woes

February 5, 2016, 5:13 am

≫ Next: Netapp Unified Manager Upgrade 6.3 RC1 to 6.3 GA Invalid ISO

≪ Previous: Netapp and Veeam – Combining Snapmirror and Snapvault

I came across a few issues upgrading a site from vCenter 5.5 update 2 up to update 3b. I’m writing this article in the hope that if you run into the same problem it will save you hours of troubleshooting and support calls with VMware.

vCenter 5.5 Update 3b Upgrade

I had already gone through updating a few instances of vCenter 5.5 to Update 3b without any issue at all. From time to backing the database, a few vm and storage snapshots it probably took a little under an hour till it was complete. However this last site experienced a bizarre issue that even had some VMware techs stumped until the issue was escalated far enough up the ranks.

First Upgrade Attempt

This site had a first attempt of upgrading vCenter to 5.5 Update 3b, upon successfull completion of the vCenter server upgrade, the vCenter service started and not longer after, crashed. It repeated this numerous times creating endless amounts of vpxd.log files. We ended up reverting back to the backups and back to the original Update 2

A VMware support ticket was raised and the logs from the attempted upgrade were submitted. We were advised that the vCenter server contained corrupt virtual machines that needed to be removed from the back-end SQL database. The KB article (KB 1028750for this issue can be found here:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1028750

Since we had rolled back already to Update 2 we could simply identify the Virtual Machine, shut it down and remove it from inventory. Following this we would attempt to upgrade the vCenter server again to Update 3b

Second Upgrade Attempt

The second upgrade attempt did not crash the vcenter service and we thought the upgrade was complete, until a few hours later it began to crash again. As the vCenter server needed to be online for people to use, we once again rolled back to Update 2.

I repeated the same process by uploading the logs to VMware, they analyzed them and the result was another corrupted Virtual Machine that needed to be removed from the inventory. Once again we removed the questionable virtual machine from the inventory.

Third Upgrade Attempt

We could not continue rolling back, uplaoding logs, removing 1 virtual machine and then performing the upgrade again. It was way too time consuming This time we scheduled a webex with the VMware engineer to be online while we performed the upgrade, this meant he could troubleshoot the issue while it was happening.

The third upgrade attempt also did not see the vcenter service start crashing until 24 hours after. At this time it continually started and crashed.

Of course when it crashed it had to be after hours, however I managed to get hold of a new VMware engineer that helped me troubleshoot the issue for 9 hours non-stop (yes it was a long night). What we did was, clear out old tasks and events from the vCenter database followed by a shrink task. He then changed the threadstack size to 1024 then 2048 within the vpx.cfg file. We checked for the known bug of a stale LDAP entry within ADSI edit via this VMware KB2044680. We did a lot of other little things here and there, we went event as far as un-installing and re-installing vCenter Server. Still the service would continue to crash.

More Logs were taken offline and analyzed. It wasn’t until the morning somewhere between 9am-10am that we heard back from an escalation engineer in the US that found the following problem.

The Problem

Within the vpxd.log file, there was an issue with the virtual machines vswap file. Basically when vCenter service starts up, it verifies that the vswap files for each virtual machine exist in the vswap location, in this instance it is set to a specific vswap datastore. This verification step is apparently something new to update 3b, and explains why when we reverted back to update 2 the service successfully started and continued running.

VPXD.LOG Entry

Resolved localPath /vmfs/volumes/9625aa71-a57f224f/Company-Web01-1d614347.vswp to URL ds:///vmfs/volumes/9625aa71-a57f224f/Company-Web01-1d614347.vswp

mem> 2016-02-05T13:05:42.928+11:00 [10860 panic ‘Default’ opID=HB-host-7902@618-32a4dff9] (Log recursion level 2) Win32 exception: Access Violation (0xc0000005)

Next was to find the host where this virtual machine was running on, I had to log into each ESXi host until we found the running virtual machine. We enabled SSH and established a CLI session. We stopped the vpxa service on the ESXi host by issuing the command /etc/init.d/vpxa stop

Once the vpxa service stopped we then attempted to re-start the vCenter service. The service started and continued to run.

To double check if there was any swap file for this questionable virtual machine, we switched over to our ESXi CLI session and typed:

ls -lrth /vmfs/volumes/9625aa71-a57f224f | grep Company-Web01

This command did not produce any vswap files in this location.

We headed back into the vSphere client attached directly to the ESXi host and shutdown the virtual machine. Once it was shut down we powered it on again. Now it was time to re-check if the vswap file existed in the vswap datastore location. Issuing the same command again:

ls -lrth /vmfs/volumes/9625aa71-a57f224f | grep Company-Web01

Displayed the vswap file for the Company-Web01 virtual machine

Still within the ESXi CLI session, we started the vpxa service:

/etc/init.d/vpxa start

After a few seconds, the host re-established a connection to vCenter and re-populated all virtual machines. (previously they were displayed as orphaned while the ESXi host was disconnected)

↧

Netapp Unified Manager Upgrade 6.3 RC1 to 6.3 GA Invalid ISO

February 21, 2016, 2:43 am

≫ Next: Netapp Announces General Availability of Data Ontap 8.3.2

≪ Previous: VMware vCenter 5.5 Update 2 upgrade to Update 3b Woes

There is currently a small issue when trying to upgrade from Netapp Unified Manager 6.3 RC1 to 6.3 GA, where by at the beginning of the upgrade the wizard cannot find the CDROM and displays:

Invalid ISO

Make sure that a valid ISO is mounted on the virtual CD/DVD device

This is documented in Netapp’s KB Article https://kb.netapp.com/support/index?page=content&id=S%3A2025520&actp=LIST

Netapp Unified Manager 6.3 RC1 – Create a new Symbolic Link for the CDROM

The fix is to create a new symbolic link for the CDROM drive

Login as your maintenance user
Select option 4
type erds
Create a new UNIX password for the diag user
Now create a new SSH session to the Unified Manager Server
Login as diag and the password you created in step 4 above
Type: sudo ln -sf /cdrom /media/cdrom

You can then exit the diag ssh session and attempt the upgrade again

↧

Netapp Announces General Availability of Data Ontap 8.3.2

March 12, 2016, 6:20 pm

≫ Next: Netapp Unified Manager 6.4RC1 and Performance Manager 2.1RC1

≪ Previous: Netapp Unified Manager Upgrade 6.3 RC1 to 6.3 GA Invalid ISO

This week Netapp has announced the general availability of Data Ontap 8.3.2

Some of the added features of Data Ontap 8.3.2 are:

Netapp Data Ontap 8.3.2 Features

Inline Deduplication
MetroCluster distance increase now supporting up to 300km between sites
Oracle NFS workload on All Flash FAS (AFF) simplify performance and setup with a quick start guide and OnCommand System Manager
Copy Free Transition (CFT) – leverage existing disk shelves to transition from 7-mode to clustered Data Ontap

Netapp Data Ontap 8.3.2 Release Notes

Here is the direct link to the Data Ontap 8.3.2 Release Notes where you can find a complete description of all the features included in this release

https://library.netapp.com/ecm/ecm_get_file/ECMLP2348067

↧

Netapp Unified Manager 6.4RC1 and Performance Manager 2.1RC1

March 15, 2016, 4:38 am

≫ Next: Netapp Performance Manager 2.1RC1 and vSphere 5.5 install error

≪ Previous: Netapp Announces General Availability of Data Ontap 8.3.2

Last week Netapp released the latest Unified Manager and Performance manager. The latest release ties the 2 products closer than ever with the new Full Integration option (Single pane of glass), which allows you to view both Unified Manager health information and Performance Manager information within the same window.

Over the weekend I had a chance to install this in the lab and with a few days of gathering information and stats, I’m very impressed with these releases.

One component that I was hoping would be fixed in this release the ability to configure threshold settings on various objects, such as aggregates, volumes, etc. However it looks like in this release it is still not working.

Below are the links to the release notes and installation guides along with some screen shots of each product

Netapp Unified Manager 6.4RC1

Unified Manager 6.4RC1 Release Notes
Unified Manager 6.4RC1 Installation and Setup Guide

Unified Manager 6.4RC1 – Adding a new cluster

Unified Manager 6.4RC1 – Main Health Dashboard

Unified Manager 6.4RC1 – Aggregate Disk View

Netapp Performance Manager 2.1RC1

Performance Manager 2.1RC1 Release Notes
Performance Manager 2.1RC1 Installation and Administration Guide

Performance Manager 2.1RC1 – Unified Manager Integration Options

Performance Manager 2.1RC1 – Performance Dashboard

Performance Manager 2.1RC1 – Cluster Details

Notice in the window below we are looking at Performance Manager within Unified Manager

Performance Manager 2.1RC1 – Volume Performance Information

↧

Netapp Performance Manager 2.1RC1 and vSphere 5.5 install error

March 26, 2016, 4:43 am

≫ Next: Netapp Tech Ontap Podcast

≪ Previous: Netapp Unified Manager 6.4RC1 and Performance Manager 2.1RC1

There is an issue installing Netapp Performance Manager 2.1RC1 on VMware vSphere 5.5 where by the network settings do not get applied during the install wizard.

The error you will see is:

Common Agent: failed
Failure occurred while applying changes. Restoring back to previous configuration

Netapp Performance Manager 2.1RC1 install error

Netapp OnCommand Performance Manager does install correctly within VMware vSphere 6

This is the workaround if you are experiencing this issue

Netapp Performance Manager 2.1RC1 Network Fix

1. After seeing the error above, let the server boot until you see the login prompt. Login with username admin and the password you set during the OVA deployment.

2. Select option 4 and type in erds for where it asks you to Enter your choice. Press y to enable remote diagnostic access.

3. Enter a password for the diag user

4. Press x to Exit out of the menu system and to return to the login prompt

5. Login as diag and the password you set in step 3. type sudo bash –login and press enter. Next we want to edit the interfaces file with vi as you can see in the screen shot below

6. We need to change the iface eth0 inet dhcp to iface eth0 inet static and add all the details below like in the screenshot below. Some commands you can use in VI to manipulate the contents are:

I – for insert
ESC – to exit insert mode
x – deletes a single character
wq! – write and quite – saves the file

7. Next we will run the vmware tools configuration script. Please read step 8 before pressing enter or accepting defaults.

8. We can accept all default EXCEPT Enabling vgauth. The default is yes however we want to set it as no exactly like in the screenshot below

9. Once the script has finished, go ahead and reboot the server. Oncommand Performance Manager will now have its network set correctly.

↧

Netapp Tech Ontap Podcast

June 23, 2016, 3:48 am

≫ Next: Netapp Disk Bad Label Version

≪ Previous: Netapp Performance Manager 2.1RC1 and vSphere 5.5 install error

If you haven’t listened or subscribed to the Netapp Tech Ontap Podcast, I highly recommend it. It’s a great way to stay on top of the latest technology and innovation from Netapp.

The hosts are:

Justin Parisi (@NFSDudeAbides)
Glenn Sizemore (@glnsize)
Andrew Sullivan (@andrew_NTAP)

The latest Episode (42) talks about Data Ontap 9 new and improved features.

You can subscribe and listen to the podcast via iTunes or Soundcloud:

Tech Ontap iTunes

Tech Ontap SoundCloud

↧

Netapp Disk Bad Label Version

June 23, 2016, 4:43 am

≫ Next: Cisco Nexus 9K are not FCoE NPIV core switches

≪ Previous: Netapp Tech Ontap Podcast

If you are taking non-zeroed disks from a later version of Netapp Data Ontap and placing them into a system running an earlier version of Data Ontap you may experience the following warning, error and critical messages in the log:

WARNING monitor.globalStatus.nonCritical: Disk on adapter 0b, shelf 28, bay 1, label version. Disk on adapter 0b, shelf 28, bay 0, label version. Disk on adapter 0b, shelf 27, bay 0, label version.

ERROR raid.config.disk.bad.label.version: Disk 0b.28.1 Shelf 28 Bay 1 [NETAPP X308_WKOJN03TSSM NA00] S/N [WD-WCC132173962] has an unsupported label version

ERROR raid.assim.disk.badlabelversion: Disk 0b.28.1 Shelf 28 Bay 1 [NETAPP X308_WKOJN03TSSM NA00] S/N [WD-WCC132173962] has raid label with version (13), which is not within the currently supported range (5 – 12). Please contact NetApp Global Services

CRITICAL callhome.dsk.label.v: Call home for DISK BAD LABEL VERSION

Issuing a disk show-boken command displays the following output:

::> disk show -broken

Disk	Outage Reason	HA	Shelf	Bay	Chan	Pool	Type	RPM	Usable Size	Physical Size
NODE1:0b.27.0	label version	0b	27	0	A	Pool0	BSAS	7200	2.42TB	2.43TB

The following table illustrates the Data Ontap Version and its corresponding RAID Label Version

Data Ontap Version	Associated RAID Label Version
6.2x – 6.4x	Raid Label Version 5
6.5x	Raid Label Version 6
7.0x – 7.1.x	Raid Label Version 7
7.2	Raid Label Version 8
7.3	Raid Label Version 9
8.0x	Raid Label Version 10
8.1	Raid Label Version 11
8.2	Raid Label Version 12
8.3	Raid Label Version 13

Solution

Data Ontap version up to and including 8.0.1 – The disk must be moved back to a system with a same or later version of Data Ontap than the source disk. Mark as spare and zero the disk. Once that is complete you can then move the disk to other versions of Data Ontap.

Data Ontap versions 8.0.2 and later – issues ::> disk unfail -s The disk will become an unzeroed spare

For a full list of details the KB article for this work around is:

https://kb.netapp.com/support/index?page=content&id=2011366&actp=LIST

↧

Cisco Nexus 9K are not FCoE NPIV core switches

July 9, 2016, 9:55 pm

≫ Next: Netapp Ontap 9 Volume Rehost

≪ Previous: Netapp Disk Bad Label Version

The Cisco Nexus 9K’s have been out for a little while now, however it has only been recently that the support for FCoE have come to these switches.

However, the support for FCoE is only as an NPV edge switch, meaning that it basically passes through FCoE traffic to another FCoE switch. The other FCoE switch could be a pair of Nexus5K’s or Cisco UCS Fabric Interconnects.

The FCoE NPV edge only mode within the Nexus 9K has been confirmed with Cisco and is true as of NXOS release 7.0(3)I4(1).

Hopefully they release NPIV core switch mode in the future.

Cisco Nexus NPV Edge Example

In the example below, we can see that the Nexus 9K switches are in NPV mode and simply allow the Netapp Storage to connect via FCoE. The FCoE traffic is then passed to 2 Cisco UCS Fabric Interconnects (FC Switch Mode, not in End-Host Mode) where all the zoning is done.

↧

Netapp Ontap 9 Volume Rehost

July 31, 2016, 2:43 pm

≫ Next: Netapp Ontap 9 Performance Comparison

≪ Previous: Cisco Nexus 9K are not FCoE NPIV core switches

If you have been working with Netapp Clustered Data Ontap for a while now you would no doubt have performed quite a few volume moves. However one limitation of the volume move command was that you couldn’t move the volume between storage virtual machines (SVM’s)

Let’s Welcome the Volume Rehost Feature of Ontap 9

Starting with Ontap 9 we can now move a volume between Storage Virtual Machines (SVM’s). But… unlike the volume move command this is an offline command that requires the namespace or luns to be dismounted.

Let’s walk through an example.

Ontap 9 Volume Rehost

In this example I have 2 SVM’s, one called SVM3 and the other called SVM4. Within SVM3 I have a volume called NFS2, this volume I will be moving or in ontap terms, rehosting to SVM4.

Before we begin if this volume is in use either by virtual machines, cifs shares, lun attachement, etc you will want to shutdown and disconnect all sessions as we will be unmounting the namespace. If we have a lun within this volume, the lun will be unmapped from the initiator.

You can see the list of volumes I have in both SVM’s from the screen shot below:

MY NFS2 volume is mounted within the SVM namespace with the junction path /NFS2. Next we will dismount it with the command volume unmount

We are now ready to run the volume rehost command. The command is quite simple to run using the options of source vserver, destination vserver and the volume we would like to rehost.

Now that the volume has been move to our destination SVM successfully we can see it appear in the list of volumes with the show volume command. The volume does not get automatically mounted within the namespace on the destination SVM, so to mount it we can simply use the volume mount command as seen in the example screen shot below. Once the volume has been mounted you can verify the junction-path with the volume show command and displaying the field junction-path.

↧

Netapp Ontap 9 Performance Comparison

August 6, 2016, 8:17 pm

≫ Next: Netapp Ontap 9 Data Compaction

≪ Previous: Netapp Ontap 9 Volume Rehost

Ontap 9 is still relatively new, only being released a few weeks ago and with all the hype around the new features and performance gains between previous versions, I really wanted to see this for myself.

Netapp Ontap 9 Performance Lab Testing

I have taken to the lab with 3 different versions of the operating system:

Data Ontap 8.2.3
Data Ontap 8.3.2
Ontap 9

My lab is setup using the Netapp simulator and on the back-end I made sure each simulator had dedicated access to 1 SSD drive on my NAS during each Data Ontap version test. This ensured full access to disk resources without any interference from other servers, virtual apps, etc

In addition to the above, each version of Data Ontap had the following volume efficiency features enabled:

In-line Dedupe
In-line Compression

The test VMware virtual machine is Windows 2012 R2

Each of the 8 tests were conducted with Netapp’s SIO software and consist of the following:

100% Read, 100% Random, 4K block, access 25MB of the file, run for 60 seconds, execute 4 threads
100% Write, 100% Random, 4K block, access 25MB of the file, run for 60 seconds, execute 4 threads
75% Read, 25% Write, 100% Random, 4K block, access 25MB of the file, run for 60 seconds, execute 4 threads
50% Read, 50% Write, 100% Random, 4K block, access 25MB of the file, run for 60 seconds, execute 4 threads
25% Read, 75% Write, 100% Random, 4K block, access 25MB of the file, run for 60 seconds, execute 4 threads
100% Read, 100% Sequential, 4K block, access 25MB of the file, run for 60 seconds, execute 4 threads
100% Write, 100% Sequential, 4K block, access 25MB of the file, run for 60 seconds, execute 4 threads
75% Read, 25% Write, 75% random, 25% Sequential, 4K block, access 25MB of the file, run for 60 seconds, execute 4 threads

While you are not going to see lightning speeds on my lab, you are going to see the percentage of performance increases between earlier versions of Data Ontap and Ontap 9.

Let’s begin

Netapp Data Ontap 8.2.3 Performance

Netapp Data Ontap 8.3.2 Performance

Netapp Ontap 9 Performance

Netapp Performance Summary

Now that we have all the data from each of the Netapp Data Ontap versions above, we can now put them into a table and calculate the performance increase between versions.

As you can see below the largest performance increase, in regards to percentages were seen between Data Ontap 8.2.3 and 8.3.2, however the gains between Data Ontap 8.3.2 and Ontap 9 are pretty impressive as well.

In between major releases there are usually quite a few new features that are introduced. It’s excellent to see that along with the benefit of additional features that you can also expect to see quite a significant performance increase.

↧

Netapp Ontap 9 Data Compaction

August 21, 2016, 5:11 am

≫ Next: Netapp 7MTT iSCSI LUN Migration Example

≪ Previous: Netapp Ontap 9 Performance Comparison

One of the buzz words around Netapp Ontap 9 is the new feature of data compaction. Working along side other storage efficiency mechanisms such as dedupe and compression, data compaction is able to further reduce your data foot print.

How does Netapp Data Compaction Work

Data Compaction is an inline process which takes I/O’s that would normally consume multiple 4K blocks and tries to fit it into one physical 4K block. This makes a lot of sense when you break down the 4K block and see how much data is actually consuming it. The actual data could only be consuming 25% of the 4K block so the 75% of the remaining block is kind of wasted.

If we take an example of the following three 4K blocks containing a certain percentage of data:

4K block – 25% data
4K block – 5% data
4K block – 60% data

With data compaction, we can compact the three blocks above into one 4K block that contains 90% data

The order in which the storage efficiency processes are executed are:

Inline zero-block deduplication
Inline adaptive compression
Inline deduplication
Inline adaptive Data Compaction

The most beneficial use cases for Data Compression are with very small I/Os and files (<2KB) and larger I/O’s with lots of white space.

It is supported on Netapp AFF (enabled by default), Flash Pool and HDD aggregates.

I’ve been able to test this in the lab with the Netapp Ontap 9 sim running an NFS datastore mounted to VMware 5.5.

Netapp Ontap 9 Data Compaction VMware Example

I’ve created a new NFS volume called SVM1_NFS1 and mounted it to my VMware hosts
I enable dedupe and inline compression on the SVM1_NFS1 volume
Next I need to enable Data Compaction on the SVM1_NFS1 volume. To do this I type in:
1. ::> volume efficiency modify -vserver SVM1 -volume SVM1_NFS1 -data-compaction true
I then storage vmotion 2 Windows 2012 R2 virtual machines (total size 42GB) to the NFS datastore SVM1_NFS1
During the storage vmotion I was looking at the compaction savings from within the CLI. To do this you type:
1. ::> storage aggregate show -fields data-compaction-space-saved
Once the storage vmotion completed I then ran a dedupe scan to get the rest of the space savings.

VMware Volume Efficiency Savings

VMware Aggregate Data Compaction Savings

Netapp Ontap 9 Data Compaction SQL Example

Next we’ll look at the space savings on a different kind of workload, this time being Microsoft SQL. I have gone ahead and created a new aggregate called aggr2 and a new volume called SVM1_NFS2. This new volume has dedupe and inline compression enabled.

Within VMware I used storage vmotion to move the data disk containing all the SQL databases and log file to the NFS datastore called SVM1_NFS2. Once the storage vmotion completed, I then went ahead and ran a manual dedupe on the volume.

Below we can see the space savings on the SQL database workload

SQL Volume Efficiency Savings

SQL Aggregate Data Compaction Savings

As we can see in the SQL Aggregate Data Compaction Savings, we are looking at nearly a 50% increase in compaction savings between VMware virtual machines and SQL Databases. Why you ask ? SQL Databases can contain a fair amount of white space, which is an excellent candidate for Data Compaction.

Technical Reference 4476 – Netapp Data Compression, Deduplication and Data Compaction

↧

Netapp 7MTT iSCSI LUN Migration Example

September 13, 2016, 5:17 am

≫ Next: FileManager.makeDirectory VMware vCenter Error

≪ Previous: Netapp Ontap 9 Data Compaction

The Netapp 7MTT software helps you easily migrate from Data Ontap 7-Mode to Clustered Data Ontap via an easy to use step-by-step wizard.

In this tutorial we will walk through migrating an iSCSI lun that is attached to a Windows 2012 R2 server containing an SQL database on 7-Mode to Clustered Data Ontap 8.3.2

The latest version of the Netapp 7MTT software is version 3 and can be found here:

Netapp 7MTT v3

If you are familiar with previous versions where you were required to install either Copy-Based Transition or Copy-Free Transition, you will be happy in version 3 that this is no longer the case and the one installation contains both CBT and CFT.

Netapp 7-Mode Transition Tool (7MTT)

After installing the 7-Mode Transition Tool, click on the icon to launch it.

Click on Storage Systems at the top and enter in your 7-mode node ip followed by the username and password. Click Add

Enter in your Cluster IP, username and password for the c-mode system and click Add.

Click on the Home tab at the top left and select Copy-Based Transition. Click on the Start Planning button.

Your 7-mode and c-mode controllers are already populated. Click Next.

Select the source 7-mode volume by clicking on the box under the Transition as stand-alone column. Click Next.

You will receive a warning, as we are migrating to a c-mode system running 8.3.2 we can click Ok.

Give the project a name and you can either create a new group or use the default. This is useful to group projects together. Click Save.

Your 7-mode management IP will already be pre-populated here. Double check to make sure it is the correct IP. Click Next.

In this window we will select the target SVM on the Netapp c-mode system. In my example the SVM is called SVM1. Ensure that the box is ticked next to your destination SVM. Click Next.

We can now create the destination volume on the Netapp c-mode system. Select the destination aggregate and type in a Target Volume Name. I like to use the clustered ontap mount policy and I recommend you select the same otherwise it will append /vol to your namespace mount points. In the example below, I show you what the Target Volume Path would look like if you select the Preserve 7-Mode Mount paths

In this example, I show you what it looks like with the Clustered Data Ontap volume name mount policy. As you can see it’s much cleaner. Click Next.

Now I will create a new iSCSI lif on the Clustered Data Ontap system. Click Save once you are done.

We are now going to configure the replication schedule. This is important as it gives us the flexibility to either cutover straight away or continue replicating the data until we are ready to cut over. In my lab I have accepted the defaults. Click create.

Here are a bunch of options that we can migrate across from 7-mode to c-mode. I have unticked most of them as I’m only interested in SAN and Snapshot schedules. When you are finished click Save and go to Dashboard.

We can now run a pre-check before any configuration is done. Click Run Precheck.

I have a few errors that have popped up that I need to fix before moving on. Untick Warning and Informational at the bottom to only see Errors.

First up we’ll create the Intercluster LIFs on the Clustered Data Ontap system.

Next we’ll allow SVM1 to snapmirror from our Data Ontap 7-mode system.

Lastly, we’ll turn the Netapp 7-mode volume option no_i2p to off on our volume iscsi.

We can now re-run the pre check and as you can see it is now successful.

The Run Precheck circle is now green and we can move onto Starting the Baseline.

Click Start Baseline. We have fixed up all the warnings so we can safely click Yes.

The baseline snapmirror is now transferring.

We can also see this via the Netapp Clustered Data Ontap command line by typing in snapmirror show.

Now it comes time to apply the configuration. We have an option to Test the destination and ensure everything is working. The Test Mode involves a volume clone process in the back-end. In my lab I will move straight onto clicking Apply Configuration.

After clicking Apply Configuration the Apply Configuration(Precutover) window pops up. I will untick the Apply configuration in test mode option and click continue.

The next pop-up window lists all the steps that were involved in the pre-cutover phase.

Snapmirror has completed another update.

We can now click on Complete Transition. A Warning window pops up ensuring you address all warnings before continuing. Click yes.

One last window pops up giving you the option to take the source volume offline once the transition is complete. I will leave this selected. Don’t click Continue yet.

I have connected via remote desktop to my SQL server. Shut down all the SQL services and disconnect from the Netapp 7-mode iSCSI lun.

We can now click Continue on the Complete Transition Window. Once the transition is complete you will see a summary of steps along with a Operation Status of Successful.

All phases of the transition are now green. We can sign out of the Netapp 7MTT tool.

Back over to our SQL server, I will go back into the iSCSI initiator and add the target IP address of the Netapp Clustered Data Ontap LIF.

Within Disk Manager we can see the disk appear as offline. Right click the disk and select Online.

The disk is now online as E: and we can see our SQL database files within the drive.

One last check within Microsoft SQL Management Studio. The vum database is online and healthy. Thank you Netapp 7MTT.

↧

FileManager.makeDirectory VMware vCenter Error

December 3, 2016, 9:49 pm

≫ Next: Netapp csm.sessionFailed:warning

≪ Previous: Netapp 7MTT iSCSI LUN Migration Example

The other day I was working on a customer VMware vCenter server along with some Netapp storage. The storage being NFS mounted to each ESXi server.

I noticed that the datastore appeared to be read only, i.e. when I browsed the datastore within either the web client or vsphere client, I could not create any new folder or file. However if I logged into an ESXi host via the CLI and tried to create a file or folder within the datastore, I could.

Another test I ran was tried to create a new Virtual Machine, which was successful. So surely the NFS permissions were correct on the storage.

I ran through all the storage settings just to double check the NFS permissions and they were all correct.

I then proceeded to have a look at the vmkernel.log file and saw the following:

8C2B4C36-000000A8-c5] [VpxLRO] -- ERROR task-internal-1676 --  -- vim.FileManager.makeDirectory: vim.fault.CannotCreateFile:

--> Result:
--> (vim.fault.CannotCreateFile) {
-->    dynamicType = ,
-->    faultCause = (vmodl.MethodFault) null,
-->    file = "[datastore1] New Folder",
-->    msg = "Received SOAP response fault from []: MakeDirectory
--> ",
--> }
--> Args:
-->

2016-11-18T15:28:18.457+11:00 [05552 info 'commonvpxLro' opID=4125134] [VpxLRO] -- BEGIN task-internal-1677 --  -- vmodl.query.PropertyCollector.retrieveContents -- ff59bd5c-aee1-7ce0-09e8-19b49ddb1d81(5272807c-a4dc-b3f0-ae5c-1c5ac91f2578

It appears the error is specific to vCenter. I ran a last test which was connect the vSphere client directly to the ESXi host and attempt to create a new file or folder within the datastore. This test was successful.

Looking throught the customer’s setup I found that the same datastores had been mounted twice, once in one vmware cluster and again in another cluster, all running under the same vCenter.

The fix was quite easy, once I unmounted the datastores from the second cluster I could successfully write to the datastore.

↧

Netapp csm.sessionFailed:warning

December 6, 2016, 1:20 pm

≫ Next: How to migrate an SQL Cluster from Netapp 7-mode to c-mode

≪ Previous: FileManager.makeDirectory VMware vCenter Error

I was looking through a bunch of syslog messages for our c-mode systems the other day and came across a repeated message which was:

csm.sessionFailed:warning

It turns out this is Bug ID 987327 which has the following description and goes on to explain that this message can be ignored:

Cluster Session Manager (CSM) sessions that are created for SnapMirror updates remain idle and are deleted after
the timeout with the following warning: CsmMpAgentThread: csm.sessionFailed:warning]: Cluster interconnect session ….. failed with record state ACTIVE and error CSM_CONNABORTED. This warning does not convey any functional impact and can be ignored

The fix to stop the message from logging is to upgrade to Ontap 9.1RC1

↧

How to migrate an SQL Cluster from Netapp 7-mode to c-mode

December 17, 2016, 10:32 pm

≫ Next: Netapp Altavault Data Throughput Reduces Randomly

≪ Previous: Netapp csm.sessionFailed:warning

In this video tutorial, I dive into my lab and guide you through the steps involved to successfully migrate a Microsoft SQL cluster from a Netapp 7-Mode system to Clustered Data Ontap, utilizing the Netapp 7-Mode Transition Tool.

I have 2 Microsoft Windows 2012 R2 servers configured with Microsoft SQL clustering. Each server is connected to the Netapp 7-mode system via iSCSI in which there are 4 luns:

Databases
Logs
Quorum
Install

The objective of this lab is to migrate all 4 luns above from 7-mode to Clustered Data Ontap using the Netapp 7MTT tool.

Below is the Network Diagram illustrating how this lab is setup:

Netapp 7MTT: How to Migrate a Microsoft SQL Cluster from 7-mode to C-mode

↧

Netapp Altavault Data Throughput Reduces Randomly

January 9, 2017, 4:03 am

≫ Next: Veeam the given key was not present in the dictionary

≪ Previous: How to migrate an SQL Cluster from Netapp 7-mode to c-mode

We have come across a very bizarre issue whereby front-end data throughput to a Netapp Altavault appliance randomly reduces until the services have restarted.

It took about a month of troubleshooting to pinpoint the issue with Netapp Support, which finally came down to disabling TOE (TCP Offload Engine) settings on the Altavault.

The Netapp Altavault is connected to two Cisco Nexus 5548 switches and while TOE is supposed to help increase speed, in this case, it does quite the opposite.

The behavior of the Altavault front-end throughput looks like the following:

The reduced rate will continue until the Altavault services are restarted.

If you are experiencing similar behaviour you will need to log a support case with Netapp and ask them to assist you in disabling the TOE settings for your data interfaces. Data interfaces include stand-alone and interfaces that are members of a vif (port-channel).

Disabling TOE settings on Altavault Data Interfaces

SSH to the Altavault device and enter into enable mode

# cli challenge generate (Generate a challenge code for Netapp)

# cli challenge response (Enter in Netapp challenge response code)

@> _shell (enter the Altavault shell)

@> ethtool -k e0a (check tso, gso, gro, lro settings on interface e0a)

@> ethtool -K e0a tso off gso off gro off lro off (disable's offload settings for e0a)

@> mount (check if /dev/loop- on / is ro)

@> mount -o remount,rw / (make / rw)

@> mount | grep loop0 (check / is now rw)

@> vi /etc/rc.local (edit rc.local and enter in the 2 commands below which disables TOE on e0a and e0c)

/usr/sbin/ethtool -K e0a tso off gso off gro off lro off

/usr/sbin/ethtool -K e0c tso off gso off gro off lro off

add the 2 lines above below – touch /var/lock/subsys/local but above /sbin/kernel_crashdump.sh

@> mount -o remount,ro / (make / ro)

No restart is required

After disabling TOE on the data interfaces you will instantaneously see the front-end speed return to normal speeds

I’d be interested to know if anyone else is experiencing this issue on switches other than Cisco, please leave a comment.

↧

Veeam the given key was not present in the dictionary

January 24, 2017, 3:26 am

≫ Next: The vSphere server with UUID was not found on this cloud instance

≪ Previous: Netapp Altavault Data Throughput Reduces Randomly

I came across this error message the other day when configuring a Veeam Replication job with separate virtual networks, which displayed “The given key was not present in the dictionary”

The error appeared within the Replication Job after selecting Separate Virtual Networks, and clicking Next until I got to the point where you select Source and Destination Networks.

Let’s take a look with a few screen shots.

Veeam Replication Job – The given key was not present in the dictionary

Within the Veeam Replication Job, I’ve selected “Separate virtual networks”

At step number 4 (Network), I have selected the source network successfully

When I click on Browse for the Target Network, and try to expand into the VMware ESXi host, the error message pops up “The given key was not present in the dictionary”

The issue is not actually with Veeam but with permissions on the distributed switch. A dedicated Veeam user called veeamrepl has been created (used specifically for Veeam Replication) along with the necessary role permissions and applied only to the port-group called Internal.

If we take a look at what permissions that have been applied to the distributed switch, we can see that we don’t have our veeamrepl user applied with any permissions and hence why we are getting the “The given key was not present in the dictionary” error.

To be able to add the veeamrepl user along with the appropriate permissions to the vmlab-dswitch we either have to apply the permission at the Datacenter level with the propagate setting enabled, or create a new Network folder and move the distributed switch into it. After moving the switch into the folder we can then apply the permission at the folder level (with the propagate setting enabled) which will then propagate down to the distributed switch.

Right click on the datacenter and select All vCenter Actions – New Network Folder

Now that we’ve created a folder called switches, we can select the manage tab followed by the Permissions button. Click on the green plus sign and add the user and role you wish to assign to your distributed switch

As you can see by clicking on the vmlab-dswitch, the veeamrepl account and its permissions have propagated down.

Now when we go back into the replication job and browse the target network, we can see the list of network available.

↧

The vSphere server with UUID was not found on this cloud instance

January 28, 2017, 2:08 am

≫ Next: The vSphere server with UUID was not found on this cloud instance

≪ Previous: Veeam the given key was not present in the dictionary

I was creating some powercli scripts to help import virtual machines from vcenter into vcloud director the other day, when I came across this error on the import-civapp cmdlet:

PowerCLI import-civapp error

The vSphere server with UUID ‘D618CD0A-D7B7-45FO-ABDS-7BA1E9EF9EFC’ was not found on this cloud instance.

I’m using PowerCLI 6.5 RC1 for my scripting and was able to confirm via logging a VMware support ticket that this issue is currently known and will be resolved in the next PowerCLI release.

PowerCLI error cause

The cause of the problem is that PowerCLI see’s the UUID in upper case, whereas the UUID is actually in lower case within the vcenter database.

I’ll update the post once the new release of PowerCLI is available and confirm whether the issue is resolved or not.

↧