Jump to content


Photo

NFS soft mount vs hard mount

Started by Gerald Brandt , 25 September 2010 - 04:04 AM
16 replies to this topic

Gerald Brandt Members

Gerald Brandt
  • 284 posts

Posted 25 September 2010 - 04:04 AM

Hi,

I notice XenServer mounts the NFS drives with soft. Is there a reason for using soft instead of hard?

I've read the using NFSv3 with hard mount and async drops the chance for silent data corruption to 0.
http://forums11.itrc.hp.com/service/forums/questionanswer.do?admit=109447626+1265529583347+28353475&threadId=1139502

Gerald



Tobias Kreidl CTP Member

Tobias Kreidl
  • 18,179 posts

Posted 25 September 2010 - 01:58 PM

It's a trade-off, of course. See for example http://tldp.org/HOWTO/NFS-HOWTO/client.html paragraph 4.3.1. Having control over the process vs. having better data integrity is the balance. If the server or either end crashes entirely, any cached data will still be lost, even a hard mount isn't 100% reliable under dire circumstances.
--Tobias



Gerald Brandt Members

Gerald Brandt
  • 284 posts

Posted 25 September 2010 - 02:41 PM

My understanding, based on the link I gave in my first message, is the NFSv3 has a 'commit' reply, and the client is not supposed to clear it's cache until it gets a the reply:

+With NFS version 2 this was a concern but not with NFS version 3.+

NFS v3 was specifically designed to eliminate this problem. NFS v3 has something called "Safe Asynchronous Writes" where the client and server exchange verifiers with each write request and the client keeps a copy of the data in it's cache until it receives confirmation from the server that the data has been posted to stable storage. If the server crashes in the middle of a write, once the server reboots it will advertise a different verifier to the client and the client will know it needs to re-send it's cached copy of the data to the server. This will continue until the server replies that all data is safely on stable storage.

Gerald



Tobias Kreidl CTP Member

Tobias Kreidl
  • 18,179 posts

Posted 25 September 2010 - 03:03 PM

I'd want to see this really work under controlled circumstances. We ran mail servers with hard links and still saw corruptions occasionally (under NFS 3). Seriously, I'd make a little test environment and see what really happens if a server crashes (including a full reboot) and whether or not things recover. Needless to say, I'm not the biggest NFS fan, but it supposedly is a lot better now than it used to be (we ran Solaris 8 and there were issues there with how the TCP drivers were written, so that may have been part of it).

The other thing is with a hard link, if the process hangs, it hard or impossible to kill it.

Edited by: Tobias K on Sep 25, 2010 11:04 AM



Colin Hutcheson Members

Colin Hutcheson
  • 4 posts

Posted 26 October 2010 - 12:54 AM

I have been scratching my head over this as well, I cant find anything to substantiate the use of soft mounts for virtual disks over NFS.

With a NFS soft mount the error is returned to the application/process. What does the XenServer PBD layer do when it sees an error returned to it from NFS directly or indirectly (TCP timeout) ?

-------------------
| XenServer |
| VDI |
| SR |
| VBD |
-------------------
| Kernel |
| NFS |
| TCP |
--------------------

http://docstore.mik.ua/orelly/networking_2ndEd/nfs/ch18_02.htm

+If++write operations fail, data consistency on the
server cannot be guaranteed. The write error is reported to the
application during some later call to+ +write( )
or+ +close( ), which is consistent with the
behavior of a local filesystem residing on a failing or overflowing
disk. When the actual write to disk is attempted by the kernel device
driver, the failure is reported to the application as an error during
the next similar or related system call.
A well-conditioned application should exit abnormally after a failed
write, or retry the write if possible. If the application ignores the
return code from+ write( ) or +close(
), then it is possible to corrupt data on a soft-mounted
filesystem. Some write operations may fail and never be retried,
leaving holes in the open file.+



Tobias Kreidl CTP Member

Tobias Kreidl
  • 18,179 posts

Posted 26 October 2010 - 01:12 AM

Also with a hard mount -- at least with older NFS versions -- if you wanted to umount the device and it was hung, it was nearly impossible without either rebooting the client or at least stopping NFS. Even pkill or fuser didn't work. I've heard that hard mounts at least used to be considered essential for things like mail storage areas where there can be a huge amount of frequent I/O and chances for file corruption are great. For most other applications, I personally prefer soft mounts and have not seen data corruption issues in a very long time. YMWV.
--Tobias



Colin Hutcheson Members

Colin Hutcheson
  • 4 posts

Posted 26 October 2010 - 07:36 PM

yeah, it depends on your environment, how well groomed it is, components, revisions, error rates.

But "soft" is at the bottom of the list for NFS, if we could get the reasoning behind why its set as default then it could be understood, as we cant find that info we can only use external NFS collateral to make decisions on.

My "quick and dirty" defaults would be:

single Xen node
rw,tcp,bg,hard,intr,sync,nolock,rsize=32768,wsize=32768,timeo=600,retrans=2,_netdev

n+ nodes Xen pool
rw,tcp,bg,hard,intr,sync,noac,rsize=32768,wsize=32768,timeo=600,retrans=2,_netdev

ISO'S
rw,tcp,fg,hard,intr,async,rsize=32768,wsize=32768,timeo=600,retrans=3

These are based on the IO type and data type:
sync - I need to have NFS FSYNC writes as its a virtual disk
TCP - more resilient to a wider scope of errors, speed not a problem with 10g
Hard - I (the stack) need to know about the NFS errors immediately
_netdev - to prevent "hung on boot" syndrome

** Interestingly the ISO defaults on another thread were UDP, again reverse of what we would expect.

It looks to me like these are arbitrary defaults derived from a specific environment, some notes in the admin guide would have gone a long way ;)



Tobias Kreidl CTP Member

Tobias Kreidl
  • 18,179 posts

Posted 26 October 2010 - 07:59 PM

Exactly, it's so configurable, depending on what level of paranoia or tolerance you're willing to deal with.
Do you get pretty optimal I/O with the buffer sizes set to 32768?
--Tobias



Colin Hutcheson Members

Colin Hutcheson
  • 4 posts

Posted 28 October 2010 - 10:37 PM

Well not really as its the default value and the "advanced params" when mounting the SR are not documented.
If I could validate I would not break something higher up the stack I would change it, as I cant I wont.

Theres nothing paranoid about feeling concerned about soft mounts, its well documented, If that host fails then you can loose the writes.

Buffer sizes..

An NFS server with enough write cache will ack that write as soon as it hits the cache**, I dont have to wait for media level commit. I would have to wait for the reads to be serviced as it maybe cold-cache or full a VM reboot, so 32K may give me a better overall response time (TBD).

"**" a 4K FSYNC write will be sent as is so the 32k is a NOOP.



Tobias Kreidl CTP Member
  • #10

Tobias Kreidl
  • 18,179 posts

Posted 28 October 2010 - 11:30 PM

Interesting -- I generally use 4k buffers, as beyond that point, it seems like the law of diminishing returns.
It depends, of course, on the nature of the data and also how your disk device is blocked.
Thanks,
--Tobias



Martín Lorente Members
  • #11

Martín Lorente
  • 13 posts

Posted 14 March 2011 - 04:49 PM

Hello everybody. I am looking for a way to set nfs mount options on muy xenserver 5.5. A cat to /proc/mounts shows me rsize and wsize = 65536. How can I set it to 32768 ? I can not find the correct "xe" command to it, maybe must I modify a file?. I have found /opt/xensource/sm/nfs.py, but nor sure if modify the line 64, where puts some options is the rigth way to do this.

By the way, do you now if relatime option is available on version 5.5 ?

Thanks in advance.



Tobias Kreidl CTP Member
  • #12

Tobias Kreidl
  • 18,179 posts

Posted 14 March 2011 - 05:01 PM

If you specify these values on the client mount command, it should negotiate the connection down to those values on its own; no need to change anything o the server side:
mount -o rsize=32768,wsize=32768 etc. ....
--Tobias



Martín Lorente Members
  • #13

Martín Lorente
  • 13 posts

Posted 14 March 2011 - 06:43 PM

What I am doing is ssh to each server, when it is empty, I umount the unit and mount again whit the same parameters adding the rsize and wsize fixed to 32K. By default they are on 64K. As I can see, the CPU of my netapp filer, which is exporting the afected volumes is quite lower (this was the main problem). What I need now is fix these values in order to xenserver use them when it reboots. Where must I write them in?



Tobias Kreidl CTP Member
  • #14

Tobias Kreidl
  • 18,179 posts

Posted 14 March 2011 - 07:08 PM

I believe it's hard-coded into the kernel, defined by the parameter NFSSVC_MAXBLKSIZE which
in turn is typically in the file /usr/include/linux/nfsd/const.h but I don't see such a configuration file on XenServer 5.5.
So I don't think this can be done unless you recompile the kernel.
--Tobias



Martín Lorente Members
  • #15

Martín Lorente
  • 13 posts

Posted 14 March 2011 - 07:44 PM

I was thinking about the /opt/xensource/sm/nfs.py file, where says

options = "soft,timeo=%d,retrains=%d,proto=%s,noac" % (SOFTMOUNT_TIMEOUT,

adding the rsize and wsize parameters. I will try and report results.



Tobias Kreidl CTP Member
  • #16

Tobias Kreidl
  • 18,179 posts

Posted 14 March 2011 - 08:11 PM

That may work, but best to try to see if it'll override the kernel settings or not; see also this earlier thread along the same lines: http://forums.citrix.com/message.jspa?messageID=1407330
--Tobias



Amarnath Reddy Citrix Employees
  • #17

Amarnath Reddy
  • 71 posts

Posted 11 February 2013 - 11:47 PM

I find a article regarding NFS soft mount and Hard mount..http://www.expertslogin.com/general/difference-soft-hard-nfs-mount/