Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#269 closed defect (fixed)

When no valid ticket exists (or the server is unavailable), NetDrive hangs waiting for response

Reported by: Lewis Rosenthal Owned by: Paul Smedley
Priority: major Milestone:
Component: Samba Client Version: Client 3.0.x
Keywords: Cc:

Description

Acquire Kerberos ticket & test access to mapped share.

Shut down share server. Access the share (dir). CPU goes high (NDCTL.EXE) and stays high for several seconds after Ctrl-C is hit in the command window.

Attachments (1)

log.smbc (8.5 KB ) - added by Lewis Rosenthal 8 years ago.
log.smbc when NT1 login fails

Download all attachments as: .zip

Change History (31)

comment:1 by Paul Smedley, 8 years ago

logs please

comment:2 by Lewis Rosenthal, 8 years ago

Surely (got called away):

2016/06/24 15:38:08.56: 9 2: NdpRsrcQueryInfo in
2016/06/24 15:38:08.56: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/24 15:38:08.56: 9 2: NdpRsrcQueryInfo in
2016/06/24 15:38:08.56: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/24 15:38:08.56: 9 2: NdpRsrcQueryInfo in
2016/06/24 15:38:08.56: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/24 15:38:08.57: 9 2: NdpRsrcQueryFSAllocate 0285fdd0
2016/06/24 15:38:08.57: 9 2: checkMountResource in tid#2
2016/06/24 15:38:08.57: 1 2: Connecting to \\fred:*********@:kerberos.samba.arcanoae\shared. Master WORKGROUP:1
2016/06/24 15:38:08.57: 1 2: Kerberos Support enabled
2016/06/24 15:39:23.48: 4 2: Connection to kerberos.samba.arcanoae failed (Error NT_STATUS_INVALID_PARAMETER)
2016/06/24 15:39:23.48: 9 2: NdpCreateConnection failed rc=3

15:39:23 may have been the time of Ctrl-C, but CPU did not relax until 30-45 seconds later. I can run some more precise tests with cpu.exe running or something to set a timestamp, if need be.

comment:3 by Lewis Rosenthal, 8 years ago

I did not press Ctrl-C this time. The same condition occurs when we destroy the ticket and then attempt to access the share (shall modify the summary):

2016/06/24 19:42:23.69: 9 2: NdpRsrcQueryInfo in
2016/06/24 19:42:23.69: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/24 19:42:23.69: 9 2: NdpRsrcQueryInfo in
2016/06/24 19:42:23.69: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/24 19:42:23.70: 9 2: NdpRsrcQueryInfo in
2016/06/24 19:42:23.70: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/24 19:42:23.70: 9 2: NdpRsrcQueryFSAllocate 0285fdd0
2016/06/24 19:42:23.70: 9 2: checkMountResource in tid#2
2016/06/24 19:42:23.70: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/24 19:43:38.28: 4 2: Connection to kerberos.samba.arcanoae failed (Error NT_STATUS_INVALID_PARAMETER)
2016/06/24 19:43:38.28: 9 2: smbwrp_connect failed rc=3
2016/06/24 19:43:38.28: 9 2: Calling NdpFreeResource() to clear connection for security purposes
2016/06/24 19:43:38.28: 9 2: dircache_delete: 0x626720
2016/06/24 19:43:38.28: 9 2: NdpFreeResource 0
2016/06/24 19:43:38.29: 9 2: NdpRsrcQueryInfo in
2016/06/24 19:43:38.29: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\\shared@fred
2016/06/24 19:43:38.29: 9 2: NdpRsrcQueryInfo in
2016/06/24 19:43:38.29: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\\shared@fred
2016/06/24 19:43:38.29: 9 2: NdpRsrcQueryInfo in
2016/06/24 19:43:38.29: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\\shared@fred
2016/06/24 19:43:38.29: 9 2: NdpRsrcQueryInfo in
2016/06/24 19:43:38.29: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\\shared@fred
2016/06/24 19:43:38.29: 9 2: NdpQueryPathInfo in [0x626920] <*>
2016/06/24 19:43:38.30: 9 2: NdpFindStart in [0x626920]
2016/06/24 19:43:38.30: 9 2: NdpFindStart: dir [\], dir_mask [*], mask [\*], szPath [*]
2016/06/24 19:43:38.30: 4 2:  smbwrp_echo
2016/06/24 19:43:40.32: 4 2: NdpFindStart, smbwrp_echo rc = -1, reconnecting.....
2016/06/24 19:43:40.32: 1 2: Connecting to \\:\IPC$ using kerberos authentication. Master :0
2016/06/24 19:43:41.41: 4 2: Connection to  failed (Error NT_STATUS_UNSUCCESSFUL)

comment:4 by Lewis Rosenthal, 8 years ago

Summary: When server goes away, and a valid ticket exists, NetDrive hangs waiting for responseWhen no valid ticket exists (or the server is unavailable), NetDrive hangs waiting for response

comment:5 by Lewis Rosenthal, 8 years ago

I no longer see this hang with:

6-25-16 2:58 13,954,000 124 ndpsmb.dll

however, there are lingering effects as described in ticket #268. Also, after a time - whether acquiring a new ticket or not - , NDCTL.EXE will crash when commands are attempted against the unavailable but still mapped share:

[j:\download\os2\samba]md o:\tempdir
SYS0003: The system cannot find the path specified. "O:\tempdir"

[j:\download\os2\samba]dir o:\desktop

 Volume in drive O is NETDRIVE

No new ticket was requested, unlike the recent discussion in ticket #268. POPUPLOG.OS2 says:

06-25-2016  15:42:12  SYS3175  PID 00b5  TID 0002  Slot 00f8
C:\NDFS\NDCTL.EXE
c0000005
1453732a
P1=00000001  P2=6990150a  P3=XXXXXXXX  P4=XXXXXXXX
EAX=00000000  EBX=00000003  ECX=1bff0002  EDX=191bdaa0
ESI=0285fdd0  EDI=69901506
DS=0053  DSACC=f0f3  DSLIM=ffffffff
ES=0053  ESACC=f0f3  ESLIM=ffffffff
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:1453732a  CSACC=f0df  CSLIM=ffffffff
SS:ESP=0053:0285fd48  SSACC=f0f3  SSLIM=ffffffff
EBP=0285fd70  FLG=00012206

NDPSMB.DLL 0001:0000732a

log.ndpsmb to the point of NDCTL.EXE crash says:

2016/06/25 15:40:32.96: 9 2: NdpRsrcQueryInfo in
2016/06/25 15:40:32.96: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/25 15:40:32.96: 9 2: NdpRsrcQueryInfo in
2016/06/25 15:40:32.96: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/25 15:40:32.97: 9 2: NdpRsrcQueryInfo in
2016/06/25 15:40:32.97: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/25 15:40:32.97: 9 2: NdpRsrcQueryFSAllocate 0285fdd0
2016/06/25 15:40:32.97: 9 2: checkMountResource in tid#2
2016/06/25 15:40:32.97: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/25 15:40:32.97: 4 2: Server supports SMB2_10 protocol
2016/06/25 15:40:32.98: 4 2:  session setuping for <fred>/<********> in <>
2016/06/25 15:40:34.59: 1 2: session setup failed: NT_STATUS_INTERNAL_ERROR
2016/06/25 15:40:34.59: 4 2: Anonymous login failed
2016/06/25 15:40:34.59: 9 2: smbwrp_connect failed rc=6
2016/06/25 15:40:34.59: 9 2: Calling NdpFreeResource() to clear connection for security purposes
2016/06/25 15:40:34.59: 9 2: dircache_delete: 0x626720
2016/06/25 15:40:34.59: 9 2: NdpFreeResource 0
2016/06/25 15:40:34.59: 9 2: NdpRsrcQueryInfo in
2016/06/25 15:40:34.59: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\\shared@fred
2016/06/25 15:40:34.59: 9 2: NdpRsrcQueryInfo in
2016/06/25 15:40:34.59: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\\shared@fred
2016/06/25 15:40:34.59: 9 2: NdpRsrcQueryInfo in
2016/06/25 15:40:34.59: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\\shared@fred
2016/06/25 15:40:34.59: 9 2: NdpRsrcQueryInfo in
2016/06/25 15:40:34.59: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\\shared@fred
2016/06/25 15:40:34.59: 9 2: NdpQueryPathInfo in [0x675e00] <*>
2016/06/25 15:40:34.59: 9 2: NdpFindStart in [0x675e00]
2016/06/25 15:40:34.59: 9 2: NdpFindStart: dir [\], dir_mask [*], mask [\*], szPath [*]
2016/06/25 15:40:34.59: 4 2:  smbwrp_echo
2016/06/25 15:40:34.60: 4 2:  smbwrp_filelist
2016/06/25 15:40:34.60: 1 2: Filelist <\*> on master <> wgrp < `b> server <X&g> share <> clidev <(null)>
2016/06/25 15:40:34.60: 4 2:  smbwrp_special_add
2016/06/25 15:40:34.60: 4 2:  smbwrp_dir_add
2016/06/25 15:40:34.60: 8 2: adding <.> 289 395346944 261
2016/06/25 15:40:34.60: 9 2: fname \.
2016/06/25 15:40:34.60: 9 2: size 0
2016/06/25 15:40:34.60: 9 2: mtime 0 Wed Dec 31 19:00:00 1969
2016/06/25 15:40:34.60: 9 2: ftimeLastAccess 19:00:00
2016/06/25 15:40:34.60: 4 2:  smbwrp_special_add
2016/06/25 15:40:34.60: 4 2:  smbwrp_dir_add
2016/06/25 15:40:34.60: 8 2: adding <..> 289 395346944 261
2016/06/25 15:40:34.60: 9 2: fname \..
2016/06/25 15:40:34.60: 9 2: size 0
2016/06/25 15:40:34.60: 9 2: mtime 0 Wed Dec 31 19:00:00 1969
2016/06/25 15:40:34.60: 9 2: ftimeLastAccess 19:00:00
2016/06/25 15:40:34.60: 4 2: NdpFindStart, smbwrp_filelist rc = 87
2016/06/25 15:40:34.60: 9 2: NdpFindStart <*> (\*) cnt 0 rc=87
2016/06/25 15:42:03.27: 9 2: NdpRsrcQueryInfo in
2016/06/25 15:42:03.27: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\@fred
2016/06/25 15:42:03.27: 9 2: NdpRsrcQueryInfo in
2016/06/25 15:42:03.27: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\@fred
2016/06/25 15:42:03.27: 9 2: NdpQueryPathInfo in [0x675e00] <tempdir>
2016/06/25 15:42:03.27: 9 2: dircache_find_path [tempdir]
2016/06/25 15:42:03.27: 9 2: dcFindPath: [][tempdir]
2016/06/25 15:42:03.27: 9 2: dircache: FindPath tempdir not found
2016/06/25 15:42:03.27: 1 2: Connecting to \\tempdir:\IPC$ using kerberos authentication. Master :0
2016/06/25 15:42:04.30: 4 2: Connection to  failed (Error NT_STATUS_UNSUCCESSFUL)
2016/06/25 15:42:04.30: 9 2: NdpQueryPathInfo pathparser for <\> rc=3
2016/06/25 15:42:04.30: 9 2: NdpQueryPathInfo <tempdir> (\) rc=3
2016/06/25 15:42:11.45: 9 2: NdpRsrcQueryInfo in
2016/06/25 15:42:11.45: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\@
2016/06/25 15:42:11.45: 9 2: NdpRsrcQueryInfo in
2016/06/25 15:42:11.45: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\@
2016/06/25 15:42:11.45: 9 2: NdpRsrcQueryInfo in
2016/06/25 15:42:11.45: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\@
2016/06/25 15:42:11.45: 9 2: NdpRsrcQueryFSAllocate 0285fdd0
2016/06/25 15:42:11.45: 9 2: checkMountResource in tid#2
2016/06/25 15:42:11.46: 1 2: Connecting to \\:X&g\IPC$ using kerberos authentication. Master :352736120
2016/06/25 15:42:11.46: 1 2: NTLMv1 Support enabled
2016/06/25 15:42:12.50: 4 2: Connection to X&g failed (Error NT_STATUS_UNSUCCESSFUL)
2016/06/25 15:42:12.50: 9 2: smbwrp_connect failed rc=3
2016/06/25 15:42:12.50: 9 2: Calling NdpFreeResource() to clear connection for security purposes
2016/06/25 15:42:12.50: 9 2: dircache_delete: 0x69901506

(Note the reference to "NTLMv1 Support enabled" - NTLMv1 is specifically *not* enabled on the server, so I'm not sure why we're even trying it, unless this is just noise.)

comment:6 by Paul Smedley, 8 years ago

hmmmmm looks like I still need to find a more elegant way to say the share is unavailable than using NdpFreeResource()

comment:7 by Paul Smedley, 8 years ago

Owner: set to Paul Smedley
Status: newaccepted

I don't see a hang here with http://smedley.id.au/tmp/ndpsmb.zip

Additionally, if we can't connect to a kerberos share, we now report:

debuglocal(4,"did you forget to run kinit?\n");

instead of

debuglocal(4,"Anonymous login failed\n");

This was always the intent, however, there were 2 issues.

1) a bug in ndpsmb 2) - Win2k reports NT_STATUS_INTERNAL_ERROR, whereas the Samba code expects NT_STATUS_MORE_PROCESSING_REQUIRED

Incidentally, http://smedley.id.au/tmp/ndpsmb.zip will now always reflect my latest test build - my update script now zips a copy automatically :)

comment:8 by Lewis Rosenthal, 8 years ago

I see a long pause with high CPU with:

6-26-16 1:20 13,953,664 124 ndpsmb.dll

Log:

2016/06/26 18:05:35.13: 9 2: NdpRsrcQueryInfo in
2016/06/26 18:05:35.13: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/26 18:05:35.13: 9 2: NdpRsrcQueryInfo in
2016/06/26 18:05:35.13: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/26 18:05:35.14: 9 2: NdpRsrcQueryInfo in
2016/06/26 18:05:35.14: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/26 18:05:35.14: 9 2: NdpRsrcQueryFSAllocate 0285fdd0
2016/06/26 18:05:35.14: 9 2: checkMountResource in tid#2
2016/06/26 18:05:35.16: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/26 18:06:50.9: 4 2: Connection to kerberos.samba.arcanoae failed (Error NT_STATUS_INVALID_PARAMETER)
2016/06/26 18:06:50.9: 9 2: smbwrp_connect failed rc=3
2016/06/26 18:06:50.9: 9 2: NdpRsrcQueryInfo in
2016/06/26 18:06:50.9: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/26 18:06:50.9: 9 2: NdpRsrcQueryInfo in
2016/06/26 18:06:50.10: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/26 18:06:50.10: 9 2: NdpRsrcQueryInfo in
2016/06/26 18:06:50.17: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/26 18:06:50.17: 9 2: NdpRsrcQueryInfo in
2016/06/26 18:06:50.17: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/26 18:06:50.20: 9 2: NdpCreateConnection in
2016/06/26 18:06:50.21: 9 2: NdpCreateConnection send CONNECT
2016/06/26 18:06:50.21: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/26 18:08:05.10: 4 2: Connection to kerberos.samba.arcanoae failed (Error NT_STATUS_INVALID_PARAMETER)
2016/06/26 18:08:05.10: 9 2: NdpCreateConnection [0] rc=87

Conditions:

  1. No ticket acquired.
  2. Stopped NDCTL.EXE
  3. Updated ndpsmb.dll to version above.
  4. Started NDCTL.EXE
  5. Ran dir against a Kerberos-authenticated share.

comment:9 by Lewis Rosenthal, 8 years ago

However, if we have a valid TGT and the server has gone away (no way to get a TGS, as we have not requested one between the time we obtained the TGT and the time the Samba server has disappeared), there is no delay:

2016/06/27 01:46:45.20: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:45.22: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:45.22: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:45.22: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:45.22: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:45.22: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:45.23: 9 2: NdpRsrcQueryFSAllocate 0285fdd0
2016/06/27 01:46:45.23: 9 2: checkMountResource in tid#2
2016/06/27 01:46:45.23: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/27 01:46:45.24: 4 2: Server supports SMB2_10 protocol
2016/06/27 01:46:47.0: 1 2: session setup failed: NT_STATUS_INTERNAL_ERROR
2016/06/27 01:46:47.0: 4 2: did you forget to run kinit?
2016/06/27 01:46:47.0: 9 2: smbwrp_connect failed rc=6
2016/06/27 01:46:47.0: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:47.0: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:47.0: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:47.0: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:47.0: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:47.0: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:47.0: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:47.0: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:47.0: 9 2: NdpCreateConnection in
2016/06/27 01:46:47.0: 9 2: NdpCreateConnection send CONNECT
2016/06/27 01:46:47.0: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/27 01:46:47.1: 4 2: Server supports SMB2_10 protocol
2016/06/27 01:46:48.60: 1 2: session setup failed: NT_STATUS_INTERNAL_ERROR
2016/06/27 01:46:48.60: 4 2: did you forget to run kinit?
2016/06/27 01:46:48.60: 9 2: NdpCreateConnection [0] rc=87
2016/06/27 01:46:51.68: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:51.68: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:51.68: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:51.68: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:51.68: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:51.68: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:51.69: 9 2: NdpRsrcQueryFSAllocate 0285fdd0
2016/06/27 01:46:51.69: 9 2: checkMountResource in tid#2
2016/06/27 01:46:51.69: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/27 01:46:51.69: 4 2: Server supports SMB2_10 protocol
2016/06/27 01:46:53.28: 1 2: session setup failed: NT_STATUS_INTERNAL_ERROR
2016/06/27 01:46:53.28: 4 2: did you forget to run kinit?
2016/06/27 01:46:53.28: 9 2: smbwrp_connect failed rc=6
2016/06/27 01:46:53.28: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:53.28: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:53.28: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:53.28: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:53.28: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:53.28: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:53.28: 9 2: NdpRsrcQueryInfo in
2016/06/27 01:46:53.28: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 01:46:53.29: 9 2: NdpCreateConnection in
2016/06/27 01:46:53.29: 9 2: NdpCreateConnection send CONNECT
2016/06/27 01:46:53.29: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/27 01:46:53.29: 4 2: Server supports SMB2_10 protocol
2016/06/27 01:46:55.0: 1 2: session setup failed: NT_STATUS_INTERNAL_ERROR
2016/06/27 01:46:55.0: 4 2: did you forget to run kinit?
2016/06/27 01:46:55.0: 9 2: NdpCreateConnection [0] rc=87

and no crash of NDCTL.EXE, either.

I'm not certain how we can know that the server support SMB2_10, either, as indicated in the log (I do see the new log messages, now, BTW).

comment:10 by Paul Smedley, 8 years ago

TGT/TGS? WTF? :)

smb protocol is determined before logon. cli_connect is called to establish a server connection, then a protocol is negotiated, then logon is attempted - either with credentials or via a kerberos ticket.

OK looks like I can reproduce with: No ticket acquired. Stopped NDCTL.EXE Updated ndpsmb.dll to version above. Started NDCTL.EXE Ran dir against a Kerberos-authenticated share.

Here though, after approx 90 seconds, cpu dropped back to normal and the command prompt returned:

2016/06/27 19:29:08.64: 1 2: Connecting to \\:KERBEROS\IPC$ using kerberos authentication. Master WORKGROUP:1
2016/06/27 19:30:25.86: 4 2: Connection to KERBEROS failed (Error NT_STATUS_INVALID_PARAMETER)
2016/06/27 19:30:25.86: 9 2: smbwrp_connect failed rc=3
2016/06/27 19:30:25.86: 9 2: NdpRsrcQueryInfo in
2016/06/27 19:30:25.86: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\KERBEROS@psmedley
2016/06/27 19:30:25.87: 9 2: NdpRsrcQueryInfo in
2016/06/27 19:30:25.87: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\KERBEROS@psmedley
2016/06/27 19:30:25.87: 9 2: NdpRsrcQueryInfo in
2016/06/27 19:30:25.87: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\KERBEROS@psmedley
2016/06/27 19:30:25.87: 9 2: NdpRsrcQueryInfo in
2016/06/27 19:30:25.87: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\KERBEROS@psmedley
2016/06/27 19:30:25.87: 9 2: NdpCreateConnection in
2016/06/27 19:30:25.87: 9 2: NdpCreateConnection send CONNECT
2016/06/27 19:30:25.87: 1 2: Connecting to \\:KERBEROS\IPC$ using kerberos authentication. Master WORKGROUP:1
2016/06/27 19:31:41.87: 4 2: Connection to KERBEROS failed (Error NT_STATUS_INVALID_PARAMETER)
2016/06/27 19:31:41.87: 9 2: NdpCreateConnection [0] rc=87
Last edited 8 years ago by Paul Smedley (previous) (diff)

comment:11 by Lewis Rosenthal, 8 years ago

Sorry, I meant to say that I was puzzled that we could know the SMB protocol when the server was not there (it was, in fact, offline completely). Cached data? Again, it was not the fact that we did not have a ticket, but rather that we could not contact the server at all.

Okay, so (latest 20160627 dll):

  1. Ticket acquired, NDCTL.EXE started, dir, OK.
  2. Ticket destroyed, NDCTL.EXE still running, dir, OK.
  3. No ticket (see previous), NDCTL.EXE still running, wait a few minutes, dir, SYS0002 (quickly), OK.
  4. No ticket (see previous), NDCTL.EXE stopped, started, dir, OK.

In all cases, CPU was normal. Logs look as expected, e.g., for case 3:

2016/06/27 11:13:22.73: 9 2: NdpRsrcQueryInfo in
2016/06/27 11:13:22.73: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 11:13:22.73: 9 2: NdpRsrcQueryInfo in
2016/06/27 11:13:22.73: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 11:13:22.73: 9 2: NdpRsrcQueryInfo in
2016/06/27 11:13:22.73: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 11:13:22.73: 9 2: NdpRsrcQueryInfo in
2016/06/27 11:13:22.73: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 11:13:22.74: 9 2: NdpQueryPathInfo in [0x626980] <*>
2016/06/27 11:13:22.74: 9 2: NdpFindStart in [0x626980]
2016/06/27 11:13:22.74: 9 2: NdpFindStart: dir [\], dir_mask [*], mask [\*], szPath [*]
2016/06/27 11:13:22.74: 4 2:  smbwrp_echo
2016/06/27 11:13:22.74: 4 2: NdpFindStart, smbwrp_echo rc = -1, reconnecting.....
2016/06/27 11:13:22.74: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/27 11:13:22.74: 4 2: Server supports SMB2_10 protocol
2016/06/27 11:13:24.34: 1 2: session setup failed: NT_STATUS_INTERNAL_ERROR
2016/06/27 11:13:24.34: 4 2: did you forget to run kinit?

and then for case 4 (where no previous connection had been made for this run of NDCTL.EXE):

2016/06/27 11:15:31.37: 9 2: NdpRsrcQueryInfo in
2016/06/27 11:15:31.37: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 11:15:31.38: 9 2: NdpRsrcQueryInfo in
2016/06/27 11:15:31.38: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 11:15:31.38: 9 2: NdpRsrcQueryInfo in
2016/06/27 11:15:31.38: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 11:15:31.38: 9 2: NdpRsrcQueryInfo in
2016/06/27 11:15:31.38: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 11:15:31.38: 9 2: NdpCreateConnection in
2016/06/27 11:15:31.38: 9 2: NdpCreateConnection send CONNECT
2016/06/27 11:15:31.38: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/27 11:15:31.39: 4 2: Server supports SMB2_10 protocol
2016/06/27 11:15:32.98: 1 2: session setup failed: NT_STATUS_INTERNAL_ERROR
2016/06/27 11:15:32.98: 4 2: did you forget to run kinit?
2016/06/27 11:15:32.98: 9 2: NdpCreateConnection [0] rc=87

Likewise, a failed copy command yields expected results:

[c:\]copy CONFIG.001 o:\
C:\CONFIG.001 => O:\CONFIG.001
SYS0065: Network access is denied. "O:\CONFIG.001"
     0 files copied

I tested the above multiple times, also directing to an inaccessible remote directory, resulting in SYS0065. NDCTL.EXE did not crash, CPU did not go high, and the prompt returned in a reasonable amount of time (2-3 seconds). I then waited several seconds and tried again. This time, I got the hang, which continued until I stopped NDCTL.EXE (I could not regain control of the command line with Ctrl-C).

Log:

2016/06/27 11:26:51.49: 9 2: NdpRsrcQueryInfo in
2016/06/27 11:26:51.49: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 11:26:51.49: 9 2: NdpRsrcQueryInfo in
2016/06/27 11:26:51.49: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 11:26:51.49: 9 2: NdpRsrcQueryInfo in
2016/06/27 11:26:51.49: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 11:26:51.49: 9 2: NdpRsrcQueryInfo in
2016/06/27 11:26:51.50: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 11:26:51.50: 9 2: NdpRsrcQueryInfo in
2016/06/27 11:26:51.50: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\kerberos.samba.arcanoae\shared@fred
2016/06/27 11:26:51.50: 9 2: NdpCreateConnection in
2016/06/27 11:26:51.50: 9 2: NdpCreateConnection send CONNECT
2016/06/27 11:26:51.50: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/27 11:28:06.13: 4 2: Connection to kerberos.samba.arcanoae failed (Error NT_STATUS_INVALID_PARAMETER)
2016/06/27 11:28:06.13: 9 2: NdpCreateConnection [0] rc=87
2016/06/27 11:28:06.14: 9 2: NdpCreateConnection in
2016/06/27 11:28:06.14: 9 2: NdpCreateConnection send CONNECT
2016/06/27 11:28:06.14: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/27 11:29:21.14: 4 2: Connection to kerberos.samba.arcanoae failed (Error NT_STATUS_INVALID_PARAMETER)
2016/06/27 11:29:21.14: 9 2: NdpCreateConnection [0] rc=87
2016/06/27 11:29:21.14: 9 2: NdpCreateConnection in
2016/06/27 11:29:21.14: 9 2: NdpCreateConnection send CONNECT
2016/06/27 11:29:21.14: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/27 11:30:36.15: 4 2: Connection to kerberos.samba.arcanoae failed (Error NT_STATUS_INVALID_PARAMETER)
2016/06/27 11:30:36.15: 9 2: NdpCreateConnection [0] rc=87

NDCTL.EXE stopped.

A subsequent retry of the command also got NDCTL.EXE stuck with high CPU, but Ctrl-C within several seconds did regain control of the session (I still had to kill NDCTL.EXE).

comment:12 by Lewis Rosenthal, 8 years ago

This issue is actually worse than originally expected. It took me a while (and a couple dozen boot cycles) to figure out just what was happening, but every (and I mean *every*) build since:

6-22-16 3:51 13,953,656 124 ndpsmb.dll

create a major delay in desktop init (complete stall in builds beginning 20160626) when the following conditions exist:

  1. Drive mapping is auto-loaded at startup which points to a Kerberos-authenticated share; and
  2. No valid ticket exists (likely optional); and
  3. Server is unavailable (likely culprit)

When NDCTL.EXE is started from STARTUP.CMD, the desktop stall/hang occurs just as XWP is instantiating the trash can (whether or not there is anything in the trash).

If one waits out the delay with the versions which will ultimately allow the desktop to come up, the first access of the Drives object stalls the desktop. At this point, when one is able to access TOP, NDCTL.EXE is hogging CPU. It can be killed, and things go back to normal.

I believe this coincides with the ping/reconnect code. Here's a complete log of a recent hang, if I've chosen well:

Samba client 3.0.x build beta-date based on 4.4.4
This build is maintained by netlabs
2016/06/28 00:02:35.39: 9 1: Working with 64 bit fileio NDFS
2016/06/28 00:02:37.47: 9 1: NdpIOCTL init thread
2016/06/28 00:02:37.47: 9 1: NdpMountResource in
2016/06/28 00:02:38.91: 9 1: dircache_create: 10 seconds, 32 entries
2016/06/28 00:02:38.91: 9 1: dircache_create: 0x6257a0, rc = 0
2016/06/28 00:02:38.91: 9 1: NdpMountResource rc=0
2016/06/28 00:02:38.95: 9 1: NdpMountResource in
2016/06/28 00:02:38.95: 9 1: dircache_create: 10 seconds, 32 entries
2016/06/28 00:02:38.95: 9 1: dircache_create: 0x625f60, rc = 0
2016/06/28 00:02:38.95: 9 1: NdpMountResource rc=0
2016/06/28 00:02:38.95: 9 1: NdpMountResource in
2016/06/28 00:02:38.95: 9 1: dircache_create: 10 seconds, 32 entries
2016/06/28 00:02:38.95: 9 1: dircache_create: 0x626720, rc = 0
2016/06/28 00:02:38.95: 9 1: NdpMountResource rc=0
2016/06/28 00:02:41.31: 9 2: NdpIOCTL init thread
2016/06/28 00:02:41.31: 9 2: NdpRsrcQueryFSAllocate 0285fdd0
2016/06/28 00:02:41.31: 9 2: checkMountResource in tid#2
2016/06/28 00:02:41.31: 1 2: Connecting to \\Lewis:*********@workgroup:othello\data. Master WORKGROUP:1
2016/06/28 00:02:41.31: 1 2: NTLMv1 Support enabled
2016/06/28 00:02:41.98: 4 2: Server supports NT1 protocol
2016/06/28 00:02:41.98: 4 2:  session setuping for <Lewis>/<********> in <workgroup>
2016/06/28 00:02:42.27: 4 2:  session setup ok. Sending tconx <data> <********>
2016/06/28 00:02:42.30: 4 2:  tconx ok
2016/06/28 00:02:42.30: 4 2:  smbwrp_dskattr
2016/06/28 00:02:42.36: 4 2: Error in cli_disk_size: NT_STATUS_INVALID_NETWORK_RESPONSE
2016/06/28 00:02:42.37: 4 2:  smbwrp_disconnect
2016/06/28 00:02:42.40: 9 2: NdpRsrcQueryFSAllocate rc=87/rc1=0 (cUnit = 123456/cUnitAvail = 123456/cbSector = 2048)
2016/06/28 00:02:42.40: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:02:42.40: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\workgroup:othello\data@Lewis
2016/06/28 00:02:42.40: 9 2: NdpRsrcQueryFSAllocate 0285fdd0
2016/06/28 00:02:42.40: 9 2: checkMountResource in tid#2
2016/06/28 00:02:42.40: 1 2: Connecting to \\lewis:*********@workgroup:dp45sg\lewis. Master WORKGROUP:1
2016/06/28 00:02:42.44: 4 2: Server supports SMB3_11 protocol
2016/06/28 00:02:42.44: 4 2:  session setuping for <lewis>/<********> in <workgroup>
2016/06/28 00:02:42.65: 4 2:  session setup ok. Sending tconx <lewis> <********>
2016/06/28 00:02:42.67: 4 2:  tconx ok
2016/06/28 00:02:42.67: 4 2:  smbwrp_dskattr
2016/06/28 00:02:42.71: 4 2: 
		262016000 blocks of size 1024. 138645204 blocks available
2016/06/28 00:02:42.71: 4 2:  smbwrp_disconnect
2016/06/28 00:02:42.71: 9 2: NdpRsrcQueryFSAllocate rc=0/rc1=0 (cUnit = 262016000/cUnitAvail = 138645204/cbSector = 1024)
2016/06/28 00:02:42.71: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:02:42.71: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\workgroup:dp45sg\lewis@lewis
2016/06/28 00:02:42.71: 9 2: NdpRsrcQueryFSAllocate 0285fdd0
2016/06/28 00:02:42.71: 9 2: checkMountResource in tid#2
2016/06/28 00:02:42.71: 1 2: Connecting to \\:kerberos.samba.arcanoae\shared using kerberos authentication. Master WORKGROUP:1
2016/06/28 00:03:58.63: 4 2: Connection to kerberos.samba.arcanoae failed (Error NT_STATUS_INVALID_PARAMETER)
2016/06/28 00:03:58.64: 9 2: smbwrp_connect failed rc=3
2016/06/28 00:03:58.64: 9 2: Calling NdpFreeResource() to clear connection for security purposes
2016/06/28 00:03:58.64: 9 2: dircache_delete: 0x626720
2016/06/28 00:03:58.64: 9 2: NdpFreeResource 0
2016/06/28 00:03:58.64: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:03:58.65: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\@fred
2016/06/28 00:04:02.47: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:04:02.47: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\workgroup:othello\data@Lewis
2016/06/28 00:04:02.47: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:04:02.48: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\workgroup:othello\data@Lewis
2016/06/28 00:04:02.48: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:04:02.48: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\workgroup:dp45sg\lewis@lewis
2016/06/28 00:04:02.48: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:04:02.48: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\workgroup:dp45sg\lewis@lewis
2016/06/28 00:04:02.50: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:04:02.50: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\@fred
2016/06/28 00:04:02.50: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:04:02.50: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\@fred
2016/06/28 00:04:10.65: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:04:10.65: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\workgroup:othello\data@Lewis
2016/06/28 00:04:10.65: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:04:10.65: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\workgroup:dp45sg\lewis@lewis
2016/06/28 00:04:10.65: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:04:10.65: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\@fred
2016/06/28 00:05:51.98: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:05:51.98: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\workgroup:othello\data@Lewis
2016/06/28 00:05:51.98: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:05:51.98: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\workgroup:dp45sg\lewis@lewis
2016/06/28 00:05:51.98: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:05:51.98: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\@fred
2016/06/28 00:05:51.98: 9 2: NdpRsrcQueryInfo in
2016/06/28 00:05:51.98: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\workgroup:othello\data@Lewis

Legend:

  • othello - remote NetWare 6.5 SP8 server (NTLM, SMB1)
  • dp45sg - local openSUSE Leap 42.1 (NTLMv2, SMB3_11)
  • kerberos - (@fred) Windows Server 2008 R2 (KRB5, SMB2_10)

comment:13 by Paul Smedley, 8 years ago

I see the same ~2 minute hang & 100% cpu with a dir on a kerberos share where the server is not active with all 4.4.4 builds, not just since 0622....

In fact, I see the same issue with 4.3.x builds as well....

...and smbclient -k KERBEROS/Shared gives the problem too under OS/2.

Will investigate more tomorrow/over the weekend. A quick review of a logging libc log shows a truckload of fcntl() calls - clearly something isn't working correctly for us....

Last edited 8 years ago by Paul Smedley (previous) (diff)

comment:14 by Lewis Rosenthal, 8 years ago

Indeed, I actually did see it on this last reboot with the 0622 build.

comment:15 by Paul Smedley, 8 years ago

trying to build a debug version of smbclient.... might help track down where the 100% cpu is coming from. Linux smbclient returns (Error NT_STATUS_HOST_UNREACHABLE) almost immediately....

comment:16 by Paul Smedley, 8 years ago

OK - tevent_req_poll_ntstatus() from cli_connect_nb() is hogging cpu.

This function calls tevent_req_poll() which is an event loop waiting for a response from a request. Just need to figure out why it's hogging cpu on OS/2 now....

comment:17 by Paul Smedley, 8 years ago

I can fix the 100% cpu with a usleep() call but this adds delays to working servers, and doesn't prevent the ~2minute delays....

comment:18 by Paul Smedley, 8 years ago

Resolution: fixed
Status: acceptedclosed

Potential fix (works for me) in http://smedley.id.au/tmp/ndpsmb.zip - works around some idiosyncracies in our select implementation

comment:19 by Lewis Rosenthal, 8 years ago

Latest build breaks compatibility with SMB1:

 7-02-16  19:27      16,090,195    124  smbcln44.dll
 7-02-16  19:29         673,222    124  smbcln44.map
 7-02-16  19:29          74,832    124  ndpsmb.dll
 7-02-16  19:29          17,551    124  ndpsmb.map

log.ndpsmb:

2016/07/06 00:42:12.26: 9 2: NdpRsrcQueryInfo in
2016/07/06 00:42:12.26: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\workgroup:othello\data@Lewis
2016/07/06 00:42:12.26: 9 2: NdpRsrcQueryInfo in
2016/07/06 00:42:12.27: 9 2: NdpRsrcQueryInfo rc=0, s = SMBFS64 \\workgroup:othello\data@Lewis
2016/07/06 00:42:12.27: 9 2: NdpCreateConnection in
2016/07/06 00:42:12.27: 9 2: NdpCreateConnection send CONNECT
2016/07/06 00:42:12.27: 1 2: Connecting to \\Lewis:*********@workgroup:othello\data. Master WORKGROUP:1
2016/07/06 00:42:12.27: 1 2: NTLMv1 Support enabled
2016/07/06 00:42:12.28: 4 2: Connection to othello failed (Error NT_STATUS_MORE_PROCESSING_REQUIRED)
2016/07/06 00:42:12.28: 9 2: NdpCreateConnection [0] rc=87

log.smbc:

[2016/07/06 00:42:12.213000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=364 rw_type=1 len=1 waitflag=1 (rc=0) pid=464
[2016/07/06 00:42:12.213000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=364 rw_type=1 len=1 (rc=0) pid=464
[2016/07/06 00:42:12.213000,  5, pid=464, effective(0, 0), real(0, 0)] ../source3/libads/sitename_cache.c:102(sitename_fetch)
  sitename_fetch: No stored sitename for 
[2016/07/06 00:42:12.213000, 10, pid=464, effective(0, 0), real(0, 0)] ../source3/libsmb/namequery.c:2596(internal_resolve_name)
  internal_resolve_name: looking up othello#20 (sitename (null))
[2016/07/06 00:42:12.213000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=436 rw_type=1 len=1 waitflag=1 (rc=0) pid=464
[2016/07/06 00:42:12.215000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=436 rw_type=1 len=1 (rc=0) pid=464
[2016/07/06 00:42:12.215000,  5, pid=464, effective(0, 0), real(0, 0)] ../source3/libsmb/namecache.c:160(namecache_fetch)
  no entry for othello#20 found.
[2016/07/06 00:42:12.215000,  3, pid=464, effective(0, 0), real(0, 0)] ../libcli/nbt/lmhosts.c:185(resolve_lmhosts_file_as_sockaddr)
  resolve_lmhosts: Attempting lmhosts lookup for name othello<0x20>
[2016/07/06 00:42:12.215000,  3, pid=464, effective(0, 0), real(0, 0)] ../source3/libsmb/namequery.c:2099(resolve_wins_send)
  resolve_wins: WINS server resolution selected and no WINS servers listed.
[2016/07/06 00:42:12.215000,  3, pid=464, effective(0, 0), real(0, 0)] ../source3/libsmb/namequery.c:2306(resolve_hosts)
  resolve_hosts: Attempting host lookup for name othello<0x20>
[2016/07/06 00:42:12.215000, 10, pid=464, effective(0, 0), real(0, 0)] ../source3/libsmb/namequery.c:1141(remove_duplicate_addrs2)
  remove_duplicate_addrs2: looking for duplicate address/port pairs
[2016/07/06 00:42:12.217000,  5, pid=464, effective(0, 0), real(0, 0)] ../source3/libsmb/namecache.c:78(namecache_store)
  namecache_store: storing 1 address for othello#20: 192.168.200.1
[2016/07/06 00:42:12.217000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=436 rw_type=1 len=1 waitflag=1 (rc=0) pid=464
[2016/07/06 00:42:12.217000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=436 rw_type=1 len=1 (rc=0) pid=464
[2016/07/06 00:42:12.217000, 10, pid=464, effective(0, 0), real(0, 0), class=tdb] ../source3/lib/gencache.c:333(gencache_set_data_blob)
  Adding cache entry with key=[NBT/OTHELLO#20] and timeout=[Wed Jul  6 12:53:12 am 2016 EDT] (660 seconds ahead)
[2016/07/06 00:42:12.217000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=436 rw_type=3 len=1 waitflag=1 (rc=0) pid=464
[2016/07/06 00:42:12.217000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=164 rw_type=3 len=1 waitflag=1 (rc=0) pid=464
[2016/07/06 00:42:12.217000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=697 rw_type=3 len=24 waitflag=0 (rc=0) pid=464
[2016/07/06 00:42:12.217000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=697 rw_type=3 len=24 (rc=0) pid=464
[2016/07/06 00:42:12.219000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=12116 rw_type=3 len=4 waitflag=0 (rc=0) pid=464
[2016/07/06 00:42:12.219000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=12116 rw_type=3 len=4 (rc=0) pid=464
[2016/07/06 00:42:12.219000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=12120 rw_type=3 len=24 waitflag=0 (rc=0) pid=464
[2016/07/06 00:42:12.219000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=12120 rw_type=3 len=24 (rc=0) pid=464
[2016/07/06 00:42:12.219000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=12200 rw_type=3 len=4 waitflag=0 (rc=0) pid=464
[2016/07/06 00:42:12.219000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=12200 rw_type=3 len=4 (rc=0) pid=464
[2016/07/06 00:42:12.219000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=164 rw_type=3 len=1 (rc=0) pid=464
[2016/07/06 00:42:12.219000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=12120 rw_type=3 len=24 waitflag=0 (rc=0) pid=464
[2016/07/06 00:42:12.219000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=12120 rw_type=3 len=24 (rc=0) pid=464
[2016/07/06 00:42:12.221000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=12144 rw_type=3 len=15 waitflag=0 (rc=0) pid=464
[2016/07/06 00:42:12.221000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=12144 rw_type=3 len=15 (rc=0) pid=464
[2016/07/06 00:42:12.221000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=12159 rw_type=3 len=29 waitflag=0 (rc=0) pid=464
[2016/07/06 00:42:12.221000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=12159 rw_type=3 len=29 (rc=0) pid=464
[2016/07/06 00:42:12.221000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=440 rw_type=3 len=4 waitflag=0 (rc=0) pid=464
[2016/07/06 00:42:12.221000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=440 rw_type=3 len=4 (rc=0) pid=464
[2016/07/06 00:42:12.221000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=48 rw_type=3 len=1 waitflag=1 (rc=0) pid=464
[2016/07/06 00:42:12.221000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=48 rw_type=3 len=4 waitflag=0 (rc=-1) pid=464
[2016/07/06 00:42:12.221000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): tdb_brlock failed (fd=12) at offset 48 rw_type=3 flags=0 len=4
[2016/07/06 00:42:12.223000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=48 rw_type=1 len=1 (rc=0) pid=464
[2016/07/06 00:42:12.223000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=48 rw_type=3 len=4 waitflag=0 (rc=0) pid=464
[2016/07/06 00:42:12.223000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=48 rw_type=3 len=4 (rc=0) pid=464
[2016/07/06 00:42:12.223000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=48 rw_type=1 len=1 waitflag=1 (rc=0) pid=464
[2016/07/06 00:42:12.223000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=48 rw_type=3 len=1 (rc=0) pid=464
[2016/07/06 00:42:12.223000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=436 rw_type=3 len=1 (rc=0) pid=464
[2016/07/06 00:42:12.223000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=608 rw_type=1 len=1 waitflag=1 (rc=0) pid=464
[2016/07/06 00:42:12.223000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=608 rw_type=1 len=1 (rc=0) pid=464
[2016/07/06 00:42:12.223000,  1, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache.tdb): tdb_transaction_start: cannot start a transaction with locks held
[2016/07/06 00:42:12.223000, 10, pid=464, effective(0, 0), real(0, 0), class=tdb] ../source3/lib/gencache.c:669(gencache_stabilize)
  Could not start transaction on gencache.tdb: Locking error
[2016/07/06 00:42:12.223000, 10, pid=464, effective(0, 0), real(0, 0)] ../source3/libsmb/namequery.c:2789(internal_resolve_name)
  internal_resolve_name: returning 1 addresses: 192.168.200.1:0 
[2016/07/06 00:42:12.225000,  3, pid=464, effective(0, 0), real(0, 0)] ../source3/lib/util_sock.c:515(open_socket_out_send)
  Connecting to 192.168.200.1 at port 445
[2016/07/06 00:42:12.274000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=364 rw_type=1 len=1 waitflag=1 (rc=0) pid=464
[2016/07/06 00:42:12.274000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=364 rw_type=1 len=1 (rc=0) pid=464
[2016/07/06 00:42:12.274000,  5, pid=464, effective(0, 0), real(0, 0)] ../source3/libads/sitename_cache.c:102(sitename_fetch)
  sitename_fetch: No stored sitename for 
[2016/07/06 00:42:12.274000, 10, pid=464, effective(0, 0), real(0, 0)] ../source3/libsmb/namequery.c:2596(internal_resolve_name)
  internal_resolve_name: looking up othello#20 (sitename (null))
[2016/07/06 00:42:12.274000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_lock: (fd=12) offset=436 rw_type=1 len=1 waitflag=1 (rc=0) pid=464
[2016/07/06 00:42:12.274000,  5, pid=464, effective(0, 0), real(0, 0)] ../lib/tdb_wrap/tdb_wrap.c:64(tdb_wrap_log)
  tdb(C:\MPTN\ETC/samba/lock/gencache_notrans.tdb): fcntl_unlock: (fd=12) offset=436 rw_type=1 len=1 (rc=0) pid=464
[2016/07/06 00:42:12.274000,  5, pid=464, effective(0, 0), real(0, 0)] ../source3/libsmb/namecache.c:165(namecache_fetch)
  name othello#20 found.
[2016/07/06 00:42:12.276000, 10, pid=464, effective(0, 0), real(0, 0)] ../source3/libsmb/namequery.c:1141(remove_duplicate_addrs2)
  remove_duplicate_addrs2: looking for duplicate address/port pairs
[2016/07/06 00:42:12.276000,  3, pid=464, effective(0, 0), real(0, 0)] ../source3/lib/util_sock.c:515(open_socket_out_send)
  Connecting to 192.168.200.1 at port 445

SMB3_11 seems to be okay. I haven't tested against SMB2, yet.

comment:20 by Lewis Rosenthal, 8 years ago

FWIW:

 6-27-16   5:18       7,269,718    124  smbcln44.dll
 6-27-16   5:24          17,640    124  ndpsmb.map
 6-27-16   5:34          74,936    124  ndpsmb.dll

doesn't seem to have a problem with SMB1.

comment:21 by Paul Smedley, 8 years ago

NT1 connection to an OS/2 Samba server is ok here with latest builds....

comment:22 by Paul Smedley, 8 years ago

I see lots of fcntl errors above - I'll rebuilt smbcln44.dll to use libcx0 and you can test and see if it helps. I'll add a copy of libcx0.dll just incase you don't have it from rpm.... http://smedley.id.au/tmp/ndpsmb.zip

Last edited 8 years ago by Paul Smedley (previous) (diff)

by Lewis Rosenthal, 8 years ago

Attachment: log.smbc added

log.smbc when NT1 login fails

comment:23 by Lewis Rosenthal, 8 years ago

This is still not working for me with NT1:

2016/07/07 13:58:48.81: 1 2: NTLMv1 Support enabled
2016/07/07 13:58:48.85: 4 2: Connection to othello failed (Error NT_STATUS_MORE_PROCESSING_REQUIRED)
2016/07/07 13:58:48.85: 9 2: smbwrp_connect failed rc=3

Sorry to post to a closed ticket. I should test the actual case of this ticket, and open a new one for NT1, if necessary.

comment:24 by Paul Smedley, 8 years ago

Please test with http://smedley.id.au/tmp/smbcln44.zip

This reverts the fix in comment 18, and wil help confirm if this fix is causing your issues with Othello

comment:25 by Lewis Rosenthal, 8 years ago

Yes, fixed. Thanks. Of course, that doesn't address the issue of this ticket (two steps forward and one back), unfortunately.

comment:26 by Paul Smedley, 8 years ago

Resolution: fixed
Status: closedreopened

comment:27 by Lewis Rosenthal, 8 years ago

Resolution: fixed
Status: reopenedclosed

This is working for me, now. I am no longer seeing the fcntl errors we saw before when accessing the CIFS volume across the WAN.

comment:28 by Paul Smedley, 8 years ago

FWIW - I reverted the poll/select hacks, and with dmik's latest libcx things now work correctly, and I get: [U:\DEV\samba-4.4.6]smbclient -k KERBEROS/Shared WARNING: The "null passwords" option is deprecated Connection to KERBEROS failed (Error NT_STATUS_IO_TIMEOUT)

within a very reasonable time and with no high cpu load....

comment:29 by Lewis Rosenthal, 8 years ago

Closing this may have been premature, as we are still unable to access such shares with the latest release (which supposedly fixes this issue, if I read the preceding comment correctly).

comment:30 by Paul Smedley, 8 years ago

The fix in comment 28 isn't in any released binary....

Note: See TracTickets for help on using tickets.