Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#65 closed defect (fixed)

SVN rev 325: completely hangs the WPS on system start

Reported by: erdmann Owned by:
Priority: major Milestone: Future
Component: IFS Version:
Severity: high Keywords:
Cc:

Description

SVN rev 325 just completely blocks the WPS on system start. Everything, including my USB mouse becomes unresponsive and I have to reset the system with the reset switch.

SVN rev 322 does not exhibit this problem.

Change History (16)

comment:1 by Valery V. Sedletski, 7 years ago

Very strange. I use USB mouse, USB keyboard and many USBMSD devices without problems. Are you sure the problem is with fat32.ifs? Could it be an USB problem? (I have IBM's USB stack, plus usbmsd.add from your 10.216 USB stack and see no problems)

comment:2 by erdmann, 7 years ago

It's not a USB problem. If I revert to rev 322 everything is fine.
I would be happy with rev 322 if the "trace enable" problem were fixed ...

Can't you just back out the SVN rev 324 "read FAT in 32k blocks" change and see if that fixes the problem (send me a test version) ? At least that's the most suspicious candidate.

Last edited 7 years ago by erdmann (previous) (diff)

comment:3 by Valery V. Sedletski, 7 years ago

USB problem can be triggered by something in the new IFS version. So, it's not clear indeed, what's the cause. Ok, I disabled reading FAT by 32 KB pieces (in non-FAT12 case. For FAT12, still 3 sectors at a time). The test version is here: ftp://osfree.org/upload/fat32/test/test_lars1.zip, please test.

PS: The trace problem was (hopefully) fixed in r326.

comment:4 by Valery V. Sedletski, 7 years ago

Oops, please test ftp://osfree.org/upload/fat32/test/test_lars2.zip (I forgot to comment out one line). Now it is the same as r326 with disabled FAT reading/writing by 32 KB blocks patch.

comment:5 by erdmann, 7 years ago

I tested with ftp://osfree.org/upload/fat32/test/test_lars2.zip and yes, it fixes the problem.
No more hangs of the WPS and everything is back to normal.

Therefore something is wrong with the "read 32k at a time" optimization.

I suggest to leave this ticket open until we have again a working version.

comment:6 by Valery V. Sedletski, 7 years ago

Ok, I reverted this patch for the time being. But its effect is very strange. Indeed, I had some WPS hangs today too, but not permanently -- only sometimes. Indeed, the intent was to enhance performance via reading by larger FAT portions at once. Still don't know why it hanged the WPS. You can look at r324 changeset, I see nothing suspicious here.

comment:7 by erdmann, 7 years ago

I do see suspicious issues:

1) "pVolInfo->pbFatSector" is malloced but the pointer never checked. That of course can lead to a system crash if the pointer is used if it is invalid/NULL.
2) pVolInfo is allocated via "gdtAlloc" but freed by EITHER "freeseg" OR "free", depending on the code path taken. So what is correct now and how do these 2 routines differ ?
3) At least to me it is totally unclear if "malloc" and "free" are custom implementations (they better be if they are supposed to work in a driver) or if they are the default RTL implementations of the Watcom library. In the latter case, using "malloc" and "free" will likely lead to a system crash.

Version 0, edited 7 years ago by erdmann (next)

comment:8 by Valery V. Sedletski, 7 years ago

malloc/free is, of course, a custom implementation based on FSH_SEGALLOC/FSH_SEGFREE (see ifsmem.c). gdtAlloc is a wrapper around FSH_ALLOC and allocates a single selector. malloc, in turn, calls gdtAlloc, to implement heap. free calls gdtFree, so freeing pVolInfo via free, while allocating it via gdtAlloc, is ok. Regarding pVolInfo->pbFatSector, you are right, I should check it for != 0 (but it is commented out now).

comment:9 by erdmann, 7 years ago

IN that case I would suggest to ONLY use "malloc" and "free" in order to avoid confusion.
Does EVERY invocation of "malloc" and "free" use FSH_SEGALLOC (you can also point me to the relevant code ...) ? In that case you will quickly run out of GDT selectors. The consequence will be a NULL pointer returned.

comment:10 by Valery V. Sedletski, 7 years ago

You can look into ifsmem.c, where malloc() is implemented. It looks that MAX_SELECTORS == 100 and this number is used as a maximum number of used GDT selectors. This is much, of course. But malloc() calls gdtAlloc to allocate heap by portions of size HEAP_SIZE == 0xfff0 bytes. So, each selector covers an area of about 64 KB, i.e., almost maximum number. Then malloc() calls FindFreeSpace() to get a free block of needed size inside the big 64 KB block. So, it preallocates a heap with initial big blocks and partitions them to smaller blocks, which are returned. So, it appears that it will not occupy more than 100 selectors.

comment:11 by erdmann, 7 years ago

I still suggest to replace the call to "gdtAlloc" with "malloc".
But we can now close this ticket.

comment:12 by erdmann, 7 years ago

Resolution: fixed
Status: newclosed

Backing out SVN rev 324 ("read FAT in 32k blocks") fixed the hang.

Last edited 7 years ago by erdmann (previous) (diff)

comment:13 by Valery V. Sedletski, 7 years ago

Ok.

Regarding gdtAlloc() vs malloc(): malloc calls FSH_SEGALLOC with an option to create swappable memory (the second parameter of gdtAlloc. If this parameter is TRUE, the allocated memory is swappable. But the VOLINFO structure should be non-swappable. So, we just use gdtAlloc with second parameter equal to FALSE. The swappable/non-swappable parameter applies to a whole selector, so we cannot just use malloc (which allocates swappable memory), then call FSH_FORCENOSWAP, because all memory with the same selector will become non-swappable (not only VOLINFO structure). So, it is better to use gdtAlloc here).

comment:14 by erdmann, 7 years ago

If you use "gdtAlloc" for allocation you will need to use "freeseg" for deallocation.
These two routines do not work with a heap but directly allocate/free a segment.

However, in "ifsmount.c" on mount you are using "gdtAlloc" for allocating pVolInfo but on unmount you are using "free" on that very same address, that is, you allocate a segment but you attempt to free a piece of memory from the heap.

This is asking for desaster. You will need to change deallocation of pVolInfo to using "freeseg".

See "ifsmount.c", routines "FS_MOUNT" (at MOUNT_RELEASE), and also "Mount" (at MOUNT_RELEASE). For routine "mount" it is doing correctly.

Last edited 7 years ago by erdmann (previous) (diff)

comment:15 by Valery V. Sedletski, 7 years ago

free() is calling freeseg() under the hood, so, there is no difference in calling free() or freeseg() if the seg is allocated via gdtAlloc(). But yes, probably it will look more clear to use freeseg(), as opposed to gdtAlloc(), so I agree here. I doubt that is somehow related with #67, though.

comment:16 by erdmann, 7 years ago

"free" only calls "freeseg" on this occasion:

283	   /*
284	      free selector if no longer needed
285	   */
286	   if (usSel > 0 &&
287	      BlockSize(rgpSegment[usSel]) == (HEAP_SIZE - sizeof (ULONG)) &&
288	      IsBlockFree(rgpSegment[usSel]))
289	      {
290	      PBYTE p = rgpSegment[usSel];
291	      rgpSegment[usSel] = NULL;
292	      freeseg(p);
293	      }

You CANNOT use "free" as a general replacement for "freeseg" when you have used "gdtAlloc" to allocate a piece of memory.
That's because "gdtAlloc" does NOT allocate a piece of memory from the heap. The memory allocated by gdtAlloc is NOT controlled by the heap management.
It is the other way around:
"malloc" uses "gdtAlloc" as the basic means to allocate memory for the heap when the heap is running out of memory.

Last edited 7 years ago by erdmann (previous) (diff)
Note: See TracTickets for help on using tickets.