Opened 7 years ago

Last modified 7 years ago

#74 new defect

VBoxDrv.sys causes system to fail to boot

Reported by: losepete Owned by:
Priority: blocker Milestone:
Component: Common Tasks Keywords:
Cc:

Description

Running ArcaoS5.0.0
I added the line
DEVICE=J:\PROGRAMS\VBox\VBoxDrv.sys
to my config.sys file and rebooted.

Boot stopped with the message ifx is checking/cleaning ini files

I booted an eCS2.2b2 installation on the same system, REM'd the ifx line in the ArcaOS config.sys file and tried booting ArcaOS again.

Result: Boot stoips with pnshell.exe showing as the last line on screen.

I booted in eCS2.2b2and REM'd the DEVICE=J:\PROGRAMS\VBox\VBoxDrv.sys line.

I then tried booting ArcaOS and had no problems.

This is very repeatable: If DEVICE=J:\PROGRAMS\VBox\VBoxDrv.sys is active in the config.sys file then ArcaOS cannot boot. With that line REM'd there is no problem booting ArcaOS.

Change History (12)

comment:1 by Valery V. Sedletski, 7 years ago

Which VBox version? Any logs? What is your hardware/software?

PS: Please try the latest driver: ftp://osfree.org/upload/vbox/vboxdrv.zip and see if it boots ok. Do you have a COM port and a null-modem cable?

Last edited 7 years ago by Valery V. Sedletski (previous) (diff)

comment:2 by losepete, 7 years ago

The readme states Version 5.0.6_OSE r140 but it is r141

Logs? - obviously no vbox logs are available.

Not sure what hardware details you require...
Gigabyte 990FXA-UD3 mainboard
AMD FX-4300 4 core cpu
4Gb RAM
ATI X550 graphics card

Software: ArcaOS5.0.0 with a few updates applied

I unzipped ​ftp://osfree.org/upload/vbox/vboxdrv.zip and rebooted with the vboxdrv.sys line active in config.sys
Result: Boot stops with the IFX checking/cleaning message.

No COM port or null modem cable

I suspect some sort of bad interaction between the ArcaOS SNAP and vboxdrv.sys as I have had vbox running in the eCS2.2b2 installation on this system which uses the earlier not-SMP-safe SNAP.

Last edited 7 years ago by losepete (previous) (diff)

comment:3 by Valery V. Sedletski, 7 years ago

The readme states Version 5.0.6_OSE r140 but it is r141

I see r141 in the README, so I wonder, where did you get your VBox build.

Logs? - obviously no vbox logs are available.

I mean, COM port logs. I need VBoxDrv.sys logs, VBox logs are useless here. If your system did not booted successfully, and you have no COM port, then the only option you have to get logs is install QSINIT and OS/4 kernel, add DBPORT=0 to os2ldr.ini and see the log redirected to the screen (in yellow colors). You can get a snapshot of your screen and share it with me. With QSINIT, it's easy to switch different kernels, so you can return back to your original kernel when everything will be resolved.

Also, probably, it would be best for us to go to IRC and get a talk in realtime. I'm on #osFree channel @ EFnet, or on #netlabs @ FreeNode, we can talk in private.

Not sure what hardware details you require...
Gigabyte 990FXA-UD3 mainboard
AMD FX-4300 4 core cpu
4Gb RAM
ATI X550 graphics card

So, it's modern AMD system.

I suspect some sort of bad interaction between the ArcaOS SNAP and vboxdrv.sys as I have had vbox running in the eCS2.2b2 installation on this system which uses the earlier not-SMP-safe SNAP.

I doubt that this has something to do with SNAP. I have SNAP too, and it works fine for me. Though, can you check running the same vboxdrv.sys and VBox binaries on your old eCS system, would it boot ok?

I suspect, it is something bad with newer PSD from Arca Noae. I ran VBox successfuly with earlier PSD's from Arca, though.

comment:4 by losepete, 7 years ago

I have checked updates applied since installation which include kernel and acpi updates.

I am currently using:-

[M:\]bldlevel os2krnl
Build Level Display Facility Version 6.12.675 Sep 25 200
(C) Copyright IBM Corporation 1993-2001
Signature: @#IBM:14.201#@_SMP IBM OS/2 Kernel
Vendor: IBM
Revision: 14.201
File Version: 14.201
Description: _SMP IBM OS/2 Kernel

[M:\os2\boot]bldlevel acpi.psd
Build Level Display Facility Version 6.12.675 Sep 25 2001
(C) Copyright IBM Corporation 1993-2001
Signature: @#Arca Noae LLC:3.23.07#@##1## 1 Sep 2017 19:17:22 DAZAR1

::::07::@@ACPI based PSD for OS/2 (c) 2017 Arca Noae LLC

Vendor: Arca Noae LLC
Revision: 3.23.07
Date/Time: 1 Sep 2017 19:17:22
Build Machine: DAZAR1
File Version: 3.23.7
Description: ACPI based PSD for OS/2 (c) 2017 Arca Noae LLC

I have only just updated acpi from 3.23.06 which was in use when I first ran into this problem. Sadly the problem persists with 3.23.07 which is that latest I have access to.

If you think either/both of the kernel/acpi packages may be involved do you have access to these to try yourself? My thinking is that ArcaOS is fairly successful and development of those packages is not going to go backwards...

If you need copies for testing further development of vboxdrv.sys I could send as an email attachment. I know I should not make the offer but I am sure you will uninstall them when you have finished with them.

Alternatively I guess I will have to find the time to muck around with qsinit and os/4 kernel but that could be a week or more...

comment:5 by Valery V. Sedletski, 7 years ago

If you think either/both of the kernel/acpi packages may be involved do you have access to these to try yourself? My thinking is that ArcaOS is fairly successful and development of those packages is not going to go backwards...

No, I have no access to AN updates, and I have no ArcaOS. But everything is possible, the regression in kernel or PSD is possible too.

If you need copies for testing further development of vboxdrv.sys I could send as an email attachment. I know I should not make the offer but I am sure you will uninstall them when you have finished with them.

Yes, maybe I'll try them later, if required. Ok, I'll uninstall them if having them installed without AN permission is such a horrible problem. No problem.

But first I'll check another hypothesis. Unfortunately, for SMP support, it is required to have a function returning a current CPU number. IBM kernel lacks such a function. So, getting CPU number is emulated with getting a local APIC number. However, LAPIC ID's are not guaranteed to be consecutive, i.e., to be in the range 0..MaxCPUNumber. So, it is possible to have a CPU number from range 0..MaxCPUNumber, which is erroneously considered to be valid. Trying to do something on an invalid CPU number can result in a hang. But if your CPU's all have numbers from the 0..MaxCPUNumber range, it will work for you. If not, then you're unlucky. However, you can install the OS/4 kernel, which has a special KEE for getting the current CPU number reliably. So, this will work for you too. Additionally, on OS/4 kernels, support for hardware virtualization is almost working now. So, you could try, if you wish -- it has its own benefits. Also, it will aid the debugging for me. OS/4 has its own internal log with big sizes (can be several megabytes large). VboxDrv.sys prints messages there. Log contents can be got with a

copy kernlog$ kerlog.txt

command, so, log is copied in the kernlog.txt file. This is if you have your system booted ok. If not, the log is repeated to the COM port, so you can capture it with a terminal emulator program. If a COM port is unavailable, you can still see the last messages on the screen, if you redirect the log to screen.

So, we can try to check my hypothesis about inconsecutive CPU ID's tomorrow, if you'll be free. Or later, I can wait. I need to finish my VBox build until tomorrow first. It takes some hours to compile on my machine (6-8 hours), so, I need to wait until VBox gets compiled.

comment:6 by Valery V. Sedletski, 7 years ago

I reuploaded newer VBoxDrv to ftp://osfree.org/upload/vbox/vboxdrv.zip. Please test if it boots now ok.

Last edited 7 years ago by Valery V. Sedletski (previous) (diff)

comment:7 by losepete, 7 years ago

I just tested the latest vboxdrv.sys

Result: ArcaOS stops with the IFX ini check on screen.

I rebooted ArcaOS - requires press of system box Reset button - and just as IFX was about to run I pressed and held the Shift key.

This presented a list of IFX options to change to an earlier set of ini file backups or press 0 to continue booting without change.

I selected 0, no change, and the system booted successfully to a working Desktop. Could this be a timing issue?

Sadly virtualbox still does not run displaying this error

Failed to create the VirtualBoxClient COM object.


The application will now terminate.

Callee RC: NS_ERROR_ABORT (0x80004004)

Regarding the AN updates: Should I upload them here?

comment:8 by Valery V. Sedletski, 7 years ago

What is IFX? There was an INI checking program in eCS 2.2, called DMT. Did they renamed it? What does "0" do? Skip checking? So, it booted finally? Then very strange... So, now it does not stop on pmshell.exe after skipping "IFX"? I cannot imagine why it can interfere with vboxdrv.sys. I thought it was vboxdrv.sys hanged, and it just was at the moment IFX boots. So, I thought it is unrelated with IFX indeed. I tested VBoxDrv.sys successfully in eCS 2.2 with DMT on a Core2Duo system, it boots without any problems. Maybe, this is some SMP-related problem. VBoxDrv.sys can run code on different CPU's during the boot process. This may interfere with IFX somehow.

I'd suspect that if it's IFX hangs, then you should ask ArcaNoae to fix IFX, as I don't know what could it do special during the boot. Arca Noae should know better (you could try creating a ticket at their bugtracker). But if you said that you've finally booted, could you try taking a log after it booted? ArcaOS has a stripped-down QSINIT version, which still should have QSINIT log present. Could you try doing

copy oemhlp$ somefile.txt

or, if it fails,

copy ___hlp$ somefile.txt

? This will capture the messages. But I'd also prefer to see the same with OS/4 kernel booted. This shows more useful info. And, you'd better install a full-featured QSINIT, as Arca Noae's version seems to be unable to boot multiple kernels and pass them options. You can create os2ldr.ini like this:

[config]
; kernel from kernel list section to be loaded by default
default=12
; waiting time after which kernel mentioned in "default" will be loaded
timeout=30
;
;showmem=1
; debug level
;dbflags=0x19
dbflags=0
; com port address (0 for vio)
;dbport=0
dbport=0x3F8 
;dbflags=0x11

[kernel]
os2ldr1. = Old os2ldr, restart
os2ldr.os4 = os4ldr, restart
preldr0.mdl = FreeLDR, restart
tetris. = Tetris, restart
os2krnl. = 14.104a UNI
os2krnl.smp = 14.104a SMP,LOGSIZE=4096,LOADSYM
os2krnl2.smp = 14.106 SMP,LOGSIZE=4096,LOADSYM
os2krnl3.smp = 14.200 SMP,LOGSIZE=4096,LOADSYM
os2krnl.os4 =  OS/4 new KEE,VALIMIT=3072,CTRLC=1,LOADSYM,LOGSIZE=4096,PRELOAD=1
os2krnl2.os4 =  OS/4 test,VALIMIT=3072,CTRLC=1,LOADSYM,LOGSIZE=4096,PRELOAD=1
os4krnl =  OS/4 kernel (512 MB),VALIMIT=3072,CTRLC=1,LOADSYM,LOGSIZE=4096,PRELOAD=1,memlimit=512
os4krnl =  OS/4 kernel,VALIMIT=3072,CTRLC=1,LOADSYM,LOGSIZE=4096,PRELOAD=1

This is a menu item list at the end, "default=12" means that item #12 is the default one. Timeout is 30 sec. Each menu item starts with a kernel name "os4krnl", which should be 8.3., then, after "=", goes the menu item name, followed by an option list, separated by commas. You can also rename your older "os2ldr" as "os2ldr.ibm" and start it from this one, with a

os2ldr.ibm = IBM's os2ldr, restart

QSINIT should boot your kernel fine, though, whithout the need to boot your older os2ldr.

You can also try to download and install the OS/4 kernel. It can be added to QSINIT menu as in the above example. You only need to extract the drivers from the kernel archive to \os2\boot (except for screen03.sys, which should be put into \os2). Everything should be easy. You can switch between many kernels easily (you can read README's in QSINIT and OS/4 kernel archives, for more details, but what I said above should be sufficient).

Regarding the NS_ERROR_ABORT error -- this means that some DLL's are missing from your system. Please (re)install the ones listed here: http://trac.netlabs.org/vbox/. Also, please update VBox from there: ftp://osfree.org/upload/vbox/vbox-os2-i386-v5.0.51.zip.

comment:9 by Valery V. Sedletski, 7 years ago

Regarding the AN updates: Should I upload them here?

You can send them by email: _valerius (dog) mail (dot) ru. I probably need the kernel and ACPI.PSD (better with their .sym files)

comment:10 by losepete, 7 years ago

Check your email

I think you will be pleased to hear that I have managed to resolve both boot problem and VBox startup problem.

I had simply added the vboxdrv.sys to the end of my config.sys [ Device Drivers ] section - my config.sys has been sorted into the same order with same section headings as eCS used - without re-reading the VBox readme.

As the problem seemed to be "timing related" I moved the vboxdrv.sys higher up the Device Drivers section, just below the IFS lines.

No problem rebooting, no stop at IFX/pmshell.exe, but I still could not start VirtualBox.

Having a quick read of the readme file I see that the top of the config.sys is recommended. A bit of a case of "Operator Error: Replace Operator"...

I posted about the problems on os2world to see if anyone had any thoughts - see https://www.os2world.com/forum/index.php?topic=1709.msg

I am glad I did as my attention was drawn to http://trac.netlabs.org/vbox/ticket/73 and the libcx problem.

I added SET LIBCX_HIGHMEM=2 to my config.sys file and rebooted.

I can now start VirtualBox.

I have just downloaded vbox 5.0.51 so will install shortly and give it a try at creating a linux guest.

Thanks for your help


comment:11 by Valery V. Sedletski, 7 years ago

Check your email

Ok, got it. Thank you!

Having a quick read of the readme file I see that the top of the config.sys is recommended.

Hm, indeed. But I always had VBoxDrv.sys near the bottom of my config.sys, and had no problems with it.

A bit of a case of "Operator Error: Replace Operator"...

Hm, what do you mean? Which operator? Do you mean "User failure, please replace user"? :)

I am glad I did as my attention was drawn to http://trac.netlabs.org/vbox/ticket/73 and the libcx problem.

I added SET LIBCX_HIGHMEM=2 to my config.sys file and rebooted.

I can now start VirtualBox?.

This is some problem with SDL and high memory, still unresolved. Setting LIBCX_HIGHMEM to 2 is a current workaround. This is required for recent version of libcx, only. No need to have several versions of libcx with LIBPATHSTRICT=T -- you only need the latest one, with LIBCX_HIGHMEM=2. Note that it is required for VBoxSDL only. VirtualBox.exe works fine without that.

Please still complain about IFX to Arca Noae. Ideally, it should not depend on a position of VboxDrv.sys in config.sys.

I have just downloaded vbox 5.0.51 so will install shortly and give it a try at creating a linux guest.

You'd better try something lightweight. You can use Knoppix Live CD, for example. Ubuntu eats up too much resources with its "Unity" desktop (it is very slow with 1.2 GB of RAM given to a VM. 2 GB is needed at least, but it's too much for a 32-bit host which OS/2 is). But you can use Xubuntu or Fluxbuntu instead, which have lightweight desktops by default. With Ubuntu + LXDE desktop I was able to have an acceptable performance with 768 MB of RAM only. Also, I enabled hardware virtualization. (OS/2 VBox has support for AMD-V/VT-x, but on OS/4 kernel only for now. So, if you want to use hardware virtualization, you'll currently need to install the OS/4 kernel).

PS: If someone needs hardware virtualization support on IBM kernels, please ask Arca Noae for financing the development. Otherwise, we have no motivation to do so. (OS/4 kernel has proper mechanisms for that. IBM kernel lacks the required functions. So, it requires more efforts, but currently we don't see any interest for HW virtualization support on their kernels from Arca Noae. Moreover, no interest in financing VBox at all currently. Sponsoring support from XEU/Mensys is almost ended too.).

Last edited 7 years ago by Valery V. Sedletski (previous) (diff)

comment:12 by losepete, 7 years ago

I find that the LIBCX_HIGHMEM=2 setting is required for virtualbox.exe - have not yet tried the vboxsdl.exe

Note: See TracTickets for help on using tickets.