Posts for year 2007 (old posts, page 3)

6539777 Cannot disable mpxio on x86/x64 platforms without serious pain

Yesterday I logged `6539777 Cannot disable mpxio on x86/x64 platforms without serious pain`_, which is the real root cause of `6539612 Run “stmsboot -d” on x86 platform will cause system boot failure`_.

Unfortunately, whatever process is used to push bug data from internal (bugster) to external (`b.o.o.`_) completely screwed up the workaround entry. This makes me mad not just because I spent a heap of time writing it carefully, but because the information that it presents to you via `b.o.o.`_ is useless!

AAAARRRRRGHHHHHH

So herewith is the workaround field, unadulterated and (hopefully!) useful if you ever findyourself in this situation:

ALWAYS exclude your root fibre-channel controller from the mpxio-disabled list. A modification to the /kernel/drv/fp.conf file is required. Determine your bootpath``and then make the appropriate modifications:``# /usr/sbin/eeprom boot-path bootpath=**/pci@1d,0/pci1022,7450@4/pci1077,132@1/fp@0,0/sd@w266000c0ffe92245,8:a**!! we want the piece between /pci@1d,0 and /fp@0,0``# cat >> /kernel/drv/fp.conf name="fp" parent="/pci@1d,0/pci1022,7450@4/pci1077,132@1" port=0 mpxio-disable="no"; ^D``!! hit control-D here``# /sbin/bootadm update-archive``If you find yourself in the situation where your system has not come back up and is stuck trying to``fsck /``, then login as root when prompted, run .. code-block:

System Message: WARNING/2 (<string>, line 17); backlink

Inline literal start-string without end-string.

# mount | grep "/ on"
/ on /pci@1d,0/pci1022,7450@4/pci1077,132@1/fp@0,0/sd@w266000c0ffe92245,8:a read/write/setuid/devices/dev=780240
# mount -o remount,rw,logging /devices/pci@1d,0/pci1022,7450@4/pci1077,132@1/fp@0,0/disk@w266000c0ffe92245,8:a /

Note the change from "sd" to "disk" - this is due to the way that fibre-channel luns are presented by the device tree on the x86 architecture.

Now I need to log a bug against `b.o.o.`_ itself. grrrrrrrr .. _6539777 Cannot disable mpxio on x86/x64 platforms without serious pain: http://bugs.opensolaris.org/view_bug.do?bug_id=6539777 .. _b.o.o.: http://bugs.opensolaris.org .. _6539612 Run “stmsboot -d” on x86 platform will cause system boot failure: http://bugs.opensolaris.org/view_bug.do?bug_id=6539612

Docutils System Messages

System Message: ERROR/3 (<string>, line 2); backlink

Unknown target name: "6539777 cannot disable mpxio on x86/x64 platforms without serious pain".

System Message: ERROR/3 (<string>, line 2); backlink

Unknown target name: "6539612 run “stmsboot -d” on x86 platform will cause system boot failure".

System Message: ERROR/3 (<string>, line 6); backlink

Unknown target name: "b.o.o.".

System Message: ERROR/3 (<string>, line 6); backlink

Unknown target name: "b.o.o.".

System Message: ERROR/3 (<string>, line 35); backlink

Unknown target name: "b.o.o.".




Congratulations to the new OpenSolaris Governing Board members

The polls have now closed in the OGB election for 2007, and I would like congratulate our new OGB Overlords .. image:: /images/smilies/icon_smile.gif

System Message: ERROR/3 (<string>, line 4)

Unexpected indentation.

alt

:-)

.

System Message: WARNING/2 (<string>, line 12)

Block quote ends without a blank line; unexpected unindent.

James D. CarlsonAlan CoopersmithCasper DikGlynn FosterStephen LauRich Teer and Keith M. WesolowskiI’d like to be amongst that group – perhaps next year?

Thankyou to the previous members of the OGB, too – we know you’ve put in a lot of effort to make things work.

Good luck guys, you’ve got a lot of work to do and don’t forget that there are plenty of people who are willing to help if you ask.

Of just as much if not more importance was the question about ratifying our Constitution, which passed.

Technorati tags: topic:{Technorati}[OpenSolaris], topic:{Technorati}[OpenSolaris Governing Board], topic:{Technorati}[OGB]




I’ve now got a 64bit nVidia driver with Xorg on snv_60

When I upgraded to snv_60 I noticed that the X server wouldn’t start. This is due to there being no 64bit nVidia driver integrated into that build. Alan Coopersmith is working with nVidia to resolve it, but in the meantime somebody on #opensolaris pointed me to`http://www.nvidia.com/object/solaris_display_1.0-9755.html`_ which has the 64bit driver ready for use.The workaround for running Xorg in 32bit mode on is

# svccfg -s x11-server setprop options/server=/usr/X11/bin/i386/Xorg

And if you want to run Xorg in 64bit mode, use

# svccfg -s x11-server setprop options/server=/usr/X11/bin/amd64/Xorg

Don’t forget to re-run

# /sbin/bootadm update-archive ; reboot

after you’ve installed the new package. Once you’ve rebooted and logged in again, you can make use of the nVidia control panel:

nVidia control panel



An in-your-face lesson about power draw

On Tuesday I got my act together, wandered down to Dick Smith Electronics at North Sydney and purchased the internal drive power-splitter cable that I’d been meaning to get for weeks. This was all part of the grand plan to install another 2 SATA disks inside my Ultra20-M2… along with the 4 36Gb scsi disks I’ve got attached in a multipack.Great idea, but with insufficient regard for the limitations of my hardware.First off, the sata cables that I used were “standard” pc cables, so their plug length was about 2x the plug length that Sun uses when building these boxes. Not a problem on the motherboard end, but definitely a problem when you want to close the case if you don’t rotate your additional disks by 90 degrees.Secondly, current and power draw. The Ultra20 and Ultra20-M2 come with a 400W psu, which as far as I’m aware is plentiful enough to run the box with 2 disks and each PCI and PCI-Express slot filled, but not if you want to add extra disks. That’s what I hadn’t bothered to think about. My disk0 and disk1 are 320Gb Seagate ST3320620AS SATA disks which run just fine. The two disks I added were a a 200Gb Seagate ST3200822AS and a 300Gb Maxtor 6V300F0.Earlier this evening I noticed that a lot of processes were starting to hang – firefox, thunderbird, gaim, xchat … apache, postgres, tomcat …. and shortly thereafter everything decided to not respond. I was able to run a reboot -dq though, so I got some data.On getting back to the grub screen, I received the worrying message that the system thought I had no slice 0 on my boot disk. Eeeeek! Power-off and power-on … boot up …. login …. hangHard hang. No chance of using F1+A to get out of this. Power-off it was.While I’d been waiting to see whether the hang was really hard I did a bit of thinking. What was the last change I made to the system? [added new disks internally] Was it disk-related? [yes] Am I an idiot? [quite possibly]Power-off, unscrew the case, remove the extra disks and re-set the original drive power cable, power-on, boot up with no problems whatsoever.I figure I’m now on the lookout for an appropriate hba with external connections along with an external enclosure to house the drives. Could be a while. In the meantime I think I’ve learnt my lesson.




Perhaps I jumped to the wrong conclusion

Yesterday I wrote an entry `wherein I blamed power draw`_ for causing the hard hangs I’ve experienced since I LU’d to snv_60. After a bit more downtime overnight and serious worrying about the safety of my data (photos…. I can’t lose them!) I’ve done a bit more analysis and come up with this explanation: it’s a dodgy disk.

Not my preferred explanation, but one which fits the evidence better than the power draw theory.

After I removed the two extra disks, I still had the problem. Ergo, the power draw theory is unlikely to be the cause.

I tried copying files off my `camera’s`_ CF card into my photo storage area three times. Each time I did, the cp process and then every other process making use of somewhere under /scratch would hang, with a stack trace like this:

> fffffffed2910340::print proc_t p_tlist|::findstack -v
stack pointer for thread fffffffed0ca5280: ffffff0005673b60
[ ffffff0005673b60 _resume_from_idle+0xf8() ]
ffffff0005673ba0 swtch+0x17f()
ffffff0005673bd0 cv_wait+0x61(fffffffec7dd2b16, fffffffec7dd2ad0)
ffffff0005673c20 txg_wait_open+0x7f(fffffffec7dd2a00, 2a3968)
ffffff0005673c60 dmu_tx_wait+0x92(ffffffff23b55800)
ffffff0005673d60 zfs_write+0x2de(fffffffef1bc9680, ffffff0005673e20, 0, fffffffee7401e88, 0)
ffffff0005673dd0 fop_write+0x3f(fffffffef1bc9680, ffffff0005673e20, 0, fffffffee7401e88, 0)
ffffff0005673e90 write+0x2ad(4, fe400000, 800000)
ffffff0005673ec0 write32+0x1e(4, fe400000, 800000)
ffffff0005673f10 sys_syscall32+0x101()

After managing to pull my head out and think about this for a moment, I realised that the problem had not occurred before I LU’d to snv_60. When I did the LU, I activated my alternate BE (on my second disk) and made it the logical lefthand side of the mirror. Whenever I hit a specific part of the filesystem in multiuser/64bit mode, all IOs would hang.

A failsafe boot to 32bit followed by a zpool scrub didn’t find anything wrong with the pool or its filesystems, but when I rebooted again I saw the dreaded GRUB error message that it couldn’t find my root partition. Another failsafe boot and “format…label” later and once more, a hard hang while doing heavy IO to and from the pool.

I removed the disk, attached the jumper to the end of it which forces 1.5Gbps (SATA-I) speeds, re-inserted and rebooted. I’ve now been up and running for nearly 45 minutes and doing some fairly heavy IO …. looks ok for the moment.

I’m more confident that I’ve nailed the source of the problem, but we’ll just have to wait and see.

Update: Another possibility, given that I’ve stumbled across 6536905 biosdev 1.4,1.5 changes render SATA disks under old framework invisible to LU, is that there’s a bios bug which prevents the second onboard SATA channel from operating at full SATA-II speeds. Not exactly sure how I’m going to investigate this idea though. .. _camera’s: http://www.jmcpdotcom.com/roller/jmcp/entry/why_are_black_eos400d_bodies .. _wherein I blamed power draw: http://www.jmcpdotcom.com/roller/jmcp/entry/20070325

Docutils System Messages

System Message: ERROR/3 (<string>, line 2); backlink

Unknown target name: "wherein i blamed power draw".

System Message: ERROR/3 (<string>, line 14); backlink

Unknown target name: "camera’s".




Weirdness in Sovietistan

While browsing my blog’s referrers list today, I saw hits from Planet SLUG which features James Purser who recently interviewed me on an OpenSolaris Round Table.That lead me here which has photos of Soviet-era bus shelters. Weird and wonderful all at the same time. For a nation that was so top-down driven and controlled, the variety of designs and styles is amazing.

Technorati tags: topic:{Technorati}[Soviet Union], topic:{Technorati}[Bus shelter], topic:{Technorati}[SLUG], topic:{Technorati}[Open Source On The Air], topic:{Technorati}[k-sit.com]




There’s a bug in my Ultra20-M2 bios :(

Today I spent a bit of time making a live-upgrade from snv_57 to snv_60 work. I’m not sure that I did things quite the right way, but …. I had to manually pkgadd the PatchPro packages to my alternate BE (SUNWppror SUNWpprou SUNWppro-plugin-sunos-base) before LU would allow me to continue — this is not what I wanted, because I don’t use PatchPro at all. I deliberately uninstalled it as soon as I possibly could. The alternate BE seemed to still have StarOffice8 installed, so of course LU upgraded that too. Again, I don’t want StarOffice8, I want OpenOffice.org 2.x instead. My non-global zones (which have their zonepaths on zfs) were copied to the new BE’s / so on reboot I had to move the contents out of the way before zfs mount -a would succeed. And, most annoying of all, /sbin/biosdev stumbled across a bug (6536905 biosdev 1.4,1.5 changes render SATA disks under old framework invisible to LU — not on b.o.o and will most probably have its synopsis changed) which meant that /sbin/biosdev couldn’t tell LU what a valid bios-registered boot device was. After running /sbin/biosdev -d and having a chat with the RE for the bug, I came up with the following hackaround – replace /sbin/biosdev with a shell script which outputs the correct information. In my case, with a dual-channel glm card installed and 6 scsi disks attached to it along with 4 SATA disks hanging off the motherboard, I need this:

#!/bin/sh
echo "0x80 /pci@0,0/pci-ide@5/ide@0/cmdk@0,0"
echo "0x81 /pci@0,0/pci10de,370@6/pci1000,1000@9/sd@2,0"
echo "0x82 /pci@0,0/pci10de,370@6/pci1000,1000@9/sd@4,0"
echo "0x83 /pci@0,0/pci10de,370@6/pci1000,1000@9/sd@5,0"
echo "0x84 /pci@0,0/pci10de,370@6/pci1000,1000@9/sd@1,0"
echo "0x85 /pci@0,0/pci-ide@5/ide@1/cmdk@0,0"
echo "0x86 /pci@0,0/pci-ide@5,1/ide@0/cmdk@0,0"
echo "0x87 /pci@0,0/pci-ide@5,1/ide@1/cmdk@0,0"
exit 0

Technorati topic:{Technorati}[Solaris] tags: topic:{Technorati}[OpenSolaris] topic:{Technorati}[LiveUpgrade] topic:{Technorati}[biosdev] topic:{Technorati}[bios bug] topic:{Technorati}[Sun Ultra20 M2]




Are you going to exercise your right to vote?

We’re in the middle of the election period for the OpenSolaris Governing Board, and as of now, 63 core contributors have bothered to vote.`Stephen Hahn`_ has sent polite emails, and Gman has begged us all to think of the kittens.If you are eligible to vote in this poll, please get your ssh session to poll.opensolaris.org going and vote. This poll matters. Really matters.Don’t disenfranchise yourself by failing to vote.`Technorati`_ tags: topic:{Technorati}[OpenSolaris], topic:{Technorati}[OpenSolaris Governing Board], topic:{Technorati}[OGB], topic:{Technorati}[Vote], topic:{Technorati}[Enfranchise]




Experimenting with Macro – is this it?

We’re really lucky in our current abode to not only have a beautiful Frangipani tree, but to also have some Bird of Paradise (Strelitzia) flowers in the front yard. From time to time I wander out onto the porch and muck around with taking photos of them, working on exposure and aperture settings.I like to think that I’ve collected some macro shots in my Training Photos album: +----------------+ | Frangipane (1) | +================+ | .. image:: http://www.jmcpdotcom.com/gallery2/main.php/v/jmcp/Photo_Training/20070112_161410__MG_2221.jpg.html


System Message: WARNING/2 (<string>, line 10)

Block quote ends without a blank line; unexpected unindent.

System Message: ERROR/3 (<string>, line 8)

Malformed table.

+-------------------------------------------------------------------------------------------------------------------+
| Frangipane (2) |
+----------------+

System Message: WARNING/2 (<string>, line 11)

Blank line required after table.

Frangipane (2) |

System Message: WARNING/2 (<string>, line 12)

Line block ends without a blank line.

System Message: ERROR/3 (<string>, line 12)

Malformed table.

+----------------+
|

System Message: WARNING/2 (<string>, line 14)

Blank line required after table.

System Message: ERROR/3 (<string>, line 14)

Error in "image" directive: no content permitted.

.. image:: http://www.jmcpdotcom.com/gallery2/main.php/v/jmcp/Photo_Training/20070317_154123__MG_3064.jpg.html

 |

System Message: WARNING/2 (<string>, line 17)

Explicit markup ends without a blank line; unexpected unindent.

System Message: ERROR/3 (<string>, line 15)

Malformed table.

+-------------------------------------------------------------------------------------------------------------------+
| Bird of Paradise (1) |
+----------------------+

System Message: WARNING/2 (<string>, line 18)

Blank line required after table.

Bird of Paradise (1) |

System Message: WARNING/2 (<string>, line 19)

Line block ends without a blank line.

System Message: ERROR/3 (<string>, line 19)

Malformed table.

+----------------------+
|

System Message: WARNING/2 (<string>, line 21)

Blank line required after table.

System Message: ERROR/3 (<string>, line 21)

Error in "image" directive: no content permitted.

.. image:: http://www.jmcpdotcom.com/gallery2/main.php/v/jmcp/Photo_Training/20070316_174728__MG_2987_c.jpg.html

 |

System Message: WARNING/2 (<string>, line 24)

Explicit markup ends without a blank line; unexpected unindent.

System Message: ERROR/3 (<string>, line 22)

Malformed table.

+---------------------------------------------------------------------------------------------------------------------+
| Bird of Paradise (2) |
+----------------------+

System Message: WARNING/2 (<string>, line 25)

Blank line required after table.

Bird of Paradise (2) |

System Message: WARNING/2 (<string>, line 26)

Line block ends without a blank line.

System Message: ERROR/3 (<string>, line 26)

Malformed table.

+----------------------+
|

System Message: WARNING/2 (<string>, line 28)

Blank line required after table.

System Message: ERROR/3 (<string>, line 28)

Error in "image" directive: no content permitted.

.. image:: http://www.jmcpdotcom.com/gallery2/main.php/v/jmcp/Photo_Training/20070317_153710__MG_3052.jpg.html

 |

System Message: WARNING/2 (<string>, line 31)

Explicit markup ends without a blank line; unexpected unindent.

What do you think? Are these photos truly macro or am I misunderstanding the definition? If you don’t think I’m getting the concept, please leave a comment so I can be enlightened.While I’m thinking about it, I discovered this evening that ImageMagick (a tool which I’ve been using for an awfully long time) can convert Adobe Photoshop PSD files directly to JPEGs. I’m using Photoshop because it has integrated support for the Canon RAW format and while The Gimp can make use of the DC-Raw utility…. it just doesn’t feel quite as seamless and easy to use. So, Adobe – if you’re listening to your customers – I would happily pay you the full retail price for Photoshop Creative Suite (2, 3 or later) if I could have it run natively on Solaris/x64. I doubt that I am the only person who would front up with the dough either.`Technorati`_ tags: topic:{Technorati}[Macro Photography], topic:{Technorati}[Canon], topic:{Technorati}[EOS 400D], topic:{Technorati}[Frangipani], topic:{Technorati}[Strelitzia]




Requirements – they’re not always expressed well

As part of my BEng(CSE) degree @ UTS, I’m doing some formal study in software engineering. I know, I know … it’s been a long time coming. At least there’s no coding required for this subject, so I can concentrate on that for work instead .. image:: /images/smilies/icon_smile.gif

System Message: ERROR/3 (<string>, line 4)

Unexpected indentation.

alt

:-)

We got our first assignment late last week and have to translate a set of customer requirements into a Requirements Specification document. Which is all well and good except that some of them aren’t really expressed very well. I’m actually really glad to be doing this subject now, having spent time as a sysadmin, a troubleshooter, a systems architect and a kernel developer, because I can bring all those years of experience to bear in looking at the questions and assessing their suitability for inclusion in a formal requirements document.I do have to be careful that I don’t drown out my other group members with “I’ve seen that fail before, let’s not try it now” or similarly negative and smothering comments. It’s not just me doing this assignment and I do want to maximise the result we get for it. If I take over, though, the other guys won’t learn as much as they probably should. It’s going to be an interesting semester.