Tuesday, December 29, 2009

Hardware FakeRAID + LVM == A Big Mess

Every time I install a new distribution of Linux on newer hardware, I wonder just what new "features" will start brewing together in a little digital tempest until the universe of malicious luck senses that ideal, worst possible moment and twists the software around the hardware in a big convoluted knot with glue poured on it.

This time it was CentOS 5.x, a VIA brand hardware raid controller, and the Linux Volume Manager. Not knowing much about how Linux interacts with hardware RAID, I just set everything up at install time using the installer's suggested defaults. What could possibly go wrong? Doh! Never ask that question, even silently in your thoughts. You're just asking to get smacked in the head with the answer.

Then it came, while I was out of town and was depending on this machine to grant me access to the rest of the computers in the house via ssh tunnels. RAID1 mirrored volumes got out of sync, the kernel exhibited it's most endearing neurosis and reported its panic, leaving me locked out until I could get home and alleviate the poor kernel's anxiety attack. It often isn't a very good idea to tell a kernel that everything's okay until you have fixed something, so at least you can believe that you're telling it the truth. So, on to the harder part of fixing something, figuring out what needs to be fixed.

In the days that followed, I managed to disable one of the drives in the mirrored pair and get things running again without the benefit of RAID. However, in the process, the Linux Volume Manager came to believe that since there were two physical volumes with the same UUID, it should mount and use one, and hide the other one. LVM hid the /dev/sdb2 partition in a secret house of mirrors in order to thwart every one of the my attempts to obtain access to it, no matter how clever. Looking back, I wonder if it could have anticipated that I'd open the case, and boot up with only one of the drives' SATA cable plugged in but I somehow managed to do that before LVM decided to delete all my data out of spite. The plan was to use pvchange --uuid on one of the drives so that LVM would know the difference, but that doesn't work while the drive is in use. Rebooting in rescue mode from a CD/DVD (the CentOS 5.1 install disc) allowed the drive to be left unmounted, and allowed the pvchange --uuid command to do its thing.

After many unfruitful searches on Google with every permutation of command names, error message text fragments, and configuration file names I could muster, I tripped over a description of something called fake-RAID. All the symptoms of what happened to my Linux machine seemed to be telling me that that's the disease it had contracted. LVM reported a duplicate volume. The BIOS RAID1 mirrored pair still showed up to the OS as two individual devices. The rear fan made a sneezing noise. Okay, that last one probably wasn't related to the fake-RAID thing. The same article that described fake-RAID, recommended against using hardware RAID on a disk controller that works that way because the Linux drivers are an afterthought from the vendor, at best, and most likely guesswork from the Linux kernel developers. Reboot, enter BIOS RAID utility, delete RAID1 array, verify that POST shows 2 standalone drives in normal non-RAID SATA mode, done.

I have to say it was hard to suppress the feeling of accomplishment that came from making the hardware RAID controller do nothing instead of doing half of something, but my euphoria was short lived. I still had a river crossing ahead of me on the slippery stones that are Linux configuration files, and I didn't want to cross without at least the illusion of a safety rope. The rope couldn't be a worthy illusion if it were tied to a log floating in the river where the hazards lay so I plugged in an external USB hard drive and used the dump command to make my "rope," which took about 8 hours at 5MB/sec to transfer the 144GB root filesystem that I would soon attempt to destroy, multiple times.

With the backup made, it was time to shift data, reconfigure, shift data again, and reconfigure again. It would take time and hassle if I slipped from one of these stones but my safety rope gave me the confidence to step and leap towards the far bank.
  1. fdisk -l -- to show the partitions and types at the start. There were 2 drives, with 2 partitions each. /dev/sda1 and /dev/sdb1 were the previously mirrored /boot partitions, so for now, they'll be left alone. /dev/sda2 and /dev/sdb2 were the previously mirrored PV partitions mapped to /VolGroup00/LogVol00 (root) and LogVol01 (swap). Only /dev/sda2 was now active as an LVM physical volume (PV). /dev/sdb2 was still a "Linux LVM" partition type, but was no longer mounted or included in the LVM Volume Group VolGroup00.
  2. fdisk /dev/sdb -- to change the type id for the larger partition (i.e. sdb2) to 'fd' - Linux raid autodetect" (fdisk inputs: t, 2, fd, w -- to change the type id, on partition 2, to 'fd', and then write).
  3. mdadm --zero-superblock /dev/sdb2 -- Since this wasn't part of a Linux software RAID array before, this step wasn't really necessary, but it's handy to know that the superblock is how Linux software RAID recognizes that a physical partition is part of a defined array. Zeroing the superblock is essentially the permanent removal of the partition from any defined array, so it avoids a warning message that would appear if the partition is specified as one of the devices when creating a new array.
  4. mdadm --create /dev/md0 --verbose --level=1 --raid-devices=2 missing /dev/sdb1 -- This creates a new array with a device name of /dev/md0 that is a mirrored pair and specifies that one of the devices will be added later ('missing') and the other device should be /dev/sdb2 (which just had its superblock zeroed so it could be added to this array without warnings.)
  5. pvcreate /dev/md0 -- This tells LVM that it can use /dev/md0 as a physical volume (PV) which can be added to a volume group. Note: LVM manages the space provided in a PV as a set of "extents" so there is no need to create a file system like ext3 on this or the underlying /dev/md0.
  6. vgdisplay -- Show information about the LVM volume groups on the system. This allows verification of the VolGroup00 name to which the new 1/2 mirror array will be added.
  7. vgextend -v VolGroup00 /dev/md0 -- This adds the PV /dev/md0 (an LVM physical volume which is referenced by its physical device name) to the volume group named VolGroup00. Now any data that is on a logical volume in the VolGroup00 volume group can be moved to the extents that are available on the /dev/md0 PV.
  8. pvdisplay - shows that there are now 2 physical volumes (PV's) /dev/sda2 and /dev/md0 that are both allocated to the VolGroup00 volume group.
  9. pvmove -v /dev/sda2 /dev/md0 -- This gives LVM instructions to migrate the logical volumes (e.g. LogVol00 and LogVol01) from the PV where all their extents are currently stored (/dev/sda2) over to the new PV (/dev/md0) Note: This may run for a while but the -v switch makes it report progress until it's done.
  10. pvdisplay - Shows the details of the two LVM PV's in order to verify that /dev/md0 has all of its PE's (extents) allocated now (Free PE = 0), and /dev/sda2 has all of its PE's free (Allocated PE = 0)
  11. vgreduce VolGroup00 /dev/sda2 -- This removes the PV (/dev/sda2) from the LVM volume group VolGroup00. Until this is done, the /dev/sda2 OS device cannot be reclaimed from LVM control.
  12. pvremove /dev/sda2 -- Since the /dev/sda2 PV is now no longer allocated to any volume group, this actually removes the corresponding /dev/sda2 partition from LVM control and frees it for use outside LVM
  13. fdisk /dev/sda -- to change the type id for the now freed partition (sda2) to 'fd' - Linux raid autodetect" (fdisk inputs: t, 2, fd, w -- to change the type id, on partition 2, to 'fd', and then write).
  14. mdadm --add /dev/md0 /dev/sda2 -- This completes the 2 volume RAID1 mirrored pair named /dev/md0 by adding back the partition (/dev/sda2) from which data was just migrated onto the the array via pvmove. Now mdadm will automatically sync the data that was in the array (physically stored on /dev/sdb2) back over to the device that was added to the array (/dev/sda2).
  15. watch cat /proc/mdstat -- Monitors the sync up progress. This will probably take just about as long as pvmove took to migrate the data to the /dev/md0 PV.
A few VERY Important Final Steps
The question that kept bugging me as I did all this was, "Will the boot process work correctly after these changes next time I reboot?" The answer to that required a few trips back to rescue mode booted from the CentOS install DVD. I suspect I would not have had to make those trips if I had known the following things.
  1. It appeared that the configuration file for mdadm (/etc/mdadm.conf in CentOS 5.x) needed to be correctly configured to reference the mount points and device uuid of the /dev/md0 array. I was still a little fuzzy on how this config file could affect things at boot time. After all, the file is stored in the filesystem that is managed by LVM in a Logical Volume that in turn stores its data as extents in a physical volume (PV) that IS the software RAID array. I couldn't find anything in the /boot/grub setup that defined anything about the raid array or LVM, so how could /etc/mdadm.conf affect the initialization of the software RAID array at boot time, BEFORE the array or LVM is even activated? It doesn't. The boot process happens in stages. The stage where the boot and root filesystems are re-mounted as read/write is where /etc/mdadm.conf matters. So it must be correct. These commands will update it to reflect the actual, active Software RAID arrays.
    mv /etc/mdadm.conf /etc/mdadm.conf.old
    echo "DEVICE partitions" > /etc/mdadm.conf
    echo "MAILADDR root" >> /etc/mdadm.conf
    mdadm --examine --scan >> /etc/mdadm.conf

  2. The init ram drive (initrd) boot image MUST be updated to include support for the software RAID (mdadm/dm-*) modules, so one of the things that I needed to do was build another initrd.?????.img file in the /boot directory (saving the old one just in case). The reason nothing in the grub config files offers a clue to why booting fails is that the relevant stuff is actually embedded within the init ram drive file /boot/initrd-{version}.img. Building a new initrd can be done using mkinitrd to generate an new init script that starts up Software RAID (md) and LVM based on the current, active system configuration. The generated init script embedded inside the initrd-{version}.img file will then have a section like this that bootstraps Software RAID and LVM enough to continue the boot process using resources on a root filesystem that is accessible only with RAID and LVM active:
    echo Scanning and configuring dmraid supported devices

    raidautorun /dev/md0
    echo Scanning logical volumes
    lvm vgscan --ignorelockingfailure
    echo Activating logical volumes
    lvm vgchange -ay --ignorelockingfailure VolGroup00
    resume /dev/VolGroup00/LogVol01
    echo Creating root device.
    mkrootdev -t ext3 -o defaults,ro /dev/VolGroup00/LogVol00
    echo Mounting root filesystem.
    mount /sysroot
    So, with all the Software RAID and LVM stuff running, so that mkinitrd will detect the RAID devices (/dev/md*), and detect the LVM configuration (/dev/mapper/*), and include any required boot-time kernel modules, and generate a workable init script, here's what I should have done BEFORE trying to reboot...
    mv /boot/initrd-$(uname -r).img /boot/initrd-$(uname -r).img.bak
    mkinitrd -v /boot/initrd-$(uname -r).img $(uname -r)

This blog post is partially a way to capture some of the solution and share it with someone else who is perplexed by the same issue, and the remainder of it is me taking an opportunity to write something on this blog that isn't totally useless. I hope you have enjoyed it as much as I have.

Addendum on Fixing initrd from a Rescue Environment
As you may have gathered, I tried to reboot before I had initrd-{version}.img updated to reflect the correct boot-time system configuration. I'm sure someone knows a faster, easier way to get initrd fixed from a rescue CD/DVD boot environment, but as is often the case, trudging through it helped me understand better how things work. The process for manually repairing this part of the boot environment consisted of mounting the boot partition, unpacking the initrd file, adding missing files, editing the init script, and repacking the initrd file for the next reboot. If you're in a similar spot, for this or some other reason, the following steps and references may help.

Manually repairing the init script and/or contents in the initrd (init ram drive) image.
REFERENCE: http://www.ibm.com/developerworks/linux/library/l-initrd.html
REFERENCE: http://wiki.openvz.org/Modifying_initrd_image
  1. boot in rescue mode from the install CD/DVD but skip mounting the root filesystem
  2. mount the boot partition
    mkdir /boot
    mount /dev/sda1 /boot
  3. Make a copy of the existing initrd-{version}.img file with a .gz extension and uncompress it
    cd /boot
    cp initrd-2.6.18-53.1.6.el5.img initrd_for_unpack.img.gz
    gunzip initrd_for_unpack.img.gz
  4. Create a temp directory and unpack the initrd.img contents into it
    mkdir initrd_unpacked
    cd initrd_unpacked
    cpio -i --make-directories < ../initrd_for_unpack.img
  5. Find and fix what's wrong with the init script or initrd contents (e.g. add missing modules, correct device name references in the init script, etc.) For example:
    mkdir /mnt/sysimage
    mount {actual-path-to-root-fs} /mnt/sysimage
    cp /mnt/sysimage/lib/modules/{vers}/kernel/drivers/dm-zero.ko lib/
    cp /mnt/sysimage/lib/modules/{vers}/kernel/drivers/dm-mod.ko lib/
    cp /mnt/sysimage/lib/modules/{vers}/kernel/drivers/dm-mirror.ko lib/
    cp /mnt/sysimage/lib/modules/{vers}/kernel/drivers/dm-snapshot.ko lib/
    vi init
  6. Re-pack a new initrd.img file and compress it
    find ./ | cpio -F newc -o > ../initrd_fixed.img
    cd ..
    gzip initrd_fixed.img
  7. Replace the existing initrd file with the fixed one.
    mv initrd-2.6.18-53.1.6.el5.img initrd-2.6.18-53.1.6.el5.img.bak
    mv initrd_fixed.img.gz initrd-2.6.18-53.1.6.el5.img
  8. Unmount whatever was manually mounted and reboot
    umount /boot
    unmount /mnt/sysimage

Saturday, December 19, 2009

Tech Note: Sendmail MASQUERADE_AS

Some email service providers, including the one I use (1and1.com), have recently clamped down on spam by refusing email with an invalid sender domain in the header or the from-address. While I'm glad to see less garbage in my inbox, I had to dive into the Linux / sendmail configuration rabbit hole to get simple notification emails to show up again. The issue is that only the actual registered domain name is likely to be considered valid by these clamped-down email recipients' hosts. In other words, an email from backupsystem@machineabc.myregistereddomain.com will be refused, but backupsystem@myregistereddomain.com would work just fine.

So, after digging through web search results and documentation for a while, I found that sendmail (the service on Linux/Unix/*nix that forwards email messages) can be configured to "masquerade" local email addresses so that they appear to be sent from the base registered domain. So, according to the documentation at http://www.sendmail.org/m4/masquerading.html I set the following configuration options in /etc/mail/sendmail.mc


Then, a few commands to activate the configuration...
m4 /etc/mail/sendmail.mc > /etc/mail/sendmail.cf
service sendmail restart

Of course this all has to be done as the root account since the modifications are being made to system configuration files. So, the next step was to send a simple test email to see if it worked, but here's where I stepped on one of those semi-hidden (i.e. inconspicuously documented) Linux software configuration land mines. Still working as the root account, I used the mail command to send a message to a GMail account so I could examine the newly masqueraded headers, from-address, etc.
mail -s "Test Sendmail Masquerading" mytestaccount@gmail.com
type in some text for the message body
type ctrl-d to finish the body, close and send

Note: As of December 2009, GMail still accepts email even if it can't resolve the sender domain, and the GMail message view has a "Show Original" option that displays the original, raw email text including headers.

The Linux aggravation meter went up a notch when I noticed that the headers in the email had NOT been masqueraded as advertised. They were all passed through in the form of root@machineabc.myregistereddomain.com. The land mine I mentioned was found in the generated /etc/mail/sendmail.cf file as the following few lines:
# class E: names that should be exposed as from this host, even if we masquerade

That means the root account is excluded from the masquerading rules. This makes sense because most of the email originated using the root account would probably pertain to the specific machine and it would usually be helpful if the machine name were included in the sender header or from-address. However, the recently clamped-down email recipients's hosts simply refuse to accept such emails. The documentation does mention that the root account will always be "exposed" but I hadn't noticed that before. Grrrr.

If you're not a sendmail administration expert, it may not be that obvious how to include the root account in the masquerading. Notification emails delivered to a clamped-down email host that originate from a non-root user on the Linux machine work fine. Just to be sure this will work, enter the same mail command as another user on the machine and it does exactly what the configuration options would lead you to expect. If you want to keep the root account from being excluded from the masquerading, you need to GET RID OF a configuration option, not add another one. Elsewhere in the sendmail.mc file, there is a line that explicitly excludes root as an EXPOSED USER account:


Comment it out by changing it to:

dnl # EXPOSED_USER(`root')dnl

And remember to run m4 to regenerate the sendmail.cf file and restart the sendmail service again as described above. Now emails sent from the processes that run as the root account should also get the masquerade treatment.

The other option...
...if you have the ability to manage subdomains, and you only have one or a few machines for which you need the emails to be delivered, then you can add a subdomain to DNS for each of the machines. For instance, if you had machineabc set up in domain myregistereddomain.com, you could add a subdomain to your DNS configuration for machineabc.myregistereddomain.com. Then the clamped-down email servers would look up the sender domain for an email from root@machineabc.myregistereddomain.com, find it, and accept the email as usual.

I'm posting this to save someone (maybe even myself) a little time later on. I searched for a while to see if I could find out why the masquerade options for sendmail were just not working but as usual, they were working, but I was experiencing the effects of the built in exception case and didn't even have a good starting point for a search that would yield an answer.

Sunday, February 15, 2009


As you'd guess from the title, I'm talking about the 2-mindedness of purchasing decisions. This has become worse for me lately with the spend-happy democrats in control of our national finances, making all the same stupid mistakes made by FDR and the U.S.S.R., and bringing Katrina-like dark skies into our economic forecast. As if the specter of ever-leapfrogging cheaper technology weren't reluctance-inducing enough, now I have another reason to wait and see on so many things I'd like to buy.

I need a new car, but next year's model might improve upon this year's price vs. fuel economy vs. power features, and I could lose my job and need to keep that money for mortgage payments, food, and medical care. My wife wants a new dishwasher, but they're still figuring out how to make them quieter, and that competition may push the cost lower. Although, inflation could drive prices WAY up so maybe we need to go ahead and buy now. I get another marketing letter from the phone company every week saying I can switch to a different wireless plan and get all sorts of features I'm pretty sure I don't need, but all those features sound cool and I might use them if I had them, but then there's the "better technology tomorrow" and I might lose my job things again. ARRRRRGGGGGGGH!!!!!

At least there's one bright spot in this picture. Almost all TV content SUCKS. The cable company is raising the rates we're paying for the same old stream of amoral, liberal slanted garbage. Broadcasters have just about completed the switch to digital and we happen to live in an area where the signal strength quality is actually pretty good with a respectable antenna. So, we can get the garbage anyway without paying monthly for it. This decision to "un-purchase" our cable service is actually getting easier. I guess, technically, that doesn't count as a purchase decision since, once again, we would move back to the "should we buy" status, but I guess I should be thankful that the basis is clearer for the choice we need to make today.

Aside from when something is clearly not worth buying, like TV and most anything "As Seen on TV," even if the economic outlook and technology / price trends are ignored for a moment, I consistently find that no one manufacturer has chosen a product design that has _the_ feature set I'd choose. I feel like I am forced to select one subset or another of what I want in a product, no matter what the product is. I'm always thinking to myself, "Well, product A has 3 things I want, and product B has 3 other things I want, and they both have at least a dozen things I care nothing about. If only product A or B had its rival's 3 features I want, I'd be ready to lay down the money and take it home." I know that patents and copyrights prevent some of that from ever being possible or practical, and that ultimately it benefits consumers to provide a company a chance to recover the investment in R & D. That doesn't stop me from wishing I could have my cake and eat it too though. Now there's an idea someone should develop and patent, cake that can be eaten, and kept for another day too.

Friday, February 6, 2009

How Do You Decide Whether To Spend?

Hearing our new president go on about how the federal government now has the right to control how private industry operates, and how the federal government is the only entity that can forcibly spend the country back to economic health, makes me wonder how so many people could still be lapping up the reeking, sour kool-aid he's pouring in their bowls. The U.S. federal government, with their notoriously wise decision making capabilities, has used our tax dollars to buy control over those private companies. Today they're also on the brink of spending even more of our tax dollars, and our kids' tax dollars, and their kids' tax dollars to buy more stuff the citizens of this country _should_ have a vested interest in _not_ being publically subsidized, never mind publically owned and run.

So, the only theory about why all that government pork spending would prop up and revive the economy is that if money is flowing, jobs will be created, and the economy will recover. Having worked for a while as a contractor in a federally run organization, I have observed first hand how that ridiculously expensive bandaid will utterly fail to have any positive effect on the economy. The bottom line problem with government spending is that the government is incapable of choosing things on which to spend our tax money that will cause an overall increase in productive activity. And, it's the productive activity that improves the economy, and the standard of living, not the wasteful spending. Two people standing on a beach exchanging a stack of $100 bills all day but doing nothing would represent a _lot_ of spending but it would have _ZERO_ effect on the wealth of either person.

Spending to stimulate an economy depends on a multiplier effect. Entity A spends money with entity B, and they spend money with entity C, and so on. In exchange, B does something useful for A, C does something useful for B, and so on. When the federal government spends money with entity A, it pays entity A a great deal more than what anyone with any sense would pay (not efficient), requires entity A to comply with boatloads of regulatory nonsense (not productive), and ties up entity A's accounts receivable for and absurdly long time (not effective). By the time entity A has a chance to spend anything with entity B, entity B has lost hope that entity A will spend, entity C has abandoned pursuit of spending from entity B, and so on. With the government overhead superimposed on their spending, less and less useful stuff gets done. That means fewer and fewer people can be engaged to do useful stuff, so fewer and fewer people have income which would have allowed them to _be_ entity A, B, or C. The "government" spending multiplier effect is essentially the opposite of what occurs when money is spent efficiently to get useful things done.

If you have a different opinion about how things work, just think about how you decide when and how much to spend. Is it based on what you got last year? Maybe it would be if nothing else had changed, but we've got "change" now whether we like it or not. Is it based on what you expect to happen this year? I suspect that has quite a bit more weight in your decision process. Do you think your prospects of keeping your job and your income _improve_ as more people make the same kind of financial decisions you're making based on what you expect? I suspect not. Finally, do you think it is better to pay 150% of the price for something even if you don't want or need it just to be sure the person offering the good or service continues to have an income, or do you buy only what you really want or need, at the lowest possible price, so your money does as much for you as possible? Unless you're a little slow, I'd venture you'd choose the latter.

If so many people have such high expectations of the newly elected administration, and those people are supposed to have adopted the nirvana of new "hope" in what is to come, why then must the federal government spend us out of the retreating economy? Does nearly 1 trillion dollars of government spending on things the public does not need give you more confidence to go spend your money, or does it throw a giant, spooky, wet blanket on that "hope?" Do you think the prospect of government bureaucratic control over the largest and most troubled players in the economic picture will make things better, or are you willing to face reality and admit that that would surely be a soviet style trade from bad to horrible?

If you voted for BO, you must think all we need is "hope" and "change" so call your senator and explain to them why you want them to vote against this detrimental redistribution of funds towards inefficient and ineffective spending.

If you voted against BO, you most likely knew that what the country really needed was a pragmatic, experienced leader, but since we don't have anything of the sort, call your senator and ask them to please vote against this unwise expansion of the size and control of our federal government.

Saturday, January 31, 2009

Who Gets The Stuff

Even young children on the playground can understand the economics that apply to the new administration's so called "stimulus package." It is simply the transfer of reward from those who earn it to those who do not. And even the kids on the playground can understand that it isn't right when someone claims that "everyone is the winner" when there was clearly one of the kids (or teams) that ran faster, scored more goals, practiced more, or maybe just had the natural talent to be the winner. The outcome on the playground is that one side gets the trophy or the cheers, and the other side realizes that they need to work harder if they want the trophy or the cheers.

When the government takes what one side works for and gives it to the other side, they DESTROY the benefit of competition for BOTH sides. The guy that works hard, finds opportunities to use his talents, and succeeds is essentially told, "keep it up because you need to produce enough for yourself, and the people who have not practiced, or trained, or even worked hard enough to have THE SAME THINGS YOU OBTAINED." The guy that didn't work hard, or find training, or practice a trade is essentially told, "don't worry about doing anything in exchange for what you get, we'll take it from other people and give it to you ANYWAY."

The net result, for all you bleeding heart, liberal, anti-capitalist types, is that NEITHER side has any reason to exert any extra effort, NOTHING more gets done, and NOTHING more gets made. THAT MEANS THERE IS LESS FOR EVERYONE TO HAVE!!!! Less FOOD! Less HOUSING! Fewer CARS! Less MEDICAL CARE PROVIDED! Less EVERYTHING. And when there is less of something, it costs more because the same number of people still want or need it. And when things cost more, money is simply worth less.

Jimmy Carter showed us how this all works 3 decades ago and it appears, based on how many people younger than 30 voted for Obama and his socialist friends, that the public schools and universities have effectively filtered that lesson out of what is taught. If we manage to survive the damage that Obama does to the overall standard of living in the United States, and in turn the rest of the world, we'll certainly have him to thank for at least one thing. He will have taught the next generation, by bringing back massive inflation and government abuse of power, what the schools have so miserably failed to teach them, that socialism DOES NOT WORK, and that EVERYBODY PAYS FOR IT.