Re: [Sks-devel] recon outage on zimmermann.mayfirst.org

sks-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Sks-devel] recon outage on zimmermann.mayfirst.org

From:	John Clizbe
Subject:	Re: [Sks-devel] recon outage on zimmermann.mayfirst.org
Date:	Thu, 26 Jul 2012 23:59:56 -0500
User-agent:	Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.20pre) Gecko/20110606 Mnenhy/0.8.5 SeaMonkey/2.0.15pre

Daniel Kahn Gillmor wrote:
> hey folks--
> 
> it looks like the sks recon process on zimmermann.mayfirst.org
> (a.k.a. keys.mayfirst.org) stopped about 10 days ago:
> 
> 2012-07-16 05:28:34 Raising Sys.Break -- PTree may be corrupted:
> Bdb.DBError("unable to allocate memory for mutex; resize mutex region")
<snip>
> 
> If anyone has a thoughts about how i could have handled this
> differently, i'd be happy to hear it.

Ajust accordingly and run
(/opt/local/bin is where MacPorts puts the BDB utilities.)

============================================
#!/bin/sh

top="`pwd`"
PATH="/opt/local/bin:$PATH"

cd /var/sks

echo "KDB --"
cd KDB
sudo db53_recover -ev
sudo db53_checkpoint -1
sudo db53_archive -dv
sudo db53_recover -ev
cd ..

echo "PTree --"
cd PTree
sudo db53_recover -ev
sudo db53_checkpoint -1
sudo db53_archive -dv
sudo db53_recover -ev
cd ..

cd "$top"
============================================

Then fiddle with some values in PTree/DB_CONFIG

Alternatively, the Key Database was fine. SKS 1.1.3 and earlier uses a default
pagesize for PTree of 512 bytes (ptree_pagesize: 1). This tends to require a
large number of mutexes for locks. You could have set ptree_pagesize to 8 or
16 and done a clean pbuild. You'd still have a lot of resyncing to do, but you
wouldn't have lost any key information.

You might have gotten away with just the db_recover script above and restarted
the sks processes. You might have locked up again, but that probably had more
to do with a massive recon set than anything else.

[Prev in Thread]

Current Thread

[Next in Thread]

[Sks-devel] recon outage on zimmermann.mayfirst.org, Daniel Kahn Gillmor, 2012/07/26
- Re: [Sks-devel] recon outage on zimmermann.mayfirst.org, Jeffrey Johnson, 2012/07/27
  - Re: [Sks-devel] recon outage on zimmermann.mayfirst.org, Daniel Kahn Gillmor, 2012/07/27
- Re: [Sks-devel] recon outage on zimmermann.mayfirst.org, John Clizbe <=
  - Re: [Sks-devel] recon outage on zimmermann.mayfirst.org, Daniel Kahn Gillmor, 2012/07/27
- Re: [Sks-devel] recon outage on zimmermann.mayfirst.org, Kristian Fiskerstrand, 2012/07/27

Prev by Date: Re: [Sks-devel] sks nginx config
Next by Date: Re: [Sks-devel] recon outage on zimmermann.mayfirst.org
Previous by thread: Re: [Sks-devel] recon outage on zimmermann.mayfirst.org
Next by thread: Re: [Sks-devel] recon outage on zimmermann.mayfirst.org
Index(es):
- Date
- Thread