sks-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Sks-devel] SKS scaling configuration


From: Jeremy T. Bouse
Subject: Re: [Sks-devel] SKS scaling configuration
Date: Mon, 4 Mar 2019 00:57:48 -0500
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1

So I'll preface this with a caveat that I know a couple of the recipient
mail servers are having some issues with my DMARC/DKIM/SPF settings so I
don't know if everyone is receiving my posts.

I've updated my configuration on sks.undergrid.net using NGINX and
load-balancing 4 SKS nodes... Here are my configs:

Under snippets/sks.conf:

    location /pks/hashquery {
        proxy_ignore_client_abort on;
        proxy_pass http://$sks_server;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_pass_header Server;
        add_header Via "1.1 $host:$server_port (nginx)";
        add_header X-Robots-Tag 'nofollow, notranslate' always;
    }

    location /pks/add {
        proxy_ignore_client_abort on;
        proxy_pass http://$sks_server;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_pass_header Server;
        add_header Via "1.1 $host:$server_port (nginx)";
        add_header X-Robots-Tag 'nofollow, notranslate' always;
        client_max_body_size 8m;
    }

    location /pks {
        proxy_ignore_client_abort on;
        proxy_pass http://$sks_server;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_pass_header Server;
        add_header Via "1.1 $host:$server_port (nginx)";
        add_header X-Robots-Tag 'nofollow, notranslate' always;
    }

Under conf.d/upstream.conf:

upstream sks_secondary {
    server 127.0.0.1:11371 weight=5;
    server 172.16.20.52:11371 weight=25;
    server 172.16.20.53:11371 weight=25;
    server 172.16.20.54:11371 weight=25;
}

upstream sks_primary {
    server 127.0.0.1:11371;
    server 172.16.20.52:11371 backup;
    server 172.16.20.53:11371 backup;
    server 172.16.20.54:11371 backup;
}

map $arg_op $sks_server {
    "stats" sks_primary;
    default sks_secondary;
}

    Of note is the use of "backup" rather than weights for the secondary
nodes under sks_primary. This allows a secondary node to respond for
stats queries when the primary is unable to respond which has typically
been when IO Wait is over 30%.

Under sites-enabled/sks-default:

server {
    listen 172.16.20.51:80 default_server;
    listen [::]:80 ipv6only=on default_server;

    access_log off;
    server_tokens off;
    root   /var/www/html;
    index  index.html index.htm;

    location / {
        return 301 https://sks.undergrid.net$request_uri;
    }

    include snippets/sks.conf;
}

server {
    listen 172.16.20.51:11371 default_server;
    listen [::]:11371 ipv6only=on default_server;

    access_log off;
    server_tokens off;

    location / {
        proxy_ignore_client_abort on;
        proxy_pass http://$sks_server;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_pass_header Server;
        add_header Via "1.1 $host:$server_port (nginx)";
        add_header X-Robots-Tag 'nofollow, notranslate' always;
        client_max_body_size 8m;
    }
}

Under sites-enabled/sks-ugns-ssl:

server {
    listen 443 http2 ssl;
    listen [::]:443 http2 ipv6only=on ssl;
    server_name sks.undergrid.net;

    access_log off;
    server_tokens off;
    root   /var/www/html;
    index  index.html index.htm;

    ssl_certificate /etc/letsencrypt/live/sks.undergrid.net/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/sks.undergrid.net/privkey.pem;
    ssl_trusted_certificate
/etc/letsencrypt/live/sks.undergrid.net/chain.pem;
    ssl_stapling on;
    ssl_stapling_verify on;

    include snippets/sks.conf;
}

I also have a sites-available/sks-default-ssl ready to be enabled:

server {
    listen 443 http2 ssl default_server;
    listen [::]:443 http2 ipv6only=on ssl default_server;
    server_name pool.sks-keyservers.net *.pool.sks-keyservers.net;

    access_log off;
    server_tokens off;
    root   /var/www/html;
    index  index.html index.htm;

    ssl_certificate /path/to/sks-pool-certificate.pem;
    ssl_certificate_key /path/to/sks-pool-certificate-key.pem;
    ssl_trusted_certificate /usr/share/gnupg/sks-keyservers.netCA.pem;
    ssl_session_cache off;
    ssl_session_tickets off;

    location / {
        return 301 https://sks.undergrid.net$request_uri;
    }

    include snippets/sks.conf;
}

    Of note here is that SSL session cache and tickets are disabled to
prevent SSL session resumption on the pool hostnames.

Other than that in my nginx.conf itself I have the following SSL
settings setup:

        ssl_protocols TLSv1.2;
        ssl_prefer_server_ciphers on;
        ssl_dhparam /etc/nginx/dhparam.pem;
        ssl_ciphers EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH;
        ssl_ecdh_curve secp384r1;
        ssl_session_timeout  10m;
        ssl_session_cache shared:SSL:10m;
        ssl_session_tickets on;
        resolver 172.16.20.2 192.168.1.2 valid=300s;
        resolver_timeout 5s;

    Testing on Qualys [1] gives my server an A grade supporting TLS 1.2
with Forward Secrecy. KF, I'll get a CSR sent to you later this week
assuming the server would be accepted back into HKPS pool.

    I'm still dealing with some high IO Wait times on my primary node
which result in my server falling out of the pool quite often although
I'm not sure why the backup stats to from one of the secondary nodes
isn't being picked up, possibly a timeout issue still.

    In trying to test and validate my configuration I did do several
queries against the pool hostnames and got an awful lot of 50[234]
errors along with several HTST headers.

    I still have a delta of about 300 keys or less between all my
secondary nodes and the primary but everything seems to be stable aside
from the IO Wait issues I'm experiencing. I have reduced my membership
file down to my nodes and the ones that I could confirm I was still
cross-peered with (currently only 5 by my count) but many of them also
appear to be bouncing in and out of the pool quite frequently so I'm
wondering if the IO wait issue slowing down SKS processing of stats
requests is not isolated to my setup.

    As soon as I get my new firewall in place I'll be working on adding
the IPv6 support as my current firewall device is not properly obtaining
the route from my provider. I'm also working on the Tor configuration if
someone else has something they could share that would be helpful.

1.
https://www.ssllabs.com/ssltest/analyze.html?d=sks.undergrid.net&hideResults=on

On 2/18/2019 5:12 AM, Michiel van Baak wrote:
> On Sun, Feb 17, 2019 at 09:18:11AM -0800, Todd Fleisher wrote:
>> The setup uses a caching NGINX server to reduce load on the backend nodes 
>> running SKS.
>> His recommendation is to run at least 3 SKS instances in the backend (I’m 
>> running 4).
>> Only one of the backend SKS nodes is configured to gossip with the outside 
>> world on the WAN, along with the other backend SKS nodes on the LAN.
>> The NGINX proxy is configured to prefer that node (the one gossiping with 
>> the outside world - let’s call it the "primary") for stats requests with a 
>> much higher weight.
>> As a quick aside, I’ve observed issues in my setup where the stats requests 
>> are often directed to the other, internal SKS backend nodes - presumably due 
>> to the the primary node timing out due to higher load when gossiping.
>> This then gets cached by the NGINX proxy and continues to get served so my 
>> stats page reports only the internal gossip peer’s IP address vs. all of my 
>> external peers.
>> If Kristian or anyone else has ideas on how to mitigate/minimize this, 
>> please do share.
>> Whenever I check his SKS node @ 
>> http://keys2.kfwebs.net:11371/pks/lookup?op=stats 
>> <http://keys2.kfwebs.net:11371/pks/lookup?op=stats> I always find it 
>> reporting his primary node eta_sks1 with external & internal peers listed.
>>
>> Here are the relevant NGINX configuration options. Obviously you need to 
>> change the server IP addresses & the hostname returned in the headers:
>>
>> upstream sks_servers
>> {
>>        server 192.168.0.55:11372 weight=5;
>>        server 192.168.0.61:11371 weight=10;
>>        server 192.168.0.36:11371 weight=10;
>> }
>>
>> upstream sks_servers_primary
>> {
>>        server 192.168.0.55:11372 weight=9999;
>>        server 192.168.0.61:11371 weight=1;
>>        server 192.168.0.36:11371 weight=1;
>> }
> I would only put the 55 server in the 'upstream sks_servers_primary' so
> it does not know about the others.
> That way the stats call will only go to the primary. 
> Downside is that it wont fail over when it times out. But maybe that is
> exactly what you want for this specific call
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]