IPsec
Todo
needs complete rewrite to become nice
This document should become a comprehensive documentation on the ipsec
setup. I will feed it my notes and experiences along the way. I created
the role ipsec-vpn
that is part of stage2, i.e., strongswan will be
rolled out on the gateway nodes. This setup ought to be pretty similar
to wireguard. I am not sure yet what of the settings around wireguard
will be helpful / sufficient for ipsec (forwarding, mtu). Also I don’t
know yet if there will be a conflict between ipsec and wireguard.
wireguard should remain the default VPN solution.
Tools over which I stumbled: - strongswan - strongswan-starter -
charon(-cmd) - swanctl - Pluto
seems to refer to the old IKEv1
daemon. Charon
is a fresh re-implementation that aims to be
compatible to Pluto and implements to IKEv1 and IKEv2.
I followed the instructions on digitalocean.com for a basic setup. The author proposes eap-mschapv2 as authentication (?) scheme. IPsec should not be tunneled via wireguard. Instead we need to use the public floating ip.
IPsec itself is implemented inside the kernel. The userspace tooling is responsible to configure the network and handle the authentication. IPsec use IKE, the Internet Key Exchange protocol. “Charon” is the ferryman of Hades who carries souls across the Styx to the world of the dead.
strongswan
is a systemd service that invokes swanctl. It invokes
charon-systemd
. swanctl
replaces the starter
, ipsec
and
stroke
tools.
swanctl
works independantly from ipsec.conf.
Does that mean I configured the wrong configuration file
(/etc/ipsec.d/ipsec.conf
)?
On Arch, strongswan
invokes swanctl
. On Debian it uses
starter
and charon
. That means the tooling is different :O and
no surprise that I didn’t see any errors yet.
VICI
is the “Versatile IKE Configuration Interface”. It’s an
interface to configure, control and monitor the IKE daemon charon
.
An example that uses strongswan-swanctl
can be found
here.
Here
you can find more examples.
/etc/swanctl/
contains the configuration of
swanctl.
What do I put in local_addrs
, what in remote_addrs
? What is
remote_ts
? My laptop is behind a NAT, the gateway does not know
about its public endpoint. TS
is traffic selector
. Split
tunneling is the name of a scenario in which a client only sends traffic
for specific destinations to the gateway. The initiator of an IPsec
tunnel can request an additional IP address from the responder to use as
an inner tunnel address. This address is called the Virtual IP
local_addrs
defaults to %any
if not set and is hence optional.
It’s useful if one wants to limit traffic to a particular interface.
The responder (server/gateway) does not receive a virtual ip. On the other hand, why should it? VirtualIP is cool in a setup in which the initiator is behind a NAT and cannot be reached.
How do I see that a tunnel is established? Where do I see any error
messages? Where can I turn up the volume? strongswan.conf
contains
the configuration of charon-systemd
.
/etc/strongswan.d/charon-systemd.conf
looks like a promising spot.
Wiki.
default = 2
sets the verbosity of all services that are not
explicitly listed to 2. The level goes from -1 (silent) to 4 (all the
things).
I see quite a lot of modules that I will not need. How do I configure / disable them? -> Static plugin configuration is discouraged unless you know what you’re doing. (Which I obviously don’t, yet).
Quoting the introduction to strongswan strongswan is a keyring daemon that uses IKE to establish a security association (SA) between two peers. IKE provides strong authentication of both peers and derives unique cryptographic session keys. An IKE session is often called IKE_SA in the docs. Besides authentication and key material IKE also provides the means to exchange configuration information (e.g., virtual IP addresses) and to negotiate IPsec SAs (often called CHILD_SA). IPsec SAs (CHILD_SA) defines which network traffic is to be secured and how it has to be encrypted and authenticated. A CHILD_SA consists of two components:
the actual IPsec SAs (two for each direction) that describe algorithms to encrypt/authenticate traffic
policies that define which network traffic shall use such an SA
Policies are derived from the traffic selectors (TS) negotiated via IKE when establishing a CHILD_SA. strongswan installs the negotiated IPsec SAs and SPs into the kernel by using a platform dependent kernel API.
Connections and CHILD_SAs defined in swanctl.conf can be started on three different occasions:
on traffic (if start_action=trap)
on startup (if start_action=start). The CHILD_SAs will not be restarted automatically when they go down. Other configuration settings are needed. This is not recommended. The user is encouraged to use trap policies instead (see above)
Manually (no start_action is provided). Use swanctl –initiate –child to start a connection
Possible values for (CHILD) SA proposals can be found here.
Forward traffic
remote_ts
on Initiator sides basically claims which subnet ranges
can be reached via this IPsec connection. The gateway is already
configured to forward wireguard traffic (NAT). Wireguard traffic arrives
over the wg
interface which makes it easy to identify. A quickfix to
also forward IPsec traffic is to add a rule that looks something like
this:
iifname $wan ip saddr 10.3.0.0/24 oifname $wan ct state new counter accept;
10.3.0.0/24
is the virtual IP pool out of which an IP was assigned
to the Initiator. This doesn’t cut, though, because coming from that
subnet is only a symptom. What I actually want is to forward decrypted
payloads.
nft >= 0.9.1 knows meta ipsec exists
, nft >= 0.8.2 knows
meta secpath exists
. Both apparently detect IPsec traffic.
Consequently the rule above becomes:
meta secpath exists iifname $wan ip saddr 10.3.0.0/24 oifname $wan ct state new counter accept;
Failover
Disclaimer: I have no idea about the details here and if what I’m doing is sensible.
strongswan comes with builtin feature for load-balancing and failover.
However, they sounded a bit limiting so we try to stick to our
established keepalived/gateway setup. Note: DPD means “Dead Peer
Detection”. strongswan offers the actions clear
, restart
and
trap
for dpd_action
. restart
, well, restarts the connection
for the CHILD_SA on a timeout. dpd_delay
must not be equal to 0 to
trigger the sending of “Informational” packets that check the
reachability of the peer.
The test scenario was as follows: the roadwarrior continuosly pings a node in the private network. By starting or stopping keepalived (depending on the node and its priority) we simulate the loss of a gateway. It as expected that the pings stop reaching their targets for a period of time. The failover is considered to be successful if the IPsec tunnel is re-established automatically and traffic passes again between roadwarrior and the node. After the first failover, there should be another failover to the “original” gateway.
Debugging
By default, strongswan uses policy based routing. Entries are injected into table 220.
To see all routing tables:
ip rule list
To see all routing rule:
ip route show table all
To see a specific routing table:
ip route show table <N>
Inside the kernel, the IP framework xfrm
is used to transform
packets (e.g., encrypting payloads). ip xfrm *
to see things.
I don’t see any traffic when I initiate the connection with
swanctl --initiate --child home
. The systemd log shows an error
message:
sending packet: from 192.168.188.139[500] to 185.128.119.223[500] (240 bytes)
error writing to socket: Network is unreachable
tcpdump -venni any udp port 500 or port 4500
does not show any
packets.
-> Issue was that 39 != 139. The whole thing was avoidable from the
start by not setting local_addrs
to a fixed value (it defaults to
%any
)
BGP
The problem statement: With the current setup a roadwarrior can reach the gateway and all nodes beyond the gateway in their private net. The gateway can contact the roadwarrior. But what if a node or a pod in the private network wants to reach the roadwarrior (probably more interesting in a site-to-site scenario)?
Bird should import routes from routing table 220
(ipsecs) into its
internal routing table. It should export those routes via BGP. The BGP
server of the CNI plugin should import them and expose them on the
nodes.
FIB is short for Forwarding Information Base.
Kernel protocol is to exchange routes between a BIRD routing table and a kernel routing table (FIB). Instances of kernel protocol cannot share BIRD routing tables or FIBs. Use Pipe for that purpose.
Example:
# Table to collect all IPv4 routes
ipv4 table bgp_v4tab;
# In bgp_v4tab, import all routes from master4
protocol pipe {
table bgp_v4tab;
peer table master4;
Random notes from Jonas
On debian, I had to install
strongswan-pki
andstrongswan-swanctl
.You can use
swanctl --load-all
to (re-)load config etcEAP_IDENTITY not supported, sending EAP_NAK
-> You also need lib{charon,strongswan}-extra-plugins installed, may need restart of strongswan-starter to applyipsec_subnet
should be set to the k8s subnet methinks?? or at least not defaulted to some strange range…
Additional info on the usage in the LCM
IPsec is a customer-specific solution and should remain so, i.e., we only support what was required.
One example of that is the authentication means. We only support PSK here
Upload an IPSec EAP PSK to Vault
If you want to use IPSec, you have to upload the pre-shared key to vault before triggering the LCM. This can be done with the following command:
Note
A root token is needed!
The orchestrator
role does not have enough privileges.
# Please substitute CLUSTERNAME & YOUR_IPSEC_EAP_PSK
vault kv put yaook/CLUSTERNAME/kv/ipsec-eap-psk ipsec_eap_psk=YOUR_IPSEC_EAP_PSK