Network Routing Still Stuck
I've been able to get some things behaving, but others don't want to play well. I can't see what's different, but I feel like it's going to poke me in the eye when I find it!
I've spent all the free time I had today trying to get the new routing scheme I have going.
I'm trying to get three things going. I need to be able to connect the web and other servers to the new LAN, so I can remove the old LAN. I need to be able to connect my WiFi router to the new LAN, so I can continue to serve the NAT/firewalled and mobile devices. And I need to be able to connect a workstation directly to the LAN to administer all the things, and have more direct network access on occasion.
I started trying to get all three working in a multi-tasking mess, but after focusing on one each, I've reached better solutions/
I was able to get computers connected to the new network to get through without NAT enabled, which is a huge win. I made a one-node DHCP server entry so I can switch to its WiFi to test directly, and I've plugged into the LAN ports and tested that way. I can also manually assign an IP to a directly connected device, by Ethernet or WiFi. Directly connected computers can reach the Internet with their LAN IPs and no NAT! Sweet basic routing, and all it takes is disabling the NAT in the router's UI.
I had spent many hours, because of my spin and the next few failures, tinkering with iptables and route commands to try to get it to work, but it seems that was all because of the trouble I'm still having with the WiFi router. Obfuscated, but basically the following can hit the Internet and I can see the traffic coming directly from the (example) 10.0.1.5 node; with the NAT on, it looks like the traffic comes from the 10.0.0.2 node instead.
{ Internet } - ISP 10.0.0.1/30 - [ ROUTER WAN 10.0.0.2/30
LAN 10.0.1.1/29 ] - [ Computer 10.0.1.5/29 ]
I spent hours on it, but I cannot get the WiFi router attached to the same new LAN network, with its unique IP but otherwise configured like the computers, to pass data through. I'm a little hampered in my experiments as the house uses this for all its things, so I might be out of luck trying to bang hard on this until Wednesday (when I'm home alone...).
I'm trying essentially this, similar to above, but extended with the WiFi/NAT in the way:
{ Internet } - ISP 10.0.0.1/30 - [ ROUTER WAN 10.0.0.2/30
LAN 10.0.1.1/29 ] - [WiFi WAN 10.0.1.5/29
LAN (NAT) 192.168.1.1/24 ] - [ Computer 192.168.1.100/24 ]
The expectation is that the computer could hit a resource on the Internet because of the WiFi router NAT. Because it's a NAT connection, I'd expect it to appear to the world as if all the traffic is coming from 10.0.1.5. I'd also expect the NAT computer to be able to access the other computer, from the other example, or the ROUTER, but none of that works. If I SSH into the WiFi router, it cannot access the Internet, but it can do reach nodes on the 10.0.1.0/29 LAN. It's so weird that the router can but its clients cannot.
I also have a weirdness with a server I've connected to the new LAN, but I believe this is a multi-home and multiple gateway problem on that server. It can connect to and interact with other nodes on the new LAN, and can connect to the Internet via the default gateway through its other network, and other nodes on that network. But I can't get it to connect to Internet resources on the new LAN.
This is mired in multi-home madness, but looks something like this:
{ Internet } - ISP 10.0.0.1/30 - [ NEW_ROUTER WAN 10.0.0.2/30
LAN 10.0.1.1/29 ] - [ SERVER NIC1 10.0.1.5/29
NIC2 10.2.0.100/29 ] - [ OLD_ROUTER LAN 10.2.0.97/29
WAN ... ] - { Internet }
The outgoing ISP handles the subnet on their end. I just get a router with an assigned address in my subnet, and I can add other nodes to its LAN with the rest of the subnet. Neater than my own router can do, though is that it also offers a private WiFi/NAT network. It seems I could do that with the slightly newer PRO version of my WiFi router, which wasn't available when I bought it, and is a $400 swap to make it go now; if I swap out the router, I'm going to leap to 10GB, so not now.
This server stuff is weirdness, that I've worked out before. I'm sure I'm simply missing some nuance between how I did it in the old network configuration, and how the new Netplan method defines these things. Really the answer lies in ensuring the traffic coming into a NIC on the server goes out on the same NIC, respecting the appropriate network configuration for that, instead of what seems to be happening now where all the outbound traffic is trying to go out the default gateway. It should be that anything originated by the server will use a global default, unless it's been configured to bind to one of the NICs, and then use that NIC configuration.
So, said differently, I expect that if I configure the HTTP server to listen on 10.0.1.5, it should send any responses through the 10.0.1.1 gateway. Even on the same server if an HTTP server is listening on 10.2.0.100, those responses should go through the 10.2.0.97 gateway. But what's happening now is everything is outbound to the default gateway, so those requests coming into 10.0.1.5 are trying to leave the server via 10.2.0.100, and that messes everything up.
I've been trying to test it with things like this:
$ ping -I enp8s0 -c 2 1.1.1.1
PING 1.1.1.1 (1.1.1.1) from 10.2.0.100 enp8s0: 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=59 time=21.6 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=59 time=21.8 ms
--- 1.1.1.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 21.633/21.739/21.846/0.106 ms
$ ping -I enp7s0 -c 2 1.1.1.1
PING 1.1.1.1 (1.1.1.1) from 10.0.1.5 enp7s0: 56(84) bytes of data.
--- 1.1.1.1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1009ms
(Sorry, I had to chose between fonts and highlighting...) There you can see that leveraging the -I
interface
option of the ping command, it binds the request to an interface, and the "from" shows the expected address, but the new definition isn't getting out or past the router, even though it has a correctly configured gateway, and the same (albeit single NIC) configuration on the computer does work.
I'll get it. I always do.