Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In an LXC container, the website displays a ‘load average’ that is very different from top #5565

Closed
juancarlosromerogarcia opened this issue Feb 1, 2024 · 10 comments
Labels

Comments

@juancarlosromerogarcia
Copy link

Versions

Pi-hole version is v5.17.3 (Latest: v5.17.3)
web version is v5.21 (Latest: v5.21)
FTL version is v5.24 (Latest: v5.24)

Platform

The platform is an LXC container that uses a Debian 12 template on a Proxmox VE 7.4.

Expected behavior

I expected to see the same load average values on the web as shown by the top command (around 1).

Actual behavior / bug

However, instead, values around 5 (belonging to the host) are displayed.

Steps to reproduce

Steps to reproduce the behavior:

  • Create an LXC container in Proxmox with the following file configuration 104.conf:
arch: amd64
cores: 2
hostname: pihole
memory: 1024
net0: name=eth0,bridge=vmbr1,firewall=1,gw=192.168.10.10,hwaddr=5A:2F:72:71:00:76,ip=192.168.10.40/24,type=veth
ostype: debian
rootfs: backup:104/vm-104-disk-0.raw,size=8G
swap: 1024
unprivileged: 1

apt update && apt upgrade
cd ~
wget -O basic-install.sh https://install.pi-hole.net
bash basic-install.sh
apt install unbound -y
nano /etc/unbound/unbound.conf.d/pi-hole.conf

server:
    # If no logfile is specified, syslog is used
    # logfile: "/var/log/unbound/unbound.log"
    verbosity: 0

    interface: 127.0.0.1
    port: 5335
    do-ip4: yes
    do-udp: yes
    do-tcp: yes

    # May be set to yes if you have IPv6 connectivity
    do-ip6: no

    # You want to leave this to no unless you have *native* IPv6. With 6to4 and
    # Terredo tunnels your web browser should favor IPv4 for the same reasons
    prefer-ip6: no

    # Use this only when you downloaded the list of primary root servers!
    # If you use the default dns-root-data package, unbound will find it automatically
    #root-hints: "/var/lib/unbound/root.hints"

    # Trust glue only if it is within the server's authority
    harden-glue: yes

    # Require DNSSEC data for trust-anchored zones, if such data is absent, the zone becomes BOGUS
    harden-dnssec-stripped: yes

    # Don't use Capitalization randomization as it known to cause DNSSEC issues sometimes
    # see https://discourse.pi-hole.net/t/unbound-stubby-or-dnscrypt-proxy/9378 for further details
    use-caps-for-id: no

    # Reduce EDNS reassembly buffer size.
    # IP fragmentation is unreliable on the Internet today, and can cause
    # transmission failures when large DNS messages are sent via UDP. Even
    # when fragmentation does work, it may not be secure; it is theoretically
    # possible to spoof parts of a fragmented DNS message, without easy
    # detection at the receiving end. Recently, there was an excellent study
    # >>> Defragmenting DNS - Determining the optimal maximum UDP response size for DNS <<<
    # by Axel Koolhaas, and Tjeerd Slokker (https://indico.dns-oarc.net/event/36/contributions/776/)
    # in collaboration with NLnet Labs explored DNS using real world data from the
    # the RIPE Atlas probes and the researchers suggested different values for
    # IPv4 and IPv6 and in different scenarios. They advise that servers should
    # be configured to limit DNS messages sent over UDP to a size that will not
    # trigger fragmentation on typical network links. DNS servers can switch
    # from UDP to TCP when a DNS response is too big to fit in this limited
    # buffer size. This value has also been suggested in DNS Flag Day 2020.
    edns-buffer-size: 1232

    # Perform prefetching of close to expired message cache entries
    # This only applies to domains that have been frequently queried
    prefetch: yes

    # One thread should be sufficient, can be increased on beefy machines.
    # In reality for most users running on small networks or on a single machine, it should
    # be unnecessary to seek performance enhancement by increasing num-threads above 1.
    num-threads: 1

    # Ensure kernel buffer is large enough to not lose messages in traffic spikes
    so-rcvbuf: 1m

    # Ensure privacy of local IP ranges
    private-address: 192.168.0.0/16
    private-address: 169.254.0.0/16
    private-address: 172.16.0.0/12
    private-address: 10.0.0.0/8
    private-address: fd00::/8
    private-address: fe80::/10

nano /etc/dnsmasq.d/99-edns.conf

edns-packet-max=1232

Through the web, uncheck all Upstream DNS Servers and check and add a custom 127.0.0.1#5335.

Debug Token

Screenshots

top

web

Additional context

@rdwebdesign
Copy link
Member

However, instead, values around 5 (belonging to the host) are displayed.

  1. Pi-hole reads the System Load using PHP function sys_getloadavg().
    This function gets the Load for the whole machine.
    As far as I know, PHP can't get a per-container value, because Proxmox/LXC container doesn't return values per container.

  2. Pi-hole also reads the number of processors using nproc.
    Inside the container this command shows the value set using Proxmox CPU limit.

  3. The load is calculated based on the available values from above.
    When system load is greater than the number of processors the warning is triggered, but (as said above) inside containers only the second value is returned per-container, causing false warnings.

You can disable the Pi-hole warning adding CHECK_LOAD=false to /etc/pihole/pihole-FTL.conf.

@juancarlosromerogarcia
Copy link
Author

I understand what you’re saying. Thank you for your quick response.
I don’t know if it’s related, but I’ve noticed that with the pihole -c command, the load average is displayed correctly.

chronometer

@rdwebdesign
Copy link
Member

As a test, please post the output of the following commands inside the container and on the host:

php -r 'echo implode(" ", sys_getloadavg())."\n";' && cat /proc/loadavg && uptime

(this will only work on the host, if PHP is installed)

@juancarlosromerogarcia
Copy link
Author

host:

php -r 'echo implode(" ", sys_getloadavg())."\n";' && cat /proc/loadavg && uptime
5.46 5.37 5.11
5.46 5.37 5.11 9/1029 4194223
 14:52:20 up 6 days,  4:35,  2 users,  load average: 5.46, 5.37, 5.11

container:

php -r 'echo implode(" ", sys_getloadavg())."\n";' && cat /proc/loadavg && uptime
5.42041015625 5.36767578125 5.10888671875
0.00 0.00 0.00 0/39 4171612
 13:52:28 up 3 days,  1:06,  1 user,  load average: 0.00, 0.00, 0.00

@nevrrmind
Copy link

nevrrmind commented Feb 8, 2024

Same problem here.

Quick n dirty fix for me:
Edit "/lib/systemd/system/lxcfs.service" on Proxmox-Node
Change ExecStart=/usr/bin/lxcfs /var/lib/lxcfs to ExecStart=/usr/bin/lxcfs -l /var/lib/lxcfs
Run systemctl daemon-reload and restart the Service service lxcfs restart
Restart the container.

Edit header_authenticated.php

Search for:

// Get CPU load
$loaddata = sys_getloadavg();
foreach ($loaddata as $key => $value) {
    $loaddata[$key] = round($value, 2);
}

Change to

function loadavg()
{
	exec("cat /proc/loadavg", $loadavg);
	$loaddata = explode(" ", $loadavg[0]);
	unset($loaddata[3]);
	unset($loaddata[4]);
	foreach ($loaddata as $key => $value) {
		$loaddata[$key] = round($value, 2);
	}
	return array($loaddata[0], $loaddata[1], $loaddata[2]);
}
$loaddata = loadavg();

Done :)
pihole-before
pihole-after

@juancarlosromerogarcia
Copy link
Author

That solution works for me. Thanks nevermind. If after the pi-hole updates, this still works, I'm satisfied.

@keywal
Copy link

keywal commented Mar 10, 2024

Same situation here too.
Before I got changing stuff, is this being reviewed or not?
Also, where is 'header_authenticated.php' - edit: ignore - I found it

@daNutzzzzz
Copy link

would be good for everyone to post here in support of LXC fixing the underlying issue.

@kobuki
Copy link

kobuki commented Apr 6, 2024

I'm the OP of the issue mentioned in this post. It looks LXC devs see it as a problem somewhere else, not in LXC - which is, of course, entirely possible, but I'm not fully convinced yet. I think that to move this issue forward a bit, we'd need to collect some more info to help with triaging. If you find evidence that points to LXC 5 (major version is important, as the problem appeared there) being at fault, please post it there in that LXC issue. For anything else, we'd need to find the appropriate upstream.

Copy link

github-actions bot commented May 7, 2024

This issue is stale because it has been open 30 days with no activity. Please comment or update this issue or it will be closed in 5 days.

@github-actions github-actions bot added the stale label May 7, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants