Troubleshooting
Inventory problem
When you see error reports about the inventory component or when your last inventory date in node details is too old (i.e. more than one day).
Run rudder agent inventory -i
:
-
If it gives an Invalid inventory error, this is likely a bug, please open an issue with the provided error message.
-
If it fails with a curl error:
-
Check if the IP seen by the server (or relay) is in the allowed network
-
Check network connectivity to port 443 on server (or relay), for example using
telnet rudder.example.com 443
-
Run the given curl command after removing the
-s
or--silent
argument to have a complete error
-
-
If it succeeds then the problem is probably on the root server. Check the web application logs in
/var/log/rudder/webapp/TODAY.stderrout.log
for lines about your node’s inventory:-
If you have an Invalid signature message
-
It could be because of a change of node id or agent key. Compare the certificate fingerprint and node id from the web interface and the
rudder agent info
output. -
There was a problem with the signature process on the node. Particularly, when there is a problem with "openssl" command, the signature may be empty. Check the content of signature file for that inventory in
/var/rudder/inventories/failed/xxx-yyy-zzz.ocs.sig-DATE
-
The node sending the inventory is not the same as the one registered in Rudder. This can show a security problem, but it may also be due to two nodes having the same nodeID for some reason (a VM cloned followed with an update of the agent key of one the cloned node, but no change of the node ID). In this case
rudder agent factory-reset
should be run after VM clone.
-
-
Compliance problem
This section assumes you are using HTTPS reporting. |
Run rudder agent update -i
:
-
If it gives a Cannot find suitable server error:
-
Check if the IP seen by the server (or relay) is in the allowed network
-
Check network connectivity to port 5309 on server (or relay), for example using
telnet rudder.example.com 5309
-
-
If it gives an Unspecified server refusal error:
-
Check if the node has actually been accepted on the server and that a first policy generation finished after acceptation
-
Check the policy server logs on the server or relay for more details
-
-
If it succeeds then the problem is probably in the reporting process. Run
rudder agent run -i
and check the messages about reporting at the end:-
If it fails with a curl error:
-
Check network connectivity to port 443 on server (or relay), for example using
telnet rudder.example.com 443
-
Run the given curl command after removing the
-s
or--silent
argument to have a complete error
-
-
If it succeeds, check compliance view for the node in the web interface:
-
If everything is red, check time synchronization between node and server, check for broken update on relays between node and server, and read the compliance message in the compliance tab.
-
If mixed red and green, check if it’s always the same red components. If you have constant isolated red components it is probably a bug, please open an issue with the details. If the end of the run is missing, then the agent probably stops in the middle of the run. In this case run it manually on the node to see the error message. You should also check if agent run does not take longer than 5 minutes (for example in
rudder agent history
output), as it can cause missing reports around the end of the run. -
If it’s sometimes grey check webapp logs for messages indicating "[Store Agent Run Times] Task frequency is set too low!" that indicate the server cannot handle the report flow. In this case you can try to tune PostgreSQL performance or to increase the period between two save jobs by modifying the
rudder.batch.storeAgentRunTimes.updateInterval
property in/opt/rudder/etc/rudder-web.properties
-
-
Agent problems
The copy of the file failed: the destination (/path/to/the/file) is not stored in a valid directory
You may encounter the following error while trying to edit or manage some files when the path contains a symbolic link:
The copy of the file failed: the destination (/path/to/the/file) is not stored in a valid directory
This is due to a security policy, to avoid privilege escalation using the Rudder agent. You can edit links that are either owned by root, by the user currently running the agent (often root) or if the owner and group of the symbolic link are the same as the target.
curl: (90) SSL: public key does not match pinned public key
ESET endpoint security replacing issuers
If the node is in the allowed network, the problem can come from ESET Endpoint Security. It contains a certificate checker that modifies certificates (on the fly) that are consider invalids, by replacing the issuer field with this exact string:
CN=The original certificate provided by the server is untrusted
This changes the hash of the certificate, and prevents the Rudder agent from contacting its policy server over HTTPS. To identify precisely the problem follow these instructions:
-
On the node run
grep POLICY_SERVER_KEY_HASH /var/rudder/cfengine-community/inputs/rudder.json@
You will get a hash formatted like that sha256//<hash>
-
Then run this curl command to get information on certificate
curl -v -k --pin sha256//<hash> https://<policy-server-hostname-or-ip>
sha256//<hash>
should be replaced by the value from the previous command
-
In the output, check the issuer line if it contains this following line, then it is linked to ESET Endpoint Security software
CN=The original certificate provided by the server is untrusted
To fix this problem you should add the root server’s certificate to the Endpoint Detection and Response (EDR) configuration and explicitly allow it: https://help.eset.com/ees/9/en-US/idh_config_epfw_ssl_known.html
← Change server ports Manage plugins →