/
Server Troubleshoot
Server Troubleshoot
There are many reasons that could lead to the users experiencing slowness, by understanding the root cause of the system slowness, we will be able to troubleshoot effectively. In order to resolve the problems effectively, it is extremely important for the users to provide the details.
- CHECKING SERVER SLOW SCRIPT
This script is usually is in the server. To run the script as root:
./checking_server_slow.sh
This script has compiled to check on error that is happening on the server which may causes the server slowness.
Note: If there script is not located in the customer server, please add this into the server.
Steps:
vim checking_server_slow.sh
Press i to insert text
copy and paste the content of the text and press esc
:wq to save
chmod a+x ./checking_server_slow.sh
As shown above, the server is currently having a outofmemory error. This error usually causes some of the user to unable to access server while others might not facing the same problem. The solutions recommended for this error is restart the server.
This the example for Exception error shown. The user might experienced slowness.
Checking running query can give us the query that could have stuck which may causes the server slowness.
The query above shows that the user is generating a report with large duration which consumes the server resources causes server slowness. - VIRUS
However the there could be another problem which causes server slowness. One of the causes is because of virus in the server pc. To check on this, as root, type top and the CPU is above 100%
As shown above, the CPU is currently at 332.6%, indicating the presence of a virus in the server pc.
If this happens, please refer to wiki: /wiki/spaces/WM/pages/1002276120 - HARDDISC FULL
For on premise customer, harddisc capacity could also leads to server slowness when it is full.
The above screenshot is a sign that the harddisc is currently too full to check on running queries.
To check on harddisc, enter df -h as root.
The image above shows that the harddisc is currently at 100% capacity, which causes the server slowness. If this happens, you can remove some of the file that are no longer used.
For example, old server log. To find server log, enter locate server.log
You can see the directory to the server log. cd to the folder and enter ls -lhrt
This will list out the item inside the folder. You can also see the size of the server log. To make room in the harddisc, you can remove some of the older server log as temporary measure.
You can also find files using line:
find / -type f -size +20M -exec ls -lh {} \; | awk '{ print $NF ": " $5 }'
This is to find file with more than 20MB. You can change the size according tot your preferences.
If you are unsure the file can be safely remove or not, please refer to your manager. - RUNNING QUERIES
Stucked query can also causes server slowness. To check on this, login into postgres from tealive test or keyopswork
PGPASSWORD=4v1c3nn4s4msung psql postgres --host=my-samsung-hq-rds-new-emp.cejxvpigvz8w.ap-southeast-1.rds.amazonaws.com --port=5432 --username=janet
To select all the running queries using postgres:
SELECT (now() - xact_start) as period, pid,datname,state, substring(query, 0, 80) FROM pg_stat_activity where (now() - xact_start) is not null order by (now() - xact_start) desc;
If you want to check on specific server (ALLIT):
SELECT (now() - xact_start), pid,datname,state, substring(query, 0, 80), application_name FROM pg_stat_activity where datname ='allit' order by (now() - xact_start) desc;
To check on the full query:
SELECT query from pg_stat_activity where pid =XXXX;
To kill / terminate the query:
SELECT pg_terminate_backend(PID);
select pg_cancel_backend(PID); - RESTARTING SERVER
Restart the server can usually solve most of the problem. To restart the server you have to stop the jboss using jboss-stop.
This could take a while depending on the server usage at the moment. If it takes too long, use ps aux|grep jboss to check on the jboss and kill -9 [PID]. however, killing the jboss could make some transaction error in the server.
As for on-premise customer please restart postgres as well
as root: /etc/init.d/postgresql-9.2.4 restart
Private & Confidential