共計 6228 個字符,預(yù)計需要花費 16 分鐘才能閱讀完成。
這期內(nèi)容當中丸趣 TV 小編將會給大家?guī)碛嘘P(guān) DB2 搜集數(shù)據(jù)腳本怎樣用于數(shù)據(jù)庫 hang 或性能問題的分析,文章內(nèi)容豐富且以專業(yè)的角度為大家分析和敘述,閱讀完這篇文章希望大家可以有所收獲。
由于產(chǎn)品的 bug,經(jīng)常會發(fā)生一些無法解釋的問題。在這種情況下我們就需要搜集相關(guān)的數(shù)據(jù)給產(chǎn)品部門分析原因并改進。
DB2 也不例外,IBM 把這些 bug 稱為 APAR。IBM 官方提供的搜集數(shù)據(jù)的相關(guān)命令有:
db2support,用于各種場景,特別是數(shù)據(jù)庫實例 crash 的情況,用法:
db2support . -d dbname -c -s
db2fodc,用于數(shù)據(jù)庫 hang 住的情況,用法:
db2fodc -hang -alldbs
或:
db2fodc -hang full
但在金融系統(tǒng)實際運維過程中,由于業(yè)務(wù)非常重要,發(fā)生故障時的首要任務(wù)是盡快恢復(fù)業(yè)務(wù)。而以上命令都會耗費比較長的時間,特別是 db2fodc 命令,必須在數(shù)據(jù)庫 hang 的情況下跑(事后跑無效),且需要 20 分鐘左右,客戶一般來說無法接受。而沒有相關(guān)數(shù)據(jù),IBM 實驗室無法分析出原因,也就無法改進。這也是我們經(jīng)常被客戶詬病的地方。
為了解決此問題,實驗室提供了輕量級的搜集數(shù)據(jù)命令,我把這些命令封裝成腳本,經(jīng)過實際測試,在客戶總連接數(shù) 3000,活躍連接數(shù) 200 左右的系統(tǒng)跑完需要不超過 1 分半鐘,而且不會導(dǎo)致更嚴重的性能問題。
腳本文件名為:gather_basic_data.sh,內(nèi)容如下:
#!/bin/ksh
# David Shen 2018/05/24 V1.0
# Used to gather necessary information(Stack,Trace) when database hang or have performance issues
#
# Functions
Usage ()
{
echo
echo Usage: $0 [-d dbname], [-t 0|1] [-f 0|1] [-p outputpath]
-d parameter is optional, if not specified, gather all db info under current db2 instance
-t parameter indicate if db2 trace info is needed, default is 0 – not needed
-f parameter indicate if db2 trace output need to be formated, default is 0 – not needed
-p parameter indicate where the output data should locate,default is instance db2diag path
Example: gather_basic_data.sh;gather_basic_data.sh -d testdb;gather_basic_data.sh -d testdb -t 1;
gather_basic_data.sh -d testdb -t 1 -f 1;gather_basic_data.sh -d testdb -t 1 -p /db/dbdata/
echo
exit 1
}
# Main function
# initial parameter
# The NeedDB2Trace variable indicates whether we need DB2 trace data!
# The NeedFmtTrace variable indicates whether we need to format trace data in the script!
# The NeedStack variable indicates whether we need DB2 stack files data!
TraceSleepTime=2
StackTraceInterval=5
NeedDB2Stack=1
OS=`uname -s|tr [a-z] [A-Z]`
#Read parameter
while getopts :d:t:f:p: opt
do
case ${opt} in
d ) DBName=${OPTARG} ;;
t ) NeedDB2Trace=$OPTARG ;;
f ) NeedFmtTrace=${OPTARG} ;;
p ) ParentDir=${OPTARG} ;;
esac
done
if [[-z $DBName]]; then
DBName=
fi
if [[-z $NeedDB2Trace]]; then
NeedDB2Trace=0
fi
if [[-z $NeedFmtTrace]]; then
NeedFmtTrace=0
fi
if [[-z $ParentDir]]; then
ParentDir=$(db2 get dbm cfg | grep Diagnostic data directory path | awk {print $NF} )
fi
#—————————————————————————
# Parm Checks
#—————————————————————————
if [$DBName !=] ; then
#DBName is valid and db is acitve?
if ((`db2 list active databases|grep -i $DBName|wc -l` == 0)) ; then
echo No active db named $DBName under this instance!
Usage
fi
fi
if [$NeedDB2Trace != 0] [$NeedDB2Trace != 1] ; then
echo -t Parameter not correct!
Usage
fi
if [$NeedDB2Trace == 1] ; then
if [$NeedFmtTrace != 0] [$NeedFmtTrace != 1] ; then
echo -f Parameter not correct!
Usage
fi
fi
if [! -d $ParentDir] ; then
echo Error: $ParentDir,no this directory!
Usage
else
#if there is enough space(1G)?
Freespace=$(df -m $ParentDir|tail -1|awk {print $3} )
if [${Freespace} -lt 1024 ] ; then
echo There is no enough space under $ParentDir,at least 1G space needed!
exit -1
fi
fi
##### Prepare for the directory that output files will be generated to.
CurrentTime=`date +%Y-%m-%d-%H.%M.%S`
DataPath= ${ParentDir}/${0}.${CurrentTime}
mkdir ${DataPath}
if [$? -ne 0]
then
echo `date` —– Failed to make directory ${DataPath}! Exiting …\n
exit -1
else
echo `date` —– Current working directory is $PWD \n
cd ${DataPath}
echo `date` —– Changed working directory to $DataPath \n
mkdir StackFiles
if [$? -ne 0]
then
echo `date` —– Failed to make directory ./StackFiles! Exiting …\n
exit -1
else
StackFilePath= ${PWD}/StackFiles
echo `date` —– Stack files are going to be put in $StackFilePath!\n
fi
LogFile= ${0}.log
echo `date` —– Starting script ${0}, log file is ${LogFile}\n\n | tee ${LogFile}
fi
##### Start gathering data
# Start gathering some OS data in background
echo `date` —– Start gathering some OS data in background …\n | tee -a ${LogFile}
nohup vmstat -tw 1 180 vmstat.txt
if [[$OS == AIX]]; then
nohup iostat -D -l -T 1 180 iostat.txt
else #Linux
nohup iostat -xtk 1 180 iostat.txt
fi
echo `date` —– OS data was submitted to background!\n\n | tee -a ${LogFile}
for i in 1 2 3
do
if [$NeedDB2Stack -eq 1]
then
# Stack files by db2pd -stack command
echo `date` —– Start generating stack files for the $i time …\n | tee -a ${LogFile}
# db2pd -stack all db2pd_stack_all_$i.txt
db2pd -stack all dumpdir=${StackFilePath} db2pd_stack_all_$i.txt
sleep $StackTraceInterval
echo `date` —– Stack files for the $i time is done!\n\n | tee -a ${LogFile}
else
echo `date` —– Not going to collect stack files!\n\n | tee -a ${LogFile}
fi
# db2pd data
echo `date` —– Start gathering db2pd data for the $i time …\n | tee -a ${LogFile}
if [$DBName ==]
then
db2pd -alldbs -appl -trans -apinfo db2pd_appl_$i.txt
db2pd -alldbs -locks wait db2pd_locks_$i.txt
db2pd -alldbs -logs db2pd_logs_$i.txt
db2pd -edu -agent db2pd_edu_$i.txt
db2pd -latch db2pd_latch_$i.txt
db2pd -dbptnmem -memset -mempool -inst -alldbs db2pd_mem_$i.txt
db2pd -alldbs -tcb db2pd_tcb_$i.txt
else
db2pd -db ${DBName} -appl -trans -apinfo db2pd_appl_$i.txt
db2pd -db ${DBName} -locks wait db2pd_locks_$i.txt
db2pd -db ${DBName} -logs db2pd_logs_$i.txt
db2pd -edu -agent db2pd_edu_$i.txt
db2pd -latch db2pd_latch_$i.txt
db2pd -dbptnmem -memset -mempool -inst -db ${DBName} db2pd_mem_$i.txt
db2pd -db ${DBName} -tcb db2pd_tcb_$i.txt
fi
echo `date` —– db2pd data for the $i time is done!\n\n | tee -a ${LogFile}
# DB2 trace (db2trc) data
if [$i -eq 2 -a $NeedDB2Trace -eq 1]
then
echo `date` —– Start gathering DB2 trace data, which will take at least $TraceSleepTime seconds …\n | tee -a ${LogFile}
db2trc on -t -i 128M
db2trc info db2trc_info.out
echo `date` —– DB2 trace data is turned on!\n | tee -a ${LogFile}
sleep $TraceSleepTime
db2trc stop ${LogFile} 2 1
db2trc dump db2trc.dmp ${LogFile} 2 1
db2trc off ${LogFile} 2 1
echo `date` —– Binary DB2 trace data is dumped out!\n\n | tee -a ${LogFile}
if [$NeedFmtTrace -eq 1]
then
db2trc flw -t db2trc.dmp db2trc.flw
db2trc fmt db2trc.dmp db2trc.fmt
else
echo `date` —– Not going to format binary trace data, please format the data manually after the script is finished!\n\n | tee -a ${LogFile}
fi
fi
# Some OS data
echo `date` —– Start gathering some OS data …\n | tee -a ${LogFile}
ps auxw ps_auxw_$i.txt
ipcs -a ipcs_$i.txt
if [[$OS == AIX]]; then
ps -kefl ps_kefl_$i.txt
svmon -G svmon_G_$i.txt
else #Linux
ps -elf ps_elf_$i.txt
netstat -v netstat_v_$i.txt
swapon -s swapon_$i.txt
free free_$i.txt
top -b -n 1 top_$i.txt
fi
echo `date` —– OS data is done!\n\n | tee -a ${LogFile}
done
# Other DB2 data
echo `date` —– Start gathering some other DB2 data …\n | tee -a ${LogFile}
db2set -all db2set.txt
db2pd -dbmcfg -dbcfg -alldbs db2pd_cfg.txt
echo `date` —– The other DB2 data is done!\n\n | tee -a ${LogFile}
echo `date` —– All data gathered, exiting …\n\n | tee -a ${LogFile}
exit 0
上述就是丸趣 TV 小編為大家分享的 DB2 搜集數(shù)據(jù)腳本怎樣用于數(shù)據(jù)庫 hang 或性能問題的分析了,如果剛好有類似的疑惑,不妨參照上述分析進行理解。如果想知道更多相關(guān)知識,歡迎關(guān)注丸趣 TV 行業(yè)資訊頻道。