PostgreSQL中create

137次閱讀

共計 7141 個字符，預計需要花費 18 分鐘才能閱讀完成。

本篇內容主要講解“PostgreSQL 中 create_index_path 函數有什么作用”，感興趣的朋友不妨來看看。本文介紹的方法操作簡單快捷，實用性強。下面就讓丸趣 TV 小編來帶大家學習“PostgreSQL 中 create_index_path 函數有什么作用”吧!

函數 build_index_paths 中的子函數 create_index_path 實現了索引掃描成本的估算主邏輯。

一、數據結構

IndexOptInfo
回顧 IndexOptInfo 索引信息結構體

 typedef struct IndexOptInfo
 {
 NodeTag type;
 
 Oid indexoid; /* Index 的 OID,OID of the index relation */
 Oid reltablespace; /* Index 的表空間,tablespace of index (not table) */
 RelOptInfo *rel; /*  指向 Relation 的指針,back-link to index s table */
 
 /* index-size statistics (from pg_class and elsewhere) */
 BlockNumber pages; /* Index 的 pages,number of disk pages in index */
 double tuples; /* Index 的元組數,number of index tuples in index */
 int tree_height; /*  索引高度,index tree height, or -1 if unknown */
 
 /* index descriptor information */
 int ncolumns; /*  索引的列數,number of columns in index */
 int nkeycolumns; /*  索引的關鍵列數,number of key columns in index */
 int *indexkeys; /* column numbers of index s attributes both
 * key and included columns, or 0 */
 Oid *indexcollations; /* OIDs of collations of index columns */
 Oid *opfamily; /* OIDs of operator families for columns */
 Oid *opcintype; /* OIDs of opclass declared input data types */
 Oid *sortopfamily; /* OIDs of btree opfamilies, if orderable */
 bool *reverse_sort; /*  倒序?is sort order descending? */
 bool *nulls_first; /* NULLs 值優先?do NULLs come first in the sort order? */
 bool *canreturn; /*  索引列可通過 Index-Only Scan 返回?which index cols can be returned in an
 * index-only scan? */
 Oid relam; /*  訪問方法 OID,OID of the access method (in pg_am) */
 
 List *indexprs; /*  非簡單索引列表達式鏈表, 如函數索引,expressions for non-simple index columns */
 List *indpred; /*  部分索引的謂詞鏈表,predicate if a partial index, else NIL */
 
 List *indextlist; /*  索引列(TargetEntry 結構體鏈表),targetlist representing index columns */
 
 List *indrestrictinfo; /*  父關系的 baserestrictinfo 列表， *  不包含索引謂詞隱含的所有條件
 * (除非是目標 rel，請參閱 check_index_predicates()中的注釋),
 * parent relation s baserestrictinfo
 * list, less any conditions implied by
 * the index s predicate (unless it s a
 * target rel, see comments in
 * check_index_predicates()) */
 
 bool predOK; /* True, 如索引謂詞滿足查詢要求,true if index predicate matches query */
 bool unique; /*  是否唯一索引,true if a unique index */
 bool immediate; /*  唯一性校驗是否立即生效,is uniqueness enforced immediately? */
 bool hypothetical; /*  是否虛擬索引,true if index doesn t really exist */
 
 /* Remaining fields are copied from the index AM s API struct: */
 // 從 Index Relation 拷貝過來的 AM(訪問方法)API 信息
 bool amcanorderbyop; /* does AM support order by operator result? */
 bool amoptionalkey; /* can query omit key for the first column? */
 bool amsearcharray; /* can AM handle ScalarArrayOpExpr quals? */
 bool amsearchnulls; /* can AM search for NULL/NOT NULL entries? */
 bool amhasgettuple; /* does AM have amgettuple interface? */
 bool amhasgetbitmap; /* does AM have amgetbitmap interface? */
 bool amcanparallel; /* does AM support parallel scan? */
 /* Rather than include amapi.h here, we declare amcostestimate like this */
 void (*amcostestimate) (); /*  訪問方法的估算函數,AM s cost estimator */
 } IndexOptInfo;

Cost 相關
注意: 實際使用的參數值通過系統配置文件定義, 而不是這里的常量定義!

 typedef double Cost; /* execution cost (in page-access units) */
 /* defaults for costsize.c s Cost parameters */
 /* NB: cost-estimation code should use the variables, not these constants! */
 /*  注意: 實際值通過系統配置文件定義, 而不是這里的常量定義! */
 /* If you change these, update backend/utils/misc/postgresql.sample.conf */
 #define DEFAULT_SEQ_PAGE_COST 1.0 // 順序掃描 page 的成本
 #define DEFAULT_RANDOM_PAGE_COST 4.0 // 隨機掃描 page 的成本
 #define DEFAULT_CPU_TUPLE_COST 0.01 // 處理一個元組的 CPU 成本
 #define DEFAULT_CPU_INDEX_TUPLE_COST 0.005 // 處理一個索引元組的 CPU 成本
 #define DEFAULT_CPU_OPERATOR_COST 0.0025 // 執行一次操作或函數的 CPU 成本
 #define DEFAULT_PARALLEL_TUPLE_COST 0.1 // 并行執行, 從一個 worker 傳輸一個元組到另一個 worker 的成本
 #define DEFAULT_PARALLEL_SETUP_COST 1000.0 // 構建并行執行環境的成本
 
 #define DEFAULT_EFFECTIVE_CACHE_SIZE 524288 /* 先前已有介紹, measured in pages */
 double seq_page_cost = DEFAULT_SEQ_PAGE_COST;
 double random_page_cost = DEFAULT_RANDOM_PAGE_COST;
 double cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST;
 double cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST;
 double cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST;
 double parallel_tuple_cost = DEFAULT_PARALLEL_TUPLE_COST;
 double parallel_setup_cost = DEFAULT_PARALLEL_SETUP_COST;
 
 int effective_cache_size = DEFAULT_EFFECTIVE_CACHE_SIZE;
 Cost disable_cost = 1.0e10;// 1 后面 10 個 0, 通過設置一個巨大的成本, 讓優化器自動放棄此路徑
 
 int max_parallel_workers_per_gather = 2;// 每次 gather 使用的 worker 數

二、源碼解讀

create_index_path
該函數創建索引掃描路徑節點, 其中調用函數 cost_index 計算索引掃描成本.

//----------------------------------------------- create_index_path
/*
* create_index_path
* Creates a path node for an index scan.
* 創建索引掃描路徑節點
*
* index is a usable index.
* indexclauses is a list of RestrictInfo nodes representing clauses
* to be used as index qual conditions in the scan.
* indexclausecols is an integer list of index column numbers (zero based)
* the indexclauses can be used with.
* indexorderbys is a list of bare expressions (no RestrictInfos)
* to be used as index ordering operators in the scan.
* indexorderbycols is an integer list of index column numbers (zero based)
* the ordering operators can be used with.
* pathkeys describes the ordering of the path.
* indexscandir is ForwardScanDirection or BackwardScanDirection
* for an ordered index, or NoMovementScanDirection for
* an unordered index.
* indexonly is true if an index-only scan is wanted.
* required_outer is the set of outer relids for a parameterized path.
* loop_count is the number of repetitions of the indexscan to factor into
* estimates of caching behavior.
* partial_path is true if constructing a parallel index scan path.
*
* Returns the new path node.
*/
IndexPath *
create_index_path(PlannerInfo *root,// 優化器信息
IndexOptInfo *index,// 索引信息
List *indexclauses,// 索引約束條件鏈表
List *indexclausecols,// 索引約束條件列編號鏈表, 與 indexclauses 一一對應
List *indexorderbys,//ORDER BY 原始表達式鏈表
List *indexorderbycols,//ORDER BY 列編號鏈表
List *pathkeys,// 排序路徑鍵
ScanDirection indexscandir,// 掃描方向
bool indexonly,// 純索引掃描?
Relids required_outer,// 需依賴的外部 Relids
double loop_count,// 用于估計緩存的重復次數
bool partial_path)// 是否并行索引掃描
{
IndexPath *pathnode = makeNode(IndexPath);// 構建節點
RelOptInfo *rel = index- // 索引對應的 Rel
List *indexquals,
*indexqualcols;

pathnode- path.pathtype = indexonly ? T_IndexOnlyScan : T_IndexScan;// 路徑類型
pathnode- path.parent = rel;//Relation
pathnode- path.pathtarget = rel- reltarget;// 路徑最終的投影列
pathnode- path.param_info = get_baserel_parampathinfo(root, rel,
required_outer);// 參數化信息
pathnode- path.parallel_aware = false;//
pathnode- path.parallel_safe = rel- consider_parallel;// 是否并行
pathnode- path.parallel_workers = 0;//worker 數目
pathnode- path.pathkeys = pathkeys;// 排序路徑鍵

/* Convert clauses to the executor can handle */
// 轉換條件子句 (clauses) 為執行器可處理的索引表達式(indexquals)
expand_indexqual_conditions(index, indexclauses, indexclausecols,
indexquals, indexqualcols);

/* 填充路徑節點信息,Fill in the pathnode */
pathnode- indexinfo = index;
pathnode- indexclauses = indexclauses;
pathnode- indexquals = indexquals;
pathnode- indexqualcols = indexqualcols;
pathnode- indexorderbys = indexorderbys;
pathnode- indexorderbycols = indexorderbycols;
pathnode- indexscandir = indexscandir;

cost_index(pathnode, root, loop_count, partial_path);// 估算成本

return pathnode;
}

正文完