Cherry Pick Fix orphaned temp namespace catalog entry left on coordinator #882

yjhjstz · 2025-01-21T06:30:06Z

Fixes #ISSUE_Number

What does this PR do?

Type of Change

Bug fix (non-breaking change)
New feature (non-breaking change)
Breaking change (fix or feature with breaking changes)
Documentation update

Breaking Changes

Test Plan

Unit tests added/updated
Integration tests added/updated
Passed make installcheck
Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Followed contribution guide
Added/updated documentation
Reviewed code for security implications
Requested review from cloudberry committers

Additional Context

CI Skip Instructions

Note that, commit e88ceb8 fixed orphaned temp table left on coordinator, but it did't handle orphaned namespace catalog entry left on coordinator. This commit fix that.

Backported from Postgres REL_13_BETA1 postgres/postgres@8961355 Original commit message: Buildfarm member chipmunk has failed twice due to taking >30s, and twenty-four runs of other members have used >5s. The test is new in v13, so no back-patch.

**Issue:** Currently `gpexpand` errors out whenever it is run using a user-created input file (not created using the `gpexpand` interview process) on a cluster that has `custom tablespaces` created with the following error - ``` $ cat gpexpand_inputfile_20230914_201220 jnihal3MD6M.vmware.com|jnihal3MD6M.vmware.com|7005|/tmp/demoDataDir3|5|3|p $ gpexpand -i gpexpand_inputfile_20230914_201220 20230914:20:13:04:066896 gpexpand:jnihal3MD6M:jnihal-[ERROR]:-gpexpand failed: [Errno 2] No such file or directory: 'gpexpand_inputfile_20230914_201220.ts' ``` **RCA:** This is happening due to the commit 9b70ba8698f656c40ee62e5519314f7db1e4655e. This commit introduced a change, where it requires `gpexpand` to have a separate tablespace input configuration file (`<input_file>.ts`) whenever there are `custom tablespaces` in the database. However, this file only gets created whenever the user uses the `gpexpand` interview process to create the input file. In cases where the user manually creates the input file, the tablespace file is missing which causes the above error. **Fix:** Add a check in the `read_tablespace_file()` function to assert if the file is present or not. In cases where the file is not present, create the file automatically and exit from the process to give users a chance to review them (if they want to change the `tablespace` location) and prompt them to re-run `gpexpand`. The call to the `read_tablespace_file()` is also moved before we start the expansion process. This is because we want to exit from the process before we start the expansion so that the user does not have to `rollback` when they re-run `gpexpand`. ``` $ gpexpand -i gpexpand_inputfile_20230914_201220 20230914:20:24:00:014186 gpexpand:jnihal3MD6M:jnihal-[WARNING]:-Could not locate tablespace input configuration file 'gpexpand_inputfile_20230914_201220.ts'. A new tablespace input configuration file is written to 'gpexpand_inputfile_20230914_201220.ts'. Please review the file and re-run with: gpexpand -i gpexpand_inputfile_20230914_201220 20230914:20:24:00:014186 gpexpand:jnihal3MD6M:jnihal-[INFO]:-Exiting... $ gpexpand -i gpexpand_inputfile_20230914_201220 --> re-run with the same input file ```

…Y is not set. (#16433) * Fix: utilities do not honor -d flag when COORDINATOR_DATA_DIRECTORY is not set. Issue: Following are the gpMgmt utilities that do not honor the -d flag, when COORDINATOR_DATA_DIRECTORY is not set. 1. gpstart 2. gpstop 3. gpstatus 4. gprecoverseg 5. gpaddmirror RCA: to get the coordinator data directory gp.getcoordinator_datadir() function is called from the above-listed utilities. the function does not have any provision to return the coordinator data directory which is provided with the -d flag. currently, it looks for COORDINATOR_DATA_DIRECTORY and MASTER_DATA_DIRECTORY env variable. also in some of the utilities we were creating lock files before parsing the provided options which looks like the design flow that was causing the utilities to crash when looking for coordinator data directory. Fix: Added a global flag which holds the data directory provided with -d option. so when we run the utility and do parsing it sets the flag with the provided datadirectory and the same will be returned when we call gp.gp.getcoordinator_datadir(). Test: Added behave test cases to use the provided data directory if COORDINATOR_DATA_DIRECTORY is not set. Added behave test case to check if the provided coordinator data directory is preferred over the already set coordinator_data_dir env variable. it is tested by setting a wrong COORDINATOR_DATA_DIRECTORY env variable and when we run the utility with the correct data directory using the -d option then the utility should execute successfully.

This should have be done with #16428, but we need to disable autovacuum when running the gp_check_files regress test. Otherwise we might see errors like: ``` @@ -53,12 +53,8 @@ -- check orphaned files, note that this forces a checkpoint internally. set client_min_messages = ERROR; select gp_segment_id, filename from run_orphaned_files_view(); - gp_segment_id | filename ----------------+---------- - 1 | 987654 - 1 | 987654.3 -(2 rows) - +ERROR: failed to retrieve orphaned files after 10 minutes of retries. +CONTEXT: PL/pgSQL function run_orphaned_files_view() line 19 at RAISE reset client_min_messages; ``` In the log we have: ``` 2023-09-20 15:33:00.766420 UTC,"gpadmin","regression",p148081,th-589358976,"[local]",,2023-09-20 15:31:39 UTC,0,con19,cmd65,seg-1,,dx38585,,sx1,"LOG","00000","attempt failed 17 with error: There is a client session running on one or more segment. Aborting...",,,,,"PL/pgSQL function run_orphaned_files_view() line 11 at RAISE","select gp_segment_id, filename from run_orphaned_files_view();",0,,"pl_exec.c",3857, ``` It is possible that some background jobs have created some backends that we think we should avoid when taking the gp_check_orphaned_files view. As we have decided to make the view conservative (disallowing any backends that could cause false positive of the view results), fixing the test is what we need. In the test we have a safeguard which is to loop 10 minutes and take the view repeatedly (function run_orphaned_files_view()). But it didn't solve the issue because it saw only one snapshot of pg_stat_activity in the entire execution of the function. Now explicitly call pg_stat_clear_snapshot() to solve that issue. Co-authored-by: Ashwin Agrawal [email protected]

The comments for create_split_update_path: ``` GPDB_96_MERGE_FIXME: the below comment is obsolete. Nowadays, SplitUpdate computes the new row's hash, and the corresponding. target segment. The old segment comes from the gp_segment_id junk column. But ORCA still does it the old way! Third, to support deletion, and hash delete operation to correct segment, we need to get attributes of OLD tuple. The old attributes must therefore be present in the subplan's target list. That is handled earlier in the planner, in expand_targetlist(). ``` For planner we use Explicit Redistribute Motion instead of Redistribute Motion above the split node, So old attributes is useless to compute target segment id for Redistribute Motion. remove above comments.

… exceeded." (#16388) When user try to select a big number of rows via PSQL, they may encounter the following error: ERROR: "Cannot add cell to table content: total cell count of XXX exceeded. The root cause is the overflow of int type in printTableAddCell(). Fix it in this commit. (And this issue also exists in upstream, I have reported it: https://www.postgresql.org/message-id/flat/TYBP286MB0351B057B101C90D7C1239E6B4E2A%40TYBP286MB0351.JPNP286.PROD.OUTLOOK.COM)

Add the duration time into "canceling statement due to user request" to help the user debug when canceling the query. The log message will be 2023-10-10 03:32:55.325528 UTC,"smart","postgres",p463714,th1746153216,"[local]",,2023-10-10 03:32:01 UTC, 0,con32,cmd9,seg-1,,,,sx1,"ERROR","57014","canceling statement due to user request, duration:4256.617",,,,,, "select * from t, t as t1, t as t2;",0,,"postgres.c",4053, Not change the log schema.

…5673) The coordinator and all segments are required to have same set of collations. This function was tuned to support MPP structure via DispatchCollationCreate, but only for importing collations from libc. Once ICU enabled, it would fail. This commit updated DispatchCollationCreate to support dispatching ICU collations. New argument `provider` is added while unused `encoding` removed.

When session exits within a transaction block, temp table created in this session will be left. This is because ResetAllGangs will reset myTempNamespace, normal temp table catalog clean up won't be called. More details could ref https://github.com/greenplum-db/gpdb/issues/16506 Noted that, when a session is going to exit on coordinator, there is no need to drop temp tables by calling GpDropTempTables, callback function will do that at the end.

fixme in create_ctescan_path is as bellow: ``` // GPDB_96_MERGE_FIXME: Why do we set pathkeys in GPDB, but not in Postgres? // pathnode->pathkeys = NIL; /* XXX for now, result is always unordered */ pathnode->pathkeys = pathkeys; ``` By now we can not find the commit info of why we use pathkeys in GPDB, but not in Postgres. The codes "pathnode->pathkeys = pathkeys;" are already introduced after commit: 6b0e52b. And I can not find much more infos in historical repos too. Sometimes we can reduce extra sort path when the pathkeys of cte node and the pathkeys of subpath node are the same. So we should keep these codes. Just remove GPDB_96_MERGE_FIXME.

The test was flaky with a diff like: ``` @@ -397,9 +397,9 @@ (8 rows) select list_tablespace_dbid_dirs(:expected_number_of_tablespaces, :'tablespace_location'); - list_tablespace_dbid_dirs + list_tablespace_dbid_dirs ------------ - {Success,"expected 1 tablespaces","found 1 tablespaces",/tmp/my_tablespace_for_testing/2} + {Failed,"expected 1 tablespaces","found 2 tablespaces",/tmp/my_tablespace_for_testing/2,/tmp/my_tablespace_for_testing/5} (1 row) ``` The reason is that there's a chance the mirror has already replayed the WAL to create tablespace, so we can have two orphaned tablespace directories instead of one. Now make it panic even before we write the WAL to create tablespace to make it not flaky.

Add some help information for the gpfdist flag 'ssl_verify_peer'.

Once a logical slot has acquired a catalog_xmin, it doesn't let go of it, even when invalidated by exceeding the max_slot_wal_keep_size, which means that dead catalog tuples are not removed by vacuum anymore since the point is invalidated, until the slot is dropped. This could be catastrophic if catalog churn is high. Change the computation of Xmin to ignore invalidated slots, to prevent dead rows from accumulating. Backpatch to 13, where slot invalidation appeared. Author: Sirisha Chamarthi <[email protected]> Reviewed-by: Ashutosh Bapat <[email protected]> Discussion: https://postgr.es/m/CAKrAKeUEDeqquN9vwzNeG-CN8wuVsfRYbeOUV9qKO_RHok=j+g@mail.gmail.com (cherry picked from commit 36eeb37cd611c0a0bfb5743d9ddbef8f04fc87f3)

Although WAL replication is used as the HA mechanism by all segments in a Greenplum cluster, there are some differences in how it is used by coordinator/standby and by primary/mirror segments. One such difference is the use of replication slots. Primary/mirror WAL replication makes use of replication slots to ensure WAL that a primary retains WAL that may still be needed by its mirror. Replication slots are not used for coordinator/standby. To ensure that coordinator retains WAL that is not yet received by the standby, possibly due to replication lag, a special mechanism is used by coordinator to track the flush location of standby. This patch confinces the use of this special mechanism to coordinator, the mechanism is no longer used by primary segments. Ideally, coordinator/standby may also make use of replication slots. The GUC max_slot_wal_keep_size can be set to the lowest value of (checkpoint_segments * XLOG_BLCKSZ) on coordinator to ensure that WAL doesn't fill up disk space when the standby goes down. That could done in a followup patch. Reviewed by: Ashwin Agrawal <[email protected]> Reviewed by: Paul Guo <[email protected]> (cherry picked from commit 0f2d65f497245eb00b5fc544cf609e79ddd75226)

The current WAL record pointer passed by the test to KeepLogSeg was older that _logSegNo. This cannot happen, so fix it by increasing it beyond _logSegNo. The test needs to be refactored heavily to make use of new WAL format, but let's do it as a follow up. Reviewed by: Ashwin Agrawal <[email protected]> Reviewed by: Paul Guo <[email protected]> (cherry picked from commit e4ec851b151267fe3209af6804e438611836a875)

…ch) (#14978) This pr can fix the bug reported in [issue 14644](https://github.com/greenplum-db/gpdb/issues/14644) **Bug Detail:** After analyzing a table with a dropped subclass, the value on the coordinator may differ from the segments. See the SQL result with bug below: ``` gpadmin=# CREATE TYPE test_type2 AS (a int, b text); CREATE TYPE gpadmin=# CREATE TABLE test_tbl2 OF test_type2; CREATE TABLE gpadmin=# CREATE TABLE test_tbl2_subclass () INHERITS (test_tbl2); CREATE TABLE gpadmin=# DROP TABLE test_tbl2_subclass; DROP TABLE gpadmin=# analyze; ANALYZE gpadmin=# select gp_segment_id, relhassubclass from pg_class where relname = 'test_tbl2'; gp_segment_id | relhassubclass ---------------+---------------- -1 | t (1 row) gpadmin=# select gp_segment_id, relhassubclass from gp_dist_random('pg_class') where relname = 'test_tbl2'; gp_segment_id | relhassubclass ---------------+---------------- 2 | f 1 | f 0 | f (3 rows) ``` **Correct Behavior:** the value of `relhassubclass `on the coordinator should be the same as segments. ``` gpadmin=# DROP TABLE test_tbl2_subclass; DROP TABLE gpadmin=# analyze; ANALYZE gpadmin=# select gp_segment_id, relhassubclass from pg_class where relname = 'test_tbl2'; gp_segment_id | relhassubclass ---------------+---------------- -1 | f (1 row) gpadmin=# select gp_segment_id, relhassubclass from gp_dist_random('pg_class') where relname = 'test_tbl2'; gp_segment_id | relhassubclass ---------------+---------------- 2 | f 1 | f 0 | f (3 rows) ``` Signed-off-by: Yongtao Huang <[email protected]>

dreamedcheng and others added 16 commits January 20, 2025 15:41

Fix orphaned temp namespace catalog entry left on coordinator

3b81802

Note that, commit e88ceb8 fixed orphaned temp table left on coordinator, but it did't handle orphaned namespace catalog entry left on coordinator. This commit fix that.

Add help information for gpfdist(#16595)

580111a

Add some help information for the gpfdist flag 'ssl_verify_peer'.

yjhjstz added the cherry-pick cherry-pick upstream commts label Jan 21, 2025

yjhjstz force-pushed the pick_0120 branch from 8cae641 to dee5e3b Compare January 21, 2025 07:26

yjhjstz marked this pull request as ready for review January 21, 2025 08:22

my-ship-it approved these changes Jan 21, 2025

View reviewed changes

avamingli approved these changes Jan 21, 2025

View reviewed changes

yjhjstz merged commit a96a9df into apache:main Jan 22, 2025
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cherry Pick Fix orphaned temp namespace catalog entry left on coordinator #882

Cherry Pick Fix orphaned temp namespace catalog entry left on coordinator #882

yjhjstz commented Jan 21, 2025

Cherry Pick Fix orphaned temp namespace catalog entry left on coordinator #882

Cherry Pick Fix orphaned temp namespace catalog entry left on coordinator #882

Conversation

yjhjstz commented Jan 21, 2025

What does this PR do?

Type of Change

Breaking Changes

Test Plan

Impact

Checklist

Additional Context

CI Skip Instructions