Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 70 additions & 1 deletion features/search-replace.feature
Original file line number Diff line number Diff line change
Expand Up @@ -292,7 +292,7 @@ Feature: Do global search/replace
When I run `wp search-replace 'http://example.com' 'http://newdomain.com' wp_posts --include-columns=post_content`
Then STDOUT should be a table containing rows:
| Table | Column | Replacements | Type |
| wp_posts | post_content | 1 | SQL |
| wp_posts | post_content | 1 | PHP |

When I run `wp post get {POST_ID} --field=post_content`
Then STDOUT should contain:
Expand Down Expand Up @@ -1637,3 +1637,72 @@ Feature: Do global search/replace
"""
Table is read-only
"""

@require-mysql
Scenario: Search and replace handles JSON-encoded URLs in custom tables
Given a WP install
And I run `wp db query "CREATE TABLE wp_json_test ( id int(11) unsigned NOT NULL AUTO_INCREMENT, meta TEXT, PRIMARY KEY (id) ) ENGINE=InnoDB;"`
And I run `wp db query "INSERT INTO wp_json_test (meta) VALUES ('{\"confirmations\":{\"1\":{\"url\":\"https:\\/\\/oldsite.com\\/confirmation-page\",\"type\":\"redirect\"}}}');"`

When I run `wp search-replace 'https://oldsite.com' 'https://newsite.com' wp_json_test --all-tables-with-prefix`
Then STDOUT should be a table containing rows:
| Table | Column | Replacements | Type |
| wp_json_test | meta | 1 | PHP |

When I run `wp db query "SELECT meta FROM wp_json_test WHERE id = 1" --skip-column-names`
Then STDOUT should contain:
"""
https:\/\/newsite.com\/confirmation-page
"""
And STDOUT should not contain:
"""
https:\/\/oldsite.com
"""

@require-mysql
Scenario: Search and replace handles nested JSON (JSON within serialized data)
Given a WP install
And a setup-nested-json.php file:
"""
<?php
$data = array(
'config' => json_encode( array(
'url' => 'https://oldsite.com/page',
'name' => 'Test',
) ),
);
update_option( 'nested_json_test', $data );
"""
And I run `wp eval-file setup-nested-json.php`

When I run `wp search-replace 'https://oldsite.com' 'https://newsite.com' wp_options --include-columns=option_value`
Then STDOUT should be a table containing rows:
| Table | Column | Replacements | Type |
| wp_options | option_value | 1 | PHP |

When I run `wp option get nested_json_test --format=json`
Then STDOUT should contain:
"""
newsite.com
"""
And STDOUT should not contain:
"""
oldsite.com
"""

@require-mysql
Scenario: Search and replace detects JSON columns for PHP mode automatically
Given a WP install
And I run `wp db query "CREATE TABLE wp_json_detect ( id int(11) unsigned NOT NULL AUTO_INCREMENT, data TEXT, PRIMARY KEY (id) ) ENGINE=InnoDB;"`
And I run `wp db query "INSERT INTO wp_json_detect (data) VALUES ('{\"site_url\":\"https:\\/\\/old.example.com\\/path\"}');"`

When I run `wp search-replace 'https://old.example.com' 'https://new.example.com' wp_json_detect --all-tables-with-prefix`
Then STDOUT should be a table containing rows:
| Table | Column | Replacements | Type |
| wp_json_detect | data | 1 | PHP |

When I run `wp db query "SELECT data FROM wp_json_detect WHERE id = 1" --skip-column-names`
Then STDOUT should contain:
"""
new.example.com
"""
8 changes: 8 additions & 0 deletions src/Search_Replace_Command.php
Original file line number Diff line number Diff line change
Expand Up @@ -562,6 +562,14 @@ public function __invoke( $args, $assoc_args ) {
if ( false !== strpos( $wpdb->last_error, 'ERROR 1139' ) ) {
$serial_row = true;
}

// Also detect JSON objects/arrays so the PHP path can decode,
// recurse into, and re-encode them — handling nested escaped
// URLs that a simple SQL REPLACE cannot reach.
if ( null === $serial_row ) {
// phpcs:ignore WordPress.DB.PreparedSQL.InterpolatedNotPrepared -- escaped through self::esc_sql_ident
$serial_row = $wpdb->get_row( "SELECT * FROM $table_sql WHERE $col_sql REGEXP '^[\\\\[{]' LIMIT 1" );
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The regular expression ^[\\[{] appears to be malformed as it opens a character class [ but does not close it with a corresponding ]. This will likely result in a database error or fail to match any rows, preventing the automatic detection of JSON columns. A more reliable and readable approach for detecting strings starting with { or [ is to use an alternation.

						$serial_row = $wpdb->get_row( "SELECT * FROM $table_sql WHERE $col_sql REGEXP '^(\\\\[|{)' LIMIT 1" );

}
}

if ( $php_only || $this->regex || null !== $serial_row ) {
Expand Down
15 changes: 14 additions & 1 deletion src/WP_CLI/SearchReplacer.php
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,20 @@ private function run_recursively( $data, $serialised, $recursion_level = 0, $vis
if ( $this->logging ) {
$old_data = $data;
}
if ( $this->regex ) {

// Try to decode as a JSON object or array and recurse into the
// decoded structure. This properly handles URLs stored inside
// JSON-encoded columns (e.g. Gravity Forms confirmations, block
// editor font data), including nested JSON where slashes are
// double-escaped.
$json_decoded = json_decode( $data, true );
if ( null !== $json_decoded && is_array( $json_decoded ) ) {
$json_decoded = $this->run_recursively( $json_decoded, false, $recursion_level + 1, $visited_data );
$json_result = json_encode( $json_decoded );
if ( false !== $json_result ) {
$data = $json_result;
}
} elseif ( $this->regex ) {
Comment on lines +204 to +216
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Attempting to json_decode() every string in a column detected as JSON can lead to significant performance degradation, especially for large text fields like post_content. Since valid JSON objects and arrays must start with { or [, adding a quick character check before calling json_decode() will avoid unnecessary parsing for the majority of non-JSON strings.

				// Try to decode as a JSON object or array and recurse into the
				// decoded structure. This properly handles URLs stored inside
				// JSON-encoded columns (e.g. Gravity Forms confirmations, block
				// editor font data), including nested JSON where slashes are
				// double-escaped.
				if ( '' !== $data && ( '{' === $data[0] || '[' === $data[0] ) ) {
					$json_decoded = json_decode( $data, true );
					if ( null !== $json_decoded && is_array( $json_decoded ) ) {
						$json_decoded = $this->run_recursively( $json_decoded, false, $recursion_level + 1, $visited_data );
						$json_result  = json_encode( $json_decoded );
						if ( false !== $json_result ) {
							$data = $json_result;
						}
					}
				} elseif ( $this->regex ) {

$search_regex = $this->regex_delimiter;
$search_regex .= $this->from;
$search_regex .= $this->regex_delimiter;
Expand Down
Loading