WordPress Development Recipe: Efficient binary storage and retrieval in custom tables using Match expressions
Database Schema Design for Binary Data
When developing custom WordPress plugins that require storing and retrieving binary data (e.g., images, PDFs, serialized objects), directly embedding this data within the WordPress `wp_posts` or `wp_options` tables is generally inefficient and can lead to performance degradation. A more robust approach involves creating dedicated custom database tables. For optimal performance, especially when dealing with large binary blobs, consider using the `LONGBLOB` or `MEDIUMBLOB` data types in MySQL. However, for this recipe, we’ll focus on a common scenario where we store metadata about binary files and their associated identifiers, deferring the actual binary storage to the filesystem or a dedicated object store. This approach keeps the database lean and fast.
Let’s define a custom table structure. We’ll need a table to store information about our binary assets. This table will include a unique identifier, a reference to the WordPress object it’s associated with (e.g., a post ID), a filename, MIME type, and a storage path or identifier. We’ll also include timestamps for tracking.
Creating the Custom Table
The standard WordPress activation hook is the ideal place to create custom database tables. This ensures the table is present when the plugin is first activated. We’ll use the `$wpdb` global object for all database interactions.
/**
* Plugin activation hook.
* Creates the custom table for binary asset metadata.
*/
function my_plugin_activate() {
global $wpdb;
$table_name = $wpdb->prefix . 'binary_assets';
$charset_collate = $wpdb->get_charset_collate();
$sql = "CREATE TABLE $table_name (
id mediumint(9) NOT NULL AUTO_INCREMENT,
object_id bigint(20) unsigned DEFAULT NULL COMMENT 'ID of the associated WordPress object (e.g., post_id)',
object_type varchar(50) DEFAULT '' COMMENT 'Type of the associated object (e.g., post, user)',
file_name varchar(255) NOT NULL DEFAULT '' COMMENT 'Original filename',
mime_type varchar(100) NOT NULL DEFAULT '' COMMENT 'MIME type of the file',
storage_path varchar(1024) NOT NULL DEFAULT '' COMMENT 'Path or identifier in storage (filesystem, S3, etc.)',
created_at datetime DEFAULT CURRENT_TIMESTAMP,
updated_at datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (id),
KEY object_index (object_id, object_type)
) $charset_collate;
To register this activation hook, you would typically do this in your main plugin file:
register_activation_hook( __FILE__, 'my_plugin_activate' );
Storing Binary Asset Metadata
When a new binary asset is uploaded or associated with a WordPress object, we need to insert its metadata into our custom table. For this example, let’s assume we have a function that handles file uploads and returns the necessary information.
/**
* Stores metadata for a binary asset.
*
* @param int $object_id The ID of the associated WordPress object.
* @param string $object_type The type of the associated object (e.g., 'post').
* @param string $file_name The original filename.
* @param string $mime_type The MIME type of the file.
* @param string $storage_path The path or identifier in storage.
* @return int|false The ID of the newly inserted row, or false on failure.
*/
function store_binary_asset_metadata( $object_id, $object_type, $file_name, $mime_type, $storage_path ) {
global $wpdb;
$table_name = $wpdb->prefix . 'binary_assets';
$data = array(
'object_id' => absint( $object_id ),
'object_type' => sanitize_text_field( $object_type ),
'file_name' => sanitize_file_name( $file_name ),
'mime_type' => sanitize_mime_type( $mime_type ),
'storage_path' => esc_url_raw( $storage_path ), // Or sanitize_text_field if not a URL
);
$format = array(
'%d',
'%s',
'%s',
'%s',
'%s',
);
$inserted = $wpdb->insert( $table_name, $data, $format );
if ( $inserted ) {
return $wpdb->insert_id;
}
return false;
}
// Example usage:
// $asset_id = store_binary_asset_metadata( 123, 'post', 'document.pdf', 'application/pdf', '/uploads/docs/document.pdf' );
// if ( $asset_id ) {
// error_log( "Binary asset metadata stored with ID: " . $asset_id );
// } else {
// error_log( "Failed to store binary asset metadata." );
// }
Retrieving Binary Asset Metadata Using MATCH()
The `MATCH()` function in MySQL is powerful for full-text searching, but it’s not directly applicable here as we’re dealing with structured metadata, not free-form text content. Instead, we’ll leverage standard SQL `WHERE` clauses for efficient retrieval. However, if our `storage_path` or `file_name` fields were to contain more complex, searchable patterns, we might consider using `LIKE` or regular expressions. For this recipe, we’ll focus on retrieving assets associated with a specific WordPress object.
To retrieve all binary assets associated with a particular WordPress object (e.g., a post), we can query our custom table using the `object_id` and `object_type` columns. This is where indexing becomes crucial for performance.
/**
* Retrieves binary asset metadata for a given WordPress object.
*
* @param int $object_id The ID of the associated WordPress object.
* @param string $object_type The type of the associated object (e.g., 'post').
* @return array|null An array of asset metadata, or null if none found or on error.
*/
function get_binary_assets_for_object( $object_id, $object_type ) {
global $wpdb;
$table_name = $wpdb->prefix . 'binary_assets';
// Ensure inputs are sanitized
$object_id = absint( $object_id );
$object_type = sanitize_text_field( $object_type );
// Prepare the query
$query = $wpdb->prepare(
"SELECT * FROM $table_name WHERE object_id = %d AND object_type = %s ORDER BY created_at DESC",
$object_id,
$object_type
);
// Execute the query
$results = $wpdb->get_results( $query, ARRAY_A ); // ARRAY_A returns associative arrays
if ( empty( $results ) ) {
return null; // Or return an empty array: return array();
}
return $results;
}
// Example usage:
// $post_id = 123;
// $assets = get_binary_assets_for_object( $post_id, 'post' );
//
// if ( $assets ) {
// echo "<h3>Assets for Post ID {$post_id}:</h3>";
// echo "<ul>";
// foreach ( $assets as $asset ) {
// echo "<li>";
// echo "Filename: " . esc_html( $asset['file_name'] ) . ", ";
// echo "MIME Type: " . esc_html( $asset['mime_type'] ) . ", ";
// echo "Storage Path: " . esc_url( $asset['storage_path'] ) . " (";
// // Here you would construct a link or action to retrieve the actual file
// echo "<a href='" . esc_url( site_url( '/download-asset/?id=' . $asset['id'] ) ) . "'>Download</a>";
// echo ")";
// echo "</li>";
// }
// echo "</ul>";
// } else {
// echo "<p>No assets found for Post ID {$post_id}.</p>";
// }
Advanced Retrieval: Filtering by Multiple Criteria
For more complex retrieval scenarios, you might need to filter by filename patterns, MIME types, or date ranges. This is where `LIKE` operators and date functions come into play. While not using `MATCH()`, these techniques provide robust filtering capabilities.
/**
* Retrieves binary asset metadata with advanced filtering.
*
* @param array $args {
* Optional. An array of arguments to filter assets.
* @type int $object_id Filter by object ID.
* @type string $object_type Filter by object type.
* @type string $filename_pattern Filter by filename pattern (e.g., '%.pdf').
* @type string $mime_type Filter by MIME type.
* @type string $date_after Filter by creation date (YYYY-MM-DD).
* }
* @return array|null An array of asset metadata, or null if none found or on error.
*/
function get_binary_assets_advanced( $args = array() ) {
global $wpdb;
$table_name = $wpdb->prefix . 'binary_assets';
$defaults = array(
'object_id' => null,
'object_type' => '',
'filename_pattern' => '',
'mime_type' => '',
'date_after' => '',
);
$args = wp_parse_args( $args, $defaults );
$where_clauses = array();
$query_params = array();
$query_formats = array();
if ( ! empty( $args['object_id'] ) ) {
$where_clauses[] = "object_id = %d";
$query_params[] = absint( $args['object_id'] );
$query_formats[] = '%d';
}
if ( ! empty( $args['object_type'] ) ) {
$where_clauses[] = "object_type = %s";
$query_params[] = sanitize_text_field( $args['object_type'] );
$query_formats[] = '%s';
}
if ( ! empty( $args['filename_pattern'] ) ) {
$where_clauses[] = "file_name LIKE %s";
$query_params[] = '%' . sanitize_text_field( $args['filename_pattern'] ); // Assuming pattern is suffix, e.g., 'pdf'
$query_formats[] = '%s';
}
if ( ! empty( $args['mime_type'] ) ) {
$where_clauses[] = "mime_type = %s";
$query_params[] = sanitize_mime_type( $args['mime_type'] );
$query_formats[] = '%s';
}
if ( ! empty( $args['date_after'] ) ) {
// Basic date validation, could be more robust
if ( preg_match( '/^\d{4}-\d{2}-\d{2}$/', $args['date_after'] ) ) {
$where_clauses[] = "created_at >= %s";
$query_params[] = sanitize_text_field( $args['date_after'] ) . ' 00:00:00';
$query_formats[] = '%s';
}
}
$where_sql = '';
if ( ! empty( $where_clauses ) ) {
$where_sql = "WHERE " . implode( ' AND ', $where_clauses );
}
// Prepare the query
$query = "SELECT * FROM $table_name $where_sql ORDER BY created_at DESC";
// Use $wpdb->prepare with dynamic parameters
if ( ! empty( $query_params ) ) {
$query = $wpdb->prepare( $query, array_merge( $query_formats, array() ) ); // $wpdb->prepare expects an array of formats
}
// Execute the query
$results = $wpdb->get_results( $query, ARRAY_A );
if ( empty( $results ) ) {
return null;
}
return $results;
}
// Example usage:
// Find all PDF documents uploaded for post 123 after January 1st, 2023
// $filtered_assets = get_binary_assets_advanced( array(
// 'object_id' => 123,
// 'object_type' => 'post',
// 'filename_pattern' => 'pdf', // Will search for file_name LIKE '%.pdf'
// 'date_after' => '2023-01-01',
// ) );
//
// if ( $filtered_assets ) {
// error_log( "Found " . count( $filtered_assets ) . " filtered assets." );
// } else {
// error_log( "No filtered assets found." );
// }
Handling Actual Binary Data Retrieval
The metadata retrieval functions above return the `storage_path`. The actual mechanism for serving the binary file depends entirely on where and how you’ve stored it. Common strategies include:
- Filesystem: The `storage_path` could be a relative or absolute path on the server. You’d use PHP functions like `readfile()` or `file_get_contents()` to fetch the data and set appropriate HTTP headers (e.g., `Content-Type`, `Content-Disposition`) before outputting the file.
- Object Storage (e.g., AWS S3, Google Cloud Storage): The `storage_path` would be an object key or URL. You’d use the respective SDK to generate a pre-signed URL for direct download or stream the object content.
- WordPress Media Library (less common for custom tables): If you’ve also registered the file with the WordPress Media Library, you might retrieve its attachment ID and use `wp_get_attachment_url()` or `wp_get_attachment_image_src()`.
For security and efficiency, it’s crucial to implement a dedicated download endpoint (e.g., a custom AJAX handler or a rewrite rule pointing to a PHP script) that verifies user permissions before serving the file. Never directly expose file paths or allow unauthenticated access to sensitive binary data.
Considerations for `MATCH()` Expressions
While this recipe focuses on structured metadata retrieval, it’s worth noting when `MATCH()` *would* be appropriate. If your custom table stored a large text field (e.g., `description` or `content`) and you needed to perform full-text searches on that content, you would:
- Add a `FULLTEXT` index to the relevant column(s) in your `CREATE TABLE` statement.
- Use `MATCH(column1, column2) AGAINST(‘search terms’ IN BOOLEAN MODE)` in your `WHERE` clause.
For example, if `binary_assets` had a `description` column:
-- Adding FULLTEXT index
ALTER TABLE wp_binary_assets ADD FULLTEXT(file_name, mime_type, storage_path);
-- Example query using MATCH
SELECT id, file_name
FROM wp_binary_assets
WHERE object_id = 123
AND MATCH(file_name, storage_path) AGAINST('report pdf' IN BOOLEAN MODE);
This allows for natural language searching or boolean searches within the specified text fields, which is distinct from the exact matching and pattern matching used for metadata retrieval in this recipe.