Skip to main content

136 Ideas

Dynamic Date Partitioning for Cloud Storage DestinationsNew

Currently, when setting up a Cloud Storage destination (like Google Cloud Storage), it is difficult to organize data into a standard partitioned folder structure (e.g., folder/YYYY/MM/DD/file.parquet) because the "Upload Path" field often treats date tags as literal strings or lacks support for granular date variables.The Solution: I would like to suggest the implementation of native support for date variables within the Upload Path field. Ideally, this would include: Standardized Tags: Support for tags like {YYYY}, {MM}, and {DD}  that resolve based on the data's date range or the execution date. Dynamic Subfolder Creation: The ability to use the forward slash / character combined with these tags to automatically generate the directory structure in the bucket. Hive Partitioning Format: Enabling users to define paths like year={YYYY}/month={MM}/day={DD}/ to facilitate seamless integration with Data Lake tools like BigQuery External Tables, AWS Athena, or Spark. Why this is important: Data Organization: Manually managing massive amounts of data in a single root folder is not scalable. Query Performance: Partitioning is essential for optimizing query costs and speed in BigQuery/Athena. Automation: It eliminates the need for middleman scripts (like Cloud Functions or Glue) just to move files into the correct date-based folders. Use Case Example: A user wants to export Google Ads data daily. With this feature, the path would automatically resolve from google_ads/data/ to google_ads/2026/01/28/data.parquet without manual intervention.