How to Get the List of all Files with Size,Modified and Path from GCS Bucket and Load into BigQuery

Topic: How to Get the List of all Files with Size, Modified and Path from GCS Bucket and Load into BigQuery.

In this post, you will learn how to get the list of all files with their size, modified date, and path from a Google Cloud Storage (GCS) bucket and load them into BigQuery.

We will guide you through the process step by step, starting with setting up the necessary permissions and credentials for accessing GCS and BigQuery.

By the end of this video, you will clearly understand how to extract file metadata from a GCS bucket, load it into BigQuery, and leverage its powerful querying capabilities for further analysis.

If you’re a data engineer, data analyst, or anyone working with large datasets in Google Cloud Platform, this post is for you! 

Script: 

function listFolderContents() { var foldername = 'Final Logos'; // provide the name of Folder from which you want to get the list of files var ListOfFiles = 'ListOfFiles_' + foldername; var folders = DriveApp.getFoldersByName(foldername) var folder = folders.next(); var contents = folder.getFiles(); var ss = SpreadsheetApp.create(ListOfFiles); var sheet = ss.getActiveSheet(); sheet.appendRow( ['name', 'link','sizeInMB'] ); var var_file; var var_name; var var_link; var var_size; while(contents.hasNext()) { var_file = contents.next(); var_name = var_file.getName(); var_link = var_file.getUrl(); var_size=var_file.getSize()/1024.0/1024.0; sheet.appendRow( [var_name, var_link,var_size] ); }
};

That’s it! You should now have a BigQuery table with all the files from your GCS bucket along with their size, modified date, and path.

 

Video Demo: How to Get the List of all Files with Size, Modified and Path from GCS Bucket and Load into BigQuery.

No comments:

Post a Comment