Mastering Object Versioning in Google Cloud Storage: A Comprehensive Guide

Usecase
Suppose we have a lot of data in our Cloud Storage bucket and somehow by mistake someone runs
gsutil rm gs://my_bucket/*
or some script or application deletes some critical data due to some bug or missing argument.
we will lose all our data and won’t be able to recover it easily or may never recover it.
How does GCS ObjectVersioning Help?
By Design, every storage object (file) in Cloud Storage is assigned 2 sequence numbers
- generation number
- meta-generation number
we will talk about them in detail later,
In a Nutshell, a generation number will be assigned each time we replace an object or modify it. similarly, a meta-generation number will be assigned to an object each time we modify the meta-data.
By default, object versioning is disabled as it incurs more cost as we store multiple versions of the same object with different generation and meta-generation numbers, but if we need the ability to recover old data we can leverage object versioning.
Enabling Object versioning
- We have just created a new bucket ashish_vtest having 2 files log.txt and Main.java
- we can check if the Object versioning is enabled on a bucket/folder
status can be Suspended or Enabled.
gsutil versioning get gs://ashish_vtest

Now Let’s enable the versioning
gsutil versioning set on gs://ashish_vtest

Checking Object Versions
- We can use gsutil ls -a gs://<path> to check all the file object versions. ( all the files including old versions and current ones )

- we can see there is the number after file names prefixed by #, which is called the generation number.
- we can access any non-current file( old versions ), using full name of the files ( name + generation number )
Deleting and recovering a file
- Now, let’s do some real work first we will delete log.txt and then recover it using the generation number.

- as we can see there is only one current file, which we can check using gsutil ls gs://<path>
- but if we check all the Object versions we will still see 2 files

- Now, let’s recover the log.txt and put it inside the same location.

- As you can see, we are just copying files and putting them inside the same directory, note that for using the non-current file we will have to use the file name and generation number together.
- we can see there are 2 current files but there will be 3 versions as a new version will be created when we copy.

Generations and Meta-Generation Number
- Even without Object Versioning enabled, all Cloud Storage objects have generation numbers and meta-generation numbers. The generation number changes each time the object is replaced, and the meta-generation number changes each time the object’s metadata is updated.
- Buckets maintain a meta-generation number enabling users to uniquely identify a bucket metadata state.
- we can check meta-generation numbers using -la flags in gsutil ls -la gs://<path>

- meta-generation number starts from 1 and increases as we update the metadata state of an Object.
- let’s update the meta-data for log.txt and check the meta-generation number.
- We can edit metadata directly via UI or we can use CLI refer — https://cloud.google.com/storage/docs/viewing-editing-metadata#view
- from 3 dots on the right side of an object on GCP UI, we can edit metadata


- Now, if we check, we will see 2 meta-generation numbers for log.txt
If we want to access a file with specific meta-data, we will have to use the generation number if that version is not the current one.

Disabling Object versioning
- we can disable object versioning using gsutil command
gsutil versioning set off gs://ashish_vtest
- Even after disabling Object Versioning, all the versions that are there won’t be deleted, Although Cloud storage will not create any further Versions.
- documentation — https://cloud.google.com/storage/docs/using-object-versioning
If you like this article, please follow me and this publication for more interesting articles, a clap will be appreciated.
#codebrace #happy_coding