Understanding what columns are getting used in what kind of joins helps Data Operators manage their data better. Visibility is key,
Small Files Problem cause slowing down issues due to frequently updated Hive tables. Identifying Manually added partitions which should be available in the Metadata management system. Identifying Ghost Partitions removing non-existent partitions
Large fact tables getting queried frequently in combination with other fact, dimension tables can be quickly identified. Once identified, such tables can be further replicated, new partitions can be created, based on the columns getting used.