1. The resource should have knowledge on Data Warehouse and Data Lake 2. Should aware of building data pipelines using Pyspark 3. Should be strong in SQL skills 4. Should have exposure to AWS environment and services like S3, EC2, EMR, Athena, Redshift etc 5. Good to have programming skills in Python