Poshith Raja
Sikha
Data Engineer
Hyderabad, IndiaAbout
Data Engineer with 4+ years of experience in building scalable data pipelines and transformations. Specializing in PySpark, Databricks, and big data technologies across financial services, banking, and sportswear industries. Focused on delivering robust solutions that ensure data integrity and drive business value.
Technical SkillsBig Data
- PySpark
- Databricks
- Hadoop
- HDFS
- Hive
- Spark-SQL
Programming
- Python
- SQL
- Shell Scripts
- Sqoop
Tools & Platforms
- Databricks
- CDSW
- Bitbucket
- Git
- Oozie
- Hue
- Impala
Data Systems
- EDL
- EDW
- AWS S3
- Linux
- Windows
Experience
Infosys
Senior Associate Consultant
Databricks
PySpark
SQL
- Resolved critical bugs in existing scripts, eliminating data discrepancies and ensuring data integrity across tables
- Developed and implemented an automation script for daily workflow monitoring, significantly improving efficiency and reducing manual oversight
- Performed in-depth root cause analysis on workflow failures, resulting in the identification and remediation of critical system issues
Scalability Engineers
Data Engineer
Databricks
PySpark
SQL
AWS S3
- Worked on silver-to-gold layer transformations using PySpark in Databricks, applying business logic to extract and transform data
- Loaded transformed data into the gold layer to build KPIs for internal audit team compliance with RBI regulatory guidelines
- Developed workflows to schedule batch data loads into tables, ensuring timely data processing
- Wrote CSV scripts to export data to S3 buckets for UAT testing and validation
- Set up auto mailers to send alerts for workflow triggers and KPI CSVs to relevant teams for review
Accenture
Data Engineering, Management & Governance Analyst
PySpark
Hadoop
Shell Scripting
Hive
Hue
Impala
HDFS
- Actively participated in the development process, adhering to agile methodologies to ensure efficient code development
- Conducted data profiling, built functions to read CSV files or tables into dataframes and performed complex transformations using PySpark
- Focused on code quality by writing test cases, reviewing code, and engaging in end-to-end regression testing, validating PySpark outputs with Hive queries
- Stored project output in Hive tables and CSV files, and transferred these files to the landing zone via SFTP for downstream systems
- Contributed to Soft-Go Live, Main-Go Live, and deployment activities, while providing ongoing production support to ensure smooth operations and resolve issues promptly
Education
Bachelor of Technology
Computer Science and Engineering
Geethanjali College of Engineering and Technology
Contact
Email
poshithraja@gmail.com