Poshith Raja Sikha — Data Engineer | PySpark & Databricks Specialist

About

Data Engineer with 4+ years of experience in building scalable data pipelines and transformations. Specializing in PySpark, Databricks, and big data technologies across financial services, banking, and sportswear industries. Focused on delivering robust solutions that ensure data integrity and drive business value.

Technical Skills

Big Data

PySpark
Databricks
Hadoop
HDFS
Hive
Spark-SQL

Programming

Python
SQL
Shell Scripts
Sqoop

Tools & Platforms

Databricks
CDSW
Bitbucket
Git
Oozie
Hue
Impala

Data Systems

EDL
EDW
AWS S3
Linux
Windows

Experience

Infosys

Senior Associate Consultant

Apr 2025 — Present

Sportswear & Apparel

Databricks PySpark SQL

Resolved critical bugs in existing scripts, eliminating data discrepancies and ensuring data integrity across tables
Developed and implemented an automation script for daily workflow monitoring, significantly improving efficiency and reducing manual oversight
Performed in-depth root cause analysis on workflow failures, resulting in the identification and remediation of critical system issues

Scalability Engineers

Data Engineer

Nov 2024 — Mar 2025

Financial Services

Databricks PySpark SQL AWS S3

Worked on silver-to-gold layer transformations using PySpark in Databricks, applying business logic to extract and transform data
Loaded transformed data into the gold layer to build KPIs for internal audit team compliance with RBI regulatory guidelines
Developed workflows to schedule batch data loads into tables, ensuring timely data processing
Wrote CSV scripts to export data to S3 buckets for UAT testing and validation
Set up auto mailers to send alerts for workflow triggers and KPI CSVs to relevant teams for review

Accenture

Data Engineering, Management & Governance Analyst

May 2021 — Oct 2024

Banking & Financial Services

PySpark Hadoop Shell Scripting Hive Hue Impala HDFS

Actively participated in the development process, adhering to agile methodologies to ensure efficient code development
Conducted data profiling, built functions to read CSV files or tables into dataframes and performed complex transformations using PySpark
Focused on code quality by writing test cases, reviewing code, and engaging in end-to-end regression testing, validating PySpark outputs with Hive queries
Stored project output in Hive tables and CSV files, and transferred these files to the landing zone via SFTP for downstream systems
Contributed to Soft-Go Live, Main-Go Live, and deployment activities, while providing ongoing production support to ensure smooth operations and resolve issues promptly

Education

Bachelor of Technology

Computer Science and Engineering

Geethanjali College of Engineering and Technology

Contact

Email poshithraja@gmail.com

Poshith RajaSikha