Matin Zarei

Data Detective

Coffee-Fueled Coder

Insight Hunter

Cloud-Ready Analyst

Storyteller with Data

Big Data Whisperer

Refactor Survivor

Matin Zarei

Data Detective

Coffee-Fueled Coder

Insight Hunter

Cloud-Ready Analyst

Storyteller with Data

Big Data Whisperer

Refactor Survivor

Credit Card Fraud Detection

See Demo

Problem

Credit card fraud is a major challenge for banks and financial institutions. Detecting fraudulent transactions in real-time is difficult because the dataset is highly imbalanced—fraud cases are extremely rare compared to legitimate transactions.

Approach

I developed a machine learning solution to classify transactions as fraudulent or legitimate. The workflow included extensive data preprocessing, feature engineering, and handling class imbalance. I compared multiple models, including Random Forest and XGBoost, and optimized them using cross-validation and early stopping. I also experimented with different decision thresholds to maximize the F1-score and balance between precision and recall.

Results

The final model successfully reduced false positives while maintaining high recall, ensuring more fraudulent cases were detected without overwhelming the system with false alerts. The performance was evaluated using confusion matrix, precision, recall, and F1-score, demonstrating the model’s ability to handle imbalanced data effectively.

  • Random Forest: Precision 94.8%, Recall 74.5%, F1 ≈ 0.835

  • XGBoost: Precision 96.7%, Recall 79.7%, F1 ≈ 0.873

While Random Forest caught slightly more positives, XGBoost consistently reduced false negatives, making it more effective for fraud detection where missing fraud is costlier than a false alarm.

Tools & Techniques

  • Languages & Libraries: Python, Pandas, NumPy, scikit-learn, XGBoost, Random Forest, Matplotlib/Seaborn

  • Techniques: Data preprocessing, feature engineering, model tuning, precision-recall optimization

  • Evaluation: Confusion matrix, precision, recall, F1-score