Data Science — Internship Tasks

Objective, Features, Technologies and tasks to learn Data Science fundamentals and practical skills.

Objective

Provide interns with end-to-end data science experience: from data cleaning and exploratory analysis to model building, evaluation and deployment. Projects are directly applicable to real business problems.


Features


Technologies & Tools

Python 3.x Pandas / NumPy scikit-learn Matplotlib / Seaborn Jupyter / Colab

Beginner Level Tasks


Note: Out of the 4 main tasks below, you are required to complete any 3 tasks.

Tasks (4)

Dataset

https://www.kaggle.com/datasets/bhanupratapbiswas/zomato

Goal

Analyze restaurant and review data to extract insights on ratings, cuisines, location preferences and factors affecting ratings.

Requirements
  • Data cleaning (handle text fields, missing values, currency conversions if needed)
  • Explore relationships: cuisine vs rating, location hotspots, price vs rating
  • Build visualizations: heatmaps, wordclouds for reviews or popular cuisines
  • Provide 5 recommendations for Alfido Tech style platform (e.g., partnership, content ideas)
Deliverables
  1. Notebook (Jupyter/Colab) with cleaned data & visualizations
  2. PDF report with key findings and recommendations

Dataset

https://www.kaggle.com/datasets/bhanupratapbiswas/loan-approval-prediction-case-study

Goal

Build a supervised model to predict loan approval using borrower features. Focus on preprocessing, handling imbalance and evaluation.

Requirements
  • Data preprocessing: missing values, encoding categorical variables, scaling
  • Handle class imbalance (SMOTE, undersampling or class weights)
  • Compare models (logistic regression, tree-based models) and report precision, recall, F1 and ROC-AUC
  • Provide business-oriented interpretation of model outputs
Deliverables
  1. Notebook with modeling pipeline and metrics
  2. Short report discussing model trade-offs and suggested threshold for deployment

Dataset

https://www.kaggle.com/datasets/bhanupratapbiswas/instgram

Goal

Analyze Instagram posts/engagement to identify best posting times, content types with high engagement and follower growth signals.

Requirements
  • Parse dates/times and compute engagement metrics (likes/comments per follower)
  • Analyze posting schedule, hashtags, and content types
  • Recommend an optimal content calendar and 5 strategies to increase engagement for Alfido Tech
Deliverables
  1. Notebook with analysis and visuals
  2. One-page strategy document with recommended posting plan

Goal

Choose a dataset or combine datasets to propose an actionable analytics solution for Alfido Tech (e.g., internship analytics, service demand forecasting).

Deliverables
  1. Notebook + README
  2. PPT or PDF summarizing findings and recommended actions

How to Submit Your Tasks

  1. For each task:
    • Create a separate document (DOC, PDF) including notebook links, screenshots, charts and an executive summary (1 page).
  2. Upload artifacts:
    • Push code & notebooks to GitHub and share repository links. Upload large files/models to Google Drive if needed.
  3. Submit links:
    • Go to the Task Submission page and paste your links, clearly mentioning task numbers.

Tip: Keep notebooks tidy: use sections, comments, and a small README so reviewers can reproduce your results quickly.