BelajarKoding Logobelajarkoding

Platform belajar web development Indonesia. Artikel, cheat sheets, roadmap, dan code challenges untuk developer Indonesia.

Navigasi

  • Artikel
  • Cheat Sheets
  • Roadmap
  • Challenges
  • Pricing
  • Search

Produk Lain

  • JagoHermes
  • KelasClaude
  • KilatKoding
  • BelajarVibeCoding
  • JualanKoding

Support

  • Privacy Policy
  • Terms of Service
  • Email

© 2026 BelajarKoding. All rights reserved.

Galih PratamaBagian dari ekosistem Galih Pratama

belajarkoding Logo
RoadmapArtikelCheat SheetsChallengesUpgrade
belajarkoding Logo
RoadmapArtikelCheat SheetsChallengesUpgrade
belajarkoding Logo
RoadmapArtikelCheat SheetsChallengesUpgrade
Kembali ke Roadmaps

Data Scientist

Roadmap lengkap untuk menjadi Data Scientist profesional. Pelajari statistik, data analysis, machine learning, visualization, dan big data untuk extract insights dari data.

7
Phases
31
Topics
13
Required
8
Resources
Loading progress...
1

Fundamental Skills

Fondasi yang wajib dikuasai sebelum masuk ke data science

Python untuk Data Science

required

Bahasa utama data science. Kuasai syntax, OOP, dan library inti (NumPy, Pandas, Matplotlib)

Statistics & Probability

required

Descriptive statistics, distribusi (normal, binomial, Poisson), hipotesis testing, p-value, confidence interval

Linear Algebra & Calculus

required

Matriks, vektor, turunan. Fondasi untuk paham algoritma ML dan optimisasi

Excel & SQL Fundamentals

required

Excel untuk quick analysis, SQL untuk query database. Dua skill paling fundamental di data career

Resources:
SQL Cheat SheetSQL Basics untuk Pemula

Git & Version Control

recommended

Version control untuk notebooks, analysis scripts, dan data pipelines

Resources:
Git Cheat SheetGit untuk Pemula
2

Data Analysis & Manipulation

Tools dan teknik untuk process dan analyze data

Pandas

required

Library utama untuk data manipulation di Python. DataFrame, groupby, merge, time series

NumPy

required

Numerical computing library. Array operations, linear algebra, random sampling

Data Cleaning & Preprocessing

required

Handle missing values, outliers, data type conversion, normalization, encoding categorical data

Exploratory Data Analysis (EDA)

required

Analisis eksploratif untuk paham pola, korelasi, anomali dalam data sebelum modeling

Advanced Database Querying

recommended

CTE, window functions, query optimization, indexing untuk analyze data dalam skala besar

Resources:
PostgreSQL Cheat SheetPostgreSQL untuk PemulaDatabase Indexing Cheat Sheet
3

Data Visualization

Komunikasi insight melalui visual yang efektif

Matplotlib & Seaborn

required

Static visualization library. Bar, line, scatter, heatmap, distribution plots

Plotly & Interactive Visualization

recommended

Interactive charts, dashboards, 3D visualizations untuk exploratory analysis

BI Tools (Tableau / Metabase)

recommended

Dashboard creation, data storytelling, self-service analytics untuk business stakeholders

Data Storytelling

recommended

Principles of effective data communication, chart selection, audience-aware presentation

4

Machine Learning

Algoritma ML untuk predictive modeling

Supervised Learning

required

Regression (linear, logistic, polynomial), classification (decision tree, random forest, XGBoost)

Unsupervised Learning

recommended

Clustering (K-Means, DBSCAN), dimensionality reduction (PCA, t-SNE), association rules

Scikit-learn

required

Library Python untuk ML klasik. Dari preprocessing sampai model training dan evaluation

Feature Engineering

recommended

Feature selection, scaling, normalization, encoding, interaction features. Bedain model average sama excellent

Model Validation & Cross-Validation

required

Train/test split, k-fold CV, stratified sampling, hyperparameter tuning (GridSearch, RandomSearch)

5

Deep Learning (Optional)

Neural networks untuk complex tasks

Neural Network Basics

recommended

MLP, forward/backward propagation, activation functions, gradient descent

NLP & Text Analysis

recommended

Text preprocessing, TF-IDF, word embeddings, sentiment analysis, topic modeling

Computer Vision Basics

optional

CNN untuk image classification, object detection. Image preprocessing dan augmentation

LLM untuk Data Science

recommended

Pakai LLM untuk data analysis automation, code generation, text classification, dan data enrichment

Resources:
AI Engineering Cheat Sheet
6

Big Data & Cloud

Handle data dalam skala besar

Data Pipelines (ETL)

recommended

Apache Airflow, Prefect, dbt untuk extract-transform-load pipelines yang reliable

Apache Spark

optional

Distributed computing untuk big data processing. PySpark, Spark SQL, MLlib

Data Warehousing

recommended

Snowflake, BigQuery, Redshift, Databricks untuk analytics dalam skala enterprise

Cloud ML Platforms

optional

AWS SageMaker, GCP Vertex AI, Azure ML untuk train dan deploy model di cloud

7

Career & Communication

Soft skills dan tools untuk sukses sebagai Data Scientist

Jupyter & Notebooks

required

Jupyter Lab, Google Colab, VS Code Notebooks untuk interactive analysis dan documentation

Experiment Tracking

recommended

MLflow, W&B (Weights & Biases) buat track model experiments, metrics, dan artifacts

Business Domain Knowledge

recommended

Paham business context, KPI, dan metrics yang matter. Translate business question jadi data problem

Communication & Presentation

recommended

Present findings ke non-technical stakeholders, bikin deck, write data reports

Udah siap buat mulai?

Roadmap ini bakal nemenin kamu dari basic sampai jago jadi Data Scientist. Pelajari tiap topik step by step, terus langsung praktik dengan bikin project.

Baca ArtikelLihat Cheat Sheets