Back to Blog
Python code on a computer screen
Tutorials Series: Python Data Analysis (1/1)

Python Data Analysis Series: Getting Your Environment Ready

Part 1 of a comprehensive series on Python data analysis. Set up your environment with the right tools for success.

Devin Brand 3 min read

Welcome to my Python Data Analysis series! Whether you’re just starting out or looking to level up your skills, this series will take you from setup to advanced analysis techniques.

What We’ll Cover in This Series

  1. Part 1 (this post): Environment setup
  2. Part 2: DataFrames and data manipulation
  3. Part 3: Data cleaning techniques
  4. Part 4: Exploratory data analysis
  5. Part 5: Visualization best practices

Setting Up Your Environment

Anaconda bundles Python with the most popular data science libraries:

# Download from anaconda.com, then:
conda create -n data-analysis python=3.11
conda activate data-analysis
conda install pandas numpy matplotlib seaborn jupyter

Option 2: pip + venv (More Control)

For those who prefer a leaner setup:

# Create virtual environment
python -m venv data-env

# Activate (Windows)
data-env\Scripts\activate

# Activate (Mac/Linux)
source data-env/bin/activate

# Install packages
pip install pandas numpy matplotlib seaborn jupyterlab

Essential Packages

PackagePurpose
pandasData manipulation and analysis
numpyNumerical computing
matplotlibBasic plotting
seabornStatistical visualization
jupyterlabInteractive notebooks
scikit-learnMachine learning

Your First Notebook

Let’s verify everything works:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Check versions
print(f"pandas: {pd.__version__}")
print(f"numpy: {np.__version__}")

# Quick test
df = pd.DataFrame({
    'x': np.random.randn(100),
    'y': np.random.randn(100)
})

plt.figure(figsize=(8, 6))
sns.scatterplot(data=df, x='x', y='y')
plt.title('Environment Test')
plt.show()

print("✅ Environment ready!")

Organizing Your Projects

Good project structure saves headaches later:

my-analysis-project/
├── data/
│   ├── raw/           # Original, untouched data
│   └── processed/     # Cleaned data
├── notebooks/
│   ├── 01-exploration.ipynb
│   └── 02-analysis.ipynb
├── src/               # Reusable code
│   └── utils.py
├── outputs/           # Generated figures, reports
├── requirements.txt
└── README.md

VS Code Extensions I Recommend

  • Python - Core Python support
  • Jupyter - Run notebooks in VS Code
  • Pylance - Better IntelliSense
  • Data Wrangler - Visual data exploration

What’s Next

In Part 2, we’ll dive into pandas DataFrames—the workhorse of Python data analysis. You’ll learn how to:

  • Load data from various sources
  • Select and filter data
  • Transform and reshape datasets
  • Handle missing values

Questions about setup? Drop them in the feedback form and I’ll address them in the series!

Share this post

Devin Brand

Devin Brand

Data explorer, web builder, and eternally curious human. Always asking "why?" and digging for answers.

Related Posts