Python Data Analysis Series: Getting Your Environment Ready
Part 1 of a comprehensive series on Python data analysis. Set up your environment with the right tools for success.
Welcome to my Python Data Analysis series! Whether you’re just starting out or looking to level up your skills, this series will take you from setup to advanced analysis techniques.
What We’ll Cover in This Series
- Part 1 (this post): Environment setup
- Part 2: DataFrames and data manipulation
- Part 3: Data cleaning techniques
- Part 4: Exploratory data analysis
- Part 5: Visualization best practices
Setting Up Your Environment
Option 1: Anaconda (Recommended for Beginners)
Anaconda bundles Python with the most popular data science libraries:
# Download from anaconda.com, then:
conda create -n data-analysis python=3.11
conda activate data-analysis
conda install pandas numpy matplotlib seaborn jupyter
Option 2: pip + venv (More Control)
For those who prefer a leaner setup:
# Create virtual environment
python -m venv data-env
# Activate (Windows)
data-env\Scripts\activate
# Activate (Mac/Linux)
source data-env/bin/activate
# Install packages
pip install pandas numpy matplotlib seaborn jupyterlab
Essential Packages
| Package | Purpose |
|---|---|
| pandas | Data manipulation and analysis |
| numpy | Numerical computing |
| matplotlib | Basic plotting |
| seaborn | Statistical visualization |
| jupyterlab | Interactive notebooks |
| scikit-learn | Machine learning |
Your First Notebook
Let’s verify everything works:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Check versions
print(f"pandas: {pd.__version__}")
print(f"numpy: {np.__version__}")
# Quick test
df = pd.DataFrame({
'x': np.random.randn(100),
'y': np.random.randn(100)
})
plt.figure(figsize=(8, 6))
sns.scatterplot(data=df, x='x', y='y')
plt.title('Environment Test')
plt.show()
print("✅ Environment ready!")
Organizing Your Projects
Good project structure saves headaches later:
my-analysis-project/
├── data/
│ ├── raw/ # Original, untouched data
│ └── processed/ # Cleaned data
├── notebooks/
│ ├── 01-exploration.ipynb
│ └── 02-analysis.ipynb
├── src/ # Reusable code
│ └── utils.py
├── outputs/ # Generated figures, reports
├── requirements.txt
└── README.md
VS Code Extensions I Recommend
- Python - Core Python support
- Jupyter - Run notebooks in VS Code
- Pylance - Better IntelliSense
- Data Wrangler - Visual data exploration
What’s Next
In Part 2, we’ll dive into pandas DataFrames—the workhorse of Python data analysis. You’ll learn how to:
- Load data from various sources
- Select and filter data
- Transform and reshape datasets
- Handle missing values
Questions about setup? Drop them in the feedback form and I’ll address them in the series!
Devin Brand
Data explorer, web builder, and eternally curious human. Always asking "why?" and digging for answers.
Related Posts
Getting Started with Data Visualization in 2026
A comprehensive guide to modern data visualization tools and techniques. Learn how to transform raw data into compelling visual stories that drive insights.
Building Modern Web Apps with Astro and Vue
Discover why Astro has become my go-to framework for building fast, content-driven websites. Plus, see how to integrate Vue for interactive islands.