Essential Tools Setup & Unix/Linux Continuation
Essential Tools Setup & Unix/Linux Continuation
This lesson builds on Week 0βs introduction by diving deeper into Unix/Linux commands, introducing Git version control, and setting up Python development environments on the Nuvolos platform.
Learning Objectives
By the end of this lesson, you will:
- Master advanced Unix/Linux commands for file manipulation and system navigation
- Use Git for version control of your research code
- Set up Python development environments on Nuvolos
- Create reproducible research workflows using cloud tools
- Integrate Unix/Linux, Git, and Python for efficient data science workflows
Advanced Unix/Linux Commands
Building on Week 0βs basics, letβs explore more powerful Unix/Linux commands essential for data science work.
Advanced File Operations
# Find files and directories
find . -name "*.csv" # Find all CSV files
find . -type f -size +1M # Find files larger than 1MB
locate filename # Quick file location
# Text processing
grep "pattern" file.txt # Search for patterns in files
grep -r "function" . # Recursively search in directories
sed 's/old/new/g' file.txt # Replace text in files
awk '{print $1}' data.txt # Extract columns from data
# File compression and archives
tar -czf archive.tar.gz dir/ # Create compressed archive
tar -xzf archive.tar.gz # Extract archive
zip -r backup.zip project/ # Create zip archive
Process and System Management
# Process management
ps aux # List all running processes
top # Monitor system resources
htop # Enhanced process monitor
kill PID # Terminate process by ID
# Disk usage and system info
df -h # Disk space usage
du -sh * # Directory sizes
free -h # Memory usage
uname -a # System information
Advanced Navigation and Shortcuts
# Command history
history # View command history
!! # Repeat last command
!n # Repeat command number n
Ctrl+R # Search command history
# File permissions
chmod 755 script.py # Set executable permissions
chmod +x script.py # Make file executable
chown user:group file.txt # Change ownership
# Pipes and redirection
ls -la | grep ".py" # Filter output
python script.py > output.txt # Redirect output to file
python script.py 2>&1 | tee log.txt # Log both output and errors
2. Git Version Control
Git tracks changes in your code and enables collaboration.
Basic Workflow:
# Initialize repository
git init
# Track changes
git add .
git commit -m "Descriptive message"
# Sync with remote
git push origin main
git pull origin main
3. Python Environment Management
Professional Python development requires proper environment management.
Using UV (recommended):
# Create virtual environment
uv venv .venv
# Activate environment
source .venv/bin/activate # macOS/Linux
# or
.venv\Scripts\activate # Windows
# Install packages
uv pip install package-name
Hands-on Activities
Activity 1: Command Line Navigation
- Open your terminal
- Navigate to your Documents folder
- Create a new directory called
research-projects
- List the contents to verify creation
Activity 2: First Git Repository
- Initialize a Git repository in your project folder
- Create a simple Python file
- Add and commit your changes
- Check the commit history
Activity 3: Nuvolos Workspace Setup
- Create a project workspace in Nuvolos
- Set up Python environment with required packages
- Test Git integration within Nuvolos
- Create your first Jupyter notebook
Integrated Nuvolos Workflow
Working with Nuvolos Platform
- Project Organization: Use Nuvolos workspace structure
- Version Control: Integrate Git within Nuvolos environment
- Environment Management: Leverage pre-configured Python environments
- Collaboration: Share workspaces with team members
- Resource Scaling: Use cloud computing resources as needed
Research Programming Workflow
- Project Setup: Create organized directory structure in Nuvolos
- Version Control: Initialize Git repository (local and remote)
- Environment: Configure Python packages for your project
- Development: Write code with proper documentation using Jupyter
- Testing: Validate your code works correctly
- Collaboration: Share via Nuvolos and push to GitHub
Code Examples
See the accompanying code examples for practical implementations:
git-basics.py
: Common Git operations in Pythonenvironment-setup.sh
: Automated environment setup script
Resources
- Slides: Session 1 Slides
- Git Documentation: git-scm.com
- UV Documentation: astral.sh/uv
- Command Line Cheat Sheet: Available in course materials
Troubleshooting
Common Issues:
- Git permission errors: Check SSH key setup
- Python environment conflicts: Use virtual environments
- Command not found: Verify software installation
Assessment
This lesson contributes to:
- Assignment 1: Git Basics
- Final project setup and version control
Next Steps
Continue with the next lesson to build on these foundational tools.