Python is the backbone of modern AI and Machine Learning. Almost every major AI framework, research paper implementation, and production ML system uses Python because of its simplicity, flexibility, and powerful ecosystem.
This module builds strong programming foundations, focusing on how Python is actually used in AI/ML projects, not just syntax.
Python is an interpreted, high-level language that allows rapid experimentation — a critical requirement in ML where models are built, tested, and improved continuously.
Variables store data used in training and prediction.
Example:
learning_rate = 0.01
epochs = 100
Python does not require explicit type declaration, which speeds up experimentation.
ML code deals with large volumes of data, so understanding data types and flow control is critical.
Integers & Floats
Used for numerical features, weights, loss values.
age = 25
accuracy = 0.92
Strings
Used in labels, file paths, class names.
label = "spam"
Booleans
Used in conditions and flags.
is_trained = True
Lists
Used to store datasets, predictions.
scores = [0.85, 0.90, 0.88]
Tuples
Used for immutable configurations.
input_shape = (224, 224, 3)
Dictionaries
Used for feature mapping and configurations.
params = {"lr": 0.01, "epochs": 50}
Sets
Used for unique values.
Conditional Statements
Used for decision-making.
if accuracy > 0.9:
print("Model is performing well")
Loops
Used for training epochs and batch processing.
for epoch in range(epochs):
train_model()
Functions allow reusability and abstraction, which is crucial in ML pipelines.
A function represents a single logical operation.
def calculate_accuracy(y_true, y_pred):
correct = sum(y_true[i] == y_pred[i] for i in range(len(y_true)))
return correct / len(y_true)
Good ML functions:
Modules help organize large ML projects.
Example file structure:
project/
├── data_loader.py
├── model.py
├── train.py
├── evaluate.py
Usage:
from data_loader import load_data
OOP helps structure complex ML systems like models, datasets, and pipelines.
class LinearRegressionModel:
def __init__(self, lr):
self.lr = lr
self.weights = None
def train(self, X, y):
pass
def predict(self, X):
pass
Key OOP Concepts:
ML systems rely heavily on datasets stored in files.
with open("data.txt", "r") as file:
data = file.readlines()
with open("results.txt", "w") as file:
file.write("Accuracy: 92%")
Use cases:
ML code often fails due to:
Exception handling prevents crashes.
try:
data = load_data("dataset.csv")
except FileNotFoundError:
print("Dataset file not found")
except Exception as e:
print("Unexpected error:", e)
Good practice:
AI projects depend on specific library versions.
Virtual environments isolate dependencies.
python -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
pip install numpy pandas scikit-learn
Clean code is critical for collaboration and debugging.
def train_model(X, y, epochs, lr):
for epoch in range(epochs):
loss = compute_loss(X, y)
update_weights(lr)
ml_project/
├── data/
├── notebooks/
├── src/
│ ├── preprocessing.py
│ ├── model.py
│ ├── train.py
│ └── utils.py
├── requirements.txt
└── README.md
# Check Python version (3.8+ recommended for ML)
python --version
# Using pip (Python package manager)
pip install numpy pandas matplotlib scikit-learn
# Check installed packages
pip list
# Interactive mode (REPL - Read-Eval-Print Loop)
>>> 2 + 2
4
>>> print("Hello ML!")
Hello ML!
# Script mode (save as script.py and run)
# python script.py
# Single-line comment
"""
Multi-line comment or docstring
Used for documentation
"""
def train_model(data):
"""
Trains a machine learning model.
Args:
data: Training dataset
Returns:
Trained model object
"""
pass
Numbers:
# Integers
num_samples = 1000
num_features = 784 # For MNIST dataset
# Floats (essential for ML calculations)
learning_rate = 0.001
accuracy = 0.95
# Complex numbers (used in signal processing)
z = 3 + 4j
# Type conversion
x = int(3.7) # 3
y = float(5) # 5.0
z = str(100) # "100"
Strings:
# String creation
model_name = "Random Forest"
dataset = 'CIFAR-10'
description = """Multi-line string
for longer descriptions"""
# String operations (useful for file paths, logging)
path = "/data/train/"
filename = "model_v1.pkl"
full_path = path + filename # Concatenation
# String formatting (for logging results)
epoch = 10
loss = 0.234
print(f"Epoch {epoch}: Loss = {loss:.4f}")
# Output: Epoch 10: Loss = 0.2340
# String methods
text = "machine learning"
print(text.upper()) # MACHINE LEARNING
print(text.split()) # ['machine', 'learning']
print(text.replace("machine", "deep")) # deep learning
Booleans:
# Boolean values (for flags and conditions)
is_training = True
has_converged = False
# Boolean operations
model_ready = (is_training and not has_converged)
# Comparison operators
accuracy > 0.9
loss <= 0.1
epoch != max_epochs
Lists (Most versatile, like arrays):
# Creating lists
features = [1, 2, 3, 4, 5]
mixed_data = [1, "feature", 3.14, True]
# Indexing (0-based)
first_feature = features[0] # 1
last_feature = features[-1] # 5
# Slicing (very important for ML data manipulation)
first_three = features[0:3] # [1, 2, 3]
last_two = features[-2:] # [4, 5]
every_second = features[::2] # [1, 3, 5]
# List operations
features.append(6) # Add to end
features.insert(0, 0) # Insert at position
features.remove(3) # Remove by value
popped = features.pop() # Remove and return last
# List comprehension (powerful for data processing)
squared = [x**2 for x in features]
filtered = [x for x in features if x > 2]
# Practical ML example
train_indices = [i for i in range(len(dataset)) if i % 5 != 0]
test_indices = [i for i in range(len(dataset)) if i % 5 == 0]
Tuples (Immutable, good for fixed data):
# Creating tuples
image_shape = (28, 28, 3) # Height, Width, Channels
train_val_split = (0.8, 0.2)
# Unpacking (very useful)
height, width, channels = image_shape
train_ratio, val_ratio = train_val_split
# Tuples are immutable
# image_shape[0] = 32 # This would raise an error
# Use case: returning multiple values from functions
def get_data_stats(data):
return len(data), data.mean(), data.std()
size, mean, std = get_data_stats(dataset)
Dictionaries (Key-value pairs, essential for configs):
# Creating dictionaries
model_config = {
'learning_rate': 0.001,
'batch_size': 32,
'epochs': 100,
'optimizer': 'Adam'
}
# Accessing values
lr = model_config['learning_rate']
optimizer = model_config.get('optimizer', 'SGD') # With default
# Adding/modifying
model_config['momentum'] = 0.9
model_config['learning_rate'] = 0.0001 # Update
# Dictionary methods
print(model_config.keys()) # Get all keys
print(model_config.values()) # Get all values
print(model_config.items()) # Get key-value pairs
# Iterating
for key, value in model_config.items():
print(f"{key}: {value}")
# Dictionary comprehension
squared_dict = {x: x**2 for x in range(5)}
# {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
# Nested dictionaries (for complex configs)
experiment_config = {
'model': {
'type': 'CNN',
'layers': [64, 128, 256]
},
'training': {
'epochs': 50,
'batch_size': 32
}
}
# Accessing nested values
model_type = experiment_config['model']['type']
Sets (Unique elements, useful for data cleaning):
# Creating sets
unique_labels = {0, 1, 2, 3, 4}
another_set = set([1, 2, 2, 3, 3, 3]) # {1, 2, 3}
# Set operations
train_ids = {1, 2, 3, 4, 5}
val_ids = {4, 5, 6, 7}
intersection = train_ids & val_ids # {4, 5} (overlap)
union = train_ids | val_ids # All unique IDs
difference = train_ids - val_ids # {1, 2, 3}
# Use case: finding unique classes
all_predictions = [0, 1, 1, 2, 0, 3, 2, 1]
unique_classes = set(all_predictions) # {0, 1, 2, 3}
If-Else Statements:
# Basic if-else
accuracy = 0.95
if accuracy > 0.9:
print("Excellent model!")
elif accuracy > 0.7:
print("Good model")
else:
print("Needs improvement")
# Ternary operator (one-liner)
status = "Pass" if accuracy > 0.8 else "Fail"
# ML example: Early stopping logic
def check_early_stopping(val_loss, best_loss, patience_counter, patience):
if val_loss < best_loss:
best_loss = val_loss
patience_counter = 0
print("New best model!")
else:
patience_counter += 1
if patience_counter >= patience:
print("Early stopping triggered")
return True
return False
Loops:
For loops (iterating over sequences):
# Basic for loop
epochs = 10
for epoch in range(epochs):
print(f"Training epoch {epoch + 1}")
# Iterating over lists
features = ['age', 'income', 'education']
for feature in features:
print(f"Processing feature: {feature}")
# Enumerate (get index and value)
for idx, feature in enumerate(features):
print(f"Feature {idx}: {feature}")
# Range with start, stop, step
for i in range(0, 100, 10): # 0, 10, 20, ..., 90
print(f"Processing batch {i}")
# Nested loops (training loop example)
for epoch in range(num_epochs):
for batch_idx, (data, labels) in enumerate(train_loader):
# Training logic here
if batch_idx % 10 == 0:
print(f"Epoch {epoch}, Batch {batch_idx}")
# List comprehension (faster alternative)
squared_features = [x**2 for x in range(10)]
While loops (condition-based):
# Basic while loop
loss = 1.0
epoch = 0
max_epochs = 1000
while loss > 0.01 and epoch < max_epochs:
# Training step
loss = loss * 0.95 # Simulated decrease
epoch += 1
print(f"Epoch {epoch}: Loss = {loss:.4f}")
# While with break
while True:
user_input = input("Continue training? (y/n): ")
if user_input.lower() == 'n':
break
# Continue training
# While with continue
batch_idx = 0
while batch_idx < len(dataset):
if dataset[batch_idx] is None:
batch_idx += 1
continue # Skip this batch
# Process batch
batch_idx += 1
Loop Control:
# Break: Exit loop immediately
for epoch in range(100):
train_loss = train_model()
if train_loss < threshold:
print(f"Converged at epoch {epoch}")
break
# Continue: Skip to next iteration
for sample_id in range(len(dataset)):
if is_corrupted(dataset[sample_id]):
continue # Skip corrupted data
process(dataset[sample_id])
# Pass: Placeholder (do nothing)
for epoch in range(num_epochs):
pass # TODO: Implement training loop
from typing import List, Dict, Tuple, Optional
def preprocess_data(
data: List[float],
labels: List[int],
normalize: bool = True
) -> Tuple[List[float], List[int]]:
"""
Preprocesses data with type hints for clarity.
"""
if normalize:
mean = sum(data) / len(data)
data = [(x - mean) for x in data]
return data, labels
# Optional types (can be None)
def load_model(path: Optional[str] = None) -> Dict:
if path is None:
return {} # Return empty config
# Load from path
# Complex types
DataPoint = Tuple[List[float], int] # (features, label)
Dataset = List[DataPoint]
def split_dataset(
dataset: Dataset,
ratio: float = 0.8
) -> Tuple[Dataset, Dataset]:
split_idx = int(len(dataset) * ratio)
return dataset[:split_idx], dataset[split_idx:]
Defining Functions:
# Basic function
def train_epoch(model, data, labels):
"""Trains model for one epoch."""
# Training logic
loss = 0.0
# ... compute loss
return loss
# Function with default arguments
def create_model(
input_size: int,
hidden_size: int = 128,
output_size: int = 10,
activation: str = 'relu'
):
"""Creates neural network with defaults."""
model = {
'input': input_size,
'hidden': hidden_size,
'output': output_size,
'activation': activation
}
return model
# Usage
model1 = create_model(784) # Uses defaults
model2 = create_model(784, hidden_size=256, activation='tanh')
Return Values:
# Single return
def calculate_accuracy(predictions, labels):
correct = sum([p == l for p, l in zip(predictions, labels)])
return correct / len(labels)
# Multiple returns (as tuple)
def evaluate_model(model, test_data, test_labels):
predictions = model.predict(test_data)
accuracy = calculate_accuracy(predictions, test_labels)
loss = calculate_loss(predictions, test_labels)
return accuracy, loss, predictions
# Unpacking returns
acc, loss, preds = evaluate_model(model, X_test, y_test)
# Named returns using dictionary
def get_metrics(predictions, labels):
return {
'accuracy': calculate_accuracy(predictions, labels),
'precision': calculate_precision(predictions, labels),
'recall': calculate_recall(predictions, labels)
}
metrics = get_metrics(preds, labels)
print(f"Accuracy: {metrics['accuracy']}")
Function Arguments:
# Positional arguments
def train(model, data, labels, epochs):
pass
train(my_model, X_train, y_train, 100)
# Keyword arguments (more readable for ML)
train(
model=my_model,
data=X_train,
labels=y_train,
epochs=100
)
# *args - Variable number of positional arguments
def ensemble_predict(*models):
"""Combines predictions from multiple models."""
predictions = []
for model in models:
pred = model.predict()
predictions.append(pred)
return average(predictions)
result = ensemble_predict(model1, model2, model3)
# **kwargs - Variable number of keyword arguments
def configure_model(**config):
"""Flexible model configuration."""
learning_rate = config.get('learning_rate', 0.001)
batch_size = config.get('batch_size', 32)
optimizer = config.get('optimizer', 'Adam')
print(f"LR: {learning_rate}, Batch: {batch_size}, Opt: {optimizer}")
configure_model(learning_rate=0.01, momentum=0.9, weight_decay=0.0001)
# Combining all argument types
def train_model(model, data, *regularizers, epochs=10, **hyperparams):
"""
model: positional
data: positional
*regularizers: variable positional
epochs: keyword with default
**hyperparams: variable keyword
"""
pass
Lambda Functions (Anonymous functions):
# Basic lambda
square = lambda x: x**2
print(square(5)) # 25
# Lambda with multiple arguments
multiply = lambda x, y: x * y
print(multiply(3, 4)) # 12
# Common use: sorting by custom key
data_points = [(1, 0.5), (2, 0.3), (3, 0.9)]
sorted_by_loss = sorted(data_points, key=lambda x: x[1])
# [(2, 0.3), (1, 0.5), (3, 0.9)]
# Lambda with map/filter
features = [1, 2, 3, 4, 5]
normalized = list(map(lambda x: (x - 3) / 2, features))
filtered = list(filter(lambda x: x > 2, features))
# ML example: applying transformation
def apply_activation(values, activation='relu'):
activations = {
'relu': lambda x: max(0, x),
'sigmoid': lambda x: 1 / (1 + math.exp(-x)),
'tanh': lambda x: math.tanh(x)
}
return [activations[activation](v) for v in values]
# Global vs Local scope
learning_rate = 0.001 # Global
def train():
loss = 0.0 # Local to train()
print(learning_rate) # Can access global
# print(batch_loss) # Error: not defined here
def validate():
batch_loss = 0.5 # Local to validate()
# print(loss) # Error: not accessible
# Modifying global variables
counter = 0
def increment():
global counter # Explicitly declare as global
counter += 1
# Closures (functions that remember enclosing scope)
def create_trainer(learning_rate):
"""Returns a training function with fixed learning rate."""
def train(model, loss):
# This function "closes over" learning_rate
model.update(loss * learning_rate)
return train
# Create specialized trainers
fast_trainer = create_trainer(0.01)
slow_trainer = create_trainer(0.0001)
# Practical example: Creating preprocessors
def create_normalizer(mean, std):
"""Factory function for normalization."""
def normalize(value):
return (value - mean) / std
return normalize
# Create normalizer for specific dataset
dataset_mean = 0.5
dataset_std = 0.2
normalizer = create_normalizer(dataset_mean, dataset_std)
# Use it
normalized_value = normalizer(0.7)
import time
import functools
# Basic decorator: timing function execution
def timer(func):
"""Decorator to measure execution time."""
@functools.wraps(func)
def wrapper(*args, **kwargs):
start_time = time.time()
result = func(*args, **kwargs)
end_time = time.time()
print(f"{func.__name__} took {end_time - start_time:.4f} seconds")
return result
return wrapper
@timer
def train_model(epochs):
"""Training takes time..."""
time.sleep(2) # Simulated training
return "Model trained"
result = train_model(100)
# Logging decorator
def log_execution(func):
"""Logs function calls."""
@functools.wraps(func)
def wrapper(*args, **kwargs):
print(f"Calling {func.__name__} with args={args}, kwargs={kwargs}")
result = func(*args, **kwargs)
print(f"{func.__name__} returned {result}")
return result
return wrapper
@log_execution
def predict(model, data):
return model.forward(data)
# Stacking decorators
@timer
@log_execution
def train_epoch(model, data):
pass # Training logic
# Decorator with arguments
def validate_inputs(min_value, max_value):
"""Decorator factory."""
def decorator(func):
@functools.wraps(func)
def wrapper(value):
if not (min_value <= value <= max_value):
raise ValueError(f"Value must be between {min_value} and {max_value}")
return func(value)
return wrapper
return decorator
@validate_inputs(0.0, 1.0)
def set_learning_rate(lr):
print(f"Learning rate set to {lr}")
set_learning_rate(0.01) # OK
# set_learning_rate(1.5) # Raises ValueError
Creating a Module:
# File: ml_utils.py
"""
Utility functions for machine learning.
"""
def normalize_data(data, mean=None, std=None):
"""Normalizes data to zero mean and unit variance."""
if mean is None:
mean = sum(data) / len(data)
if std is None:
std = (sum((x - mean)**2 for x in data) / len(data)) ** 0.5
normalized = [(x - mean) / std for x in data]
return normalized, mean, std
def split_data(data, labels, ratio=0.8):
"""Splits data into train and test sets."""
split_idx = int(len(data) * ratio)
return (
data[:split_idx],
labels[:split_idx],
data[split_idx:],
labels[split_idx:]
)
# Module-level variables
DEFAULT_RANDOM_SEED = 42
Using Modules:
# Import entire module
import ml_utils
data, mean, std = ml_utils.normalize_data([1, 2, 3, 4, 5])
print(ml_utils.DEFAULT_RANDOM_SEED)
# Import specific functions
from ml_utils import normalize_data, split_data
data, mean, std = normalize_data([1, 2, 3])
# Import with alias
from ml_utils import normalize_data as norm
data, _, _ = norm([1, 2, 3])
# Import all (not recommended for large modules)
from ml_utils import *
Creating a Package:
ml_project/
│
├── __init__.py # Makes it a package
├── preprocessing/
│ ├── __init__.py
│ ├── normalization.py
│ └── feature_engineering.py
├── models/
│ ├── __init__.py
│ ├── linear_models.py
│ └── neural_nets.py
└── utils/
├── __init__.py
└── metrics.py
init.py example:
# ml_project/__init__.py
"""
ML Project Package
"""
__version__ = '1.0.0'
__author__ = 'Your Name'
# Import key functions for easy access
from .preprocessing.normalization import normalize
from .models.linear_models import LinearRegression
# Package-level configuration
DEFAULT_CONFIG = {
'random_seed': 42,
'test_size': 0.2
}
Using the package:
# Import from package
from ml_project.preprocessing.normalization import normalize
from ml_project.models.linear_models import LinearRegression
from ml_project import DEFAULT_CONFIG
# Or use package-level imports
import ml_project
model = ml_project.LinearRegression()
data = ml_project.normalize(raw_data)
Relative Imports (within package):
# In ml_project/models/neural_nets.py
from ..preprocessing.normalization import normalize # Go up one level
from .linear_models import LinearRegression # Same level
from ..utils.metrics import accuracy # Up then down
Basic Class:
class NeuralNetwork:
"""A simple neural network class."""
def __init__(self, input_size, hidden_size, output_size):
"""
Constructor - called when creating new object.
Args:
input_size: Number of input features
hidden_size: Number of hidden neurons
output_size: Number of output classes
"""
self.input_size = input_size
self.hidden_size = hidden_size
self.output_size = output_size
self.weights1 = None
self.weights2 = None
self.is_trained = False
def initialize_weights(self):
"""Initializes network weights."""
import random
self.weights1 = [[random.random() for _ in range(self.hidden_size)]
for _ in range(self.input_size)]
self.weights2 = [[random.random() for _ in range(self.output_size)]
for _ in range(self.hidden_size)]
def train(self, X, y, epochs=100):
"""Trains the network."""
self.initialize_weights()
for epoch in range(epochs):
# Training logic
pass
self.is_trained = True
print(f"Training completed for {epochs} epochs")
def predict(self, X):
"""Makes predictions."""
if not self.is_trained:
raise ValueError("Model must be trained first!")
# Prediction logic
return []
def get_info(self):
"""Returns model information."""
return {
'input_size': self.input_size,
'hidden_size': self.hidden_size,
'output_size': self.output_size,
'is_trained': self.is_trained
}
# Creating objects (instances)
model1 = NeuralNetwork(784, 128, 10) # MNIST-like architecture
model2 = NeuralNetwork(100, 64, 2) # Binary classification
# Using methods
model1.train(X_train, y_train, epochs=50)
predictions = model1.predict(X_test)
info = model1.get_info()
print(info)
class Dataset:
"""Dataset class for managing ML data."""
# Class attributes (shared by all instances)
supported_formats = ['csv', 'json', 'parquet']
num_instances = 0
def __init__(self, data, labels, name="Unnamed"):
# Instance attributes (unique to each object)
self.data = data
self.labels = labels
self.name = name
self.size = len(data)
self._preprocessed = False # Private attribute (convention)
# Increment class counter
Dataset.num_instances += 1
# Instance method (requires self)
def shuffle(self):
"""Shuffles data and labels together."""
import random
combined = list(zip(self.data, self.labels))
random.shuffle(combined)
self.data, self.labels = zip(*combined)
self.data = list(self.data)
self.labels = list(self.labels)
def normalize(self):
"""Normalizes the data."""
mean = sum(sum(row) for row in self.data) / (len(self.data) * len(self.data[0]))
self.data = [[(x - mean) for x in row] for row in self.data]
self._preprocessed = True
# Property (accessed like attribute, but is a method)
@property
def is_preprocessed(self):
"""Check if data has been preprocessed."""
return self._preprocessed
# Setter for property
@is_preprocessed.setter
def is_preprocessed(self, value):
if not isinstance(value, bool):
raise TypeError("Must be boolean")
self._preprocessed = value
# Class method (works with class, not instance)
@classmethod
def from_file(cls, filename):
"""Creates Dataset from file."""
# Load data from file
data = [] # Load logic here
labels = []
return cls(data, labels, name=filename)
# Static method (doesn't need class or instance)
@staticmethod
def is_valid_format(filename):
"""Checks if file format is supported."""
extension = filename.split('.')[-1]
return extension in Dataset.supported_formats
# Special method for string representation
def __repr__(self):
return f"Dataset(name='{self.name}', size={self.size})"
# Special method for len()
def __len__(self):
return self.size
# Special method for indexing
def __getitem__(self, index):
return self.data[index], self.labels[index]
# Usage examples
ds = Dataset([[1, 2], [3, 4]], [0, 1], name="MyDataset")
# Instance methods
ds.shuffle()
ds.normalize()
# Property access
print(ds.is_preprocessed) # True (accessed like attribute)
# Class method
ds2 = Dataset.from_file("data.csv")
# Static method
if Dataset.is_valid_format("data.csv"):
print("Valid format")
# Special methods
print(ds) # Uses __repr__
print(len(ds)) # Uses __len__
sample = ds[0] # Uses __getitem__
# Class attribute
print(f"Total datasets created: {Dataset.num_instances}")
# Base class
class Model:
"""Base model class."""
def __init__(self, name):
self.name = name
self.is_trained = False
self.training_history = []
def train(self, X, y):
"""To be implemented by subclasses."""
raise NotImplementedError("Subclass must implement train()")
def predict(self, X):
"""To be implemented by subclasses."""
raise NotImplementedError("Subclass must implement predict()")
def evaluate(self, X, y):
"""Common evaluation logic."""
predictions = self.predict(X)
accuracy = sum([p == l for p, l in zip(predictions, y)]) / len(y)
return accuracy
def save_model(self, filepath):
"""Common save functionality."""
print(f"Saving {self.name} to {filepath}")
# Derived class 1
class LinearRegressionModel(Model):
"""Linear regression implementation."""
def __init__(self, name="LinearRegression"):
super().__init__(name) # Call parent constructor
self.coefficients = None
self.intercept = None
def train(self, X, y, learning_rate=0.01, epochs=1000):
"""Implements training for linear regression."""
# Initialize parameters
self.coefficients = [0.0] * len(X[0])
self.intercept = 0.0
# Training loop
for epoch in range(epochs):
# Gradient descent logic
loss = 0.0
# ... compute and update
self.training_history.append(loss)
self.is_trained = True
print(f"{self.name} training completed")
def predict(self, X):
"""Makes predictions using linear model."""
if not self.is_trained:
raise ValueError("Model not trained!")
predictions = []
for sample in X:
pred = self.intercept + sum([c * x for c, x in zip(self.coefficients, sample)])
predictions.append(pred)
return predictions
# Derived class 2
class DecisionTreeModel(Model):
"""Decision tree implementation."""
def __init__(self, name="DecisionTree", max_depth=10):
super().__init__(name)
self.max_depth = max_depth
self.tree = None
def train(self, X, y):
"""Implements training for decision tree."""
# Build tree
self.tree = self._build_tree(X, y, depth=0)
self.is_trained = True
print(f"{self.name} training completed")
def _build_tree(self, X, y, depth):
"""Helper method to build tree recursively."""
if depth >= self.max_depth:
return {'leaf': True, 'value': max(set(y), key=y.count)}
# Tree building