Skip to main content

Command Palette

Search for a command to run...

10 NumPy Functions Every Data Scientist Should Know

Updated
5 min read
10 NumPy Functions Every Data Scientist Should Know
N
Simplifying AI, Machine Learning & Data Science for beginners. Free cheat sheets, roadmaps & resources to help you start your data journey — no CS degree needed.

Before Pandas. Before Scikit-learn. There is NumPy.

NumPy is the foundation of every data science library in Python. If you understand NumPy, everything else makes more sense.

Here are the 10 most important NumPy functions — with examples and real use cases.


1. np.array() — Create Your First Array

Everything in NumPy starts with an array.

Example:

import numpy as np

data = np.array([10, 20, 30, 40, 50])
print(data)        # → [10 20 30 40 50]
print(data.shape)  # → (5,)
print(data.dtype)  # → int64

Use case: Converting a Python list into a NumPy array for fast mathematical operations.


2. np.zeros() and np.ones() — Create Empty Arrays

Useful for initializing arrays before filling them with data.

Example:

zeros = np.zeros((3, 4))   # 3 rows, 4 columns of 0s
ones = np.ones((2, 3))     # 2 rows, 3 columns of 1s

print(zeros.shape)  # → (3, 4)
print(ones.shape)   # → (2, 3)

Use case: Creating placeholder arrays for machine learning model weights before training.


3. np.arange() — Create a Range of Numbers

Like Python's range() but returns a NumPy array.

Example:

arr = np.arange(0, 100, 10)
print(arr)  # → [ 0 10 20 30 40 50 60 70 80 90]

Use case: Generating evenly spaced values for plotting graphs or creating test datasets.


4. np.reshape() — Change Array Shape

Change the shape of an array without changing its data.

Example:

arr = np.arange(12)
reshaped = arr.reshape(3, 4)  # 3 rows, 4 columns
print(reshaped)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

Use case: Reshaping image data or feature matrices before feeding into a machine learning model.


5. np.mean(), np.median(), np.std() — Basic Statistics

The three most used statistical functions in data science.

Example:

scores = np.array([85, 90, 78, 92, 88, 76, 95])

print(np.mean(scores))    # → 86.28
print(np.median(scores))  # → 88.0
print(np.std(scores))     # → 6.6

Use case: Calculating average salary, median house price, or standard deviation of test scores.


6. np.min(), np.max(), np.sum() — Aggregations

Find the smallest, largest, or total value in an array.

Example:

sales = np.array([1200, 1500, 980, 2100, 1750])

print(np.min(sales))   # → 980
print(np.max(sales))   # → 2100
print(np.sum(sales))   # → 7530

Use case: Finding the lowest temperature in weather data, highest revenue month, total annual sales.


7. np.where() — Conditional Selection

Apply conditions to an array. Returns elements that meet the condition.

Example:

scores = np.array([45, 72, 88, 55, 91, 63])
result = np.where(scores >= 70, "Pass", "Fail")
print(result)  # → ['Fail' 'Pass' 'Pass' 'Fail' 'Pass' 'Fail']

Use case: Labeling students as pass or fail, flagging transactions above a threshold, categorizing data.


8. np.unique() — Find Unique Values

Returns all unique values in an array.

Example:

categories = np.array(["A", "B", "A", "C", "B", "A", "D"])
print(np.unique(categories))         # → ['A' 'B' 'C' 'D']
print(np.unique(categories, return_counts=True))
# → (['A' 'B' 'C' 'D'], [3, 2, 1, 1])

Use case: Finding unique product categories, counting how many times each value appears in a column.


9. np.sort() — Sort an Array

Sort values in ascending or descending order.

Example:

prices = np.array([450, 120, 890, 230, 670])
print(np.sort(prices))           # → [120 230 450 670 890]
print(np.sort(prices)[::-1])     # → [890 670 450 230 120]

Use case: Sorting product prices, ranking exam scores, ordering time series data.


10. np.dot() — Matrix Multiplication

Essential for machine learning. Used in neural networks, linear regression, and more.

Example:

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

result = np.dot(A, B)
print(result)
# → [[19 22]
#    [43 50]]

Use case: Calculating predictions in linear regression, forward pass in neural networks, feature transformations.


Quick Reference — Save This

Function What It Does
np.array() Create a NumPy array
np.zeros() / np.ones() Create arrays of 0s or 1s
np.arange() Create a range of numbers
np.reshape() Change array shape
np.mean() / np.median() / np.std() Basic statistics
np.min() / np.max() / np.sum() Aggregations
np.where() Conditional selection
np.unique() Find unique values
np.sort() Sort array values
np.dot() Matrix multiplication

One Important Rule

NumPy arrays are faster than Python lists — but only if you use NumPy operations on them.

# Slow ❌ — using Python loop on NumPy array
total = 0
for x in data:
    total += x

# Fast ✅ — using NumPy operation
total = np.sum(data)

Always use NumPy functions on NumPy arrays. Never loop through them manually.

Save this article and come back to it every time you work with numerical data.


More from this blog

N

Neural Notes

19 posts