The asterisk in SQL selects all columns:
SELECT * FROM users;
This returns every column in the users table. Simple, but often wrong.
When SELECT * Is Fine
Exploring data in development:
SELECT * FROM users LIMIT 10;
Quick exploration doesn't need precision.
Small tables with few columns:
SELECT * FROM config WHERE key = 'api_url';
If the table has 3-4 columns and you need them all, SELECT * is readable.
Ad-hoc queries and scripts:
-- One-time data export
SELECT * FROM orders WHERE created_at > '2024-01-01';
When SELECT * Is Wrong
Production application code:
// Bad - application code
const users = await db.query('SELECT * FROM users');
Applications should specify exactly what they need.
Large tables with many columns:
-- Bad - 50 columns, only need 2
SELECT * FROM products;
-- Good
SELECT id, name FROM products;
Tables with BLOB/TEXT fields:
-- Bad - pulls massive binary data
SELECT * FROM documents;
-- Good - exclude large fields
SELECT id, title, created_at FROM documents;
Performance Impact
Network overhead: Transferring unnecessary data wastes bandwidth:
-- Transfers 1 MB per row
SELECT * FROM users;
-- Transfers 100 bytes per row
SELECT id, email FROM users;
Memory usage: Applications load all columns into memory.
Index coverage: Specific columns can use covering indexes:
-- May use covering index on (email, name)
SELECT email, name FROM users WHERE email = 'alice@example.com';
-- Cannot use covering index - needs all columns
SELECT * FROM users WHERE email = 'alice@example.com';
Schema Changes Break Code
Adding columns changes SELECT * output:
-- Original schema: id, name
SELECT * FROM users; // Returns 2 columns
-- After adding 'internal_notes'
SELECT * FROM users; // Now returns 3 columns
Applications expecting 2 columns might break.
Explicit columns are stable:
SELECT id, name FROM users; // Always 2 columns
Column Order Dependency
Some code relies on column order:
const [id, name] = row; // Assumes id is first
Schema changes that reorder columns break this. Explicit names are safer:
const { id, name } = row; // Order doesn't matter
Security Concerns
Sensitive columns might leak:
-- Bad - includes password_hash
SELECT * FROM users;
-- Good - excludes sensitive fields
SELECT id, username, email, created_at FROM users;
JOIN Performance
With joins, SELECT * pulls columns from all tables:
-- Bad - 100+ columns from 3 tables
SELECT * FROM orders
JOIN users ON orders.user_id = users.id
JOIN products ON orders.product_id = products.id;
-- Good - only needed columns
SELECT
orders.id,
orders.total,
users.email,
products.name
FROM orders
JOIN users ON orders.user_id = users.id
JOIN products ON orders.product_id = products.id;
Code Readability
Explicit columns document intent:
-- Unclear what's used
SELECT * FROM users WHERE active = true;
-- Clear requirements
SELECT id, email, name FROM users WHERE active = true;
Debugging
Explicit columns make query plans easier to read:
EXPLAIN SELECT id, email FROM users WHERE email LIKE '%@gmail.com';
Exceptions
COUNT(*) is fine:
SELECT COUNT(*) FROM users WHERE active = true;
COUNT(*) counts rows, not columns.
EXISTS checks:
SELECT EXISTS(SELECT 1 FROM users WHERE email = 'alice@example.com');
Only checks existence, column doesn't matter.
Best Practices
- List specific columns in application code
- Use
SELECT *only for exploration - Avoid in JOINs unless truly needed
- Consider covering indexes
- Document why if you must use it
Linting
SQL linters can flag SELECT *:
-- Warning: Avoid SELECT *
SELECT * FROM users;
Migration Strategy
Replacing existing SELECT *:
- Run query with
LIMIT 1 - Note which columns application uses
- Replace * with explicit list
- Test thoroughly
Further Reading
Use The Index, Luke's SQL performance guide covers index optimization.
PostgreSQL's EXPLAIN documentation shows query performance analysis.
SELECT * is convenient but rarely correct in production.
0 comments