Simple SQL: Filtering Rows
Posted by Data, Data Analytics
on March 19, 2015This is the second post in our series featuring SQL tutorials for those getting started. Check out our last post on how to join tables and come back for more Simple SQL posts.
The WHERE Clause
In SQL, the SELECT statement is used to return specific columns of data from a table. Similarly, the WHERE clause is used to choose the rows of a table, and the query will return only the rows that meet the given criteria.
SELECT column_name1, column_name2
FROM table_name
WHERE column_name1 operator value;
In this blog post I’ll be describing some common comparison operators and how to use them with the WHERE clause. I’ll be working with a demo database called Cities, that contains city names and population based on the 2010 census.
Comparison Operators
These comparison operators are used with WHERE to determine equality or difference between variables or values. You may or may not be familiar with these common comparison operators:
- = (Equal to)
- != (Not equal to)
- < (Less than)
-
(Greater than)
- <= (Less than or equal to)
-
= (Greater than or equal to)
Let’s see it in action. To demonstrate, we will ask a question:
What are the cities that have a population greater than 700,000 and less than 3,000,000?
First consider what the query looks like without using WHERE, and the outcome.
SELECT city, population FROM cities;
City | Population
San Francisco, CA | 805235
Portland, OR | 583776
Los Angeles, CA | 3792621
Seattle, WA | 608660
New York City, NY | 8175133
Houston, TX | 2099451
Boston, MA | 617594
Chicago, IL | 2695598
By selecting for the city and population column, the query returns all rows for the database.
To answer the question, we want to know which rows in the population column have a value of greater than 700,000 and a value of less than 3,000,000. In this example we also have two comparisons, so we need to define whether we want the query to return only the results when one is true or when both are true. Since we want to know the result when both are true, we use AND.
SELECT city FROM cities
WHERE population > 700000 AND population < 3000000;
And our result is three cities - San Francisco, Houston and Chicago.
City
San Francisco, CA
Houston, TX
Chicago, IL
If instead our question was: What is the population of either San Francisco or Houston? - our query would use OR because either comparison can be true.
SELECT city, population FROM cities
WHERE city = "San Francisco, CA" OR city = "Houston, TX";
City | Population
San Francisco, CA | 805235
Houston, TX | 2099451
If we had used AND in this example then we would have no result, because there are no cities that are named both San Franicsco and Houston.
Using the IN( ) Operator
The IN() operator is useful when you want to show the values that are inside of a list. Let’s ask the question:
What is the population of San Francisco, Portland, and Seattle?
In this scenario we are asking for the rows in the population column where our cities can be San Francisco, Portland or Seattle. Rather than writing a query for each of these cities, or writing a long WHERE clause, we can put the cities in a list.
SELECT city, population FROM cities
WHERE city IN('San Francisco, CA', 'Portland, OR', 'Seattle, WA');
And as you can see, the query returns only the rows for which we specified in the list.
City | Population
San Francisco, CA | 805235
Portland, OR | 583776
Seattle, WA | 608660
Using the LIKE Operator
The LIKE operator (or NOT LIKE) is used to perform pattern matching, combined with wildcards. If you are looking for a value that starts, ends or contains a certain search term, you can use LIKE to return all possible matches.
- LIKE %SearchTerm% → anything that contains this search term
- LIKE %SearchTerm → anything that ends with this search term
- LIKE SearchTerm% → anything that begins with this search term
Let’s see this in action. For example, what is the population of the cities located in California?
In this question, we want to pull out any rows from the cities column that end with “CA”. So our search term, “CA” will begin with the wildcard “%”.
SELECT city, population FROM cities
WHERE city LIKE '%CA';
City | Population
San Francisco, CA | 805235
Los Angeles, CA | 3792621
And our result is only the cities located in California.
I hope these examples help to get you started using WHERE. Stay posted for more SQL tutorials.