A boolean operation returns only True or False, and can only be performed against a series. The process of boolean indexing involves two steps: first, evaluate the series to identify true/false values, then filter the dataframe based on results.

Operators

Here are the standard boolean operators:

  • == a : Equality comparison
  • ~a : Negation (NOT)
  • a > b : Greater than
  • a < b : Less than
  • a & b : AND operator
  • a | b : OR operator

Filtering a Dataframe

Let’s demonstrate filtering using a dataset. First, create an evaluation series:

evaluation = df['number'] < 10

This identifies entries meeting the condition. Next, apply the filter to return matching rows:

df[evaluation]

Selecting Specific Data

You can combine conditions for more advanced filtering.

Combining OR conditions across country values:

df[(df['country'] == 'USA') | (df['country'] == 'Canada')]

Using negation with AND to exclude specific criteria while filtering by sector:

df[(~df['country'] == 'USA') & (df['sector'] == 'Technology')]

Boolean indexing enables data exploration by creating filtered datasets based on conditional logic applied to dataframe columns.