Methods for Selecting Variables¶
The step to select variables from a DataFrame is SelectColumnsStep
/ SelectStep
.
It will keep columns based on their names or the results of the selectors.
from yeast.selectors import AllNumeric
Recipe([
# AllNumeric() is a selector in charge of keep all numeric variables
# So, when executed it keeps all numeric columns and title
SelectStep([AllNumeric(), 'title'])
])
The selectors can choose columns based on their data type or name. They are shortcuts to select a subset of columns/predictors based on a common attribute:
- AllColumns: All variables
- AllString: All string variables
- AllBoolean: All boolean variables
- AllNumeric: All numerical variables
- AllDatetime: All date or time variables
- AllCategorical: All categorical variables
- AllMatching: All variables matching the regular expression
The usage is quite simple, you can pass them on any parameter that indicates column names and basically they are used to select columns based on the attributes.
Recipe([
# Will keep all numeric and 2 more columns:
SelectStep([AllNumeric(), 'title', 'aired']),
# Will keep all the numeric variables:
SelectStep(AllNumeric()),
# Will only one columns:
SelectStep('seasons')
])
Available Selectors¶
AllColumns¶
class
yeast.selectors.AllColumns
()Return all columns on the DataFrame
Recipe([
# Will keep all columns
SelectStep(AllColumns())
])
AllString¶
class
yeast.selectors.AllString
()Return all string columns
Recipe([
# Will keep all strings
SelectStep(AllString())
])
AllBoolean¶
class
yeast.selectors.AllBoolean
()Return all boolean columns
Recipe([
# Will keep all booleans
SelectStep(AllBoolean())
])
AllNumeric¶
class
yeast.selectors.AllNumeric
()Return all numerical columns
Recipe([
# Will keep all numerical values like int
, float
, etc.
SelectStep(AllNumeric())
])
AllDatetime¶
class
yeast.selectors.AllDatetime
()Return all DateTime columns
Recipe([
# Will keep all dates and times
SelectStep(AllDatetime())
])
AllCategorical¶
class
yeast.selectors.AllCategorical
()Return all Categorical columns
Recipe([
# Will keep all categorical
SelectStep(AllCategorical())
])
AllMatching¶
class
yeast.selectors.AllMatching
(pattern='')Return all columns matching the regular expression given by pattern
Recipe([
# Will keep all the columns ending with "ed" (ed$)
SelectStep(AllMatching('ed$'))
])