Cell Patterns¶
See the cell pattern tutorial for examples.
CellPatterns normalize fields’ data values
Classes
Boolean ([default_value]) |
Normalize cell values to booleans. |
Digit ([default_value]) |
Normalize cell values to an integer between 0-9. |
Float ([default_value]) |
Normalize cell values to float . |
Integer ([default_value]) |
Normalize cell values to int . |
IntegerList () |
Normalize cell values to list of int . |
String ([default_value]) |
Normalizes cell values to str . |
StringChoice (choices[, dict_use_keys, …]) |
Return “choice” that best fits the cell value. |
StringChoiceMulti (choices[, case_sensitive]) |
Check cell for desired strings. |
WordList ([default_value]) |
Normalize cell values to a list of words (no digits, no punctuation). |
-
class
fuzzytable.cellpatterns.
Boolean
(default_value=None)¶ Bases:
fuzzytable.patterns.cellpattern.CellPattern
Normalize cell values to booleans.
# warm_colors.py from fuzzytable import FuzzyTable, FieldPattern, cellpatterns iswarmcolor_field = FieldPattern( name="is_warm_color", cellpattern=cellpatterns.Boolean, ) warmcolor_table = FuzzyTable( path='warm_colors.csv', fields=['color', boolean_field], approximate_match=True, )
warm_colors.csv¶ color is warm color brown True green False yellow yes black >>> python warm_colors.py >>> for record in warmcolor_table.records ... print(record) ... {'color': 'brown', 'is_warm_color': True} {'color': 'green', 'is_warm_color': False} {'color': 'yellow', 'is_warm_color': True} {'color': 'black', 'is_warm_color': False}
-
class
fuzzytable.cellpatterns.
Digit
(default_value=None)¶ Bases:
fuzzytable.patterns.cellpattern.CellPattern
Normalize cell values to an integer between 0-9.
-
class
fuzzytable.cellpatterns.
Float
(default_value=None)¶ Bases:
fuzzytable.patterns.cellpattern.CellPattern
Normalize cell values to
float
.
-
class
fuzzytable.cellpatterns.
Integer
(default_value=None)¶ Bases:
fuzzytable.patterns.cellpattern.CellPattern
Normalize cell values to
int
.
-
class
fuzzytable.cellpatterns.
IntegerList
¶ Bases:
fuzzytable.patterns.cellpattern.CellPattern
Normalize cell values to
list
ofint
.
-
class
fuzzytable.cellpatterns.
String
(default_value='')¶ Bases:
fuzzytable.patterns.cellpattern.CellPattern
Normalizes cell values to
str
.
-
class
fuzzytable.cellpatterns.
StringChoice
(choices, dict_use_keys=True, default=None, approximate_match=False, min_ratio=0.6, case_sensitive=False, contains_match=True, mode=None)¶ Bases:
fuzzytable.patterns.cellpattern.CellPattern
Return “choice” that best fits the cell value.
This pattern operates in one of these three modes: -
exact
-approx
-contains
# cities.py from fuzzytable import FuzzyTable, FieldPattern, cellpatterns state_field = FieldPattern( name="states", cellpattern=cellpatterns.StringChoice( choices='pennsylvania new_york north_carolina'.split() case_sensitive=False, approximate_match=True, min_ratio=0.5, ), ) cities_table = FuzzyTable(path='cities.csv', fields=['city', state_field],)
cities.csv¶ city state New York New York Philadelphia Pennsylvania Albany new york Raleigh North Carolina Wilmington Delaware >>> python cities.py >>> for record in cities_table.records ... print(record) ... {'city': 'New York', 'state': 'new_york'} {'city': 'Philadelphia', 'state': 'pennsylvania'} {'city': 'Albany', 'state': 'new_york'} {'city': 'Raleigh', 'state': 'north_carolina'} {'city': 'Wilmington', 'state': None}
Parameters: - choices (sequence of strings or dict whose values are sequences of strings) – the key is what is returned.
- case_sensitive (
bool
, defaultFalse
) – - dict_use_keys (
bool
, defaultTrue
) – the keys of the dict will be used as search terms. - default (Any, default
None
) – any of the choices given. - approximate_match (
bool
, defaultFalse
) – Deprecated in v0.18. To be removed in v1.0. Usemode
instead. True overridescontains_match
. - min_ratio (
float
within [0.0, 1.0], default0.6
) – - case_sensitive –
- contains_match (
bool
, defaultTrue
) – Deprecated in v0.18. To be removed in v1.0. Usemode
instead. - mode (None or
str
) – Choose from'exact'
,'approx'
, or'contains'
.mode
overrides approximate_match and contains_match.
-
class
fuzzytable.cellpatterns.
StringChoiceMulti
(choices: List[str], case_sensitive=True)¶ Bases:
fuzzytable.patterns.cellpattern.CellPattern
Check cell for desired strings. Return list of found strings.
# colors.py from fuzzytable import FuzzyTable, FieldPattern, cellpatterns warm_color_field = FieldPattern( name="warm_colors", cellpattern=cellpatterns.StringChoiceMulti( choices='red pink brown yellow'.split() case_sensitive=False, ), ) colors_table = FuzzyTable( path='colors.csv', fields=[warm_color_field, 'cool_colors'], approximate_match=True, )
colors.csv¶ warm colors cool colors brown Yellow Red Brown green yellow red blue black >>> python colors.py >>> for record in colors_table.records ... print(record) ... {'warm_colors': ['red', 'brown', 'yellow'], 'cool_colors': None} {'warm_colors': ['brown'], 'cool_colors': 'green'} {'warm_colors': ['red', 'yellow'], 'cool_colors': 'blue'} {'warm_colors': [], 'cool_colors': 'black'}
Parameters: - choices (sequence of strings) –
- case_sensitive (
bool
, defaultTrue
) –
-
class
fuzzytable.cellpatterns.
WordList
(default_value=None)¶ Bases:
fuzzytable.patterns.cellpattern.CellPattern
Normalize cell values to a list of words (no digits, no punctuation).