Cell Patterns¶
See the cell pattern tutorial for examples.
CellPatterns normalize fields’ data values
Classes
Boolean([default_value]) |
Normalize cell values to booleans. |
Digit([default_value]) |
Normalize cell values to an integer between 0-9. |
Float([default_value]) |
Normalize cell values to float. |
Integer([default_value]) |
Normalize cell values to int. |
IntegerList() |
Normalize cell values to list of int. |
String([default_value]) |
Normalizes cell values to str. |
StringChoice(choices[, dict_use_keys, …]) |
Return “choice” that best fits the cell value. |
StringChoiceMulti(choices[, case_sensitive]) |
Check cell for desired strings. |
WordList([default_value]) |
Normalize cell values to a list of words (no digits, no punctuation). |
-
class
fuzzytable.cellpatterns.Boolean(default_value=None)¶ Bases:
fuzzytable.patterns.cellpattern.CellPatternNormalize cell values to booleans.
# warm_colors.py from fuzzytable import FuzzyTable, FieldPattern, cellpatterns iswarmcolor_field = FieldPattern( name="is_warm_color", cellpattern=cellpatterns.Boolean, ) warmcolor_table = FuzzyTable( path='warm_colors.csv', fields=['color', boolean_field], approximate_match=True, )
warm_colors.csv¶ color is warm color brown True green False yellow yes black >>> python warm_colors.py >>> for record in warmcolor_table.records ... print(record) ... {'color': 'brown', 'is_warm_color': True} {'color': 'green', 'is_warm_color': False} {'color': 'yellow', 'is_warm_color': True} {'color': 'black', 'is_warm_color': False}
-
class
fuzzytable.cellpatterns.Digit(default_value=None)¶ Bases:
fuzzytable.patterns.cellpattern.CellPatternNormalize cell values to an integer between 0-9.
-
class
fuzzytable.cellpatterns.Float(default_value=None)¶ Bases:
fuzzytable.patterns.cellpattern.CellPatternNormalize cell values to
float.
-
class
fuzzytable.cellpatterns.Integer(default_value=None)¶ Bases:
fuzzytable.patterns.cellpattern.CellPatternNormalize cell values to
int.
-
class
fuzzytable.cellpatterns.IntegerList¶ Bases:
fuzzytable.patterns.cellpattern.CellPatternNormalize cell values to
listofint.
-
class
fuzzytable.cellpatterns.String(default_value='')¶ Bases:
fuzzytable.patterns.cellpattern.CellPatternNormalizes cell values to
str.
-
class
fuzzytable.cellpatterns.StringChoice(choices, dict_use_keys=True, default=None, approximate_match=False, min_ratio=0.6, case_sensitive=False, contains_match=True, mode=None)¶ Bases:
fuzzytable.patterns.cellpattern.CellPatternReturn “choice” that best fits the cell value.
This pattern operates in one of these three modes: -
exact-approx-contains# cities.py from fuzzytable import FuzzyTable, FieldPattern, cellpatterns state_field = FieldPattern( name="states", cellpattern=cellpatterns.StringChoice( choices='pennsylvania new_york north_carolina'.split() case_sensitive=False, approximate_match=True, min_ratio=0.5, ), ) cities_table = FuzzyTable(path='cities.csv', fields=['city', state_field],)
cities.csv¶ city state New York New York Philadelphia Pennsylvania Albany new york Raleigh North Carolina Wilmington Delaware >>> python cities.py >>> for record in cities_table.records ... print(record) ... {'city': 'New York', 'state': 'new_york'} {'city': 'Philadelphia', 'state': 'pennsylvania'} {'city': 'Albany', 'state': 'new_york'} {'city': 'Raleigh', 'state': 'north_carolina'} {'city': 'Wilmington', 'state': None}
Parameters: - choices (sequence of strings or dict whose values are sequences of strings) – the key is what is returned.
- case_sensitive (
bool, defaultFalse) – - dict_use_keys (
bool, defaultTrue) – the keys of the dict will be used as search terms. - default (Any, default
None) – any of the choices given. - approximate_match (
bool, defaultFalse) – Deprecated in v0.18. To be removed in v1.0. Usemodeinstead. True overridescontains_match. - min_ratio (
floatwithin [0.0, 1.0], default0.6) – - case_sensitive –
- contains_match (
bool, defaultTrue) – Deprecated in v0.18. To be removed in v1.0. Usemodeinstead. - mode (None or
str) – Choose from'exact','approx', or'contains'.modeoverrides approximate_match and contains_match.
-
class
fuzzytable.cellpatterns.StringChoiceMulti(choices: List[str], case_sensitive=True)¶ Bases:
fuzzytable.patterns.cellpattern.CellPatternCheck cell for desired strings. Return list of found strings.
# colors.py from fuzzytable import FuzzyTable, FieldPattern, cellpatterns warm_color_field = FieldPattern( name="warm_colors", cellpattern=cellpatterns.StringChoiceMulti( choices='red pink brown yellow'.split() case_sensitive=False, ), ) colors_table = FuzzyTable( path='colors.csv', fields=[warm_color_field, 'cool_colors'], approximate_match=True, )
colors.csv¶ warm colors cool colors brown Yellow Red Brown green yellow red blue black >>> python colors.py >>> for record in colors_table.records ... print(record) ... {'warm_colors': ['red', 'brown', 'yellow'], 'cool_colors': None} {'warm_colors': ['brown'], 'cool_colors': 'green'} {'warm_colors': ['red', 'yellow'], 'cool_colors': 'blue'} {'warm_colors': [], 'cool_colors': 'black'}
Parameters: - choices (sequence of strings) –
- case_sensitive (
bool, defaultTrue) –
-
class
fuzzytable.cellpatterns.WordList(default_value=None)¶ Bases:
fuzzytable.patterns.cellpattern.CellPatternNormalize cell values to a list of words (no digits, no punctuation).