Thursday, 19 September 2013

python - pandas: create dummies from column with multiple values

python - pandas: create dummies from column with multiple values

I am looking for for a pythonic way to handle the following problem.
The pandas.get_dummies() method is great to create dummies from a
categorical column of a dataframe. For example, if the column has values
in ['A', 'B'], get_dummies() creates 2 dummy variables and assigns 0 or 1
accordingly.
Now, I need to handle this situation. A column has values like ['A', 'B',
'C', 'D', 'A*C', 'C*D'] . get_dummies() creates 6 dummies, but I only want
4 of them, so that a row could have multiple 1s.
Is there a way to handle this in a pythonic way? I could only think of
some step-by-step algorithm to get it, but that would not include
get_dummies(). Thanks

No comments:

Post a Comment