GithubHelp home page GithubHelp logo

Comments (2)

supatuffpinkpuff avatar supatuffpinkpuff commented on May 29, 2024

I'm encountering this issue, but I think I've determined a cause.

It clearly has to do with line 82, in find_pe_for_covar_set, in flame_dame_helpers.py
binarized_df = pd.get_dummies(x_treated.loc[:, non_bool_cols].astype(str))

Based on the error message coming from pd.get_dummies, this seems to indicate that pd.get_dummies is being passed an empty dataframe. As to why that's happening, I think it has to do with the precise number of columns that are boolean/non boolean used for analysis. Since the code is iteratively removing columns, potentially including the non-boolean columns, at some point depending on the dataset, there might be an iteration where there are no non-boolean columns, in which case pd.get_dummies is passed an empty dataframe leading to this issue.

I believe a fix would involve adding an additional check before running pd.get_dummies to make sure that it's not being passed an empty dataframe. Perhaps adding

if len(non_bool_cols) > 0:

before each call of pd.get_dummies would suffice? Will do some more testing tomorrow.

I made some fake datasets as test cases, and found the following:

The below dataset has only boolean integer columns, and fails with this error on iteration 1.
data = pd.DataFrame({'bool_integer_categories1':[1, 1, 0, 0, 1],
'treatment':[0, 0, 0, 1, 1],
'outcome':[5, 1, 2, 3, 4],
'bool_integer_categories2':[1, 0, 1, 0, 0]})

The below dataset has only non-boolean integer columns, and succeeds after two iterations.
data = pd.DataFrame({'nonbool_integer_categories1':[100, 100, 10, 10, 1],
'treatment':[0, 0, 0, 1, 1],
'outcome':[5, 1, 2, 3, 4],
'nonbool_integer_categories2':[1, 2, 3, 1, 2]})

The below dataset has one non-boolean integer column and one boolean integer column, and fails on iteration 1.
data = pd.DataFrame({'nonbool_integer_categories':[100, 10, 10, 50, 50],
'treatment':[0, 0, 0, 1, 1],
'outcome':[5, 1, 2, 3, 4],
'boolean_integer_column':[0, 0, 1, 1, 0]})

Adding another boolean integer column, still fails on iteration 1.
data = pd.DataFrame({'nonbool_integer_categories':[100, 10, 10, 50, 50],
'treatment':[0, 0, 0, 1, 1],
'outcome':[5, 1, 2, 3, 4],
'boolean_integer_column':[0, 0, 1, 1, 0],
'boolean_int_col_2':[1, 0, 1, 0, 1]})

However, adding more non_boolean columns still fails, but survives another iteration to iteration 2.
data = pd.DataFrame({'nonbool_integer_col_1':[100, 10, 10, 50, 50],
'nonbool_integer_col_2':[1, 1, 2, 2, 3],
'treatment':[0, 0, 0, 1, 1],
'outcome':[5, 1, 2, 3, 4],
'boolean_integer_column':[0, 0, 1, 1, 0]})

Code, Logs, and Traceback from the last example below:
`def test_mixed_bool_more_cols_1():

data = pd.DataFrame({'nonbool_integer_col_1':[100, 10, 10, 50, 50], 
                    'nonbool_integer_col_2':[1, 1, 2, 2, 3], 
                    'treatment':[0, 0, 0, 1, 1],
                     'outcome':[5, 1, 2, 3, 4],
                     'boolean_integer_column':[0, 0, 1, 1, 0]})
# return data

model_flame = dame_flame.matching.FLAME(repeats=False, verbose=3, adaptive_weights='decisiontree')
model_flame.fit(holdout_data=data, treatment_column_name='treatment', outcome_column_name='outcome')
result_flame = model_flame.predict(data)

print('ATE:')
print(dame_flame.utils.post_processing.ATE(model_flame))

# Visualizing CATE of matched groups from FLAME.
group_size_treated = []
group_size_overall = []
cate_of_group = []
for group in model_flame.units_per_group:

    # find len of just treated units
    df_mmg = data.loc[group]
    treated = df_mmg.loc[df_mmg["treatment"] == 1]
    
    group_size_treated.append(len(treated))
    group_size_overall.append(len(group))
    
    cate_of_group.append(dame_flame.utils.post_processing.CATE(model_flame, group[0]))

plt.scatter(group_size_treated, cate_of_group, alpha=0.25, edgecolors='b')
plt.axhline(y=0.0, color='r', linestyle='-')
plt.xlabel('Number of Treatment units in group', fontsize=12)
plt.ylabel('Estimated Treatment Effect of Group', fontsize=12)
plt.title("Visualizing CATE of matched groups from FLAME", fontsize=14)
plt.savefig('interpretability.png')

`
Iteration number: 1
Number of matched groups formed in total: 0
Unmatched treated units: 2 out of a total of 2 treated units
Unmatched control units: 3 out of a total of 3 control units
Predictive error of covariates chosen this iteration: 0
Number of matches made in this iteration: 0
Number of matches made so far: 0
In this iteration, the covariates dropped are: set()
Iteration number: 2
Number of matched groups formed in total: 0
Unmatched treated units: 2 out of a total of 2 treated units
Unmatched control units: 3 out of a total of 3 control units
Predictive error of covariates chosen this iteration: 0.0
Number of matches made in this iteration: 0
Number of matches made so far: 0
In this iteration, the covariates dropped are: nonbool_integer_col_2

Error

Traceback (most recent call last):
File "test_mixed_bool_more_cols_1", line 1, in
File "test_mixed_bool_more_cols_1", line 12, in test_mixed_bool_more_cols_1
File "/opt/conda/lib/python3.7/site-packages/dame_flame/matching.py", line 219, in predict
pre_dame, C)
File "/opt/conda/lib/python3.7/site-packages/dame_flame/matching.py", line 416, in _FLAME
want_bf, mice_on_hold, early_stops, pre_dame, C)
File "/opt/conda/lib/python3.7/site-packages/dame_flame/flame_algorithm.py", line 210, in flame_generic
df_unmatched, return_matches, C, weight_array)
File "/opt/conda/lib/python3.7/site-packages/dame_flame/flame_algorithm.py", line 78, in decide_drop
adaptive_weights, alpha_given)
File "/opt/conda/lib/python3.7/site-packages/dame_flame/flame_dame_helpers.py", line 82, in find_pe_for_covar_set
binarized_df = pd.get_dummies(x_treated.loc[:, non_bool_cols].astype(str))
File "/opt/conda/lib/python3.7/site-packages/pandas/core/reshape/reshape.py", line 903, in get_dummies
result = concat(with_dummies, axis=1)
File "/opt/conda/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 295, in concat
sort=sort,
File "/opt/conda/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 342, in init
raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate

from dame-flame-python-package.

nehargupta avatar nehargupta commented on May 29, 2024

Hi @supatuffpinkpuff
I just saw your comment. Thanks for using this package and thoroughly debugging it!

If I'm not mistaken, this issue should be resolved in my local branch but somehow didn't make it up to the master yet...you can see my branch here: https://github.com/nehargupta/DAME-FLAME-Python-Package, and my bug fix to this assuming it's alright is this one: e088707

Please feel free to check out my local branch if that suffices for your needs, and definitely let me know if you're still seeing this issue. I hope to have this version control issue sorted and it merged with the master soon, so we should be able to close this soon I hope. Please let me know if you think the issue is still persistent or I'm mistaken somehow.

from dame-flame-python-package.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.