16

I'm trying to merge a dataframe (df1) with another dataframe (df2) for which df2 can potentially be empty. The merge condition is df1.index=df2.z (df1 is never empty), but I'm getting the following error.

Is there any way to get this working?

In [31]:
import pandas as pd
In [32]:
df1 = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6], 'c': [1, 2, 3]})
df2 = pd.DataFrame({'x':[], 'y':[], 'z':[]})
dfm = pd.merge(df1, df2, how='outer', left_index=True, right_on='z')
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-34-4e9943198dae> in <module>()
----> 1 dfmb = pd.merge(df1, df2, how='outer', left_index=True, right_on='z')

/usr/local/lib/python2.7/dist-packages/pandas/tools/merge.pyc in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy)
     37                          right_index=right_index, sort=sort, suffixes=suffixes,
     38                          copy=copy)
---> 39     return op.get_result()
     40 if __debug__:
     41     merge.__doc__ = _merge_doc % '\nleft : DataFrame'

/usr/local/lib/python2.7/dist-packages/pandas/tools/merge.pyc in get_result(self)
    185 
    186     def get_result(self):
--> 187         join_index, left_indexer, right_indexer = self._get_join_info()
    188 
    189         ldata, rdata = self.left._data, self.right._data

/usr/local/lib/python2.7/dist-packages/pandas/tools/merge.pyc in _get_join_info(self)
    277                 join_index = self.left.index.take(left_indexer)
    278             elif self.left_index:
--> 279                 join_index = self.right.index.take(right_indexer)
    280             else:
    281                 join_index = Index(np.arange(len(left_indexer)))

/usr/local/lib/python2.7/dist-packages/pandas/core/index.pyc in take(self, indexer, axis)
    981 
    982         indexer = com._ensure_platform_int(indexer)
--> 983         taken = np.array(self).take(indexer)
    984 
    985         # by definition cannot propogate freq

IndexError: cannot do a non-empty take from an empty axes.
2
  • 1
    why not just check if its empty first? that takes like no time at all Mar 3, 2015 at 0:48
  • 1
    That's not the problem. In the subsequent code, I expect a merged dataframe with columns from df1 and df2 (even though some of them may be None/nan).
    – orange
    Mar 3, 2015 at 1:51

2 Answers 2

6

Another alternative, similar to Joran's:

try:
    dfm = pd.merge(df1, df2, how='outer', left_index=True, right_on='z')
except IndexError:
    dfm = df1.reindex_axis(df1.columns.union(df2.columns), axis=1)

I'm not sure which is clearer but both the following work:

In [11]: df1.reindex_axis(df1.columns.union(df2.columns), axis=1)
Out[11]:
   a  b  c   x   y   z
0  1  4  1 NaN NaN NaN
1  2  5  2 NaN NaN NaN
2  3  6  3 NaN NaN NaN

In [12]: df1.loc[:, df1.columns.union(df2.columns)]
Out[12]:
   a  b  c   x   y   z
0  1  4  1 NaN NaN NaN
1  2  5  2 NaN NaN NaN
2  3  6  3 NaN NaN NaN

(I prefer the former.)

3
  • 1
    This works fine, but how would I go about preserving the type? union just copies the column names, but not the type. In my case, some of the values are datetimes, so I'd expect NaT instead of NaN as column value.
    – orange
    Mar 3, 2015 at 2:31
  • @orange I would just hit these columns with pd.to_datetime afterwards. Mar 3, 2015 at 7:31
  • NB that reindex_axis is deprecated since Pandas 0.21.0. Use reindex for current versions of Pandas. Feb 10, 2020 at 6:16
6
try:
    dfm = pd.merge(df1, df2, how='outer', left_index=True, right_on='z')
except IndexError:
    dfm = df1 if not df1.empty else df2

might be sufficient for your needs

1
  • 1
    This isn't the same as merging. After a merge, I would expect all columns to be part of dfm.
    – orange
    Mar 3, 2015 at 1:50

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.