解决方案:
另一种可能的解决方法是先垂直堆叠这三个数据框,然后将timestamps
列设为索引,接着对其进行堆叠和展平操作,使data1
、data2
和data3
成为水平列:
pd.concat([df1, df2, df3]).set_index('Timestamp').stack().unstack(1).reset_index()
中间步骤
步骤1
代码
pd.concat([df1, df2, df3]).set_index('Timestamp')
产生的结果是:
data1 data2 data3
Timestamp
2019/04/02 10:00:00 1.0 NaN NaN
2019/04/02 10:10:00 1.0 NaN NaN
2019/04/02 10:20:00 1.0 NaN NaN
2019/04/02 10:30:00 1.0 NaN NaN
2019/04/02 10:00:00 NaN 2.0 NaN
2019/04/02 10:15:00 NaN 22.0 NaN
2019/04/02 10:30:00 NaN 222.0 NaN
2019/04/02 10:45:00 NaN 2222.0 NaN
2019/04/02 11:00:00 NaN 22222.0 NaN
2019/04/02 10:00:00 NaN NaN 3.0
2019/04/02 10:30:00 NaN NaN 33.0
2019/04/02 11:00:00 NaN NaN 333.0
2019/04/02 11:30:00 NaN NaN 3333.0
步骤2
代码
pd.concat([df1, df2, df3]).set_index('Timestamp').stack()
产生的结果是:
Timestamp variable
2019/04/02 10:00:00 data1 1.0
2019/04/02 10:10:00 data1 1.0
2019/04/02 10:20:00 data1 1.0
2019/04/02 10:30:00 data1 1.0
2019/04/02 10:00:00 data2 2.0
2019/04/02 10:15:00 data2 22.0
2019/04/02 10:30:00 data2 222.0
2019/04/02 10:45:00 data2 2222.0
2019/04/02 11:00:00 data2 22222.0
2019/04/02 10:00:00 data3 3.0
2019/04/02 10:30:00 data3 33.0
2019/04/02 11:00:00 data3 333.0
2019/04/02 11:30:00 data3 3333.0
dtype: float64
步骤3
最后,代码
pd.concat([df1, df2, df3]).set_index('Timestamp').stack().unstack(1)
产生的结果是(注意unstack(1)
取多级索引的第1层——即包含data1
、data2
和data3
的那一层——来形成最终数据框的3列):
data1 data2 data3
Timestamp
2019/04/02 10:00:00 1.0 2.0 3.0
2019/04/02 10:10:00 1.0 NaN NaN
2019/04/02 10:15:00 NaN 22.0 NaN
2019/04/02 10:20:00 1.0 NaN NaN
2019/04/02 10:30:00 1.0 222.0 33.0
2019/04/02 10:45:00 NaN 2222.0 NaN
2019/04/02 11:00:00 NaN 22222.0 333.0
2019/04/02 11:30:00 NaN NaN 3333.0
输出
Timestamp data1 data2 data3
0 2019/04/02 10:00:00 1.0 2.0 3.0
1 2019/04/02 10:10:00 1.0 NaN NaN
2 2019/04/02 10:15:00 NaN 22.0 NaN
3 2019/04/02 10:20:00 1.0 NaN NaN
4 2019/04/02 10:30:00 1.0 222.0 33.0
5 2019/04/02 10:45:00 NaN 2222.0 NaN
6 2019/04/02 11:00:00 NaN 22222.0 333.0
7 2019/04/02 11:30:00 NaN NaN 3333.0