将Timestamp转换为str值python pandas dataframe

Convert Timestamp to str value python pandas dataframe

我有数据框架看起来像这样

1
2
3
4
5
    Date        Player          Fee
0   2017-01-08  Steven Berghuis 6500000
1   2017-07-18  Jerry St. Juste 4500000
2   2017-07-18  Ridgeciano Haps 600000
3   2017-01-07  Sofyan Amrabat  400000

如果符合条件,我想将每个日期值都更改为str

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
def is_in_range(x):
ses1 = pd.to_datetime('2013-02-01')
ses2 = pd.to_datetime('2014-02-01')
ses3 = pd.to_datetime('2015-02-01')
ses4 = pd.to_datetime('2016-02-01')
ses5 = pd.to_datetime('2017-02-01')
ses6 = pd.to_datetime('2018-02-01')

if x < ses1 :
     x = '2012-13'
if x > ses2 and x < ses3 :
     x = '2013-14'
if x > ses3 and x < ses4 :
     x = '2014-15'
if x > ses4 and x < ses5 :
     x = '2015-16'
if x > ses5 and x < ses6 :
     x = '2016-17'
return ses6
aj = ajax_t['Date'].apply(is_in_range)
aj

TypeError Traceback (most recent call last)
in ()
18 x = '2016-17'
19 return ses6
---> 20 aj = ajax_t['Date'].apply(is_in_range)
21 aj

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/series.py
in apply(self, func, convert_dtype, args, **kwds) 2353
else: 2354 values = self.asobject
-> 2355 mapped = lib.map_infer(values, f, convert=convert_dtype) 2356 2357 if len(mapped) and
isinstance(mapped[0], Series):

pandas/_libs/src/inference.pyx in pandas._libs.lib.map_infer
(pandas/_libs/lib.c:66645)()

in is_in_range(x)
15 if x > ses4 and x < ses5 : 16 x = '2015-16' ---> 17 if x > ses5 and x < ses6 : 18 x = '2016-17' 19 return ses6

pandas/_libs/tslib.pyx in pandas._libs.tslib._Timestamp.richcmp
(pandas/_libs/tslib.c:20281)()

TypeError: Cannot compare type 'Timestamp' with type 'str'

我得到这个错误的任何建议,亲切地


使用pd.cut

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
ses1 = pd.to_datetime('2013-02-01')
ses2 = pd.to_datetime('2014-02-01')
ses3 = pd.to_datetime('2015-02-01')
ses4 = pd.to_datetime('2016-02-01')
ses5 = pd.to_datetime('2017-02-01')
ses6 = pd.to_datetime('2018-02-01')

pd.cut(df.Date,[ses1,ses2,ses3,ses4,ses5,ses6],labels=['2012-13','2013-14','2014-15','2015-16','2016-17'])


Out[1227]:
0    2015-16
1    2016-17
2    2016-17
3    2015-16
Name: Date, dtype: category

1
2
ses = pd.to_datetime(['2013-02-01','2014-02-01','2015-02-01','2016-02-01','2017-02-01','2018-02-01'])
pd.cut(df.Date,ses,labels=['2012-13','2013-14','2014-15','2015-16','2016-17'])


如果需要,您需要转换到列to_datetime,并将变量x更改为另一个,如y,因为在循环中被覆盖。

变量y也应从函数返回:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
ajax_t['Date'] = pd.to_datetime(ajax_t['Date'])

def is_in_range(x):
    print (x)
    ses1 = pd.to_datetime('2013-02-01')
    ses2 = pd.to_datetime('2014-02-01')
    ses3 = pd.to_datetime('2015-02-01')
    ses4 = pd.to_datetime('2016-02-01')
    ses5 = pd.to_datetime('2017-02-01')
    ses6 = pd.to_datetime('2018-02-01')

    if x < ses1 :
         y = '2012-13'
    if x > ses2 and x < ses3 :
         y = '2013-14'
    if x > ses3 and x < ses4 :
         y = '2014-15'
    if x > ses4 and x < ses5 :
         y = '2015-16'
    if x > ses5 and x < ses6 :
         y = '2016-17'
    return y
aj = ajax_t['Date'].apply(is_in_range)
print (aj)
0    2015-16
1    2016-17
2    2016-17
3    2015-16
Name: Date, dtype: object


显然,您没有将Date列作为DateTime加载到DataFrame ajax_t中。尝试转换它

1
ajax_t['Date'] = pd.to_datetime(ajax_t.Date)

或者,如果从文件(例如,data.csv文件)加载数据帧ajax_t,则可以指定参数以强制分析Date列为DateTime类型。

1
ajax_t = pd.read_csv('data.csv', parse_dates=['Date'])

希望这会有所帮助。


您可以尝试更改日期的格式:

1
ses1 = pd.to_datetime('2017-01-08', format='%Y%b/%d')